Cited papers

Getting AI to behave: How Prolific helped researchers align AI with public opinion

Simon Banks
|June 24, 2025

Challenge

When it comes to sensitive topics like mental health, getting AI to behave in ways that genuinely match public expectations is a complex task. That’s why researchers from Remesh, AIDF, the University of Washington, the University of Notre Dame, and the University of Liverpool wanted to find a reliable way to measure whether language models (LMs) truly align with public values. 

Past approaches often struggled because they combined what the public genuinely wanted with uncertain assumptions about outcomes. To overcome this, the research team partnered with Prolific to gather representative and trustworthy insights directly from the public.

The study needed a participant sample large and diverse enough to accurately reflect the views of the US public, along with dependable data that could confidently underpin the study’s findings.

Solution

Prolific provided the research team with exactly what was needed: highly engaged participants who genuinely represented US public opinion, coupled with reliable data that could withstand rigorous scrutiny.

Diverse, representative recruitment

Using Prolific’s participant pool, the researchers engaged around 600 US participants. Prolific made sure that the sample closely matched important demographic factors, including gender, age, and political affiliation to give the researchers confidence that the data accurately represented public views.

Reliable and high-quality data

Prolific’s stringent standards meant the researchers received data that was thoughtful and dependable. With careful participant selection and clear instructions, the researchers could trust the responses to reflect genuine public opinion, even on challenging topics like mental health.

Flexible, efficient integration

Prolific’s easy integration with external research tools allowed the team to focus on their innovative method—the "chain of alignment" (CoA)—without worrying about logistical or technical hurdles. This streamlined collaboration among participants, researchers, and experts.

Execution

Using collective dialogues conducted via Prolific, participants defined clear and broadly supported objectives related to AI responses in three key mental health areas. Mental health experts then translated these public objectives into specific, actionable rules for evaluating AI behavior. These rules were integrated into a reward system designed to measure how closely AI responses aligned with public values.

Results

The CoA method produced clear and impactful results:

  • Public-defined objectives received exceptionally high levels of support (between 96% and 98%), confirming a strong consensus.

     
  • Expert-crafted rules effectively turned abstract public objectives into practical guidelines for AI alignment.

     
  • The resulting evaluation system closely matched expert judgments (Pearson’s r = 0.841), validating the approach’s accuracy and reliability.

Conclusion

The collaboration between Remesh, AIDF, the University of Washington, the University of Notre Dame, and the University of Liverpool shows how strong, representative public data can significantly enhance AI alignment research. Prolific’s ability to deliver precise, dependable insights enabled the researchers to create a scalable method that reliably aligns AI with public opinion.

For researchers tackling ambitious AI studies, Prolific offers a proven solution for collecting the diverse, high-quality public input essential for achieving meaningful alignment results.

Citation

Konya, A., Ovadya, A., Feng, K., Chen, Q. Z., Schirch, L., Irwin, C., & Zhang, A. X. (2024). Chain of Alignment: Integrating Public Will with Expert Intelligence for Language Model Alignment. arXiv:2411.10534.

Research institutions: Remesh, AIDF, University of Washington, University of Notre Dame, University of Liverpool.

Have your research featured

Share your published research using the form below, and it may be featured on our website.