How Ai2 builds breakthrough AI faster with quality human data from Prolific
Ai2 is a non-profit AI lab that’s building breakthrough AI to solve the world’s biggest problems. Founded in 2014, they develop foundational AI research and innovation to deliver real-world impact through large-scale open models, data, robotics, conservation, and beyond.
Ai2 needed to collect high-quality human data for state-of-the-art multimodal models, which could take weeks, slowing down their research cycles. But Prolific's platform and partnership approach transformed their workflow. Now, it takes just hours for their researchers to collect the data they need to build models that match leading closed-source systems.
How do you get rich human data without slowing down research?
To build AI that can tackle the world's biggest challenges, Ai2 needs transparent, scientifically grounded models trained on rich, reliable human data.
"AI tools are built for - and by - humans," explains Sophie Lebrecht, Chief Operating Officer at Ai2. "Without strong human input, the outcomes just won't align with real-world expectations."
But collecting that human data at the scale and speed Ai2 needed was a major bottleneck. The research teams, including the Large Language Model team and Multimodal Model team, are pushing the boundaries of video language models and vision systems. These complex models require iterative research cycles with constant human feedback.
Before Prolific, researchers had to be much more hands-on. There weren't streamlined processes or automated tooling to support their work, especially at scale.
"We didn't have a great partner who streamlined this process for us and did the heavy lifting to take the burden off researchers," says Taira Anderson, Program Manager supporting Ai2's multimodal teams.
The challenge was threefold:
Speed without sacrificing rigor: Researchers needed to iterate fast, testing hypotheses and refining data strategies in days, not weeks.
Consistent quality: Complex annotation tasks require engaged, skilled annotators who can provide rich, reliable data across thousands of tasks.
Research-friendly workflows: AI research doesn't follow a linear path. The team needed a partner who could pivot with them as they discovered new insights and adjusted their approach.
A common industry assumption is that you have to choose between fast and high-quality data. Ai2 needed both.
The solution: A partnership built for both speed and quality
When Ai2 turned to Prolific, they found a partner who understood the iterative nature of AI research and could scale in lockstep with their project cycles.
Streamlined tooling that removes operational burden
Prolific gave Ai2's researchers a platform that handles the complexity of large-scale human data collection, so they can focus on what matters: advancing AI capabilities.

For Zixian Ma, a PhD student and researcher on Ai2's multimodal team, the platform's intuitive design made launching studies straightforward. "The UI is very user-friendly. We create a draft study, configure a few features, and we can launch," she explains.
Two features proved especially valuable:
Taskflow for seamless study management: Researchers upload a CSV file with URLs, and Prolific automatically allocates them to participants. No manual assignment logic needed.
API for programmatic scale: For approval processes involving thousands of annotations, the API lets researchers review and approve submissions programmatically, cutting days of manual work into hours.
A pool of participants ready for complex tasks
Ai2's annotation tasks aren't simple. They need participants who can handle sophisticated instructions, from identifying specific objects in video frames to providing dense captions that capture nuanced visual details.
Prolific's pool of AI Taskers - participants skilled in everything from image and video annotation to factuality testing and comparative reasoning - gave Ai2 access to annotators ready to deliver from day one. They could instantly find specialists who understood complex video annotation work and could provide the detailed, nuanced captions that sophisticated vision models require.
"These taskers have already gone through a whole process of screening," Zixian notes. "They've provided high-quality annotations, which have been very helpful."
The screening process is crucial for Ai2. They filter for approval rates and specific experience, like video annotation skills. With Prolific's diverse pool, they can find the exact participants they need for each task.
A responsive partnership that evolves with research needs
Beyond the platform features, Prolific's team worked directly with Ai2's researchers to build solutions tailored to their research goals.
"A big reason why we continue to choose Prolific is that you've been such great partners with us," Taira says.

"The Prolific team has been very supportive and responsive," Zixian notes. "They always reach out to us asking for feedback, offer suggestions, even offer ideas for new features that would help us collect even higher-quality data."
"You've gone the extra mile to build new features and work with us to resolve any snags, just to get those roadblocks out of the way so our researchers can really excel at what they need to do," Taira adds. "I don't believe that's the level of relationship we would get on certain other platforms."
The results: From bottleneck to breakthrough
Since partnering with Prolific, Ai2 has transformed its approach to human data collection. Key outcomes from the partnership across Ai2's research teams include:
- Pilot data collected in hours, enabling same-day iteration
- Full-scale datasets ready within a week, supporting rapid model development
- Models matching performance of leading closed-source systems, built on high-quality human data
- New model capabilities shaped by nuanced human insights
- Thousands of annotations processed programmatically through the API, cutting manual work from days to hours
Same-day iteration replaces week-long cycles
Researchers can now move at the pace of their ideas. Pilot studies that used to require extensive setup now launch and complete in a single day.
"Prolific has helped me personally iterate on model development much faster than before," Zixian says.

State-of-the-art performance from quality data
The high-quality data Ai2 collects through Prolific directly impacts model performance. In their video project, models trained on Prolific data matched the performance of leading closed-source systems like Gemini and GPT.
"This is a really nice benefit of the high-quality human data we collected," Zixian notes.
The data quality also revealed new capabilities. When participants showed different approaches to annotation tasks - like pointing to individual people versus the center of a group - the research team incorporated both behaviors into their models, making them more human-like and versatile.
Scale without trade-offs
The assumption that speed and quality can't coexist has been thoroughly disproven in Ai2's experience.
"Prolific doesn't slow us down," Taira explains. "It's able to iterate and scale in lockstep with our project cycles. We're a research institute - our cycles are going to be different than other places. Researchers are working in iterative steps and processes. Prolific enables that research-driven process to move forward.”

The human data advantage in AI research
The partnership between Ai2 and Prolific shows what's possible when you refuse to compromise on either speed or quality in AI development.
For Ai2, this means they can pursue their mission of building breakthrough AI to tackle the world's biggest challenges - from multimodal models that understand vision and video to agents that can help scientists with complex research tasks.
The work happening at Ai2 has shown that quality beats quantity in multimodal language model development. "People say 'bigger isn't always better,' but you don't believe it until it happens," Taira reflects. "You realize quality data, even at a smaller scale, can beat big data that's just not as good. The value Prolific brought in this experience was profound."
As Ai2 continues to push boundaries in multimodal language models, video understanding, and autonomous agents, Prolific will support the human data collection that makes it all possible.
"The relationship between human data collection and AI development is symbiotic," Taira notes. "They co-evolve. AI is a tool created for and by humans. That tool must be relevant to and useful for humans. It's important that the feedback loop is robust. Prolific is a great partner in that."
Build better AI with quality human data
If you're developing AI models that require high-quality human feedback, evaluation, or alignment, Prolific can help you move faster without compromising on data quality. Learn more about Prolific for AI.





