Articles

Beyond the algorithms at AI UK

Simon Banks
|May 16, 2025

In a compelling talk at AI UK, an event from the Alan Turing Institute, Prolific CEO Phelim Bradley challenged the dominant narrative around artificial intelligence by highlighting what truly makes these systems work: human data.

"For every magical moment you've had with an LLM, countless hours of human labor have gone into making that possible," Phelim noted. "What makes artificial intelligence 'intelligent' is human intelligence."

The perspective shifts focus from the computational power and algorithms that typically dominate headlines to the often-invisible human workforce that shapes how AI systems understand and respond to our world.

The evolution of humans in AI development

Phelim illustrated how our relationship with artificial intelligence has evolved using a growth analogy that resonated with the audience:

"In the past, AI models were like toddlers. They needed an explanation for everything. Today's AI is more like an adolescent. It is mostly independent but occasionally needs guidance on complex matters."

Beyond the algorithms at AI UK

In a compelling talk at AI UK, an event from the Alan Turing Institute, Prolific CEO Phelim Bradley challenged the dominant narrative around artificial intelligence by highlighting what truly makes these systems work: human data.

"For every magical moment you've had with an LLM, countless hours of human labor have gone into making that possible," Phelim noted. "What makes artificial intelligence 'intelligent' is human intelligence."

The perspective shifts focus from the computational power and algorithms that typically dominate headlines to the often-invisible human workforce that shapes how AI systems understand and respond to our world.

The evolution of humans in AI development

Phelim illustrated how our relationship with artificial intelligence has evolved using a growth analogy that resonated with the audience:

"In the past, AI models were like toddlers. They needed an explanation for everything. Today's AI is more like an adolescent. It is mostly independent but occasionally needs guidance on complex matters."

It’s an evolution that means human contributors are transitioning from basic data labeling (identifying cats and buses) to more sophisticated judgment tasks, such as guiding models toward responsible outcomes, ensuring factual accuracy, and providing nuanced contextual understanding.

Cultural fingerprints in our models

One of the most illuminating moments in Phelim’s talk came through an anecdote about early ChatGPT responses that overused the word "delve", a pattern reflecting frequent usage by annotators whose English was influenced by phrasing common in African business contexts.

"This cultural fingerprint reveals a profound truth," Phelim explained. "Who trains your AI model matters. Who provides this data matters."

The observation underscores why diversity in human feedback is essential for creating AI systems that work effectively across different cultural contexts. As AI becomes more advanced, having a broad range of human perspectives becomes more—not less—important.

Beyond traditional benchmarks

Phelim argued that traditional AI evaluation methods miss crucial dimensions of performance that matter to real users. In collaboration with partners like the Collective Intelligence Project and Hugging Face, Prolific is developing a human-centric benchmarking approach that evaluates AI systems through five key dimensions:

  • Preference – How well does the model align with user choices?
  • Usefulness – Does it help people accomplish their goals?
  • Ease of use – How intuitive is the interaction?
  • Trustworthiness – Do users feel confident in the model’s output?
  • Alignment – Does it respect users' values and boundaries?

“What good is a perfect technical benchmark score,” Phelim asked, “if the system fails real users in a real-world context?”

Explore Prolific's AI user experience leaderboard and discover how we evaluate AI systems through a human-centric lens.

From data extraction to human collaboration

Phelim concluded his talk with a vision for how the AI industry should evolve its relationship with human contributors, moving from what he termed "extractive practices" to "the renewable energy approach to human data."

The approach treats contributors not solely as data sources but as collaborators, trainers, and mentors with valuable expertise. It also means bringing the work "out of the shadows" and making it a more open, collaborative process with fair compensation and recognition.

Why Phelim's insights matter for the future of AI

As AI capabilities continue to grow, Phelim's talk highlighted several critical considerations for anyone working in AI development. At Prolific, these insights aren't just theoretical concepts—they form the foundation of how we've built our human data infrastructure. Here's how we're putting the vision into practice:

Embodying the data quality principle

"The cultural fingerprints in our models reveal who trained them," Phelim explained during his talk. This understanding drives Prolific's approach to data quality. Our global participant pool of more than 200,000 verified participants reflects Phelim's emphasis on quality human-verified data as the foundation for AI that truly understands real-world contexts.

Extending this principle further, we've developed tools like Authenticity Check, which detects AI-generated responses with 98.7% accuracy and addresses concerns about maintaining genuine human input in an increasingly AI-saturated environment.

Implementing domain-specific human guidance

When Phelim described AI's evolution from "toddler to adolescent" stages, he highlighted how human contributors must transition from basic labeling to sophisticated judgment tasks. Insights like these inform Prolific's Domain Expert network, which connects AI developers with verified specialists across numerous fields, from healthcare professionals to software developers to financial analysts.

Our approach directly addresses humans moving beyond data providers to become collaborators and mentors, enabling the precise evaluation of domain-specific AI applications with the specialized knowledge essential for vertical-specific AI solutions.

Beyond metrics: human-centric evaluation

"What good is a perfect technical benchmark score," Phelim asked in his talk, "if the system fails real users in a real-world context?" This question guides Prolific's approach to human-centric benchmarking, which goes beyond binary right and wrong metrics to measure what Phelim identified as truly mattering: usefulness, trustworthiness, and alignment with user values.

Our evaluation methodology directly implements the five dimensions outlined, ensuring AI systems are assessed on how effectively they serve real human needs rather than just technical performance.

Diversity as a cornerstone of AI development

Phelim's "delve" anecdote illustrates how cultural fingerprints become embedded in AI systems. As he noted, "Who trains your AI model matters." Prolific has built diversity into our core infrastructure, so AI systems are tested across varied perspectives from our global participant pool.

This approach directly addresses concerns about cultural biases, helping create systems that work effectively for everyone, not just dominant user segments—a principle he identified as becoming more critical as AI advances.

Check out Phelim's full talk at AI UK

Building a human-centric AI future

Phelim's talk made clear: as AI systems grow more autonomous and integrate further into our daily lives, the infrastructure for meaningful human involvement becomes more essential than ever. Experience the principles from Phelim's talk in action. 

Ready to implement these principles in your AI development? Discover how Prolific helps organizations access high-quality human feedback in hours, not weeks.