HUMAINE: the human-centered AI leaderboard

See how frontier models perform in real-world use—measured by real people, not just technical benchmarks.

In partnership with

Hugging Face

How HUMAINE measures real-world AI
 

Comparative by design
Models are evaluated side by side, so differences are clear and meaningful—not hidden in abstract rating scales.
Multi-dimensional metrics
Performance is measured across reasoning, communication style, core task performance, adaptiveness, trust, ethics, and safety.
Statistically rigorous
Model comparisons are dictated via TrueSkill tournament design and rankings via hierarchical Bayesian modelling, ensuring reliable, unbiased results.
THE HUMAN EXPERIENCE BEHIND THE RANKINGS

HUMAINE is a human-preference leaderboard that evaluates frontier AI models based on real-world usage. Unlike traditional benchmarks that mainly track technical performance, HUMAINE captures how diverse users actually experience AI—across everyday tasks, trust and safety, adaptability, and more.

By combining rigorous methodology with feedback from a representative pool of real people, HUMAINE offers the insights model creators and evaluators need to understand not just which model performs best, but why. Updated regularly, it provides a dynamic view of model strengths, weaknesses, and user satisfaction.
 

Explore the leaderboard

See how leading AI models stack up in real-world use.

FAQs