Why AI leaderboards miss the mark
Tuesday, 22 July 2025 at 12:00 PM EDT
AI systems increasingly interact with humans in sensitive and meaningful ways. Leaderboards should be a useful, reliable tool for understanding which models perform best in certain areas.
However, criticisms of some of the most popular leaderboards on everything from gaming, to underrepresentation of open source models, to misalignment on real world ability has led to some doubt on their reliability.
This panel will discuss where current leaderboards are falling short, what can be done to improve them, and how useful they actually are in the real-world. Come along to better understand leaderboards, their limitations, and how you should and shouldn’t use them.
