Post
1871
Look at that π
Actual benchmarks have become too easy for recent models, much like grading high school students on middle school problems makes little sense. So the team worked on a new version of the Open LLM Leaderboard with new benchmarks.
Stellar work from @clefourrier @SaylorTwift and the team!
π Read the blog post: open-llm-leaderboard/blog
π Explore the leaderboard: open-llm-leaderboard/open_llm_leaderboard
Actual benchmarks have become too easy for recent models, much like grading high school students on middle school problems makes little sense. So the team worked on a new version of the Open LLM Leaderboard with new benchmarks.
Stellar work from @clefourrier @SaylorTwift and the team!
π Read the blog post: open-llm-leaderboard/blog
π Explore the leaderboard: open-llm-leaderboard/open_llm_leaderboard