Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
onekq 
posted an update about 1 month ago
Post
555
October version of Claude 3.5 lifts SOTA (set by its June version) by 7 points.
onekq-ai/WebApp1K-models-leaderboard

Closed sourced models are widening the gap again.

Note: Our frontier leaderboard now uses double test scenarios because the single-scenario test suit has been saturated.
In this post