@onekq on Hugging Face: "October version of Claude 3.5 lifts SOTA (set by its June version) by 7…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

onekq

posted an update about 1 month ago

Post

555

October version of Claude 3.5 lifts SOTA (set by its June version) by 7 points.
onekq-ai/WebApp1K-models-leaderboard

Closed sourced models are widening the gap again.

Note: Our frontier leaderboard now uses double test scenarios because the single-scenario test suit has been saturated.

In this post

onekq Yi Cui