MMLU is only 25.64, anything wrong?

by cloudyu - opened Apr 5

Apr 5

I just check https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard and found the metric about this model is really bad, is there anything wrong about the score?

sarahooker

Cohere For AI org Apr 5

Hey @cloudyu -- I don't see this score or any model with this score average on the leaderboard. Can you specify the model name you are seeing?

Command R plus is not yet on the leaderboard -- it should be on the leaderboard shortly. We submitted it jointly with hugging face yesterday and my understanding is that it will be made public shortly.

sarahooker

Cohere For AI org Apr 5

I'm going to close this for now -- but feel free to re-open with additional details.

sarahooker changed discussion status to closed Apr 5

clefourrier

Cohere For AI org Apr 5

Hi @cloudyu , a random results file was accidentally pushed on our side under the wrong namespace - you can find the c4ai-command-r-plus details here while the leaderboard is rebuilding.

cloudyu

Apr 5

Now MMLU is 75.73 on the leaderboard; that's great.

sarahooker

Cohere For AI org Apr 6

Thanks @cloudyu ! Our full results on the Open LLM leaderboard is now public on https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard -- here is a quick comparison with a subset of other relevant models whose scores are publicly available on the leaderboard.

Hope this is helpful!

sarahooker changed discussion status to open Apr 6

sarahooker changed discussion status to closed Apr 7

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment