MMLU is only 25.64, anything wrong?

#8
by cloudyu - opened

I just check https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard and found the metric about this model is really bad, is there anything wrong about the score?

Cohere For AI org

Hey @cloudyu -- I don't see this score or any model with this score average on the leaderboard. Can you specify the model name you are seeing?

Command R plus is not yet on the leaderboard -- it should be on the leaderboard shortly. We submitted it jointly with hugging face yesterday and my understanding is that it will be made public shortly.

Cohere For AI org

I'm going to close this for now -- but feel free to re-open with additional details.

sarahooker changed discussion status to closed
Cohere For AI org

Hi @cloudyu , a random results file was accidentally pushed on our side under the wrong namespace - you can find the c4ai-command-r-plus details here while the leaderboard is rebuilding.

Now MMLU is 75.73 on the leaderboard; that's great.

Cohere For AI org

Thanks @cloudyu ! Our full results on the Open LLM leaderboard is now public on https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard -- here is a quick comparison with a subset of other relevant models whose scores are publicly available on the leaderboard.

Hope this is helpful!

Screenshot 2024-04-06 at 11.02.07 AM.png

sarahooker changed discussion status to open
sarahooker changed discussion status to closed

Sign up or log in to comment