Failed evaluation again for `fblgit/TheBeagle-v2beta-32B-MGS`

#994
by fblgit - opened

hi @clefourrier @alozowski

I pushed a new tokenizer_config.json to the model and submitted the evaluation again to use the chat_template as well, but it failed:

https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/fblgit/TheBeagle-v2beta-32B-MGS_eval_request_False_bfloat16_Original.json

Open LLM Leaderboard org

Hi @fblgit ,

No worries, I've resubmitted your model and it should be fine now!

I'm closing this discussion, please, ping me here if you encounter any problems with this model again or feel free to open a new discussion

alozowski changed discussion status to closed

Thanks @alozowski very appreciated your time and efforts. Here tbh, im trying to figure out whats going on with the chat template.. radical differences.. and how somehow could try to bring them together at once.. so I hope TheBeagle doesn't makes more failures and im sorry for the overhead caused.

Best Regards

@alozowski i see it keeps failing :( anything wrong with it? maybe can u give it a shot to rerun the non chat_template one and see if thats the issue??

Open LLM Leaderboard org

Hi @fblgit ,

Thanks for pinging!
I'm keeping an eye on this model, unfortunately it has failed because of a network issue on our cluster – it shouldn't be the problem now, I'll debug the evaluation process and get back to you soon!

@alozowski
The model 'fblgit/TheBeagle-v2beta-32B-MGS' with revision 'dfaae005c6aa9a3aa5b49b8ee4b4773cc7aaea62' and precision 'bfloat16' has already been submitted.

Id like to evaluate this model in bfloat16 with the fixed tokenizer that submitted a while ago. Can you please set it to PENDING? it failed last week multiple times..

It should be this file https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/fblgit/TheBeagle-v2beta-32B-MGS_eval_request_False_bfloat16_Original.json

Open LLM Leaderboard org

Hi @fblgit ,

The evaluation was corrected, you will be able to find your model today once the Leaderboard restarts! (Within one hour from this my message)

Sign up or log in to comment