Can't reproduce MATH performance
#66
by
jpiabrantes
- opened
In the model card you say you achieved 51.9 on MATH using Llama 3.1 8B with zero-shots.
I can't reproduce this. Did you use lm-evaluation-harness or some other code?
You may need to remove the system role if you use the default apply_chat_template
function.