Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
[Bug] Average is taking empty columns as 0
#14
by
natolambert
- opened
Currently, the avg column assumes the empty columns are zero, I don't think this should be the case!
It also needs to be manually refreshed whenever someone wants to see up to date data instead of defaulting to the latest leader board.
I don't think this is undesirable, it prevents models from appearing as high quality while some evals are missing which is less misleading than them appearing really high because they did one test well.
The missing benchmarks (HellaSwag and MMLU) have random baselines at 25%, so at least it shouldn't be 0, assuming no model will perform worse than random, which should be the case for HellaSwag and MMLU, though not necessarily for the other two benchmarks.
@natolambert was this fixed?
clefourrier
changed discussion status to
closed