Muennighoff's picture
Eval
b220bd1
task,metric,value,err,version
anli_r1,acc,0.354,0.015129868238451772,0
anli_r2,acc,0.36,0.015186527932040122,0
anli_r3,acc,0.365,0.013903485981413582,0
arc_challenge,acc,0.27559726962457337,0.013057169655761838,0
arc_challenge,acc_norm,0.30119453924914674,0.013406741767847617,0
arc_easy,acc,0.5269360269360269,0.01024488474062011,0
arc_easy,acc_norm,0.5122053872053872,0.010256726235129012,0
boolq,acc,0.5116207951070336,0.008742692742551265,1
cb,acc,0.48214285714285715,0.0673769750864465,1
cb,f1,0.33071988595866,,1
copa,acc,0.71,0.045604802157206845,0
hellaswag,acc,0.4561840270862378,0.004970585328297622,0
hellaswag,acc_norm,0.5903206532563234,0.004907694727935688,0
piqa,acc,0.720348204570185,0.01047189953030656,0
piqa,acc_norm,0.7252448313384113,0.010415033676676056,0
rte,acc,0.5054151624548736,0.03009469812323996,0
sciq,acc,0.817,0.012233587399477823,0
sciq,acc_norm,0.79,0.01288666233227455,0
storycloze_2016,acc,0.6990913949759487,0.010606289538707334,0
winogrande,acc,0.5406471981057617,0.014005973823825124,0