Muennighoff's picture
Eval
b220bd1
task,metric,value,err,version
anli_r1,acc,0.356,0.01514904265930662,0
anli_r2,acc,0.366,0.01524061272640576,0
anli_r3,acc,0.35583333333333333,0.013826518748493315,0
arc_challenge,acc,0.2713310580204778,0.012993807727545792,0
arc_challenge,acc_norm,0.310580204778157,0.013522292098053054,0
arc_easy,acc,0.5231481481481481,0.010248782484554471,0
arc_easy,acc_norm,0.5046296296296297,0.010259343705889728,0
boolq,acc,0.5131498470948013,0.008742030090044975,1
cb,acc,0.5,0.06741998624632421,1
cb,f1,0.34491725768321513,,1
copa,acc,0.71,0.045604802157206845,0
hellaswag,acc,0.4587731527584147,0.004972790690640181,0
hellaswag,acc_norm,0.5902210714997013,0.004907877144720023,0
piqa,acc,0.7236126224156693,0.010434162388275624,0
piqa,acc_norm,0.7312295973884657,0.010343392940090011,0
rte,acc,0.4729241877256318,0.030052303463143706,0
sciq,acc,0.795,0.012772554096113109,0
sciq,acc_norm,0.783,0.01304151375727071,0
storycloze_2016,acc,0.689470871191876,0.010700112173178448,0
winogrande,acc,0.5406471981057617,0.014005973823825136,0