pythia-160m-sft-hh / README.md
yongzx's picture
Update README.md (#2)
0bbc2c3
|
raw
history blame
1.04 kB

wandb run: https://wandb.ai/eleutherai/pythia-rlhf/runs/e0drjcsz?workspace=user-yongzx

Model Evals:

Task Version Filter Metric Value Stderr
arc_challenge Yaml none acc 0.1877 ± 0.0114
none acc_norm 0.2372 ± 0.0124
arc_easy Yaml none acc 0.4390 ± 0.0102
none acc_norm 0.4082 ± 0.0101
logiqa Yaml none acc 0.1889 ± 0.0154
none acc_norm 0.2473 ± 0.0169
piqa Yaml none acc 0.6213 ± 0.0113
none acc_norm 0.6279 ± 0.0113
sciq Yaml none acc 0.7230 ± 0.0142
none acc_norm 0.6840 ± 0.0147
winogrande Yaml none acc 0.5162 ± 0.0140
wsc Yaml none acc 0.3654 ± 0.0474
lambada_openai Yaml none perplexity 58.9478 ± 2.7662
none acc 0.2602 ± 0.0061