eval
Browse files
README.md
CHANGED
@@ -100,7 +100,7 @@ litgpt evaluate --tasks 'leaderboard' --out_dir 'evaluate-0/' --batch_size 4 --d
|
|
100 |
litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-1/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
101 |
```
|
102 |
|
103 |
-
Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
104 |
|---------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
105 |
|arc_challenge | 1|none | 0|acc |↑ |0.2082|± |0.0119|
|
106 |
| | |none | 0|acc_norm |↑ |0.2474|± |0.0126|
|
|
|
100 |
litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_challenge' --out_dir 'evaluate-1/' --batch_size 4 --dtype 'bfloat16' out/pretrain/final/
|
101 |
```
|
102 |
|
103 |
+
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
104 |
|---------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
105 |
|arc_challenge | 1|none | 0|acc |↑ |0.2082|± |0.0119|
|
106 |
| | |none | 0|acc_norm |↑ |0.2474|± |0.0126|
|