Update README.md
Browse files
README.md
CHANGED
@@ -53,7 +53,7 @@ The wand db log is below:
|
|
53 |
## MMLU-SR benchmarks
|
54 |
|
55 |
Below are before and after [MMLU-SR benchmark](https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/mmlusr) scores for the MMLU medical topics listed below were measured before and afterwards. MMLU-SR is a dataset
|
56 |
-
used by the LM Evaluation Harness for
|
57 |
|
58 |
### Before (unquantized internistai lm-eval run on Apple Metal)
|
59 |
|
|
|
53 |
## MMLU-SR benchmarks
|
54 |
|
55 |
Below are before and after [MMLU-SR benchmark](https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/mmlusr) scores for the MMLU medical topics listed below were measured before and afterwards. MMLU-SR is a dataset
|
56 |
+
used by the LM Evaluation Harness for rigorous benchmarking of true model comprehension.
|
57 |
|
58 |
### Before (unquantized internistai lm-eval run on Apple Metal)
|
59 |
|