Update README.md
Browse files
README.md
CHANGED
@@ -131,18 +131,18 @@ Core model results for the new and original 7B model are found below.
|
|
131 |
|
132 |
And for the 1B model:
|
133 |
|
134 |
-
| task | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | **OLMo 1B** (ours) |
|
135 |
-
|
|
136 |
-
| arc_challenge | 25 | 43.81 | 33.11 | 34.78 | 34.45 |
|
137 |
-
| arc_easy | 25 | 63.68 | 50.18 | 53.16 | 58.07 |
|
138 |
-
| boolq | 50 | 76.6 | 61.8 | 64.6 | 60.7 |
|
139 |
-
| copa | 50 | 84 | 72 | 78 | 79 |
|
140 |
-
| hellaswag | 25 | 68.2 | 44.7 | 58.7 | 62.5 |
|
141 |
-
| openbookqa | 25 | 45.8 | 37.8 | 43.6 | 46.4 |
|
142 |
-
| piqa | 50 | 74 | 69.1 | 71.1 | 73.7 |
|
143 |
-
| sciq | 25 | 94.7 | 86 | 90.5 | 88.1 |
|
144 |
-
| winogrande | 50 | 64.9 | 53.3 | 58.9 | 58.9 |
|
145 |
-
| Average | 36.11 | 68.41 | 56.44 | 61.48 | 62.42 |
|
146 |
|
147 |
\*Unlike OLMo, Pythia, and TinyLlama, StabilityAI has not disclosed yet the data StableLM was trained on, making comparisons with other efforts challenging.
|
148 |
|
|
|
131 |
|
132 |
And for the 1B model:
|
133 |
|
134 |
+
| task | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | OLMo 1B | **OLMo 1.7-1B** (ours) |
|
135 |
+
| ------------- | ------ | ----------------- | --------- | -------------------------------------- | ------- | ---- |
|
136 |
+
| arc_challenge | 25 | 43.81 | 33.11 | 34.78 | 34.45 | 36.5 |
|
137 |
+
| arc_easy | 25 | 63.68 | 50.18 | 53.16 | 58.07 | 55.3 |
|
138 |
+
| boolq | 50 | 76.6 | 61.8 | 64.6 | 60.7 | 67.5 |
|
139 |
+
| copa | 50 | 84 | 72 | 78 | 79 | 83.0 |
|
140 |
+
| hellaswag | 25 | 68.2 | 44.7 | 58.7 | 62.5 | 66.9 |
|
141 |
+
| openbookqa | 25 | 45.8 | 37.8 | 43.6 | 46.4 | 46.4 |
|
142 |
+
| piqa | 50 | 74 | 69.1 | 71.1 | 73.7 | 74.9 |
|
143 |
+
| sciq | 25 | 94.7 | 86 | 90.5 | 88.1 | 93.4 |
|
144 |
+
| winogrande | 50 | 64.9 | 53.3 | 58.9 | 58.9 | 61.4 |
|
145 |
+
| Average | 36.11 | 68.41 | 56.44 | 61.48 | 62.42 | 65.0 |
|
146 |
|
147 |
\*Unlike OLMo, Pythia, and TinyLlama, StabilityAI has not disclosed yet the data StableLM was trained on, making comparisons with other efforts challenging.
|
148 |
|