Update README.md (#4)
Browse files- Update README.md (8f3fc9a733aa72076d79e747063c80b06ee5beee)
README.md
CHANGED
@@ -274,8 +274,8 @@ Data used for model training and how the data was processed.
|
|
274 |
|
275 |
### Training Dataset
|
276 |
|
277 |
-
These models were trained on a dataset of text data that includes a wide variety
|
278 |
-
|
279 |
|
280 |
* Web Documents: A diverse collection of web text ensures the model is exposed
|
281 |
to a broad range of linguistic styles, topics, and vocabulary. Primarily
|
|
|
274 |
|
275 |
### Training Dataset
|
276 |
|
277 |
+
These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens.
|
278 |
+
Here are the key components:
|
279 |
|
280 |
* Web Documents: A diverse collection of web text ensures the model is exposed
|
281 |
to a broad range of linguistic styles, topics, and vocabulary. Primarily
|