Update README.md
Browse files
README.md
CHANGED
@@ -4,9 +4,9 @@ datasets:
|
|
4 |
- togethercomputer/RedPajama-Data-1T
|
5 |
---
|
6 |
|
7 |
-
# MPT-1b-RedPajama-200b
|
8 |
|
9 |
-
MPT-1b-RedPajama-200b is a 1.3 billion parameter decoder-only transformer pre-trained on the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) and subsequently fine-tuned on the [Databricks Dolly](https://github.com/databrickslabs/dolly/tree/master/data) instruction dataset.
|
10 |
The model was pre-trained for 200B tokens by sampling from the subsets of the RedPajama dataset in the same proportions as were used by the [Llama series of models](https://arxiv.org/abs/2302.13971).
|
11 |
This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
|
12 |
|
|
|
4 |
- togethercomputer/RedPajama-Data-1T
|
5 |
---
|
6 |
|
7 |
+
# MPT-1b-RedPajama-200b-dolly
|
8 |
|
9 |
+
MPT-1b-RedPajama-200b-dolly is a 1.3 billion parameter decoder-only transformer pre-trained on the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) and subsequently fine-tuned on the [Databricks Dolly](https://github.com/databrickslabs/dolly/tree/master/data) instruction dataset.
|
10 |
The model was pre-trained for 200B tokens by sampling from the subsets of the RedPajama dataset in the same proportions as were used by the [Llama series of models](https://arxiv.org/abs/2302.13971).
|
11 |
This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
|
12 |
|