sam-mosaic
commited on
Commit
•
19180fa
1
Parent(s):
c75058f
Update README.md
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ This model uses the MosaicML LLM codebase, which can be found in the [llm-foundr
|
|
32 |
|
33 |
MPT-7B-8k is
|
34 |
|
35 |
-
* **Licensed for the possibility of commercial use
|
36 |
* **Trained on a large amount of data** (1.5T tokens like [XGen](https://huggingface.co/Salesforce/xgen-7b-8k-base) vs. 1T for [LLaMA](https://arxiv.org/abs/2302.13971), 1T for [MPT-7B](https://www.mosaicml.com/blog/mpt-7b), 300B for [Pythia](https://github.com/EleutherAI/pythia), 300B for [OpenLLaMA](https://github.com/openlm-research/open_llama), and 800B for [StableLM](https://github.com/Stability-AI/StableLM)).
|
37 |
* **Prepared to handle long inputs** thanks to [ALiBi](https://arxiv.org/abs/2108.12409). With ALiBi, the model can extrapolate beyond the 8k training sequence length to up to 10k, and with a few million tokens it can be finetuned to extrapolate much further.
|
38 |
* **Capable of fast training and inference** via [FlashAttention](https://arxiv.org/pdf/2205.14135.pdf) and [FasterTransformer](https://github.com/NVIDIA/FasterTransformer)
|
|
|
32 |
|
33 |
MPT-7B-8k is
|
34 |
|
35 |
+
* **Licensed for the possibility of commercial use.**
|
36 |
* **Trained on a large amount of data** (1.5T tokens like [XGen](https://huggingface.co/Salesforce/xgen-7b-8k-base) vs. 1T for [LLaMA](https://arxiv.org/abs/2302.13971), 1T for [MPT-7B](https://www.mosaicml.com/blog/mpt-7b), 300B for [Pythia](https://github.com/EleutherAI/pythia), 300B for [OpenLLaMA](https://github.com/openlm-research/open_llama), and 800B for [StableLM](https://github.com/Stability-AI/StableLM)).
|
37 |
* **Prepared to handle long inputs** thanks to [ALiBi](https://arxiv.org/abs/2108.12409). With ALiBi, the model can extrapolate beyond the 8k training sequence length to up to 10k, and with a few million tokens it can be finetuned to extrapolate much further.
|
38 |
* **Capable of fast training and inference** via [FlashAttention](https://arxiv.org/pdf/2205.14135.pdf) and [FasterTransformer](https://github.com/NVIDIA/FasterTransformer)
|