Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ tags:
|
|
12 |
|
13 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/630417380907b9a115c6aa9f/tJ3zBlUS83BzKx0G0VutU.png)
|
14 |
|
15 |
-
The original model, LLaMA 1 was pre-trained at a sequence length of 2048 tokens. We went through two individual runs, targeting a sequence length of 16,
|
16 |
significant increase over the original length. While it was originally pre-trained on 1.4T tokens, it was shown to respond positively to our 500M token train and will
|
17 |
coherently write and keep the same writing format (granted some caveats) up to 12K tokens relatively consistently.
|
18 |
|
|
|
12 |
|
13 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/630417380907b9a115c6aa9f/tJ3zBlUS83BzKx0G0VutU.png)
|
14 |
|
15 |
+
The original model, LLaMA 1 was pre-trained at a sequence length of 2048 tokens. We went through two individual runs, targeting a sequence length of 16,384 which is a
|
16 |
significant increase over the original length. While it was originally pre-trained on 1.4T tokens, it was shown to respond positively to our 500M token train and will
|
17 |
coherently write and keep the same writing format (granted some caveats) up to 12K tokens relatively consistently.
|
18 |
|