Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ tags:
|
|
11 |
# Granite-3.1-1B-A400M-Base
|
12 |
|
13 |
**Model Summary:**
|
14 |
-
Granite-3.1-1B-A400M-Base extends the context length of Granite-3.0-1B-A400M-Base from 4K to 128K using a progressive training strategy by increasing the supported context length in increments while adjusting RoPE theta until the model has successfully adapted to desired length of 128K.
|
15 |
|
16 |
- **Developers:** Granite Team, IBM
|
17 |
- **GitHub Repository:** [ibm-granite/granite-3.1-language-models](https://github.com/ibm-granite/granite-3.1-language-models)
|
|
|
11 |
# Granite-3.1-1B-A400M-Base
|
12 |
|
13 |
**Model Summary:**
|
14 |
+
Granite-3.1-1B-A400M-Base extends the context length of Granite-3.0-1B-A400M-Base from 4K to 128K using a progressive training strategy by increasing the supported context length in increments while adjusting RoPE theta until the model has successfully adapted to desired length of 128K. This long-context pre-training stage was performed using approximately 500B tokens.
|
15 |
|
16 |
- **Developers:** Granite Team, IBM
|
17 |
- **GitHub Repository:** [ibm-granite/granite-3.1-language-models](https://github.com/ibm-granite/granite-3.1-language-models)
|