Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ library_name: transformers
|
|
3 |
tags: []
|
4 |
---
|
5 |
|
6 |
-
#
|
7 |
|
8 |
This is the result of step 2 of the upscaling of [Boreas-7B](https://huggingface.co/yhavinga/Boreas-7B) with [mergekit](https://github.com/cg123/mergekit).
|
9 |
It is trying to reproduce the upscaling described in the [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
|
@@ -13,6 +13,7 @@ This model is the result after step 2 from the figure below:
|
|
13 |
![SOLAR 10.7B Depth up scaling](img_2.png)
|
14 |
|
15 |
The model was continuously pretrained on a mix of Dutch and English for 20B tokens.
|
|
|
16 |
|
17 |
## Model Details
|
18 |
|
|
|
3 |
tags: []
|
4 |
---
|
5 |
|
6 |
+
# Boreas-10_7B-v1
|
7 |
|
8 |
This is the result of step 2 of the upscaling of [Boreas-7B](https://huggingface.co/yhavinga/Boreas-7B) with [mergekit](https://github.com/cg123/mergekit).
|
9 |
It is trying to reproduce the upscaling described in the [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
|
|
|
13 |
![SOLAR 10.7B Depth up scaling](img_2.png)
|
14 |
|
15 |
The model was continuously pretrained on a mix of Dutch and English for 20B tokens.
|
16 |
+
It must be finetuned on an instruct or chat dataset to be useful.
|
17 |
|
18 |
## Model Details
|
19 |
|