Commit
•
bdea023
1
Parent(s):
c2f91b5
Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,15 @@ language:
|
|
7 |
- bo
|
8 |
base_model:
|
9 |
- meta-llama/Llama-2-7b-hf
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- bo
|
8 |
base_model:
|
9 |
- meta-llama/Llama-2-7b-hf
|
10 |
+
---
|
11 |
+
|
12 |
+
A continually pre-trained model based on Llama-2-7b-hf.
|
13 |
+
|
14 |
+
We use the **Tibetan texts** in MC^2 and **English texts** in RedPajama with a proportion of **4:1** for training.
|
15 |
+
|
16 |
+
#### Hyper-parameters:
|
17 |
+
* lr: 3e-5
|
18 |
+
* batch size: 1M (2K*512)
|
19 |
+
* lr scheduler: cosine
|
20 |
+
* min lr: 1e-6
|
21 |
+
* lr decay iters: 10240
|