atsuki-yamaguchi
commited on
Commit
•
580e3ff
1
Parent(s):
7b0c67b
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -6,13 +6,13 @@ language:
|
|
6 |
base_model: meta-llama/Meta-Llama-3-8B
|
7 |
library_name: transformers
|
8 |
---
|
9 |
-
# Llama3 8B for Sinhala:
|
10 |
|
11 |
This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
|
12 |
|
13 |
## Model Details
|
14 |
|
15 |
-
* **Vocabulary**: This model has an additional
|
16 |
* **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
|
17 |
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
|
18 |
|
@@ -34,10 +34,10 @@ Use the code below to get started with the model.
|
|
34 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
35 |
|
36 |
model = AutoModelForCausalLM.from_pretrained(
|
37 |
-
"atsuki-yamaguchi/Llama-3-8B-si-30K-
|
38 |
)
|
39 |
tokenizer = AutoTokenizer.from_pretrained(
|
40 |
-
"atsuki-yamaguchi/Llama-3-8B-si-30K-
|
41 |
)
|
42 |
```
|
43 |
|
|
|
6 |
base_model: meta-llama/Meta-Llama-3-8B
|
7 |
library_name: transformers
|
8 |
---
|
9 |
+
# Llama3 8B for Sinhala: 5000 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
|
10 |
|
11 |
This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
|
12 |
|
13 |
## Model Details
|
14 |
|
15 |
+
* **Vocabulary**: This model has an additional 5000 target vocabulary.
|
16 |
* **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
|
17 |
* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
|
18 |
|
|
|
34 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
35 |
|
36 |
model = AutoModelForCausalLM.from_pretrained(
|
37 |
+
"atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
|
38 |
)
|
39 |
tokenizer = AutoTokenizer.from_pretrained(
|
40 |
+
"atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
|
41 |
)
|
42 |
```
|
43 |
|