atsuki-yamaguchi commited on
Commit
580e3ff
1 Parent(s): 7b0c67b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -6,13 +6,13 @@ language:
6
  base_model: meta-llama/Meta-Llama-3-8B
7
  library_name: transformers
8
  ---
9
- # Llama3 8B for Sinhala: 100 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
10
 
11
  This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
12
 
13
  ## Model Details
14
 
15
- * **Vocabulary**: This model has an additional 100 target vocabulary.
16
  * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
17
  * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
18
 
@@ -34,10 +34,10 @@ Use the code below to get started with the model.
34
  from transformers import AutoTokenizer, AutoModelForCausalLM
35
 
36
  model = AutoModelForCausalLM.from_pretrained(
37
- "atsuki-yamaguchi/Llama-3-8B-si-30K-100-mean"
38
  )
39
  tokenizer = AutoTokenizer.from_pretrained(
40
- "atsuki-yamaguchi/Llama-3-8B-si-30K-100-mean"
41
  )
42
  ```
43
 
 
6
  base_model: meta-llama/Meta-Llama-3-8B
7
  library_name: transformers
8
  ---
9
+ # Llama3 8B for Sinhala: 5000 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
10
 
11
  This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
12
 
13
  ## Model Details
14
 
15
+ * **Vocabulary**: This model has an additional 5000 target vocabulary.
16
  * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
17
  * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
18
 
 
34
  from transformers import AutoTokenizer, AutoModelForCausalLM
35
 
36
  model = AutoModelForCausalLM.from_pretrained(
37
+ "atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
38
  )
39
  tokenizer = AutoTokenizer.from_pretrained(
40
+ "atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
41
  )
42
  ```
43