atsuki-yamaguchi
/

Llama-3-8B-si-30K-5000-mean

@@ -6,13 +6,13 @@ language:
 base_model: meta-llama/Meta-Llama-3-8B
 library_name: transformers
 ---
-# Llama3 8B for Sinhala: 100 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
 This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
 ## Model Details
-* **Vocabulary**: This model has an additional 100 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
 * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
@@ -34,10 +34,10 @@ Use the code below to get started with the model.
 from transformers import AutoTokenizer, AutoModelForCausalLM
 model = AutoModelForCausalLM.from_pretrained(
-    "atsuki-yamaguchi/Llama-3-8B-si-30K-100-mean"
 )
 tokenizer = AutoTokenizer.from_pretrained(
-    "atsuki-yamaguchi/Llama-3-8B-si-30K-100-mean"
 )
 ```

 base_model: meta-llama/Meta-Llama-3-8B
 library_name: transformers
 ---
+# Llama3 8B for Sinhala: 5000 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
 This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
 ## Model Details
+* **Vocabulary**: This model has an additional 5000 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
 * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
 from transformers import AutoTokenizer, AutoModelForCausalLM
 model = AutoModelForCausalLM.from_pretrained(
+    "atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
 )
 tokenizer = AutoTokenizer.from_pretrained(
+    "atsuki-yamaguchi/Llama-3-8B-si-30K-5000-mean"
 )
 ```