atsuki-yamaguchi
/

gemma-2-9b-my-30K-500-mean

@@ -6,7 +6,7 @@ language:
 base_model: google/gemma-2-9b
 library_name: transformers
 ---
-# Gemma2 9B for Burmese: 500 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
 This model is built on top of Gemma2 9B adapted for Burmese using 30K target language sentences sampled from CC-100.
@@ -14,7 +14,7 @@ This model is built on top of Gemma2 9B adapted for Burmese using 30K target lan
 * **Vocabulary**: This model has an additional 500 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding were initialized using Mean initialization.
-* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
 ## Model Description

 base_model: google/gemma-2-9b
 library_name: transformers
 ---
+# Gemma2 9B for Burmese: 500 target vocabulary size + Mean target vocabulary initialization + 2x2LS/MTP/512 training
 This model is built on top of Gemma2 9B adapted for Burmese using 30K target language sentences sampled from CC-100.
 * **Vocabulary**: This model has an additional 500 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding were initialized using Mean initialization.
+* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the 2x2LS/MTP/512 strategies introduced in the paper.
 ## Model Description