pkupie
/

Llama-2-7b-tam

Model card Files Files and versions Community

KobayashiKanna01 commited on 17 days ago

Commit

bddbe90

•

1 Parent(s): a54a05d

Update README.md

Files changed (1) hide show

README.md +48 -3

README.md CHANGED Viewed

@@ -1,3 +1,48 @@
----
-license: llama2
----

+---
+license: llama2
+datasets:
+- togethercomputer/RedPajama-Data-1T
+- uonlp/CulturaX
+language:
+- en
+- te
+base_model:
+- meta-llama/Llama-2-7b-hf
+---
+A continually pre-trained model based on Llama-2-7b-hf.
+We use the **Telugu texts** in CulturaX and **English texts** in RedPajama with a proportion of **4:1** for training.
+#### Hyper-parameters:
+ * lr: 3e-5
+ * batch size: 1M (2K*512)
+ * lr scheduler: cosine
+ * min lr: 1e-6
+ * lr decay iters: 10240
+## Citation
+If you find this model is useful in your work, please cite it with:
+```
+@inproceedings{tao-etal-2024-unlocking,
+    title = "Unlocking the Potential of Model Merging for Low-Resource Languages",
+    author = "Tao, Mingxu  and
+      Zhang, Chen  and
+      Huang, Quzhe  and
+      Ma, Tianyao  and
+      Huang, Songfang  and
+      Zhao, Dongyan  and
+      Feng, Yansong",
+    editor = "Al-Onaizan, Yaser  and
+      Bansal, Mohit  and
+      Chen, Yun-Nung",
+    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
+    month = nov,
+    year = "2024",
+    address = "Miami, Florida, USA",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2024.findings-emnlp.508",
+    doi = "10.18653/v1/2024.findings-emnlp.508",
+    pages = "8705--8720"
+}
+```