PyTorch
English
Tamil
llama
KobayashiKanna01 commited on
Commit
bddbe90
1 Parent(s): a54a05d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -3
README.md CHANGED
@@ -1,3 +1,48 @@
1
- ---
2
- license: llama2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ datasets:
4
+ - togethercomputer/RedPajama-Data-1T
5
+ - uonlp/CulturaX
6
+ language:
7
+ - en
8
+ - te
9
+ base_model:
10
+ - meta-llama/Llama-2-7b-hf
11
+ ---
12
+
13
+ A continually pre-trained model based on Llama-2-7b-hf.
14
+
15
+ We use the **Telugu texts** in CulturaX and **English texts** in RedPajama with a proportion of **4:1** for training.
16
+
17
+ #### Hyper-parameters:
18
+ * lr: 3e-5
19
+ * batch size: 1M (2K*512)
20
+ * lr scheduler: cosine
21
+ * min lr: 1e-6
22
+ * lr decay iters: 10240
23
+
24
+ ## Citation
25
+ If you find this model is useful in your work, please cite it with:
26
+ ```
27
+ @inproceedings{tao-etal-2024-unlocking,
28
+ title = "Unlocking the Potential of Model Merging for Low-Resource Languages",
29
+ author = "Tao, Mingxu and
30
+ Zhang, Chen and
31
+ Huang, Quzhe and
32
+ Ma, Tianyao and
33
+ Huang, Songfang and
34
+ Zhao, Dongyan and
35
+ Feng, Yansong",
36
+ editor = "Al-Onaizan, Yaser and
37
+ Bansal, Mohit and
38
+ Chen, Yun-Nung",
39
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
40
+ month = nov,
41
+ year = "2024",
42
+ address = "Miami, Florida, USA",
43
+ publisher = "Association for Computational Linguistics",
44
+ url = "https://aclanthology.org/2024.findings-emnlp.508",
45
+ doi = "10.18653/v1/2024.findings-emnlp.508",
46
+ pages = "8705--8720"
47
+ }
48
+ ```