pkupie
/

Llama-2-7b-bod

Model card Files Files and versions Community

Llama-2-7b-bod / README.md

KobayashiKanna01's picture

KobayashiKanna01

Update README.md

36d7fbe verified 4 days ago

|

1.28 kB

	---
	license: llama2
	datasets:
	- pkupie/mc2_corpus
	- togethercomputer/RedPajama-Data-1T
	language:
	- en
	- bo
	base_model:
	- meta-llama/Llama-2-7b-hf
	---

	A continually pre-trained model based on Llama-2-7b-hf.

	We use the Tibetan texts in MC^2 and English texts in RedPajama with a proportion of 4:1 for training.

	#### Hyper-parameters:
	* lr: 3e-5
	* batch size: 1M (2K*512)
	* lr scheduler: cosine
	* min lr: 1e-6
	* lr decay iters: 10240

	## Citation
	If you find this model is useful in your work, please cite it with:
	```
	@inproceedings{tao-etal-2024-unlocking,
	title = "Unlocking the Potential of Model Merging for Low-Resource Languages",
	author = "Tao, Mingxu and
	Zhang, Chen and
	Huang, Quzhe and
	Ma, Tianyao and
	Huang, Songfang and
	Zhao, Dongyan and
	Feng, Yansong",
	editor = "Al-Onaizan, Yaser and
	Bansal, Mohit and
	Chen, Yun-Nung",
	booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
	month = nov,
	year = "2024",
	address = "Miami, Florida, USA",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2024.findings-emnlp.508",
	doi = "10.18653/v1/2024.findings-emnlp.508",
	pages = "8705--8720"
	}
	```