--- license: llama2 datasets: - pkupie/mc2_corpus - togethercomputer/RedPajama-Data-1T language: - en - bo base_model: - meta-llama/Llama-2-7b-hf --- A continually pre-trained model based on Llama-2-7b-hf. We use the **Tibetan texts** in MC^2 and **English texts** in RedPajama with a proportion of **4:1** for training. #### Hyper-parameters: * lr: 3e-5 * batch size: 1M (2K*512) * lr scheduler: cosine * min lr: 1e-6 * lr decay iters: 10240