yulan-team
/

YuLan-Mini-Before-Annealing

optimizer_states

Model card Files Files and versions Community

IvanHU commited on Dec 27, 2024

Commit

34a2d27

·

1 Parent(s): eb753da

Update README

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -8,6 +8,8 @@ Both [**YuLan-Mini**](https://huggingface.co/yulan-team/YuLan-Mini) and **YuLan-
 This version includes the optimizer, allowing you to resume training using the Hugging Face Trainer and DeepSpeed Universal Checkpoint.
 ## What you can do with these pre-training resources
 1. **Pre-train** your own LLM. You can use [our data](https://huggingface.co/yulan-team/YuLan-Mini-Datasets) and curriculum to train a model that's just as powerful as YuLan-Mini.

 This version includes the optimizer, allowing you to resume training using the Hugging Face Trainer and DeepSpeed Universal Checkpoint.
+For easier inference and deployment, we merged the re-parameterized added parameters and scaling factors into the final released models ([**YuLan-Mini**](https://huggingface.co/yulan-team/YuLan-Mini) and **YuLan-Mini-Intermediate-4K**), enabling it to run on the Llama architecture. However, these parameters are still retained in the intermediate checkpoints from the training process.
 ## What you can do with these pre-training resources
 1. **Pre-train** your own LLM. You can use [our data](https://huggingface.co/yulan-team/YuLan-Mini-Datasets) and curriculum to train a model that's just as powerful as YuLan-Mini.