--- license: mit ---


Z1-Coder

*Unleashing the Reasoning Power of Large Language Models to Code Generation*

NewsLinksGetting StartedIntroductionCitation

# News - **[2025/01/17]** 🎉 We have released our Z1-Coder-1.5B and Z1-Coder-7B model and data through [HuggingFace](https://huggingface.co/Z1-Coder)! # Links - [GitHub](https://github.com/Z1-Coder/Z1-Coder) - 🤗 [Z1-Coder models](https://huggingface.co/collections/Z1-Coder/z1-coder-models-678f26c001517fc438f84894) - 🤗 [Z1-Coder data](https://huggingface.co/collections/Z1-Coder/z1-coder-dataset-678f26e7c52dc4f4152d1fe1) # Getting Started We open source the code and scripts we used for data curation, training, and evaluation for Z1-Coder models, you can find more details in each directory. - ``src/eval``: Evaluation for Z1-Coder. To generate our training data, we use the QwQ-32B-Preview model. we curate reasoning trajectories on code-related datasets and propose self-invoking evolving to further refine models' reasoning behaviour in code generation. - ``scr/train``: Training scripts for Z1-Coder. We train all the models with Fully Shard Data Parallel (FSDP) and set a global batch size to 1024 for 3 epochs using 2 NVIDIA A800-80G GPUs. We used greedy decoding for all results, with the maximum sequence length set to 1280. We use a learning rate of 5e-5 for the two training stages. # Introduction To train Z1-Coder, we curate reasoning trajectories on code-related datasets and propose [self-invoking](https://github.com/CodeEval-Pro/CodeEval-Pro) evolving to further refine models' reasoning behaviour in code generation. | Model | Trajectory Dataset Download | Reference | |------------------------|-----------------------------------|--------------------------------| | SFT *stage 1* Data | [Z1Coder-Evol-CoT-110K](https://huggingface.co/datasets/Z1-Coder/Z1Coder-Evol-CoT-110K) | https://github.com/nlpxucan/WizardLM | | SFT *stage 2* Data | [Z1Coder-SelfInvoking-CoT-20K](https://huggingface.co/datasets/Z1-Coder/Z1Coder-SelfInvoking-CoT-20K) | https://github.com/CodeEval-Pro/CodeEval-Pro | We fine-tune Qwen-2.5-Coder-Base (1.5B and 7B) for two stages with two trajectory datasets, yielding Z1-Coder-1.5B and Z1-Coder-7B respectively. Z1-Coder-7B surpasses the best 7B code LLMs Qwen2.5-Coder-7B-Instruct, with only 1% its post-training data. | Model | Z1-Coder-7B | Qwen2.5-Coder-7B-Ins | |------------------------|------------------------------|------------------------------------------| | Base Model | Qwen2.5-Coder-7B | Qwen2.5-Coder-7B | | SFT Data *stage 1* | 110K (open-source) | 10M+ (open-source and in-house) | | SFT Data *stage 2* | 20K (open-source) | 1M+ (in-house) | | Offline RL | No | DPO | # Citation The code in this repository is mostly described in the post below. Please consider citing this work if you find the repository helpful. ```bibtex @misc{z1-coder, author = {Z1-Coder Team}, title = {Z1-Coder: Unleashing the Reasoning Power of Large Language Models to Code Generation}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/Z1-Coder/Z1-Coder}}, note = {Accessed: 2025-01-17}, year = {2025} } ```