Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -82,7 +82,7 @@ Throughout the entire training process, we did not experience any irrecoverable
82
 
83
  **Post-Training: Knowledge Distillation from DeepSeek-R1**
84
 
85
- - We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain a control over the output style and length of DeepSeek-V3.
86
 
87
  ---
88
 
 
82
 
83
  **Post-Training: Knowledge Distillation from DeepSeek-R1**
84
 
85
+ - We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain control over the output style and length of DeepSeek-V3.
86
 
87
  ---
88