deepseek-ai
/

DeepSeek-V3

Model card Files Files and versions Community

Update README.md

#37

by TomGrc - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/37

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -82,7 +82,7 @@ Throughout the entire training process, we did not experience any irrecoverable
 **Post-Training: Knowledge Distillation from DeepSeek-R1**
--   We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain a control over the output style and length of DeepSeek-V3.
 ---

 **Post-Training: Knowledge Distillation from DeepSeek-R1**
+-   We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain control over the output style and length of DeepSeek-V3.
 ---