Update README.md
#37
by
TomGrc
- opened
README.md
CHANGED
@@ -82,7 +82,7 @@ Throughout the entire training process, we did not experience any irrecoverable
|
|
82 |
|
83 |
**Post-Training: Knowledge Distillation from DeepSeek-R1**
|
84 |
|
85 |
-
- We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain
|
86 |
|
87 |
---
|
88 |
|
|
|
82 |
|
83 |
**Post-Training: Knowledge Distillation from DeepSeek-R1**
|
84 |
|
85 |
+
- We introduce an innovative methodology to distill reasoning capabilities from the long-Chain-of-Thought (CoT) model, specifically from one of the DeepSeek R1 series models, into standard LLMs, particularly DeepSeek-V3. Our pipeline elegantly incorporates the verification and reflection patterns of R1 into DeepSeek-V3 and notably improves its reasoning performance. Meanwhile, we also maintain control over the output style and length of DeepSeek-V3.
|
86 |
|
87 |
---
|
88 |
|