Update README.md
Browse files
README.md
CHANGED
@@ -120,7 +120,7 @@ model-index:
|
|
120 |
|
121 |
![SmolTulu Banner](smoltulubanner.png)
|
122 |
|
123 |
-
SmolTulu-
|
124 |
|
125 |
This model scores the highest current score in both IFEval and GSM8k while maintaining the extremely low contamination levels in Tulu 3 and SmolLM2! I've listed the datasets used to do both the SFT (supervised finetuning) and DPO (direct preference optimization) stages.
|
126 |
|
|
|
120 |
|
121 |
![SmolTulu Banner](smoltulubanner.png)
|
122 |
|
123 |
+
SmolTulu-1.7b-Instruct is the first model in a series of models meant to leverage [AllenAI's Tulu 3 post-training pipeline](https://allenai.org/blog/tulu-3-technical) to tune the [base version of Huggingface's SmolLM2-1.7b](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B)! The post training pipeline AllenAI came up with seemed like something perfect to apply here.
|
124 |
|
125 |
This model scores the highest current score in both IFEval and GSM8k while maintaining the extremely low contamination levels in Tulu 3 and SmolLM2! I've listed the datasets used to do both the SFT (supervised finetuning) and DPO (direct preference optimization) stages.
|
126 |
|