stojchet
/

k1

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

stojchet commited on Jul 18

Commit

45af223

•

1 Parent(s): 59bced1

End of training

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -17,13 +17,13 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1596
-- Eval/rewards/chosen: 2.1581
-- Eval/logps/chosen: -121.0155
-- Eval/rewards/rejected: -14.2694
-- Eval/logps/rejected: -323.7450
-- Eval/rewards/margins: 16.4275
-- Eval/kl: 3.1515
 ## Model description
@@ -59,7 +59,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |        |
 |:-------------:|:------:|:----:|:---------------:|:------:|
-| 0.0966        | 1.7058 | 100  | 0.1596          | 3.1515 |
 ### Framework versions

 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1316
+- Eval/rewards/chosen: 2.3534
+- Eval/logps/chosen: -119.0631
+- Eval/rewards/rejected: -14.8466
+- Eval/logps/rejected: -329.5165
+- Eval/rewards/margins: 17.1999
+- Eval/kl: 2.9675
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |        |
 |:-------------:|:------:|:----:|:---------------:|:------:|
+| 0.0963        | 1.7058 | 100  | 0.1316          | 2.9675 |
 ### Framework versions