stojchet
/

sft8

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

stojchet commited on Jul 16

Commit

7ae3c96

•

1 Parent(s): 45b1aa7

End of training

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -15,9 +15,12 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # sft8
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on the generator dataset.
 ## Model description
@@ -47,6 +50,13 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 200
 - num_epochs: 3
 ### Framework versions
 - Transformers 4.43.0.dev0

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/stojchets/huggingface/runs/sft8)
 # sft8
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.2180
 ## Model description
 - lr_scheduler_warmup_steps: 200
 - num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.0923        | 2.56  | 100  | 1.2180          |
 ### Framework versions
 - Transformers 4.43.0.dev0