lesso
/

8cad90b9-d437-42b4-afb4-e51b67e64e50

Generated from Trainer

Model card Files Files and versions Community

lesso commited on 4 days ago

Commit

c3e5bc0

•

1 Parent(s): 14109f8

End of training

Files changed (2) hide show

README.md +4 -4
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -105,7 +105,7 @@ xformers_attention: null
 This model is a fine-tuned version of [tokyotech-llm/Llama-3-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.1841
 ## Model description
@@ -141,9 +141,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 2.4235        | 0.0027 | 1    | 2.2008          |
-| 2.6546        | 0.0081 | 3    | 2.1996          |
-| 2.3495        | 0.0162 | 6    | 2.1935          |
-| 2.3295        | 0.0243 | 9    | 2.1841          |
 ### Framework versions

 This model is a fine-tuned version of [tokyotech-llm/Llama-3-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.1838
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 2.4235        | 0.0027 | 1    | 2.2008          |
+| 2.6545        | 0.0081 | 3    | 2.1995          |
+| 2.3491        | 0.0162 | 6    | 2.1932          |
+| 2.3294        | 0.0243 | 9    | 2.1838          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:74735c2e63d406e7970a6509b86dea08dac56b622c28910801aff96f76320d50
 size 84047370

 version https://git-lfs.github.com/spec/v1
+oid sha256:307e08f581ad4e466f44d672b3101edf2d76a91dbe94a4af047504ac92cb0dd2
 size 84047370