End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -96,7 +96,7 @@ xformers_attention: true
 This model is a fine-tuned version of [migtissera/Tess-v2.5-Phi-3-medium-128k-14B](https://huggingface.co/migtissera/Tess-v2.5-Phi-3-medium-128k-14B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7149
 ## Model description
@@ -134,8 +134,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 3.6352        | 0.0016 | 1    | 5.1592          |
-| 2.2293        | 0.0390 | 25   | 1.7845          |
-| 2.1153        | 0.0780 | 50   | 1.7149          |
 ### Framework versions

 This model is a fine-tuned version of [migtissera/Tess-v2.5-Phi-3-medium-128k-14B](https://huggingface.co/migtissera/Tess-v2.5-Phi-3-medium-128k-14B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7096
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 3.6352        | 0.0016 | 1    | 5.1592          |
+| 2.2311        | 0.0390 | 25   | 1.7820          |
+| 2.0822        | 0.0780 | 50   | 1.7096          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,10 +20,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "gate_up_proj",
-    "o_proj",
     "qkv_proj",
-    "down_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "qkv_proj",
+    "down_proj",
+    "gate_up_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:406b998a7075c4caaf16acb0c862225334ff17151cec2f234d1a217f482aba8d
 size 445760970

 version https://git-lfs.github.com/spec/v1
+oid sha256:451c3f5fd718da045d39607408451556308cc8b62b2ace594defdbdd28084120
 size 445760970

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fe4c111824ae0addcdf78322c527745999170d7f3b17df8aa6e772131bd49a9f
 size 445688440

 version https://git-lfs.github.com/spec/v1
+oid sha256:cd386869160576e247f818b418f34612f1c7896a0fbfa1c8145e8625aeb86c00
 size 445688440

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0159e238933649e4b1712b80c029ece5145df05759de3a9be54b50d9c93a5826
 size 6776

 version https://git-lfs.github.com/spec/v1
+oid sha256:b095aab98c9d0f056114cfd27d18b2ae9c09fc7600a46ac9acc6ecf782b705ca
 size 6776