psvishnu/Phi-3.5-mini-instruct-qlora

Files changed (4) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8569
 ## Model description
@@ -50,18 +50,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.6558        | 0.3333 | 10   | 1.4457          |
-| 1.2244        | 0.6667 | 20   | 1.1363          |
-| 1.0198        | 1.0    | 30   | 1.0057          |
-| 0.92          | 1.3333 | 40   | 0.9328          |
-| 0.8365        | 1.6667 | 50   | 0.8991          |
-| 0.7908        | 2.0    | 60   | 0.8744          |
-| 0.7432        | 2.3333 | 70   | 0.8794          |
-| 0.7421        | 2.6667 | 80   | 0.8692          |
-| 0.739         | 3.0    | 90   | 0.8599          |
-| 0.6996        | 3.3333 | 100  | 0.8603          |
-| 0.7016        | 3.6667 | 110  | 0.8560          |
-| 0.7205        | 4.0    | 120  | 0.8569          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.8242
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.6254        | 0.3333 | 10   | 1.2928          |
+| 1.0852        | 0.6667 | 20   | 0.9771          |
+| 0.8786        | 1.0    | 30   | 0.8939          |
+| 0.7889        | 1.3333 | 40   | 0.8575          |
+| 0.7281        | 1.6667 | 50   | 0.8336          |
+| 0.6876        | 2.0    | 60   | 0.8175          |
+| 0.6217        | 2.3333 | 70   | 0.8238          |
+| 0.6066        | 2.6667 | 80   | 0.8274          |
+| 0.614         | 3.0    | 90   | 0.8193          |
+| 0.5568        | 3.3333 | 100  | 0.8235          |
+| 0.5435        | 3.6667 | 110  | 0.8242          |
+| 0.5699        | 4.0    | 120  | 0.8242          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,8 +20,11 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "qkv_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "o_proj",
+    "gate_up_proj",
+    "down_proj",
+    "up_proj",
+    "qkv_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2b7193a9250b868e45abb649dca070d3a2d1b5f3097562fbca78a91c4777a6f3
-size 37766064

 version https://git-lfs.github.com/spec/v1
+oid sha256:016263d091dc1ff8810193befca6d2ef09c865ce406784c07c53b6fcc9e6047b
+size 100697728

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e6f77cc20f3a9851b6b2714db72118558e4efb0e9002500e9a0100018c491dfd
 size 5496

 version https://git-lfs.github.com/spec/v1
+oid sha256:cd0312728633f86d6242ef91f16063fed1b9483cb90a5fd129467ac7a4bead91
 size 5496