Training complete

Files changed (4) hide show

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [pszemraj/pegasus-x-large-book-summary](https://huggingface.co/pszemraj/pegasus-x-large-book-summary) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8273
 ## Model description
@@ -37,21 +37,23 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 2
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
-- num_epochs: 1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.1182        | 1.0   | 2    | 0.8273          |
 ### Framework versions

 This model is a fine-tuned version of [pszemraj/pegasus-x-large-book-summary](https://huggingface.co/pszemraj/pegasus-x-large-book-summary) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4437
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
+- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.5953        | 0.992 | 62   | 0.4814          |
+| 0.5178        | 2.0   | 125  | 0.4507          |
+| 0.4823        | 2.976 | 186  | 0.4437          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,8 +20,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
     "q_proj",
     "dense",
     "v_proj"
   ],

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
+    "k_proj",
     "dense",
     "v_proj"
   ],

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:50c563c9d556182b2fb72b22effe9bf588ef87a2b1c9f8c97e66875fa43d587a
 size 302031872

 version https://git-lfs.github.com/spec/v1
+oid sha256:69c70c335317fb47faa05cc1791d2e42e98fc480c299047186d48211f59e0fce
 size 302031872

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f41289dfbf2bee000cf9e1de602cfad9220a9549d160989f7721e424ea7b6fd
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:2f9af75046ad1319b3a45f02b59e8a6d74cfda6616a87ea22287c60a4e2e7b4c
 size 5240