End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1363
 ## Model description
@@ -35,22 +35,19 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
-- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.192         | 1.0   | 430  | 0.1363          |
-| 0.2402        | 2.0   | 860  | 0.1363          |
-| 0.2249        | 3.0   | 1290 | 0.1363          |
-| 0.2243        | 4.0   | 1720 | 0.1363          |
-| 0.2243        | 5.0   | 2150 | 0.1363          |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1314
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
+- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.2287        | 1.0   | 430  | 0.1314          |
+| 0.2014        | 2.0   | 860  | 0.1314          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -5,10 +5,10 @@
   "num_attention_heads": 16,
   "num_layers": 24,
   "num_transformer_submodules": 2,
-  "num_virtual_tokens": 20,
   "peft_type": "PROMPT_TUNING",
-  "prompt_tuning_init": "TEXT",
-  "prompt_tuning_init_text": "Answer this question using the provided context",
   "revision": null,
   "task_type": "SEQ_2_SEQ_LM",
   "token_dim": 1024,

   "num_attention_heads": 16,
   "num_layers": 24,
   "num_transformer_submodules": 2,
+  "num_virtual_tokens": 8,
   "peft_type": "PROMPT_TUNING",
+  "prompt_tuning_init": "RANDOM",
+  "prompt_tuning_init_text": null,
   "revision": null,
   "task_type": "SEQ_2_SEQ_LM",
   "token_dim": 1024,

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:34bd6fbe4ca9710c3217bb88f627333c27861fdf642d2e815636091dee8aecdb
-size 164605

 version https://git-lfs.github.com/spec/v1
+oid sha256:353f2f19234d0dc4d54ae03fec116695608b592244b6987b8f887109541e2e11
+size 66301

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0f8eb94c1337f4adf82eb85b1ebcdbf245054a02b9da2716bbd86d8e45880b6b
 size 4219

 version https://git-lfs.github.com/spec/v1
+oid sha256:36ea94b94e9d3e7a936caa6a922bfe5dc8f6f93ac97ef9d562dbec62eb709047
 size 4219