flan-t5-rouge-durga-q5-clean-4e
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0068
- Rouge1: 0.7279
- Rouge2: 0.6940
- Rougel: 0.7270
- Rougelsum: 0.7260
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 60
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.1863 | 1.0 | 9 | 1.7470 | 0.2666 | 0.0753 | 0.2615 | 0.2609 |
2.2667 | 2.0 | 18 | 1.3513 | 0.3106 | 0.1072 | 0.3001 | 0.2994 |
1.4223 | 3.0 | 27 | 1.0986 | 0.3466 | 0.1348 | 0.3382 | 0.3376 |
1.5007 | 4.0 | 36 | 0.8739 | 0.3505 | 0.1562 | 0.3430 | 0.3432 |
1.4099 | 5.0 | 45 | 0.7260 | 0.3892 | 0.1758 | 0.3780 | 0.3779 |
0.9957 | 6.0 | 54 | 0.5489 | 0.3756 | 0.1913 | 0.3661 | 0.3672 |
0.9214 | 7.0 | 63 | 0.4321 | 0.3928 | 0.2141 | 0.3832 | 0.3825 |
0.7139 | 8.0 | 72 | 0.3473 | 0.4109 | 0.2480 | 0.4020 | 0.4023 |
0.6069 | 9.0 | 81 | 0.2602 | 0.4478 | 0.2897 | 0.4377 | 0.4380 |
0.5718 | 10.0 | 90 | 0.2200 | 0.4416 | 0.2834 | 0.4325 | 0.4321 |
0.4169 | 11.0 | 99 | 0.1599 | 0.4401 | 0.3084 | 0.4346 | 0.4345 |
0.3225 | 12.0 | 108 | 0.1387 | 0.4636 | 0.3224 | 0.4579 | 0.4584 |
0.3575 | 13.0 | 117 | 0.1143 | 0.4867 | 0.3583 | 0.4806 | 0.4801 |
0.2931 | 14.0 | 126 | 0.1022 | 0.4854 | 0.3662 | 0.4820 | 0.4828 |
0.2127 | 15.0 | 135 | 0.0913 | 0.4831 | 0.3654 | 0.4798 | 0.4814 |
0.2038 | 16.0 | 144 | 0.0848 | 0.5087 | 0.3932 | 0.5069 | 0.5070 |
0.264 | 17.0 | 153 | 0.0770 | 0.5041 | 0.3951 | 0.5008 | 0.5017 |
0.225 | 18.0 | 162 | 0.0745 | 0.5207 | 0.4148 | 0.5186 | 0.5191 |
0.1799 | 19.0 | 171 | 0.0660 | 0.5238 | 0.4168 | 0.5218 | 0.5214 |
0.1969 | 20.0 | 180 | 0.0585 | 0.5367 | 0.4385 | 0.5354 | 0.5348 |
0.1936 | 21.0 | 189 | 0.0537 | 0.5336 | 0.4331 | 0.5311 | 0.5313 |
0.1648 | 22.0 | 198 | 0.0482 | 0.5679 | 0.4742 | 0.5634 | 0.5651 |
0.139 | 23.0 | 207 | 0.0474 | 0.5563 | 0.4650 | 0.5535 | 0.5541 |
0.1243 | 24.0 | 216 | 0.0424 | 0.5729 | 0.4932 | 0.5733 | 0.5735 |
0.1175 | 25.0 | 225 | 0.0405 | 0.6096 | 0.5354 | 0.6074 | 0.6071 |
0.1087 | 26.0 | 234 | 0.0374 | 0.6148 | 0.5404 | 0.6140 | 0.6139 |
0.08 | 27.0 | 243 | 0.0351 | 0.6005 | 0.5205 | 0.5976 | 0.6004 |
0.1133 | 28.0 | 252 | 0.0317 | 0.5957 | 0.5140 | 0.5939 | 0.5947 |
0.0514 | 29.0 | 261 | 0.0300 | 0.6371 | 0.5702 | 0.6376 | 0.6368 |
0.0819 | 30.0 | 270 | 0.0281 | 0.6332 | 0.5680 | 0.6315 | 0.6318 |
0.1079 | 31.0 | 279 | 0.0248 | 0.6358 | 0.5696 | 0.6345 | 0.6344 |
0.079 | 32.0 | 288 | 0.0271 | 0.6231 | 0.5504 | 0.6222 | 0.6222 |
0.0633 | 33.0 | 297 | 0.0239 | 0.6607 | 0.5957 | 0.6572 | 0.6584 |
0.0557 | 34.0 | 306 | 0.0207 | 0.6658 | 0.6084 | 0.6639 | 0.6636 |
0.0614 | 35.0 | 315 | 0.0176 | 0.6767 | 0.6228 | 0.6753 | 0.6744 |
0.1288 | 36.0 | 324 | 0.0197 | 0.6603 | 0.5977 | 0.6592 | 0.6583 |
0.0324 | 37.0 | 333 | 0.0172 | 0.6853 | 0.6355 | 0.6852 | 0.6849 |
0.0275 | 38.0 | 342 | 0.0167 | 0.6881 | 0.6352 | 0.6852 | 0.6853 |
0.0346 | 39.0 | 351 | 0.0162 | 0.6772 | 0.6256 | 0.6760 | 0.6759 |
0.0343 | 40.0 | 360 | 0.0148 | 0.6960 | 0.6493 | 0.6935 | 0.6937 |
0.0413 | 41.0 | 369 | 0.0133 | 0.6883 | 0.6375 | 0.6869 | 0.6862 |
0.0871 | 42.0 | 378 | 0.0127 | 0.7170 | 0.6764 | 0.7146 | 0.7148 |
0.0351 | 43.0 | 387 | 0.0126 | 0.7072 | 0.6630 | 0.7067 | 0.7066 |
0.0391 | 44.0 | 396 | 0.0123 | 0.7108 | 0.6658 | 0.7091 | 0.7095 |
0.019 | 45.0 | 405 | 0.0122 | 0.7049 | 0.6601 | 0.7045 | 0.7044 |
0.0366 | 46.0 | 414 | 0.0117 | 0.7078 | 0.6670 | 0.7077 | 0.7072 |
0.0222 | 47.0 | 423 | 0.0110 | 0.7039 | 0.6591 | 0.7033 | 0.7037 |
0.03 | 48.0 | 432 | 0.0103 | 0.7098 | 0.6643 | 0.7076 | 0.7089 |
0.0534 | 49.0 | 441 | 0.0096 | 0.7184 | 0.6801 | 0.7171 | 0.7176 |
0.0363 | 50.0 | 450 | 0.0093 | 0.7207 | 0.6829 | 0.7193 | 0.7195 |
0.0175 | 51.0 | 459 | 0.0093 | 0.7208 | 0.6849 | 0.7185 | 0.7196 |
0.0325 | 52.0 | 468 | 0.0080 | 0.7276 | 0.6930 | 0.7261 | 0.7261 |
0.0341 | 53.0 | 477 | 0.0072 | 0.7310 | 0.6980 | 0.7296 | 0.7287 |
0.015 | 54.0 | 486 | 0.0072 | 0.7261 | 0.6896 | 0.7246 | 0.7244 |
0.0391 | 55.0 | 495 | 0.0074 | 0.7279 | 0.6940 | 0.7270 | 0.7260 |
0.0271 | 56.0 | 504 | 0.0072 | 0.7251 | 0.6883 | 0.7242 | 0.7240 |
0.0528 | 57.0 | 513 | 0.0073 | 0.7243 | 0.6870 | 0.7229 | 0.7226 |
0.0292 | 58.0 | 522 | 0.0069 | 0.7279 | 0.6940 | 0.7270 | 0.7260 |
0.0465 | 59.0 | 531 | 0.0068 | 0.7279 | 0.6940 | 0.7270 | 0.7260 |
0.0218 | 60.0 | 540 | 0.0068 | 0.7279 | 0.6940 | 0.7270 | 0.7260 |
Framework versions
- Transformers 4.46.0
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4e
Base model
google/flan-t5-base