flan-t5-rouge-durga-q5-clean-4e

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0068
  • Rouge1: 0.7279
  • Rouge2: 0.6940
  • Rougel: 0.7270
  • Rougelsum: 0.7260

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.1863 1.0 9 1.7470 0.2666 0.0753 0.2615 0.2609
2.2667 2.0 18 1.3513 0.3106 0.1072 0.3001 0.2994
1.4223 3.0 27 1.0986 0.3466 0.1348 0.3382 0.3376
1.5007 4.0 36 0.8739 0.3505 0.1562 0.3430 0.3432
1.4099 5.0 45 0.7260 0.3892 0.1758 0.3780 0.3779
0.9957 6.0 54 0.5489 0.3756 0.1913 0.3661 0.3672
0.9214 7.0 63 0.4321 0.3928 0.2141 0.3832 0.3825
0.7139 8.0 72 0.3473 0.4109 0.2480 0.4020 0.4023
0.6069 9.0 81 0.2602 0.4478 0.2897 0.4377 0.4380
0.5718 10.0 90 0.2200 0.4416 0.2834 0.4325 0.4321
0.4169 11.0 99 0.1599 0.4401 0.3084 0.4346 0.4345
0.3225 12.0 108 0.1387 0.4636 0.3224 0.4579 0.4584
0.3575 13.0 117 0.1143 0.4867 0.3583 0.4806 0.4801
0.2931 14.0 126 0.1022 0.4854 0.3662 0.4820 0.4828
0.2127 15.0 135 0.0913 0.4831 0.3654 0.4798 0.4814
0.2038 16.0 144 0.0848 0.5087 0.3932 0.5069 0.5070
0.264 17.0 153 0.0770 0.5041 0.3951 0.5008 0.5017
0.225 18.0 162 0.0745 0.5207 0.4148 0.5186 0.5191
0.1799 19.0 171 0.0660 0.5238 0.4168 0.5218 0.5214
0.1969 20.0 180 0.0585 0.5367 0.4385 0.5354 0.5348
0.1936 21.0 189 0.0537 0.5336 0.4331 0.5311 0.5313
0.1648 22.0 198 0.0482 0.5679 0.4742 0.5634 0.5651
0.139 23.0 207 0.0474 0.5563 0.4650 0.5535 0.5541
0.1243 24.0 216 0.0424 0.5729 0.4932 0.5733 0.5735
0.1175 25.0 225 0.0405 0.6096 0.5354 0.6074 0.6071
0.1087 26.0 234 0.0374 0.6148 0.5404 0.6140 0.6139
0.08 27.0 243 0.0351 0.6005 0.5205 0.5976 0.6004
0.1133 28.0 252 0.0317 0.5957 0.5140 0.5939 0.5947
0.0514 29.0 261 0.0300 0.6371 0.5702 0.6376 0.6368
0.0819 30.0 270 0.0281 0.6332 0.5680 0.6315 0.6318
0.1079 31.0 279 0.0248 0.6358 0.5696 0.6345 0.6344
0.079 32.0 288 0.0271 0.6231 0.5504 0.6222 0.6222
0.0633 33.0 297 0.0239 0.6607 0.5957 0.6572 0.6584
0.0557 34.0 306 0.0207 0.6658 0.6084 0.6639 0.6636
0.0614 35.0 315 0.0176 0.6767 0.6228 0.6753 0.6744
0.1288 36.0 324 0.0197 0.6603 0.5977 0.6592 0.6583
0.0324 37.0 333 0.0172 0.6853 0.6355 0.6852 0.6849
0.0275 38.0 342 0.0167 0.6881 0.6352 0.6852 0.6853
0.0346 39.0 351 0.0162 0.6772 0.6256 0.6760 0.6759
0.0343 40.0 360 0.0148 0.6960 0.6493 0.6935 0.6937
0.0413 41.0 369 0.0133 0.6883 0.6375 0.6869 0.6862
0.0871 42.0 378 0.0127 0.7170 0.6764 0.7146 0.7148
0.0351 43.0 387 0.0126 0.7072 0.6630 0.7067 0.7066
0.0391 44.0 396 0.0123 0.7108 0.6658 0.7091 0.7095
0.019 45.0 405 0.0122 0.7049 0.6601 0.7045 0.7044
0.0366 46.0 414 0.0117 0.7078 0.6670 0.7077 0.7072
0.0222 47.0 423 0.0110 0.7039 0.6591 0.7033 0.7037
0.03 48.0 432 0.0103 0.7098 0.6643 0.7076 0.7089
0.0534 49.0 441 0.0096 0.7184 0.6801 0.7171 0.7176
0.0363 50.0 450 0.0093 0.7207 0.6829 0.7193 0.7195
0.0175 51.0 459 0.0093 0.7208 0.6849 0.7185 0.7196
0.0325 52.0 468 0.0080 0.7276 0.6930 0.7261 0.7261
0.0341 53.0 477 0.0072 0.7310 0.6980 0.7296 0.7287
0.015 54.0 486 0.0072 0.7261 0.6896 0.7246 0.7244
0.0391 55.0 495 0.0074 0.7279 0.6940 0.7270 0.7260
0.0271 56.0 504 0.0072 0.7251 0.6883 0.7242 0.7240
0.0528 57.0 513 0.0073 0.7243 0.6870 0.7229 0.7226
0.0292 58.0 522 0.0069 0.7279 0.6940 0.7270 0.7260
0.0465 59.0 531 0.0068 0.7279 0.6940 0.7270 0.7260
0.0218 60.0 540 0.0068 0.7279 0.6940 0.7270 0.7260

Framework versions

  • Transformers 4.46.0
  • Pytorch 2.5.0+cu121
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
6
Safetensors
Model size
248M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4e

Finetuned
(678)
this model

Space using devagonal/flan-t5-rouge-durga-q5-clean-4e 1