mt5-small_old / README.md
psxjp5's picture
update model card README.md
658dab0
|
raw
history blame
4.31 kB
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
model-index:
  - name: mt5-small_epochs_new_new
    results: []

mt5-small_epochs_new_new

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0848
  • Rouge1: 41.527
  • Rouge2: 33.324
  • Rougel: 38.4866
  • Rougelsum: 38.4856
  • Bleu: 29.906
  • Gen Len: 17.1296
  • Meteor: 0.377
  • No ans accuracy: 47

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 9
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len Meteor No ans accuracy
9.0096 1.0 316 2.5283 23.4252 15.294 21.7032 21.7597 9.5703 12.1099 0.2117 0
3.2564 2.0 632 1.8337 33.2328 25.7922 31.2553 31.2709 15.51 13.8804 0.3108 0
2.5244 3.0 948 1.5796 34.1863 26.9908 32.3162 32.3197 17.5684 14.1904 0.3262 0
2.1686 3.99 1264 1.4179 34.565 27.4829 32.6012 32.6306 18.1896 14.2814 0.329 0
1.9465 4.99 1580 1.3050 41.2984 32.8587 38.3901 38.3985 28.3953 17.1212 0.3724 24
1.8009 5.99 1896 1.2428 41.5784 33.0684 38.6495 38.6555 28.9287 17.2045 0.3755 27
1.6954 6.99 2212 1.1992 40.4868 32.2937 37.6021 37.5986 28.2477 16.8056 0.3662 54
1.6322 7.99 2528 1.1769 37.6427 30.0271 34.8637 34.8951 26.433 15.5656 0.34 124
1.5845 8.99 2844 1.1574 40.3396 32.2547 37.3672 37.4137 28.6687 16.6457 0.3638 66
1.5425 9.98 3160 1.1500 39.1906 31.3426 36.3113 36.3654 27.8135 16.1679 0.3542 95
1.5137 10.98 3476 1.1367 41.4173 33.1848 38.4473 38.4306 29.6548 17.0306 0.3755 51
1.4826 11.98 3792 1.1161 41.4856 33.1913 38.4806 38.4896 29.5512 17.1031 0.3762 44
1.4514 12.98 4108 1.1182 41.8374 33.5091 38.7582 38.7679 29.8577 17.2987 0.3797 37
1.4444 13.98 4424 1.1056 42.0345 33.6905 38.9576 38.9795 30.1371 17.2669 0.3823 38
1.425 14.98 4740 1.0973 41.5086 33.2216 38.4098 38.4115 29.7019 17.1244 0.3767 50
1.407 15.97 5056 1.0890 41.7122 33.4259 38.605 38.6225 29.9984 17.1908 0.3794 44
1.4005 16.97 5372 1.0881 41.5731 33.2998 38.521 38.5259 29.9097 17.1027 0.3775 49
1.3865 17.97 5688 1.0860 40.9767 32.8412 37.9532 37.9637 29.4171 16.9404 0.372 55
1.3849 18.97 6004 1.0848 41.527 33.324 38.4866 38.4856 29.906 17.1296 0.377 47

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.13.1
  • Tokenizers 0.13.3