pakawadeep's picture
Training in progress epoch 27
7939180
metadata
license: apache-2.0
base_model: google/mt5-large
tags:
  - generated_from_keras_callback
model-index:
  - name: pakawadeep/mt5-large-finetuned-ctfl
    results: []

pakawadeep/mt5-large-finetuned-ctfl

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.7379
  • Validation Loss: 0.8383
  • Train Rouge1: 8.9816
  • Train Rouge2: 2.3762
  • Train Rougel: 8.9109
  • Train Rougelsum: 8.9109
  • Train Gen Len: 11.9752
  • Epoch: 27

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
11.3596 5.2319 2.5366 0.5088 2.5148 2.4929 19.0 0
6.1803 3.1508 2.7057 0.5265 2.6846 2.6643 19.0 1
4.6767 2.5774 2.7471 0.5265 2.7054 2.6899 18.3218 2
3.8698 2.8216 2.9763 0.2200 2.8792 2.8987 16.6238 3
4.7045 2.7911 3.2793 0.5501 3.1484 3.2486 14.1881 4
4.0342 2.4191 5.9406 0.6365 5.7206 5.8306 10.6980 5
3.4642 2.1307 5.9406 0.9406 5.7756 5.8463 11.2228 6
3.0690 1.9079 6.0644 0.9901 5.9406 6.0644 11.2228 7
2.6140 1.7092 5.7756 0.8251 5.6518 5.8168 11.4604 8
2.4520 1.6478 5.8581 0.8251 5.6931 5.8581 11.0842 9
2.2701 1.5641 5.9406 0.8251 5.8581 5.8581 10.8465 10
2.0735 1.4839 7.3020 1.0726 7.1370 7.2814 11.0891 11
1.8757 1.3780 7.4670 1.0726 7.3020 7.4257 11.2228 12
1.7313 1.3204 7.3020 1.0726 7.1370 7.2814 11.5842 13
1.5944 1.2466 7.4670 1.0726 7.3020 7.4257 11.6485 14
1.4894 1.1993 8.0858 1.5677 7.9208 8.1271 11.6139 15
1.3939 1.1446 8.1271 2.0627 8.0033 8.0858 11.7030 16
1.3065 1.0837 7.7558 1.5677 7.5083 7.5908 11.8168 17
1.2367 1.0604 8.0387 1.9307 7.9915 7.9679 11.9356 18
1.1569 1.0071 7.6143 1.4356 7.4257 7.4965 11.8515 19
1.0732 0.9713 8.5809 1.9307 8.4158 8.4512 11.8465 20
1.0204 0.9582 8.5809 1.9307 8.4158 8.4512 11.8317 21
0.9636 0.9317 8.5809 1.9307 8.4158 8.4512 11.8416 22
0.9054 0.8921 8.5809 1.9307 8.4158 8.4512 11.8663 23
0.8685 0.8795 9.0759 2.4257 8.9109 8.9109 11.8861 24
0.8100 0.8666 8.9816 2.3762 8.9109 8.9109 11.9455 25
0.7749 0.8524 8.9816 2.3762 8.9109 8.9109 11.9505 26
0.7379 0.8383 8.9816 2.3762 8.9109 8.9109 11.9752 27

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2