Edit model card

mt5-large-finetuned-scope-summarization

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.2775
  • Rouge1: 5.918
  • Rouge2: 1.0667
  • Rougel: 5.7247
  • Rougelsum: 5.552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
27.8719 1.0 13 15.8303 9.9779 0.8912 8.8304 8.8653
25.4142 2.0 26 20.3410 11.3301 1.0662 9.8807 9.8442
24.8026 3.0 39 16.5876 11.1912 1.5008 9.9776 9.9685
23.7918 4.0 52 14.0667 11.5953 1.6391 10.2961 10.1512
21.945 5.0 65 12.3075 10.6522 1.2121 10.0748 10.0261
18.8588 6.0 78 11.8270 11.4944 1.4152 9.9891 9.9505
16.587 7.0 91 10.7425 9.9989 1.425 8.9661 8.9811
15.9949 8.0 104 10.2228 10.0086 1.6533 8.9911 9.0047
15.2301 9.0 117 11.2979 9.2011 1.425 8.9267 8.8763
14.9655 10.0 130 11.3654 9.3934 1.6533 8.9243 8.8443
14.7982 11.0 143 10.7718 8.5085 1.4133 8.0936 8.0127
13.5222 12.0 156 10.0961 7.849 1.1637 7.3283 7.1943
13.0959 13.0 169 9.4677 8.0846 1.1637 7.1215 7.0501
13.0554 14.0 182 8.9576 7.0454 1.2494 6.7761 6.6897
13.1098 15.0 195 8.7926 7.9192 1.4133 7.742 7.6718
12.4133 16.0 208 8.5472 7.0176 1.2819 6.8465 6.8276
12.4751 17.0 221 8.5494 5.918 1.0667 5.7247 5.552
11.9681 18.0 234 8.5223 5.918 1.0667 5.7247 5.552
11.8797 19.0 247 8.3327 5.918 1.0667 5.7247 5.552
11.8815 20.0 260 8.2775 5.918 1.0667 5.7247 5.552

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
1.23B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nandavikas16/mt5-large-finetuned-scope-summarization

Base model

google/mt5-large
Finetuned
(41)
this model