mbart-large-50 / README.md
joheras's picture
update model card README.md
af65b3b
|
raw
history blame
4.2 kB
metadata
license: mit
tags:
  - simplification
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mbart-large-50-clara-med
    results: []

mbart-large-50-clara-med

This model is a fine-tuned version of facebook/mbart-large-50 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0952
  • Rouge1: 49.4298
  • Rouge2: 31.7193
  • Rougel: 44.732
  • Rougelsum: 44.9281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 190 9.5151 8.9002 0.0056 8.9059 8.8991
No log 2.0 380 1.7786 44.8765 27.9652 40.2081 40.3457
4.488 3.0 570 1.7104 46.4054 28.8582 41.6579 41.86
4.488 4.0 760 1.7601 47.6046 30.1854 42.9171 43.0745
1.1057 5.0 950 1.9232 48.1693 30.1535 43.0418 43.1796
1.1057 6.0 1140 2.2791 43.831 26.9216 39.1768 39.3672
1.1057 7.0 1330 2.4800 42.4614 25.2371 37.6735 37.9309
0.4401 8.0 1520 2.4991 46.6653 28.9836 42.1188 42.2492
0.4401 9.0 1710 2.5826 47.2784 29.8703 42.622 42.7514
0.2523 10.0 1900 2.6356 48.0382 30.8884 43.3523 43.5068
0.2523 11.0 2090 2.6141 47.6911 29.3254 42.4938 42.6519
0.2523 12.0 2280 2.6942 48.7597 30.9279 43.5391 43.6974
0.1613 13.0 2470 2.7194 49.0916 30.9767 43.9943 44.1572
0.1613 14.0 2660 2.7911 47.8223 30.6173 43.1809 43.3471
0.1305 15.0 2850 2.8370 47.5629 29.7783 42.7168 42.8503
0.1305 16.0 3040 2.8588 49.4762 31.6101 44.5422 44.7027
0.1305 17.0 3230 2.9082 49.1502 31.4654 44.2166 44.3186
0.141 18.0 3420 2.8887 48.9675 31.0485 44.177 44.3258
0.141 19.0 3610 2.9043 49.2936 31.5204 44.2215 44.4216
0.1096 20.0 3800 2.9549 48.0316 30.4505 42.9444 43.0893
0.1096 21.0 3990 2.9666 49.2276 31.2755 44.2435 44.4591
0.1096 22.0 4180 2.9697 49.1008 31.4931 44.1893 44.382
0.0773 23.0 4370 2.9970 49.3707 31.4672 44.6066 44.7685
0.0773 24.0 4560 3.0081 49.2172 31.4693 44.4235 44.5458
0.048 25.0 4750 2.9968 49.4847 31.8341 44.8464 45.0286
0.048 26.0 4940 3.0405 49.5724 31.612 44.5192 44.7717
0.048 27.0 5130 3.0651 49.0194 31.2473 44.177 44.3837
0.0274 28.0 5320 3.0740 49.2999 31.5672 44.56 44.8105
0.0274 29.0 5510 3.0842 49.2898 31.602 44.5414 44.754
0.0168 30.0 5700 3.0952 49.4298 31.7193 44.732 44.9281

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0
  • Datasets 2.8.0
  • Tokenizers 0.12.1