mbart-large-50 / README.md
joheras's picture
update model card README.md
fbb368d
|
raw
history blame
4.2 kB
metadata
license: mit
tags:
  - simplification
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mbart-large-50-clara-med
    results: []

mbart-large-50-clara-med

This model is a fine-tuned version of facebook/mbart-large-50 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2121
  • Rouge1: 49.1001
  • Rouge2: 31.2516
  • Rougel: 44.0446
  • Rougelsum: 44.1075

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 190 1.8633 44.8593 28.0451 40.7724 40.8654
No log 2.0 380 1.6667 46.8654 29.5857 42.6056 42.7844
3.317 3.0 570 1.6847 48.1605 30.163 43.1965 43.3317
3.317 4.0 760 1.7845 48.7615 30.8887 43.6946 43.8016
0.7441 5.0 950 2.0090 48.4207 30.64 43.654 43.7979
0.7441 6.0 1140 2.2425 49.1967 31.2644 44.0566 44.2112
0.7441 7.0 1330 2.4520 47.0568 28.7501 41.8219 41.9605
0.2396 8.0 1520 2.5336 47.969 30.0618 42.9924 43.1481
0.2396 9.0 1710 2.6153 47.2037 28.9732 42.0939 42.2242
0.1112 10.0 1900 2.7299 48.3657 30.3342 43.2025 43.3223
0.1112 11.0 2090 2.7696 48.0929 30.0156 42.9385 43.026
0.1112 12.0 2280 2.8627 48.1979 30.2714 43.0959 43.2027
0.0938 13.0 2470 2.8788 47.7685 29.5733 42.7561 42.9112
0.0938 14.0 2660 2.9128 47.5374 29.8217 42.7097 42.7803
0.0394 15.0 2850 2.9470 48.6385 30.1425 43.3326 43.3963
0.0394 16.0 3040 3.0039 48.6657 30.6642 43.471 43.592
0.0394 17.0 3230 3.0380 48.2351 30.5653 43.257 43.3788
0.023 18.0 3420 3.0289 48.6593 30.6916 43.7861 43.9098
0.023 19.0 3610 3.0733 49.2114 31.2737 44.0852 44.1993
0.0122 20.0 3800 3.1089 48.5431 30.5305 43.4128 43.5288
0.0122 21.0 3990 3.0684 48.4197 30.4005 43.2305 43.3214
0.0122 22.0 4180 3.1252 48.6007 30.5594 43.4008 43.5336
0.0071 23.0 4370 3.1572 48.7297 30.7028 43.436 43.5106
0.0071 24.0 4560 3.1716 48.9335 30.9918 43.7764 43.8044
0.0041 25.0 4750 3.1687 48.8731 31.1055 43.8021 43.8987
0.0041 26.0 4940 3.1845 48.9432 31.0766 43.8628 43.9726
0.0041 27.0 5130 3.2133 49.2016 31.1265 44.052 44.1427
0.0025 28.0 5320 3.2146 49.1473 31.3109 44.0372 44.1189
0.0025 29.0 5510 3.2121 49.2815 31.4258 44.1661 44.2436
0.0019 30.0 5700 3.2121 49.1001 31.2516 44.0446 44.1075

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0
  • Datasets 2.8.0
  • Tokenizers 0.12.1