mbart-large-50 / README.md
joheras's picture
update model card README.md
9e5ec36
|
raw
history blame
4.2 kB
metadata
license: mit
tags:
  - simplification
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mbart-large-50-clara-med
    results: []

mbart-large-50-clara-med

This model is a fine-tuned version of facebook/mbart-large-50 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2175
  • Rouge1: 48.3311
  • Rouge2: 30.5638
  • Rougel: 43.5214
  • Rougelsum: 43.6488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 190 3.2394 16.8539 2.7013 12.425 12.5286
No log 2.0 380 1.7381 44.5316 27.8022 40.1591 40.3177
3.4249 3.0 570 1.7198 45.6463 28.6925 41.263 41.4703
3.4249 4.0 760 1.9450 43.0233 26.3397 38.7518 38.9154
0.8377 5.0 950 2.1068 46.5936 28.7218 41.7184 41.8448
0.8377 6.0 1140 2.2815 46.4517 28.5639 41.8107 41.9996
0.8377 7.0 1330 2.4726 46.0403 28.1887 40.9183 41.0318
0.3195 8.0 1520 2.5690 47.255 29.1482 42.4463 42.5728
0.3195 9.0 1710 2.6753 46.5967 28.5688 41.414 41.5889
0.1925 10.0 1900 2.7276 46.3251 28.4889 41.4556 41.581
0.1925 11.0 2090 2.7638 46.9325 29.2558 41.726 41.8413
0.1925 12.0 2280 2.8273 47.0344 29.1298 41.7291 41.9236
0.1313 13.0 2470 2.8633 47.5234 29.6376 42.3409 42.4372
0.1313 14.0 2660 2.8989 47.0396 29.117 41.9893 42.1846
0.1117 15.0 2850 2.9691 47.8406 29.889 42.5645 42.7676
0.1117 16.0 3040 2.9763 46.9489 28.9919 41.8404 42.0141
0.1117 17.0 3230 2.9985 47.6628 29.7341 42.6382 42.7649
0.0824 18.0 3420 3.0511 48.0627 30.4108 43.1693 43.3489
0.0824 19.0 3610 3.0102 48.05 29.9552 43.1462 43.3421
0.0467 20.0 3800 3.0520 47.5451 29.6129 42.6499 42.7968
0.0467 21.0 3990 3.0978 47.5042 29.6191 42.6093 42.7341
0.0467 22.0 4180 3.1270 47.8301 29.9484 42.6866 42.9179
0.0246 23.0 4370 3.1435 47.6683 30.1974 43.0456 43.1496
0.0246 24.0 4560 3.1599 47.8652 30.2751 43.0445 43.1898
0.013 25.0 4750 3.1750 48.1352 30.4185 43.0485 43.2456
0.013 26.0 4940 3.1939 47.9653 30.3968 43.1271 43.2522
0.013 27.0 5130 3.2054 48.2122 30.6 43.3461 43.4629
0.0071 28.0 5320 3.1964 47.924 30.3089 43.0402 43.2016
0.0071 29.0 5510 3.2123 48.2967 30.5088 43.431 43.5384
0.005 30.0 5700 3.2175 48.3311 30.5638 43.5214 43.6488

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0
  • Datasets 2.8.0
  • Tokenizers 0.12.1