mbart-large-50 / README.md
joheras's picture
update model card README.md
fbb368d
---
license: mit
tags:
- simplification
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mbart-large-50-clara-med
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# mbart-large-50-clara-med
This model is a fine-tuned version of [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 3.2121
- Rouge1: 49.1001
- Rouge2: 31.2516
- Rougel: 44.0446
- Rougelsum: 44.1075
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| No log | 1.0 | 190 | 1.8633 | 44.8593 | 28.0451 | 40.7724 | 40.8654 |
| No log | 2.0 | 380 | 1.6667 | 46.8654 | 29.5857 | 42.6056 | 42.7844 |
| 3.317 | 3.0 | 570 | 1.6847 | 48.1605 | 30.163 | 43.1965 | 43.3317 |
| 3.317 | 4.0 | 760 | 1.7845 | 48.7615 | 30.8887 | 43.6946 | 43.8016 |
| 0.7441 | 5.0 | 950 | 2.0090 | 48.4207 | 30.64 | 43.654 | 43.7979 |
| 0.7441 | 6.0 | 1140 | 2.2425 | 49.1967 | 31.2644 | 44.0566 | 44.2112 |
| 0.7441 | 7.0 | 1330 | 2.4520 | 47.0568 | 28.7501 | 41.8219 | 41.9605 |
| 0.2396 | 8.0 | 1520 | 2.5336 | 47.969 | 30.0618 | 42.9924 | 43.1481 |
| 0.2396 | 9.0 | 1710 | 2.6153 | 47.2037 | 28.9732 | 42.0939 | 42.2242 |
| 0.1112 | 10.0 | 1900 | 2.7299 | 48.3657 | 30.3342 | 43.2025 | 43.3223 |
| 0.1112 | 11.0 | 2090 | 2.7696 | 48.0929 | 30.0156 | 42.9385 | 43.026 |
| 0.1112 | 12.0 | 2280 | 2.8627 | 48.1979 | 30.2714 | 43.0959 | 43.2027 |
| 0.0938 | 13.0 | 2470 | 2.8788 | 47.7685 | 29.5733 | 42.7561 | 42.9112 |
| 0.0938 | 14.0 | 2660 | 2.9128 | 47.5374 | 29.8217 | 42.7097 | 42.7803 |
| 0.0394 | 15.0 | 2850 | 2.9470 | 48.6385 | 30.1425 | 43.3326 | 43.3963 |
| 0.0394 | 16.0 | 3040 | 3.0039 | 48.6657 | 30.6642 | 43.471 | 43.592 |
| 0.0394 | 17.0 | 3230 | 3.0380 | 48.2351 | 30.5653 | 43.257 | 43.3788 |
| 0.023 | 18.0 | 3420 | 3.0289 | 48.6593 | 30.6916 | 43.7861 | 43.9098 |
| 0.023 | 19.0 | 3610 | 3.0733 | 49.2114 | 31.2737 | 44.0852 | 44.1993 |
| 0.0122 | 20.0 | 3800 | 3.1089 | 48.5431 | 30.5305 | 43.4128 | 43.5288 |
| 0.0122 | 21.0 | 3990 | 3.0684 | 48.4197 | 30.4005 | 43.2305 | 43.3214 |
| 0.0122 | 22.0 | 4180 | 3.1252 | 48.6007 | 30.5594 | 43.4008 | 43.5336 |
| 0.0071 | 23.0 | 4370 | 3.1572 | 48.7297 | 30.7028 | 43.436 | 43.5106 |
| 0.0071 | 24.0 | 4560 | 3.1716 | 48.9335 | 30.9918 | 43.7764 | 43.8044 |
| 0.0041 | 25.0 | 4750 | 3.1687 | 48.8731 | 31.1055 | 43.8021 | 43.8987 |
| 0.0041 | 26.0 | 4940 | 3.1845 | 48.9432 | 31.0766 | 43.8628 | 43.9726 |
| 0.0041 | 27.0 | 5130 | 3.2133 | 49.2016 | 31.1265 | 44.052 | 44.1427 |
| 0.0025 | 28.0 | 5320 | 3.2146 | 49.1473 | 31.3109 | 44.0372 | 44.1189 |
| 0.0025 | 29.0 | 5510 | 3.2121 | 49.2815 | 31.4258 | 44.1661 | 44.2436 |
| 0.0019 | 30.0 | 5700 | 3.2121 | 49.1001 | 31.2516 | 44.0446 | 44.1075 |
### Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0
- Datasets 2.8.0
- Tokenizers 0.12.1