|
--- |
|
license: mit |
|
tags: |
|
- simplification |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: mbart-large-50-clara-med |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# mbart-large-50-clara-med |
|
|
|
This model is a fine-tuned version of [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 3.2121 |
|
- Rouge1: 49.1001 |
|
- Rouge2: 31.2516 |
|
- Rougel: 44.0446 |
|
- Rougelsum: 44.1075 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5.6e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 30 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:| |
|
| No log | 1.0 | 190 | 1.8633 | 44.8593 | 28.0451 | 40.7724 | 40.8654 | |
|
| No log | 2.0 | 380 | 1.6667 | 46.8654 | 29.5857 | 42.6056 | 42.7844 | |
|
| 3.317 | 3.0 | 570 | 1.6847 | 48.1605 | 30.163 | 43.1965 | 43.3317 | |
|
| 3.317 | 4.0 | 760 | 1.7845 | 48.7615 | 30.8887 | 43.6946 | 43.8016 | |
|
| 0.7441 | 5.0 | 950 | 2.0090 | 48.4207 | 30.64 | 43.654 | 43.7979 | |
|
| 0.7441 | 6.0 | 1140 | 2.2425 | 49.1967 | 31.2644 | 44.0566 | 44.2112 | |
|
| 0.7441 | 7.0 | 1330 | 2.4520 | 47.0568 | 28.7501 | 41.8219 | 41.9605 | |
|
| 0.2396 | 8.0 | 1520 | 2.5336 | 47.969 | 30.0618 | 42.9924 | 43.1481 | |
|
| 0.2396 | 9.0 | 1710 | 2.6153 | 47.2037 | 28.9732 | 42.0939 | 42.2242 | |
|
| 0.1112 | 10.0 | 1900 | 2.7299 | 48.3657 | 30.3342 | 43.2025 | 43.3223 | |
|
| 0.1112 | 11.0 | 2090 | 2.7696 | 48.0929 | 30.0156 | 42.9385 | 43.026 | |
|
| 0.1112 | 12.0 | 2280 | 2.8627 | 48.1979 | 30.2714 | 43.0959 | 43.2027 | |
|
| 0.0938 | 13.0 | 2470 | 2.8788 | 47.7685 | 29.5733 | 42.7561 | 42.9112 | |
|
| 0.0938 | 14.0 | 2660 | 2.9128 | 47.5374 | 29.8217 | 42.7097 | 42.7803 | |
|
| 0.0394 | 15.0 | 2850 | 2.9470 | 48.6385 | 30.1425 | 43.3326 | 43.3963 | |
|
| 0.0394 | 16.0 | 3040 | 3.0039 | 48.6657 | 30.6642 | 43.471 | 43.592 | |
|
| 0.0394 | 17.0 | 3230 | 3.0380 | 48.2351 | 30.5653 | 43.257 | 43.3788 | |
|
| 0.023 | 18.0 | 3420 | 3.0289 | 48.6593 | 30.6916 | 43.7861 | 43.9098 | |
|
| 0.023 | 19.0 | 3610 | 3.0733 | 49.2114 | 31.2737 | 44.0852 | 44.1993 | |
|
| 0.0122 | 20.0 | 3800 | 3.1089 | 48.5431 | 30.5305 | 43.4128 | 43.5288 | |
|
| 0.0122 | 21.0 | 3990 | 3.0684 | 48.4197 | 30.4005 | 43.2305 | 43.3214 | |
|
| 0.0122 | 22.0 | 4180 | 3.1252 | 48.6007 | 30.5594 | 43.4008 | 43.5336 | |
|
| 0.0071 | 23.0 | 4370 | 3.1572 | 48.7297 | 30.7028 | 43.436 | 43.5106 | |
|
| 0.0071 | 24.0 | 4560 | 3.1716 | 48.9335 | 30.9918 | 43.7764 | 43.8044 | |
|
| 0.0041 | 25.0 | 4750 | 3.1687 | 48.8731 | 31.1055 | 43.8021 | 43.8987 | |
|
| 0.0041 | 26.0 | 4940 | 3.1845 | 48.9432 | 31.0766 | 43.8628 | 43.9726 | |
|
| 0.0041 | 27.0 | 5130 | 3.2133 | 49.2016 | 31.1265 | 44.052 | 44.1427 | |
|
| 0.0025 | 28.0 | 5320 | 3.2146 | 49.1473 | 31.3109 | 44.0372 | 44.1189 | |
|
| 0.0025 | 29.0 | 5510 | 3.2121 | 49.2815 | 31.4258 | 44.1661 | 44.2436 | |
|
| 0.0019 | 30.0 | 5700 | 3.2121 | 49.1001 | 31.2516 | 44.0446 | 44.1075 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.25.1 |
|
- Pytorch 1.13.0 |
|
- Datasets 2.8.0 |
|
- Tokenizers 0.12.1 |
|
|