|
--- |
|
tags: |
|
- simplification |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: marimari-r2r-mlsum-clara-med |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# marimari-r2r-mlsum-clara-med |
|
|
|
This model is a fine-tuned version of [IIC/marimari-r2r-mlsum](https://huggingface.co/IIC/marimari-r2r-mlsum) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 3.9618 |
|
- Rouge1: 42.6764 |
|
- Rouge2: 24.4569 |
|
- Rougel: 37.0033 |
|
- Rougelsum: 37.1595 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5.6e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 30 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:| |
|
| No log | 1.0 | 190 | 2.3970 | 40.7426 | 23.212 | 35.7093 | 35.8437 | |
|
| No log | 2.0 | 380 | 2.3165 | 42.5676 | 24.6494 | 37.1225 | 37.2619 | |
|
| 1.9699 | 3.0 | 570 | 2.4711 | 42.0346 | 23.7633 | 36.3472 | 36.4433 | |
|
| 1.9699 | 4.0 | 760 | 2.7339 | 41.1717 | 22.8419 | 35.3263 | 35.4823 | |
|
| 0.6485 | 5.0 | 950 | 2.9593 | 40.714 | 22.6931 | 34.8859 | 35.0647 | |
|
| 0.6485 | 6.0 | 1140 | 3.1316 | 41.3218 | 23.2054 | 35.3103 | 35.5063 | |
|
| 0.6485 | 7.0 | 1330 | 3.2542 | 41.2786 | 23.4853 | 35.8236 | 35.972 | |
|
| 0.1529 | 8.0 | 1520 | 3.3470 | 41.2991 | 22.8385 | 35.0524 | 35.2153 | |
|
| 0.1529 | 9.0 | 1710 | 3.4324 | 41.3838 | 23.1045 | 35.3472 | 35.5779 | |
|
| 0.0719 | 10.0 | 1900 | 3.5187 | 42.0833 | 23.8538 | 36.3282 | 36.5294 | |
|
| 0.0719 | 11.0 | 2090 | 3.5527 | 41.2993 | 23.0323 | 35.3116 | 35.4687 | |
|
| 0.0719 | 12.0 | 2280 | 3.6624 | 41.6524 | 23.8925 | 35.9281 | 36.1012 | |
|
| 0.0393 | 13.0 | 2470 | 3.6536 | 41.188 | 23.2066 | 35.371 | 35.5616 | |
|
| 0.0393 | 14.0 | 2660 | 3.6656 | 40.8222 | 22.5651 | 35.0515 | 35.1399 | |
|
| 0.0266 | 15.0 | 2850 | 3.7349 | 41.844 | 23.7839 | 36.102 | 36.3169 | |
|
| 0.0266 | 16.0 | 3040 | 3.7254 | 41.5535 | 23.3996 | 35.9619 | 36.0981 | |
|
| 0.0266 | 17.0 | 3230 | 3.7919 | 41.5683 | 23.2824 | 36.0855 | 36.2475 | |
|
| 0.0151 | 18.0 | 3420 | 3.8152 | 42.1272 | 24.0548 | 36.5784 | 36.785 | |
|
| 0.0151 | 19.0 | 3610 | 3.8213 | 41.9185 | 23.5975 | 36.1182 | 36.3194 | |
|
| 0.0087 | 20.0 | 3800 | 3.8501 | 41.3409 | 23.0081 | 35.7662 | 35.9451 | |
|
| 0.0087 | 21.0 | 3990 | 3.8690 | 41.9496 | 23.7032 | 36.0116 | 36.1843 | |
|
| 0.0087 | 22.0 | 4180 | 3.8809 | 42.5366 | 24.6413 | 37.2644 | 37.459 | |
|
| 0.0044 | 23.0 | 4370 | 3.8865 | 42.4346 | 24.2278 | 36.7284 | 36.8846 | |
|
| 0.0044 | 24.0 | 4560 | 3.9044 | 42.9781 | 24.8423 | 37.3582 | 37.4807 | |
|
| 0.0024 | 25.0 | 4750 | 3.9138 | 42.6738 | 24.4737 | 36.8959 | 37.0031 | |
|
| 0.0024 | 26.0 | 4940 | 3.9361 | 42.5267 | 24.4155 | 36.8414 | 36.9915 | |
|
| 0.0024 | 27.0 | 5130 | 3.9477 | 42.4844 | 24.5483 | 36.8857 | 37.0219 | |
|
| 0.0013 | 28.0 | 5320 | 3.9561 | 42.7199 | 24.5977 | 37.1206 | 37.2374 | |
|
| 0.0013 | 29.0 | 5510 | 3.9599 | 42.7088 | 24.4474 | 37.0513 | 37.1971 | |
|
| 0.001 | 30.0 | 5700 | 3.9618 | 42.6764 | 24.4569 | 37.0033 | 37.1595 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.25.1 |
|
- Pytorch 1.13.0 |
|
- Datasets 2.8.0 |
|
- Tokenizers 0.12.1 |
|
|