flan-t5-large / README.md
joheras's picture
Update README.md
3df8685
|
raw
history blame
2 kB
metadata
license: apache-2.0
tags:
  - simplification
  - generated_from_trainer
metrics:
  - rouge
  - sari
model-index:
  - name: flan-t5-large-clara-med
    results: []
datasets:
  - lcampillos/CLARA-MeD
language:
  - es

flan-t5-large-clara-med

This model is a fine-tuned version of google/flan-t5-large on the CLARA-MeD dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0898
  • Rouge1: 28.9888
  • Rouge2: 16.3801
  • Rougel: 27.4186
  • Rougelsum: 27.4043
  • sari: 39.1731

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 380 1.1948 28.2861 15.6461 26.7126 26.7389
No log 2.0 760 1.1361 28.3528 15.8519 26.8151 26.8069
1.3561 3.0 1140 1.1051 29.6216 16.8227 28.0662 28.0613
1.3561 4.0 1520 1.0915 29.3603 16.5008 27.7915 27.7761
1.0939 5.0 1900 1.0898 28.9888 16.3801 27.4186 27.4043

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0
  • Datasets 2.8.0
  • Tokenizers 0.12.1