ft-wmt14-5 / README.md
lilferrit's picture
Model save
b6887bd verified
|
raw
history blame
2.09 kB
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ft-wmt14-5
    results: []

ft-wmt14-5

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0597
  • Bleu: 20.6113
  • Gen Len: 30.701

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adafactor
  • lr_scheduler_type: constant
  • training_steps: 100000

Training results

Training Loss Epoch Step Bleu Gen Len Validation Loss
1.9166 0.2778 10000 15.8119 32.097 2.3105
1.7184 0.5556 20000 17.5903 31.1153 2.1993
1.6061 0.8333 30000 18.9604 30.327 2.1380
1.516 1.1111 40000 19.1444 30.2727 2.1366
1.4675 1.3889 50000 19.7588 30.1127 2.1208
1.4416 1.6667 60000 19.9263 30.4463 2.0889
1.4111 1.9444 70000 2.0795 20.3323 30.1207
1.3603 2.2222 80000 2.0850 20.5373 30.5943
1.3378 2.5 90000 2.0604 20.7584 30.499
1.3381 2.7778 100000 2.0597 20.6113 30.701

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1