mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3554
  • Rouge1: 24.0279
  • Rouge2: 14.2784
  • Rougel: 23.3319
  • Rougelsum: 23.3852

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 28
  • eval_batch_size: 28
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.0088 1.0 4679 2.5872 21.5939 12.0659 20.9575 20.9943
2.92 2.0 9358 2.4859 22.4633 12.6865 21.7729 21.8407
2.7704 3.0 14037 2.4248 22.9471 13.2621 22.2334 22.274
2.6884 4.0 18716 2.3995 23.9663 14.0107 23.2218 23.2728
2.6323 5.0 23395 2.3759 24.057 14.1794 23.3209 23.3744
2.5971 6.0 28074 2.3649 24.0271 14.2326 23.3148 23.3478
2.5713 7.0 32753 2.3595 23.9268 14.1398 23.1927 23.2442
2.5577 8.0 37432 2.3554 24.0279 14.2784 23.3319 23.3852

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
13
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for aalbero/mt5-small-finetuned-amazon-en-es

Base model

google/mt5-small
Finetuned
(362)
this model