pritamdeka's picture
Upload folder using huggingface_hub
7ee1dd8 verified
metadata
language:
  - en
  - bgc
license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: tst-translation
    results: []

tst-translation

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8421
  • Bleu: 13.1948
  • Gen Len: 49.9179

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.4257 1.9900 400 2.1087 4.1008 77.8284
1.8571 3.9801 800 1.9292 8.6198 61.1418
1.2467 5.9701 1200 1.9779 10.7074 48.3184
0.8749 7.9602 1600 2.0539 11.8538 49.3483
0.6141 9.9502 2000 2.1948 12.4452 51.1269
0.4446 11.9403 2400 2.3902 12.3052 48.0995
0.3251 13.9303 2800 2.5698 12.5824 49.1244
0.2501 15.9204 3200 2.6631 13.0619 50.6095
0.1986 17.9104 3600 2.7877 13.0557 51.1443
0.1692 19.9005 4000 2.8421 13.1948 49.9179

Framework versions

  • Transformers 4.43.0.dev0
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1