cs_m2m_2e-5_50_v0.2 / README.md
kmok1's picture
End of training
95fa571 verified
metadata
license: mit
base_model: facebook/m2m100_1.2B
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: cs_m2m_2e-5_50_v0.2
    results: []

cs_m2m_2e-5_50_v0.2

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3546
  • Bleu: 46.5499
  • Gen Len: 19.8571

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.0 1.0 6 2.4739 49.3833 19.8571
0.0 2.0 12 2.4967 47.4622 20.7143
0.0087 3.0 18 2.6016 47.8384 20.9524
0.0 4.0 24 2.6004 49.8858 19.9048
0.0825 5.0 30 2.4731 50.7434 19.9524
0.0 6.0 36 2.4229 45.2602 20.7619
0.0002 7.0 42 2.4148 45.5274 20.5238
0.0001 8.0 48 2.3583 47.4096 19.9524
0.0 9.0 54 2.3559 49.1212 20.1905
0.0 10.0 60 2.3610 47.0296 20.0952
0.0001 11.0 66 2.3423 47.2022 19.8571
0.0002 12.0 72 2.2938 48.5473 20.0952
0.0 13.0 78 2.2591 49.6382 19.4762
0.0001 14.0 84 2.2492 49.5102 19.6667
0.0001 15.0 90 2.2740 49.1707 19.6667
0.0 16.0 96 2.2876 48.9631 19.3333
0.0023 17.0 102 2.2842 48.7639 19.6667
0.0001 18.0 108 2.2830 45.9993 19.5238
0.0 19.0 114 2.2872 49.1391 19.7619
0.0 20.0 120 2.2893 49.1623 19.8095
0.0 21.0 126 2.2948 48.5803 20.0
0.0 22.0 132 2.3048 48.9732 20.0476
0.0 23.0 138 2.3114 49.1156 19.9524
0.0 24.0 144 2.3169 49.1156 19.9524
0.0 25.0 150 2.3202 48.4435 20.0
0.0 26.0 156 2.3227 48.4435 20.0
0.0 27.0 162 2.3236 48.4435 20.0
0.0 28.0 168 2.3244 48.4435 20.0
0.0 29.0 174 2.3268 48.4435 20.0
0.0002 30.0 180 2.3296 45.9582 19.8571
0.0 31.0 186 2.3319 45.9582 19.8571
0.0 32.0 192 2.3338 45.9582 19.8571
0.0 33.0 198 2.3401 46.8428 19.8571
0.0 34.0 204 2.3473 46.586 19.8095
0.0001 35.0 210 2.3513 46.586 19.8095
0.0 36.0 216 2.3539 48.1767 20.0476
0.0 37.0 222 2.3554 48.1966 19.9048
0.0 38.0 228 2.3563 48.1966 19.9048
0.0 39.0 234 2.3563 48.1966 19.9048
0.0 40.0 240 2.3550 46.5682 19.8095
0.0001 41.0 246 2.3541 46.5499 19.9524
0.0 42.0 252 2.3534 46.5499 19.8571
0.0001 43.0 258 2.3533 46.5499 19.8571
0.0 44.0 264 2.3533 46.5499 19.8571
0.0 45.0 270 2.3537 46.5499 19.8571
0.0001 46.0 276 2.3540 46.5499 19.8571
0.0 47.0 282 2.3543 46.5499 19.8571
0.0 48.0 288 2.3544 46.5499 19.8571
0.0 49.0 294 2.3545 46.5499 19.8571
0.0 50.0 300 2.3546 46.5499 19.8571

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2