Edit model card

jako_13p_tokenie_run1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3560
  • Bleu: 50.2114
  • Gen Len: 18.2159

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.2125 0.83 1600 1.1356 44.2732 18.9394
0.8519 1.66 3200 1.0618 47.1622 18.3936
0.6394 2.49 4800 1.0923 47.7818 18.3397
0.532 3.32 6400 1.1294 48.4283 18.3375
0.3543 4.15 8000 1.1765 47.7916 18.4422
0.2569 4.99 9600 1.2103 48.1268 18.5385
0.1732 5.82 11200 1.2549 48.9329 18.2085
0.1228 6.65 12800 1.3022 49.0248 18.2133
0.0937 7.48 14400 1.3179 49.3503 18.1673
0.0627 8.31 16000 1.3409 49.5551 18.2672
0.0558 9.14 17600 1.3545 49.7808 18.2815
0.0442 9.97 19200 1.3560 50.2114 18.2159

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yesj1234/jako_mbartLarge_13p_tokenize_run1

Finetuned
(106)
this model