metadata

license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: bart-base-en-to-de
    results: []

bart-base-en-to-de

This model is a fine-tuned version of ahazeemi/bart-base-finetuned-en-to-de on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.9665
Bleu: 4.7851
Gen Len: 19.453

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.319	0.04	5000	1.1247	4.4467	19.447
1.295	0.07	10000	1.1012	4.4235	19.458
1.2901	0.11	15000	1.0923	4.4386	19.4423
1.2678	0.14	20000	1.0803	4.5259	19.4557
1.267	0.18	25000	1.0724	4.5534	19.4653
1.2444	0.21	30000	1.0591	4.4944	19.4623
1.2365	0.25	35000	1.0509	4.5736	19.446
1.2137	0.28	40000	1.0400	4.5346	19.4553
1.214	0.32	45000	1.0340	4.5733	19.4543
1.218	0.35	50000	1.0283	4.6076	19.4693
1.2118	0.39	55000	1.0225	4.6192	19.454
1.1948	0.43	60000	1.0152	4.6082	19.4553
1.1932	0.46	65000	1.0128	4.665	19.449
1.1889	0.5	70000	1.0028	4.6929	19.4493
1.2154	0.53	75000	1.0004	4.7151	19.4477
1.194	0.57	80000	0.9950	4.6655	19.467
1.1847	0.6	85000	0.9966	4.708	19.451
1.1848	0.64	90000	0.9897	4.7794	19.458
1.1762	0.67	95000	0.9866	4.7204	19.4523
1.1818	0.71	100000	0.9803	4.7137	19.458
1.1613	0.75	105000	0.9788	4.7652	19.4573
1.1738	0.78	110000	0.9775	4.8088	19.453
1.1569	0.82	115000	0.9752	4.7522	19.4577
1.1631	0.85	120000	0.9713	4.7301	19.4513
1.1517	0.89	125000	0.9690	4.7935	19.456
1.1577	0.92	130000	0.9686	4.791	19.4543
1.1607	0.96	135000	0.9676	4.7529	19.4533
1.153	0.99	140000	0.9665	4.7851	19.453

Framework versions

Transformers 4.22.2
Pytorch 1.12.0+cu116
Datasets 2.5.1
Tokenizers 0.12.1