north-translation

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1809

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss
6.6535	0.3333	50	6.5107
4.8917	0.6667	100	4.6859
3.3227	1.0	150	3.1355
1.89	1.3333	200	1.7087
0.8322	1.6667	250	0.6969
0.3733	2.0	300	0.3310
0.2401	2.3333	350	0.2457
0.2166	2.6667	400	0.2203
0.2199	3.0	450	0.2098
0.1924	3.3333	500	0.2011
0.2007	3.6667	550	0.1967
0.1901	4.0	600	0.1922
0.1753	4.3333	650	0.1905
0.1523	4.6667	700	0.1882
0.1863	5.0	750	0.1842
0.1432	5.3333	800	0.1851
0.1525	5.6667	850	0.1833
0.1377	6.0	900	0.1812
0.1319	6.3333	950	0.1823
0.1608	6.6667	1000	0.1821
0.1535	7.0	1050	0.1807
0.14	7.3333	1100	0.1811
0.1347	7.6667	1150	0.1812
0.1484	8.0	1200	0.1809

Framework versions

Transformers 4.42.4
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 1

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Thirawarit/north-translation

Base model

facebook/nllb-200-distilled-600M

Finetuned

(290)

this model