genz_model

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.2536
Bleu: 40.0734
Gen Len: 15.8667

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	41	1.9667	16.4087	16.3333
No log	2.0	82	1.8242	30.3437	15.4788
No log	3.0	123	1.7376	35.0542	15.6545
No log	4.0	164	1.6830	36.3815	15.9091
No log	5.0	205	1.6438	37.3325	15.9212
No log	6.0	246	1.6052	37.5162	16.0364
No log	7.0	287	1.5723	37.5334	16.097
No log	8.0	328	1.5484	38.2319	16.1152
No log	9.0	369	1.5249	38.3884	16.1455
No log	10.0	410	1.5040	38.4443	16.1394
No log	11.0	451	1.4852	38.8279	16.1879
No log	12.0	492	1.4706	39.4717	16.0424
1.7321	13.0	533	1.4525	39.6365	16.103
1.7321	14.0	574	1.4361	39.7667	16.0545
1.7321	15.0	615	1.4237	39.934	16.0182
1.7321	16.0	656	1.4084	39.8808	16.0606
1.7321	17.0	697	1.4013	39.958	16.0606
1.7321	18.0	738	1.3875	39.4972	16.0788
1.7321	19.0	779	1.3770	39.4976	15.9394
1.7321	20.0	820	1.3681	39.4927	15.9818
1.7321	21.0	861	1.3592	39.8584	15.9818
1.7321	22.0	902	1.3512	39.9409	15.9515
1.7321	23.0	943	1.3414	39.8891	15.9576
1.7321	24.0	984	1.3367	40.0053	15.9576
1.3831	25.0	1025	1.3298	39.9729	15.9636
1.3831	26.0	1066	1.3231	40.0029	15.9333
1.3831	27.0	1107	1.3157	39.9874	15.9394
1.3831	28.0	1148	1.3093	39.8156	15.9152
1.3831	29.0	1189	1.3051	40.1371	15.9152
1.3831	30.0	1230	1.3006	40.0601	15.897
1.3831	31.0	1271	1.2950	40.2356	15.8727
1.3831	32.0	1312	1.2899	40.3369	15.8848
1.3831	33.0	1353	1.2871	40.452	15.8667
1.3831	34.0	1394	1.2836	40.5232	15.8364
1.3831	35.0	1435	1.2804	40.455	15.8485
1.3831	36.0	1476	1.2768	40.4874	15.8485
1.2414	37.0	1517	1.2728	40.5694	15.8424
1.2414	38.0	1558	1.2692	40.4767	15.8424
1.2414	39.0	1599	1.2679	40.5449	15.8424
1.2414	40.0	1640	1.2650	40.5121	15.8667
1.2414	41.0	1681	1.2625	40.0705	15.8545
1.2414	42.0	1722	1.2604	40.056	15.8545
1.2414	43.0	1763	1.2597	40.1238	15.8667
1.2414	44.0	1804	1.2579	40.0473	15.8606
1.2414	45.0	1845	1.2565	40.0792	15.8667
1.2414	46.0	1886	1.2553	40.0734	15.8667
1.2414	47.0	1927	1.2545	40.0734	15.8667
1.2414	48.0	1968	1.2539	40.0734	15.8667
1.179	49.0	2009	1.2537	40.0734	15.8667
1.179	50.0	2050	1.2536	40.0734	15.8667

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu118
Datasets 2.14.2
Tokenizers 0.13.3

ethannhzhouu
/

genz_model

genz_model

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results