augmented_step_val_25_gemma-2-2b_hs2_iter1_sftsd0

This model is a fine-tuned version of jkazdan/step_val_25_gemma-2-2b_hs2_iter1_sftsd2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.0950	0
1.4641	0.0363	5	1.0952	288056
1.2843	0.0726	10	1.1081	573696
1.179	0.1089	15	1.1363	864312
1.0141	0.1452	20	1.1791	1155592
0.9315	0.1815	25	1.2351	1442896
0.825	0.2178	30	1.3062	1738192
0.6513	0.2541	35	1.3937	2026640
0.5567	0.2904	40	1.4694	2311728
0.5304	0.3267	45	1.4723	2603472
0.372	0.3630	50	1.4773	2895216
0.3612	0.3993	55	1.4670	3177072
0.3167	0.4356	60	1.4953	3464608
0.2068	0.4719	65	1.5190	3749472
0.1664	0.5082	70	1.4786	4033064
0.2256	0.5445	75	1.4518	4326968
0.1704	0.5808	80	1.4577	4611416
0.1391	0.6171	85	1.5038	4903168
0.2488	0.6534	90	1.4373	5191528
0.1726	0.6897	95	1.5123	5474696
0.1696	0.7260	100	1.4582	5757304
0.1919	0.7623	105	1.4735	6047208
0.1987	0.7985	110	1.4654	6343824
0.256	0.8348	115	1.4215	6627376
0.0984	0.8711	120	1.5130	6915440
0.108	0.9074	125	1.4880	7206272
0.1414	0.9437	130	1.4197	7504304
0.1076	0.9800	135	1.5077	7784504