collapse_gemma-2-2b_hs2_replace_iter7_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5364	0.0316	5	1.3068	253784
1.1259	0.0632	10	1.2465	510904
0.6887	0.0947	15	1.3550	769904
0.5128	0.1263	20	1.5270	1016400
0.3194	0.1579	25	1.6846	1268560
0.1918	0.1895	30	1.8390	1517032
0.1256	0.2211	35	2.0239	1770608
0.0896	0.2527	40	2.2047	2023392
0.0555	0.2842	45	2.3706	2273832
0.0402	0.3158	50	2.4122	2529912
0.0338	0.3474	55	2.4354	2783520
0.0345	0.3790	60	2.3862	3030728
0.0281	0.4106	65	2.4360	3277464
0.0285	0.4422	70	2.4702	3520992
0.0296	0.4737	75	2.4709	3770984
0.0284	0.5053	80	2.5290	4026776
0.0254	0.5369	85	2.5619	4275136
0.0288	0.5685	90	2.5185	4524936
0.026	0.6001	95	2.4887	4781848
0.0265	0.6317	100	2.4976	5021704
0.0254	0.6632	105	2.4820	5274368
0.0322	0.6948	110	2.4803	5529272
0.0308	0.7264	115	2.4894	5783496
0.0263	0.7580	120	2.5027	6034168
0.0254	0.7896	125	2.4805	6291808
0.0263	0.8212	130	2.4729	6544088
0.0335	0.8527	135	2.4893	6791616
0.0264	0.8843	140	2.5056	7045456
0.0251	0.9159	145	2.5100	7294288
0.0245	0.9475	150	2.5167	7548384
0.0277	0.9791	155	2.5252	7791720