collapse_gemma-2-2b_hs2_accumulatesubsample_iter7_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4117	0.0532	5	1.2712	266592
1.1724	0.1063	10	1.1918	533048
1.0606	0.1595	15	1.1870	799288
0.8876	0.2126	20	1.2049	1066760
0.7677	0.2658	25	1.2150	1336544
0.823	0.3189	30	1.2520	1607968
0.6771	0.3721	35	1.2333	1874784
0.6136	0.4252	40	1.2067	2139928
0.6083	0.4784	45	1.2110	2411200
0.6399	0.5316	50	1.1935	2679224
0.5353	0.5847	55	1.1854	2944064
0.5082	0.6379	60	1.1890	3209088
0.4659	0.6910	65	1.1827	3473936
0.5292	0.7442	70	1.1786	3744800
0.4468	0.7973	75	1.1750	4009560
0.453	0.8505	80	1.1796	4274632
0.4064	0.9037	85	1.1718	4536408
0.4862	0.9568	90	1.1720	4804824