collapse_gemma-2-2b_hs2_accumulatesubsample_iter10_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3566	0.0539	5	1.2744	272448
1.1352	0.1079	10	1.2127	547040
0.9502	0.1618	15	1.1996	825664
0.8885	0.2158	20	1.2152	1093864
0.8554	0.2697	25	1.2327	1366616
0.8248	0.3237	30	1.2216	1642200
0.7493	0.3776	35	1.2269	1920624
0.6711	0.4316	40	1.2200	2193904
0.7145	0.4855	45	1.2118	2471312
0.5736	0.5394	50	1.2113	2743896
0.6077	0.5934	55	1.2109	3020256
0.5245	0.6473	60	1.2123	3293520
0.566	0.7013	65	1.2143	3567816
0.5426	0.7552	70	1.1968	3834000
0.5058	0.8092	75	1.2092	4106144
0.4798	0.8631	80	1.1969	4379288
0.4227	0.9171	85	1.2013	4654784
0.494	0.9710	90	1.2026	4929264