collapse_gemma-2-2b_hs2_accumulatesubsample_iter11_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3516	0.0539	5	1.2769	268176
1.1484	0.1077	10	1.2134	539128
0.9886	0.1616	15	1.2129	816696
0.8987	0.2155	20	1.2197	1085552
0.76	0.2694	25	1.2295	1359368
0.7064	0.3232	30	1.2342	1633752
0.7212	0.3771	35	1.2166	1907824
0.6333	0.4310	40	1.2243	2178048
0.6637	0.4848	45	1.2205	2450344
0.5582	0.5387	50	1.2237	2715928
0.5408	0.5926	55	1.2263	2988808
0.4935	0.6465	60	1.1999	3260920
0.5121	0.7003	65	1.2196	3530496
0.5136	0.7542	70	1.2042	3804800
0.4048	0.8081	75	1.2149	4080712
0.4924	0.8620	80	1.2057	4350992
0.353	0.9158	85	1.2012	4616600
0.5064	0.9697	90	1.1996	4895192