collapse_gemma-2-2b_hs2_accumulatesubsample_iter12_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4166	0.0539	5	1.2769	264784
1.181	0.1077	10	1.2129	544496
0.9778	0.1616	15	1.2133	810896
0.9186	0.2155	20	1.2228	1084848
0.8617	0.2694	25	1.2353	1354664
0.7313	0.3232	30	1.2450	1624584
0.5778	0.3771	35	1.2580	1893920
0.6233	0.4310	40	1.2506	2170792
0.5564	0.4848	45	1.2258	2440400
0.6599	0.5387	50	1.2106	2719504
0.4742	0.5926	55	1.2293	2999912
0.474	0.6465	60	1.2061	3261600
0.6034	0.7003	65	1.2262	3537920
0.4712	0.7542	70	1.1980	3804184
0.544	0.8081	75	1.2184	4076984
0.4497	0.8620	80	1.1947	4344192
0.4503	0.9158	85	1.2074	4615280
0.4385	0.9697	90	1.2039	4885496