collapse_gemma-2-2b_hs2_accumulatesubsample_iter12_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3649	0.0529	5	1.2745	267096
1.1224	0.1058	10	1.2058	530288
0.9974	0.1587	15	1.2049	800248
0.8189	0.2116	20	1.2372	1058320
0.7833	0.2646	25	1.2189	1325704
0.6665	0.3175	30	1.2693	1584760
0.5681	0.3704	35	1.2443	1856304
0.5335	0.4233	40	1.2355	2125480
0.5541	0.4762	45	1.2238	2393968
0.4262	0.5291	50	1.2276	2656976
0.4628	0.5820	55	1.2021	2920640
0.3494	0.6349	60	1.2094	3190360
0.4511	0.6878	65	1.1954	3457336
0.3678	0.7407	70	1.1997	3727624
0.4241	0.7937	75	1.1929	3995904
0.3534	0.8466	80	1.1951	4259976
0.3476	0.8995	85	1.1903	4524480
0.4014	0.9524	90	1.1970	4798896