collapse_gemma-2-2b_hs2_accumulatesubsample_iter13_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4147	0.0528	5	1.2762	263776
1.1751	0.1056	10	1.2089	527904
0.9507	0.1584	15	1.2023	795176
0.8056	0.2112	20	1.2344	1051032
0.6575	0.2640	25	1.2598	1316456
0.6252	0.3168	30	1.2757	1583656
0.5477	0.3696	35	1.2561	1847776
0.5462	0.4224	40	1.2272	2113544
0.5597	0.4752	45	1.2306	2386008
0.4005	0.5281	50	1.2235	2650504
0.5095	0.5809	55	1.2107	2915648
0.3978	0.6337	60	1.2088	3173912
0.3427	0.6865	65	1.2017	3439032
0.3256	0.7393	70	1.2081	3699752
0.3051	0.7921	75	1.1954	3970736
0.4045	0.8449	80	1.1972	4229528
0.4072	0.8977	85	1.1940	4490136
0.307	0.9505	90	1.1987	4751368