collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3991	0.0543	5	1.2725	268768
1.1703	0.1086	10	1.1974	544736
1.0865	0.1629	15	1.1898	814424
1.0812	0.2172	20	1.1993	1093688
0.8983	0.2716	25	1.2050	1373160
0.8093	0.3259	30	1.2215	1647872
0.734	0.3802	35	1.2129	1921464
0.6783	0.4345	40	1.2141	2206248
0.5858	0.4888	45	1.2226	2476264
0.6223	0.5431	50	1.2036	2753528
0.7186	0.5974	55	1.1927	3034280
0.452	0.6517	60	1.2088	3302232
0.5381	0.7060	65	1.1925	3575192
0.6065	0.7604	70	1.1956	3848736
0.5219	0.8147	75	1.1899	4125440
0.4986	0.8690	80	1.1895	4392672
0.4997	0.9233	85	1.1895	4661944
0.5353	0.9776	90	1.1928	4941648