collapse_gemma-2-2b_hs2_accumulatesubsample_iter4_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.317	0.0543	5	1.2676	278888
1.2103	0.1087	10	1.1836	560856
1.1544	0.1630	15	1.1540	844528
1.1964	0.2174	20	1.1470	1128496
0.9374	0.2717	25	1.1433	1409880
0.9893	0.3261	30	1.1511	1694568
0.9799	0.3804	35	1.1555	1983024
0.9148	0.4348	40	1.1759	2267152
0.872	0.4891	45	1.1720	2553896
0.7683	0.5435	50	1.1734	2832280
0.7309	0.5978	55	1.1710	3116288
0.7317	0.6522	60	1.1715	3400728
0.6844	0.7065	65	1.1663	3683408
0.6955	0.7609	70	1.1680	3959976
0.6387	0.8152	75	1.1771	4241544
0.6381	0.8696	80	1.1675	4526832
0.6677	0.9239	85	1.1682	4803712
0.6433	0.9783	90	1.1650	5085136