collapse_gemma-2-2b_hs2_accumulatesubsample_iter6_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3794	0.0534	5	1.2685	268544
1.1344	0.1068	10	1.1908	540984
1.092	0.1602	15	1.1709	814280
0.9745	0.2136	20	1.1742	1088888
0.8742	0.2670	25	1.1718	1360592
0.9279	0.3204	30	1.1893	1633024
0.8757	0.3738	35	1.1800	1905464
0.7368	0.4272	40	1.2066	2182616
0.7263	0.4806	45	1.1794	2457160
0.5811	0.5340	50	1.1940	2735040
0.5781	0.5874	55	1.1842	3007976
0.6488	0.6409	60	1.1876	3283704
0.6015	0.6943	65	1.1807	3548216
0.6332	0.7477	70	1.1787	3816768
0.638	0.8011	75	1.1893	4083896
0.6347	0.8545	80	1.1804	4371088
0.5831	0.9079	85	1.1794	4646192
0.5994	0.9613	90	1.1799	4922280