collapse_gemma-2-2b_hs2_accumulatesubsample_iter6_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3996	0.0547	5	1.2737	278864
1.2739	0.1094	10	1.1914	559168
1.0342	0.1640	15	1.1756	843088
0.9859	0.2187	20	1.1705	1120864
0.8532	0.2734	25	1.1868	1389432
0.886	0.3281	30	1.1878	1659616
0.8437	0.3828	35	1.1879	1944408
0.8296	0.4375	40	1.1998	2221448
0.6965	0.4921	45	1.2044	2496880
0.7313	0.5468	50	1.1847	2774592
0.654	0.6015	55	1.1892	3058288
0.6299	0.6562	60	1.1958	3340632
0.5727	0.7109	65	1.1848	3619656
0.5546	0.7656	70	1.1821	3898496
0.632	0.8202	75	1.1899	4177160
0.5853	0.8749	80	1.1794	4460344
0.5044	0.9296	85	1.1827	4736688
0.5797	0.9843	90	1.1790	5018304