collapse_gemma-2-2b_hs2_accumulatesubsample_iter6_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4282	0.0533	5	1.2707	272080
1.0872	0.1065	10	1.1923	534176
1.0069	0.1598	15	1.1791	810424
0.9708	0.2130	20	1.1779	1085464
0.841	0.2663	25	1.1966	1354360
0.7559	0.3196	30	1.2040	1626064
0.726	0.3728	35	1.1940	1897496
0.7034	0.4261	40	1.1953	2174648
0.5682	0.4794	45	1.1947	2445704
0.575	0.5326	50	1.1886	2714920
0.566	0.5859	55	1.1807	2982200
0.5243	0.6391	60	1.1784	3246752
0.5905	0.6924	65	1.1718	3518224
0.473	0.7457	70	1.1766	3783208
0.5029	0.7989	75	1.1662	4047576
0.5819	0.8522	80	1.1747	4321368
0.5147	0.9055	85	1.1620	4594208
0.4796	0.9587	90	1.1722	4862792