collapse_gemma-2-9b_hs2_accumulate_iter2_sftsd2

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.2335	0
1.2828	0.0271	5	1.1073	265560
1.0134	0.0542	10	1.0202	534484
1.033	0.0814	15	0.9863	798368
0.8674	0.1085	20	0.9835	1059736
0.8101	0.1356	25	0.9846	1317664
0.7184	0.1627	30	0.9897	1578212
0.7122	0.1899	35	0.9834	1838956
0.7129	0.2170	40	0.9781	2102368
0.643	0.2441	45	0.9751	2365072
0.6169	0.2712	50	0.9738	2626376
0.7176	0.2984	55	0.9700	2886524
0.5972	0.3255	60	0.9665	3149448
0.573	0.3526	65	0.9639	3415664
0.6035	0.3797	70	0.9629	3676764
0.6096	0.4068	75	0.9598	3940104
0.5832	0.4340	80	0.9585	4204740
0.6262	0.4611	85	0.9572	4467556
0.6814	0.4882	90	0.9555	4731864
0.6672	0.5153	95	0.9533	4997040
0.5181	0.5425	100	0.9519	5263636
0.5759	0.5696	105	0.9515	5527476
0.597	0.5967	110	0.9507	5790324
0.5898	0.6238	115	0.9501	6054116
0.6857	0.6510	120	0.9496	6313184
0.5666	0.6781	125	0.9490	6573064
0.5007	0.7052	130	0.9491	6839704
0.5295	0.7323	135	0.9473	7101760
0.5782	0.7595	140	0.9458	7359916
0.5476	0.7866	145	0.9456	7629448
0.5752	0.8137	150	0.9457	7888404
0.48	0.8408	155	0.9444	8151276
0.6858	0.8679	160	0.9448	8410464
0.569	0.8951	165	0.9454	8677664
0.5906	0.9222	170	0.9441	8944556
0.5673	0.9493	175	0.9441	9201860
0.6069	0.9764	180	0.9446	9467072