collapse_gemma-2-2b_hs2_replace_iter8_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6775	0.0317	5	1.3078	249640
1.1595	0.0633	10	1.2385	493152
0.7855	0.0950	15	1.3530	742568
0.4733	0.1267	20	1.5512	993096
0.3632	0.1584	25	1.6760	1241784
0.2405	0.1900	30	1.8061	1488408
0.1741	0.2217	35	2.0582	1736448
0.1075	0.2534	40	2.1692	1986304
0.054	0.2850	45	2.3143	2236192
0.0417	0.3167	50	2.4076	2485904
0.0317	0.3484	55	2.4662	2728784
0.0307	0.3800	60	2.4682	2981024
0.0281	0.4117	65	2.4533	3224056
0.0296	0.4434	70	2.4544	3466368
0.0281	0.4751	75	2.4627	3716376
0.029	0.5067	80	2.4801	3956352
0.0276	0.5384	85	2.5255	4200848
0.0266	0.5701	90	2.5232	4438112
0.0266	0.6017	95	2.5296	4681008
0.0271	0.6334	100	2.5345	4929272
0.026	0.6651	105	2.5361	5177360
0.0269	0.6968	110	2.5346	5422624
0.031	0.7284	115	2.5489	5664432
0.0277	0.7601	120	2.5383	5911992
0.0269	0.7918	125	2.5107	6157840
0.0245	0.8234	130	2.5145	6401608
0.0284	0.8551	135	2.5238	6639192
0.0269	0.8868	140	2.5150	6886080
0.0276	0.9184	145	2.5151	7135760
0.026	0.9501	150	2.5276	7383616
0.0259	0.9818	155	2.5393	7629864