collapse_gemma-2-2b_hs2_replace_iter9_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5959	0.0315	5	1.3066	254632
1.0717	0.0630	10	1.2465	502128
0.7125	0.0945	15	1.3592	744976
0.504	0.1260	20	1.5120	986472
0.2848	0.1575	25	1.6652	1237336
0.2452	0.1890	30	1.8288	1482344
0.1578	0.2205	35	1.9980	1732136
0.0569	0.2520	40	2.1960	1978848
0.0667	0.2835	45	2.3046	2223360
0.0341	0.3150	50	2.4331	2460800
0.0289	0.3465	55	2.4497	2702840
0.027	0.3780	60	2.5245	2953304
0.0265	0.4094	65	2.5800	3203880
0.0271	0.4409	70	2.5911	3452328
0.0265	0.4724	75	2.6014	3694936
0.0237	0.5039	80	2.6018	3940776
0.0253	0.5354	85	2.5984	4186160
0.0254	0.5669	90	2.6081	4427280
0.026	0.5984	95	2.6275	4674224
0.0249	0.6299	100	2.6499	4922464
0.0263	0.6614	105	2.6559	5169512
0.0295	0.6929	110	2.6640	5411768
0.0241	0.7244	115	2.6679	5655504
0.0259	0.7559	120	2.6763	5901264
0.0255	0.7874	125	2.6777	6144528
0.024	0.8189	130	2.6766	6387936
0.0228	0.8504	135	2.6707	6633736
0.0258	0.8819	140	2.6821	6868528
0.024	0.9134	145	2.6846	7115712
0.0257	0.9449	150	2.6769	7363728
0.0263	0.9764	155	2.6716	7603744