collapse_gemma-2-2b_hs2_replace_iter9_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6327	0.0315	5	1.3089	247632
1.1975	0.0630	10	1.2561	504648
0.7639	0.0945	15	1.3571	747296
0.513	0.1259	20	1.5568	1008344
0.3238	0.1574	25	1.6800	1260640
0.1575	0.1889	30	1.8364	1510496
0.1051	0.2204	35	2.0453	1761320
0.0558	0.2519	40	2.1727	2016736
0.0395	0.2834	45	2.3560	2261448
0.0314	0.3148	50	2.4808	2507808
0.0264	0.3463	55	2.5543	2762208
0.0291	0.3778	60	2.5322	3006848
0.0324	0.4093	65	2.5662	3253536
0.0291	0.4408	70	2.6164	3500992
0.0271	0.4723	75	2.6075	3746280
0.0269	0.5037	80	2.5847	3999584
0.0248	0.5352	85	2.5952	4246920
0.0244	0.5667	90	2.6096	4503608
0.0247	0.5982	95	2.6188	4754536
0.0233	0.6297	100	2.6244	5005328
0.0248	0.6612	105	2.6260	5255024
0.0237	0.6926	110	2.6294	5501344
0.0277	0.7241	115	2.6373	5757840
0.0256	0.7556	120	2.6263	6010232
0.0254	0.7871	125	2.6241	6256112
0.0232	0.8186	130	2.6195	6504184
0.0243	0.8501	135	2.6216	6755968
0.0243	0.8815	140	2.6244	7006776
0.0255	0.9130	145	2.6225	7262768
0.0251	0.9445	150	2.6236	7515400
0.0243	0.9760	155	2.6327	7765712