collapse_gemma-2-2b_hs2_replace_iter9_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5468	0.0315	5	1.3110	253928
1.182	0.0630	10	1.2480	514960
0.8084	0.0945	15	1.3189	773760
0.6416	0.1259	20	1.4974	1041648
0.3966	0.1574	25	1.6165	1307976
0.2039	0.1889	30	1.8225	1565976
0.1576	0.2204	35	1.9499	1822872
0.0829	0.2519	40	2.1969	2080200
0.0476	0.2834	45	2.3565	2335552
0.0338	0.3148	50	2.4119	2590880
0.0303	0.3463	55	2.5071	2851232
0.0381	0.3778	60	2.5463	3110576
0.0307	0.4093	65	2.5668	3369800
0.0279	0.4408	70	2.5711	3630600
0.0262	0.4723	75	2.6104	3884416
0.0284	0.5037	80	2.6201	4140232
0.0265	0.5352	85	2.6255	4390344
0.0265	0.5667	90	2.6473	4646944
0.0288	0.5982	95	2.6452	4907960
0.0242	0.6297	100	2.6281	5157432
0.0235	0.6612	105	2.6248	5417680
0.0256	0.6926	110	2.6399	5680504
0.0224	0.7241	115	2.6534	5934288
0.0246	0.7556	120	2.6607	6188664
0.0313	0.7871	125	2.6628	6444560
0.0252	0.8186	130	2.6540	6702464
0.0258	0.8501	135	2.6528	6962424
0.0276	0.8815	140	2.6468	7217352
0.0245	0.9130	145	2.6580	7472288
0.025	0.9445	150	2.6685	7739408
0.0285	0.9760	155	2.6733	8001312