collapse_gemma-2-2b_hs2_replace_iter10_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6727	0.0315	5	1.3102	246184
1.009	0.0630	10	1.2543	489448
0.667	0.0945	15	1.3838	726824
0.4294	0.1259	20	1.5708	970024
0.2248	0.1574	25	1.6675	1212248
0.1908	0.1889	30	1.8604	1467208
0.1451	0.2204	35	2.0014	1710152
0.0621	0.2519	40	2.1971	1954192
0.0381	0.2834	45	2.3063	2191576
0.0367	0.3148	50	2.3949	2433592
0.036	0.3463	55	2.4774	2679696
0.0294	0.3778	60	2.5588	2928552
0.0283	0.4093	65	2.5792	3173464
0.0285	0.4408	70	2.6130	3413776
0.0246	0.4723	75	2.6031	3659144
0.0239	0.5037	80	2.6188	3912088
0.023	0.5352	85	2.6231	4148400
0.0251	0.5667	90	2.5840	4398984
0.0236	0.5982	95	2.5662	4651040
0.0264	0.6297	100	2.5629	4894920
0.0243	0.6612	105	2.5727	5137152
0.0256	0.6926	110	2.5955	5378304
0.0235	0.7241	115	2.6078	5624672
0.0242	0.7556	120	2.6111	5877704
0.024	0.7871	125	2.6151	6124640
0.0265	0.8186	130	2.6286	6367576
0.0224	0.8501	135	2.6392	6614328
0.0242	0.8815	140	2.6356	6856504
0.023	0.9130	145	2.6439	7105832
0.0238	0.9445	150	2.6567	7354200
0.0244	0.9760	155	2.6456	7601504