collapse_gemma-2-2b_hs2_replace_iter4_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5456	0.0318	5	1.3070	265304
1.2703	0.0636	10	1.2218	523568
0.9657	0.0954	15	1.2481	782936
0.6809	0.1271	20	1.3512	1047016
0.5361	0.1589	25	1.4749	1317216
0.3923	0.1907	30	1.5943	1576968
0.2609	0.2225	35	1.7244	1835720
0.1476	0.2543	40	1.9387	2096288
0.0973	0.2861	45	2.0971	2356272
0.1096	0.3178	50	2.1618	2620648
0.0651	0.3496	55	2.1722	2877880
0.0528	0.3814	60	2.1908	3133480
0.0455	0.4132	65	2.2489	3397968
0.0474	0.4450	70	2.2538	3659672
0.0919	0.4768	75	2.2320	3919344
0.0311	0.5085	80	2.1670	4187912
0.0319	0.5403	85	2.2046	4447608
0.0278	0.5721	90	2.2165	4705488
0.0284	0.6039	95	2.2048	4979080
0.0549	0.6357	100	2.1651	5237472
0.0319	0.6675	105	2.1267	5496736
0.0304	0.6992	110	2.1044	5759232
0.0274	0.7310	115	2.0821	6021968
0.0307	0.7628	120	2.0860	6280048
0.0297	0.7946	125	2.1247	6547056
0.0283	0.8264	130	2.1514	6801816
0.0295	0.8582	135	2.1703	7057840
0.0326	0.8899	140	2.1964	7323848
0.0283	0.9217	145	2.1959	7580872
0.0439	0.9535	150	2.1948	7845552
0.0282	0.9853	155	2.1853	8107432