collapse_gemma-2-2b_hs2_accumulate_iter2_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4986	0.0274	5	1.3330	291568
1.3182	0.0548	10	1.2111	587448
1.2698	0.0822	15	1.1561	878712
1.1636	0.1096	20	1.1285	1172912
1.1254	0.1370	25	1.1113	1462432
1.1388	0.1644	30	1.1125	1754352
1.0632	0.1918	35	1.1148	2044296
1.0854	0.2193	40	1.1123	2336344
1.0012	0.2467	45	1.1118	2629112
0.9763	0.2741	50	1.1233	2922992
0.8928	0.3015	55	1.1148	3212144
0.9294	0.3289	60	1.1208	3498808
0.9218	0.3563	65	1.1160	3790240
0.8805	0.3837	70	1.1220	4084176
0.8095	0.4111	75	1.1249	4369920
0.8382	0.4385	80	1.1195	4666480
0.8528	0.4659	85	1.1163	4959872
0.8016	0.4933	90	1.1147	5254800
0.8473	0.5207	95	1.1142	5546992
0.7947	0.5481	100	1.1122	5834416
0.7363	0.5755	105	1.1072	6127320
0.6941	0.6029	110	1.1062	6426288
0.7032	0.6304	115	1.1080	6714832
0.73	0.6578	120	1.1044	7008720
0.6667	0.6852	125	1.1017	7302184
0.6676	0.7126	130	1.1011	7596152
0.7638	0.7400	135	1.0994	7884552
0.7206	0.7674	140	1.0979	8179512
0.7141	0.7948	145	1.0960	8470208
0.7504	0.8222	150	1.0947	8761968
0.6988	0.8496	155	1.0930	9055184
0.7438	0.8770	160	1.0927	9343128
0.667	0.9044	165	1.0902	9637976
0.7389	0.9318	170	1.0913	9930512
0.7248	0.9592	175	1.0880	10226368
0.7772	0.9866	180	1.0892	10513336