collapse_gemma-2-2b_hs2_accumulatesubsample_iter4_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.4366	0.0537	5	1.2652	272808
1.205	0.1075	10	1.1792	543728
1.0601	0.1612	15	1.1613	817168
0.9601	0.2149	20	1.1532	1092816
0.9378	0.2686	25	1.1523	1371272
0.9852	0.3224	30	1.1648	1652968
0.9609	0.3761	35	1.1735	1931576
0.8948	0.4298	40	1.1661	2213968
0.8069	0.4835	45	1.1685	2496776
0.6446	0.5373	50	1.1695	2771880
0.7284	0.5910	55	1.1612	3049008
0.6245	0.6447	60	1.1637	3321840
0.5641	0.6985	65	1.1559	3594864
0.5613	0.7522	70	1.1590	3871512
0.6246	0.8059	75	1.1572	4140888
0.6635	0.8596	80	1.1523	4417664
0.626	0.9134	85	1.1528	4694904
0.579	0.9671	90	1.1477	4973416