collapse_gemma-2-2b_hs2_accumulatesubsample_iter13_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3546	0.0532	5	1.2770	264160
1.0602	0.1063	10	1.2137	532584
0.9619	0.1595	15	1.2117	800536
0.8456	0.2126	20	1.2305	1064552
0.8874	0.2658	25	1.2288	1334288
0.7271	0.3189	30	1.2471	1604456
0.6848	0.3721	35	1.2268	1869408
0.66	0.4252	40	1.2269	2137928
0.5898	0.4784	45	1.2345	2405736
0.5111	0.5316	50	1.2218	2670688
0.5592	0.5847	55	1.2104	2939792
0.4165	0.6379	60	1.2177	3205680
0.5257	0.6910	65	1.2159	3475424
0.3911	0.7442	70	1.2172	3741984
0.4243	0.7973	75	1.2121	4012288
0.512	0.8505	80	1.2124	4271576
0.473	0.9037	85	1.2070	4541040
0.3554	0.9568	90	1.2051	4811336