collapse_gemma-2-2b_hs2_accumulatesubsample_iter13_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.34	0.0538	5	1.2783	273912
1.1013	0.1075	10	1.2124	550336
0.9918	0.1613	15	1.2140	821232
0.8609	0.2151	20	1.2190	1094808
0.7352	0.2688	25	1.2393	1360608
0.7336	0.3226	30	1.2311	1633144
0.6607	0.3763	35	1.2354	1902744
0.543	0.4301	40	1.2269	2170672
0.5362	0.4839	45	1.2253	2438088
0.5783	0.5376	50	1.2295	2709272
0.4413	0.5914	55	1.2153	2982760
0.5566	0.6452	60	1.2091	3250856
0.5763	0.6989	65	1.2251	3522440
0.4629	0.7527	70	1.2077	3792592
0.4905	0.8065	75	1.2210	4052656
0.4028	0.8602	80	1.2064	4317496
0.4751	0.9140	85	1.2065	4590056
0.4461	0.9677	90	1.2108	4861056