collapse_gemma-2-2b_hs2_massive_iter1_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.2754	0.0511	5	1.2593	285512
1.2153	0.1021	10	1.1717	578296
1.1556	0.1532	15	1.1341	873440
1.1445	0.2042	20	1.1080	1168560
1.0672	0.2553	25	1.0979	1463952
1.1502	0.3063	30	1.0929	1754024
1.0342	0.3574	35	1.0884	2046160
1.0635	0.4084	40	1.0853	2341224
1.1419	0.4595	45	1.0824	2635056
1.0155	0.5105	50	1.0796	2927424
1.0927	0.5616	55	1.0768	3221968
1.1001	0.6126	60	1.0747	3519568
1.0711	0.6637	65	1.0727	3816688
1.0622	0.7147	70	1.0711	4117768
1.0785	0.7658	75	1.0695	4418488
1.154	0.8168	80	1.0683	4709408
1.1034	0.8679	85	1.0669	5000912
1.0458	0.9190	90	1.0655	5295112
1.0685	0.9700	95	1.0642	5591032