collapse_gemma-2-2b_hs2_accumulate_iter3_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.5228	0.0530	5	1.2631	285904
1.2844	0.1060	10	1.1761	573320
1.269	0.1589	15	1.1464	856296
1.1088	0.2119	20	1.1267	1134800
1.0695	0.2649	25	1.1290	1413200
1.027	0.3179	30	1.1306	1697288
0.9688	0.3709	35	1.1340	1980216
0.9701	0.4238	40	1.1427	2266568
0.949	0.4768	45	1.1409	2548552
0.9408	0.5298	50	1.1578	2839880
0.9139	0.5828	55	1.1506	3115520
0.8606	0.6358	60	1.1560	3398440
0.8238	0.6887	65	1.1561	3687696
0.8161	0.7417	70	1.1506	3977240
0.7423	0.7947	75	1.1503	4256976
0.7188	0.8477	80	1.1514	4544776
0.6642	0.9007	85	1.1464	4827760
0.6403	0.9536	90	1.1524	5108184