collapse_gemma-2-2b_hs2_accumulate_iter3_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3102	0.0539	5	1.2636	283856
1.3691	0.1077	10	1.1765	571632
1.1242	0.1616	15	1.1491	857944
1.1425	0.2155	20	1.1270	1146080
1.1527	0.2694	25	1.1235	1423176
1.0294	0.3232	30	1.1268	1709384
0.9761	0.3771	35	1.1413	1997472
1.0079	0.4310	40	1.1340	2289288
0.9212	0.4848	45	1.1454	2577432
0.871	0.5387	50	1.1548	2863824
0.8043	0.5926	55	1.1584	3143184
0.7448	0.6465	60	1.1527	3429216
0.8393	0.7003	65	1.1466	3713984
0.8134	0.7542	70	1.1457	4005488
0.7978	0.8081	75	1.1524	4284408
0.7489	0.8620	80	1.1426	4569048
0.6384	0.9158	85	1.1419	4853184
0.6986	0.9697	90	1.1415	5141336