opt-babylm2-clean-spacy-32k_seed-42_3e-4

This model was trained from scratch on the kanishka/babylm2-clean-spacy dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.3944	1.0	15543	3.4212	0.3734
3.1245	2.0	31086	3.2037	0.3940
2.9807	3.0	46629	3.0794	0.4073
2.8872	4.0	62172	3.0205	0.4140
2.8286	5.0	77715	2.9885	0.4180
2.779	6.0	93258	2.9699	0.4206
2.7316	7.0	108801	2.9588	0.4222
2.6909	8.0	124344	2.9554	0.4233
2.6504	9.0	139887	2.9544	0.4238
2.6246	10.0	155430	2.9523	0.4244
2.5988	11.0	170973	2.9568	0.4248
2.5639	12.0	186516	2.9595	0.4248
2.5361	13.0	202059	2.9698	0.4248
2.5098	14.0	217602	2.9747	0.4247
2.4899	15.0	233145	2.9792	0.4247
2.4626	16.0	248688	2.9882	0.4244
2.4399	17.0	264231	2.9961	0.4243
2.4186	18.0	279774	3.0051	0.4239
2.3869	19.0	295317	3.0119	0.4237
2.3686	20.0	310860	3.0190	0.4234