calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9619	1.0	6	2.2703
2.0327	2.0	12	1.7673
1.6053	3.0	18	1.3886
1.2679	4.0	24	1.1616
1.0723	5.0	30	1.0302
0.9910	6.0	36	0.9907
0.9064	7.0	42	0.8454
0.8019	8.0	48	0.7476
0.7394	9.0	54	0.7726
0.7155	10.0	60	0.6386
0.6290	11.0	66	0.6013
0.5877	12.0	72	0.5625
0.5574	13.0	78	0.5217
0.5265	14.0	84	0.4952
0.5026	15.0	90	0.4686
0.4780	16.0	96	0.4439
0.4424	17.0	102	0.4193
0.4208	18.0	108	0.4193
0.4181	19.0	114	0.3753
0.3887	20.0	120	0.3754
0.3926	21.0	126	0.3477
0.3558	22.0	132	0.3137
0.3339	23.0	138	0.3153
0.3184	24.0	144	0.2835
0.2936	25.0	150	0.2796
0.2708	26.0	156	0.2514
0.2641	27.0	162	0.2385
0.2414	28.0	168	0.2228
0.2311	29.0	174	0.2163
0.2236	30.0	180	0.2022
0.2206	31.0	186	0.1944
0.2064	32.0	192	0.1902
0.1955	33.0	198	0.1744
0.1864	34.0	204	0.1688
0.1777	35.0	210	0.1662
0.1765	36.0	216	0.1539
0.1627	37.0	222	0.1517
0.1600	38.0	228	0.1478
0.1561	39.0	234	0.1442
0.1537	40.0	240	0.1432

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support