calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9541

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.6580 1.0 6 3.2453
3.1079 2.0 12 2.8981
2.8039 3.0 18 2.6629
2.6131 4.0 24 2.5056
2.4617 5.0 30 2.3752
2.3350 6.0 36 2.2642
2.2296 7.0 42 2.1748
2.1459 8.0 48 2.0943
2.0633 9.0 54 2.0142
1.9671 10.0 60 1.9194
1.8781 11.0 66 1.8104
1.7735 12.0 72 1.6989
1.6788 13.0 78 1.6040
1.5891 14.0 84 1.5262
1.5152 15.0 90 1.4682
1.4592 16.0 96 1.4185
1.4132 17.0 102 1.3766
1.3728 18.0 108 1.3385
1.3356 19.0 114 1.3063
1.3025 20.0 120 1.2722
1.2656 21.0 126 1.2434
1.2433 22.0 132 1.2168
1.2150 23.0 138 1.1880
1.1891 24.0 144 1.1642
1.1656 25.0 150 1.1414
1.1386 26.0 156 1.1177
1.1159 27.0 162 1.0988
1.0934 28.0 168 1.0751
1.0739 29.0 174 1.0548
1.0619 30.0 180 1.0354
1.0372 31.0 186 1.0206
1.0235 32.0 192 1.0072
1.0175 33.0 198 0.9949
1.0034 34.0 204 0.9841
0.9859 35.0 210 0.9759
0.9895 36.0 216 0.9683
0.9779 37.0 222 0.9622
0.9706 38.0 228 0.9574
0.9735 39.0 234 0.9549
0.9607 40.0 240 0.9541

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
61
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support