calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0710

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0915 1.0 5 2.3521
2.1484 2.0 10 1.8601
1.7128 3.0 15 1.4527
1.3553 4.0 20 1.1778
1.1163 5.0 25 1.0277
0.9841 6.0 30 0.9234
0.8693 7.0 35 0.7778
0.7649 8.0 40 0.7049
0.7043 9.0 45 0.6547
0.6440 10.0 50 0.6092
0.6069 11.0 55 0.5777
0.5713 12.0 60 0.5318
0.5384 13.0 65 0.4881
0.5031 14.0 70 0.4651
0.4705 15.0 75 0.4390
0.4453 16.0 80 0.4080
0.4165 17.0 85 0.3966
0.3953 18.0 90 0.3614
0.3782 19.0 95 0.3430
0.3625 20.0 100 0.3272
0.3394 21.0 105 0.3016
0.3107 22.0 110 0.2624
0.2814 23.0 115 0.2426
0.2610 24.0 120 0.2223
0.2468 25.0 125 0.1960
0.2233 26.0 130 0.1802
0.2052 27.0 135 0.1603
0.1890 28.0 140 0.1367
0.1708 29.0 145 0.1219
0.1577 30.0 150 0.1105
0.1487 31.0 155 0.1030
0.1400 32.0 160 0.0960
0.1308 33.0 165 0.0913
0.1265 34.0 170 0.0849
0.1187 35.0 175 0.0796
0.1138 36.0 180 0.0781
0.1113 37.0 185 0.0747
0.1079 38.0 190 0.0743
0.1083 39.0 195 0.0714
0.1057 40.0 200 0.0710

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
37
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support