calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7002

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4966 1.0 5 2.9381
2.5644 2.0 10 2.1833
1.9501 3.0 15 1.7482
1.6572 4.0 20 1.6286
1.5672 5.0 25 1.5984
1.5456 6.0 30 1.5522
1.5138 7.0 35 1.5623
1.5187 8.0 40 1.5271
1.4693 9.0 45 1.4929
1.4264 10.0 50 1.4364
1.3861 11.0 55 1.3868
1.3580 12.0 60 1.3506
1.3041 13.0 65 1.3024
1.2614 14.0 70 1.2353
1.1888 15.0 75 1.1557
1.1243 16.0 80 1.0897
1.0817 17.0 85 1.0597
1.0505 18.0 90 1.0613
1.0286 19.0 95 0.9825
0.9680 20.0 100 0.9919
0.9733 21.0 105 0.9443
0.9319 22.0 110 0.9685
0.9913 23.0 115 0.9770
0.9304 24.0 120 0.9019
0.8845 25.0 125 0.8840
0.8630 26.0 130 0.8568
0.8613 27.0 135 0.8521
0.8631 28.0 140 0.8390
0.8458 29.0 145 0.8094
0.8052 30.0 150 0.7904
0.7919 31.0 155 0.8096
0.7809 32.0 160 0.7677
0.7595 33.0 165 0.7542
0.7481 34.0 170 0.7461
0.7381 35.0 175 0.7301
0.7230 36.0 180 0.7188
0.7149 37.0 185 0.7115
0.7137 38.0 190 0.7115
0.7057 39.0 195 0.7009
0.7026 40.0 200 0.7002

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
22
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support