calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5655

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4577 1.0 5 2.8856
2.5445 2.0 10 2.1155
1.9433 3.0 15 1.7402
1.6716 4.0 20 1.5984
1.5706 5.0 25 1.5216
1.4716 6.0 30 1.4133
1.4078 7.0 35 1.4063
1.3399 8.0 40 1.2725
1.2563 9.0 45 1.1975
1.1916 10.0 50 1.1727
1.1233 11.0 55 1.0583
1.0715 12.0 60 1.0389
1.0457 13.0 65 1.0173
1.0079 14.0 70 0.9519
0.9752 15.0 75 0.9463
0.9617 16.0 80 0.9936
0.9496 17.0 85 0.8543
0.8664 18.0 90 0.8184
0.8374 19.0 95 0.8107
0.8280 20.0 100 0.8515
0.8739 21.0 105 0.8150
0.8371 22.0 110 0.7676
0.7851 23.0 115 0.7262
0.7736 24.0 120 0.7195
0.7422 25.0 125 0.7092
0.7317 26.0 130 0.6800
0.7113 27.0 135 0.6642
0.6888 28.0 140 0.6617
0.6923 29.0 145 0.6446
0.6785 30.0 150 0.6787
0.6754 31.0 155 0.6197
0.6616 32.0 160 0.6091
0.6520 33.0 165 0.6110
0.6432 34.0 170 0.5926
0.6334 35.0 175 0.5935
0.6235 36.0 180 0.5870
0.6212 37.0 185 0.5840
0.6134 38.0 190 0.5712
0.6072 39.0 195 0.5662
0.6043 40.0 200 0.5655

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
44
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support