chesspythia-70m-daryo

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9213

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.5097 0.0100 17 1.4338
1.2602 0.0200 34 1.2784
1.2292 0.0301 51 1.2254
1.1925 0.0401 68 1.1904
1.1533 0.0501 85 1.1670
1.1705 0.0601 102 1.1515
1.1103 0.0701 119 1.1351
1.1797 0.0801 136 1.1217
1.0768 0.0902 153 1.1103
1.1207 0.1002 170 1.1013
1.1476 0.1102 187 1.0919
1.1478 0.1202 204 1.0913
1.1316 0.1302 221 1.0768
1.0524 0.1402 238 1.0750
1.0392 0.1503 255 1.0614
1.0935 0.1603 272 1.0641
1.0097 0.1703 289 1.0508
1.0855 0.1803 306 1.0528
1.0863 0.1903 323 1.0413
1.0812 0.2004 340 1.0400
1.0675 0.2104 357 1.0338
1.1371 0.2204 374 1.0348
1.0607 0.2304 391 1.0307
1.0659 0.2404 408 1.0268
1.046 0.2504 425 1.0200
1.0169 0.2605 442 1.0173
1.0329 0.2705 459 1.0125
1.0181 0.2805 476 1.0125
1.0158 0.2905 493 1.0063
1.0837 0.3005 510 1.0077
1.0016 0.3105 527 1.0113
1.0054 0.3206 544 1.0029
1.0125 0.3306 561 0.9970
1.027 0.3406 578 0.9977
1.0072 0.3506 595 0.9903
1.0993 0.3606 612 0.9918
1.0218 0.3707 629 0.9872
0.961 0.3807 646 0.9841
1.0845 0.3907 663 0.9827
1.0536 0.4007 680 0.9848
0.9998 0.4107 697 0.9825
1.0145 0.4207 714 0.9814
0.9812 0.4308 731 0.9794
0.9736 0.4408 748 0.9761
0.9738 0.4508 765 0.9699
1.0023 0.4608 782 0.9703
1.0239 0.4708 799 0.9709
0.9626 0.4808 816 0.9673
0.9331 0.4909 833 0.9679
0.9569 0.5009 850 0.9643
0.9414 0.5109 867 0.9653
0.9671 0.5209 884 0.9613
0.9531 0.5309 901 0.9607
0.9611 0.5410 918 0.9591
1.0037 0.5510 935 0.9582
1.0062 0.5610 952 0.9581
0.9264 0.5710 969 0.9555
0.97 0.5810 986 0.9546
0.9121 0.5910 1003 0.9505
0.9815 0.6011 1020 0.9489
0.9873 0.6111 1037 0.9475
0.9398 0.6211 1054 0.9467
0.942 0.6311 1071 0.9455
0.9716 0.6411 1088 0.9471
0.9642 0.6511 1105 0.9436
0.93 0.6612 1122 0.9424
0.9498 0.6712 1139 0.9410
0.9216 0.6812 1156 0.9420
0.9522 0.6912 1173 0.9380
0.9366 0.7012 1190 0.9382
0.9293 0.7113 1207 0.9353
0.9097 0.7213 1224 0.9356
1.0044 0.7313 1241 0.9352
0.9624 0.7413 1258 0.9319
0.9621 0.7513 1275 0.9315
0.9402 0.7613 1292 0.9314
0.9148 0.7714 1309 0.9314
0.9373 0.7814 1326 0.9300
0.9458 0.7914 1343 0.9289
0.917 0.8014 1360 0.9283
0.9305 0.8114 1377 0.9282
0.8832 0.8214 1394 0.9273
0.908 0.8315 1411 0.9257
0.9667 0.8415 1428 0.9258
0.9673 0.8515 1445 0.9245
0.9462 0.8615 1462 0.9246
0.9475 0.8715 1479 0.9236
0.9716 0.8816 1496 0.9237
0.936 0.8916 1513 0.9231
0.9497 0.9016 1530 0.9229
0.9507 0.9116 1547 0.9223
0.955 0.9216 1564 0.9221
0.9212 0.9316 1581 0.9220
0.9257 0.9417 1598 0.9218
0.9765 0.9517 1615 0.9215
0.9094 0.9617 1632 0.9214
0.9401 0.9717 1649 0.9213
0.9492 0.9817 1666 0.9213
0.971 0.9918 1683 0.9213

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
128
Safetensors
Model size
70.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for darwinfegarido/chesspythia-70m-daryo

Finetuned
(119)
this model