chesspythia-70m-daryo
This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9213
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.5097 | 0.0100 | 17 | 1.4338 |
1.2602 | 0.0200 | 34 | 1.2784 |
1.2292 | 0.0301 | 51 | 1.2254 |
1.1925 | 0.0401 | 68 | 1.1904 |
1.1533 | 0.0501 | 85 | 1.1670 |
1.1705 | 0.0601 | 102 | 1.1515 |
1.1103 | 0.0701 | 119 | 1.1351 |
1.1797 | 0.0801 | 136 | 1.1217 |
1.0768 | 0.0902 | 153 | 1.1103 |
1.1207 | 0.1002 | 170 | 1.1013 |
1.1476 | 0.1102 | 187 | 1.0919 |
1.1478 | 0.1202 | 204 | 1.0913 |
1.1316 | 0.1302 | 221 | 1.0768 |
1.0524 | 0.1402 | 238 | 1.0750 |
1.0392 | 0.1503 | 255 | 1.0614 |
1.0935 | 0.1603 | 272 | 1.0641 |
1.0097 | 0.1703 | 289 | 1.0508 |
1.0855 | 0.1803 | 306 | 1.0528 |
1.0863 | 0.1903 | 323 | 1.0413 |
1.0812 | 0.2004 | 340 | 1.0400 |
1.0675 | 0.2104 | 357 | 1.0338 |
1.1371 | 0.2204 | 374 | 1.0348 |
1.0607 | 0.2304 | 391 | 1.0307 |
1.0659 | 0.2404 | 408 | 1.0268 |
1.046 | 0.2504 | 425 | 1.0200 |
1.0169 | 0.2605 | 442 | 1.0173 |
1.0329 | 0.2705 | 459 | 1.0125 |
1.0181 | 0.2805 | 476 | 1.0125 |
1.0158 | 0.2905 | 493 | 1.0063 |
1.0837 | 0.3005 | 510 | 1.0077 |
1.0016 | 0.3105 | 527 | 1.0113 |
1.0054 | 0.3206 | 544 | 1.0029 |
1.0125 | 0.3306 | 561 | 0.9970 |
1.027 | 0.3406 | 578 | 0.9977 |
1.0072 | 0.3506 | 595 | 0.9903 |
1.0993 | 0.3606 | 612 | 0.9918 |
1.0218 | 0.3707 | 629 | 0.9872 |
0.961 | 0.3807 | 646 | 0.9841 |
1.0845 | 0.3907 | 663 | 0.9827 |
1.0536 | 0.4007 | 680 | 0.9848 |
0.9998 | 0.4107 | 697 | 0.9825 |
1.0145 | 0.4207 | 714 | 0.9814 |
0.9812 | 0.4308 | 731 | 0.9794 |
0.9736 | 0.4408 | 748 | 0.9761 |
0.9738 | 0.4508 | 765 | 0.9699 |
1.0023 | 0.4608 | 782 | 0.9703 |
1.0239 | 0.4708 | 799 | 0.9709 |
0.9626 | 0.4808 | 816 | 0.9673 |
0.9331 | 0.4909 | 833 | 0.9679 |
0.9569 | 0.5009 | 850 | 0.9643 |
0.9414 | 0.5109 | 867 | 0.9653 |
0.9671 | 0.5209 | 884 | 0.9613 |
0.9531 | 0.5309 | 901 | 0.9607 |
0.9611 | 0.5410 | 918 | 0.9591 |
1.0037 | 0.5510 | 935 | 0.9582 |
1.0062 | 0.5610 | 952 | 0.9581 |
0.9264 | 0.5710 | 969 | 0.9555 |
0.97 | 0.5810 | 986 | 0.9546 |
0.9121 | 0.5910 | 1003 | 0.9505 |
0.9815 | 0.6011 | 1020 | 0.9489 |
0.9873 | 0.6111 | 1037 | 0.9475 |
0.9398 | 0.6211 | 1054 | 0.9467 |
0.942 | 0.6311 | 1071 | 0.9455 |
0.9716 | 0.6411 | 1088 | 0.9471 |
0.9642 | 0.6511 | 1105 | 0.9436 |
0.93 | 0.6612 | 1122 | 0.9424 |
0.9498 | 0.6712 | 1139 | 0.9410 |
0.9216 | 0.6812 | 1156 | 0.9420 |
0.9522 | 0.6912 | 1173 | 0.9380 |
0.9366 | 0.7012 | 1190 | 0.9382 |
0.9293 | 0.7113 | 1207 | 0.9353 |
0.9097 | 0.7213 | 1224 | 0.9356 |
1.0044 | 0.7313 | 1241 | 0.9352 |
0.9624 | 0.7413 | 1258 | 0.9319 |
0.9621 | 0.7513 | 1275 | 0.9315 |
0.9402 | 0.7613 | 1292 | 0.9314 |
0.9148 | 0.7714 | 1309 | 0.9314 |
0.9373 | 0.7814 | 1326 | 0.9300 |
0.9458 | 0.7914 | 1343 | 0.9289 |
0.917 | 0.8014 | 1360 | 0.9283 |
0.9305 | 0.8114 | 1377 | 0.9282 |
0.8832 | 0.8214 | 1394 | 0.9273 |
0.908 | 0.8315 | 1411 | 0.9257 |
0.9667 | 0.8415 | 1428 | 0.9258 |
0.9673 | 0.8515 | 1445 | 0.9245 |
0.9462 | 0.8615 | 1462 | 0.9246 |
0.9475 | 0.8715 | 1479 | 0.9236 |
0.9716 | 0.8816 | 1496 | 0.9237 |
0.936 | 0.8916 | 1513 | 0.9231 |
0.9497 | 0.9016 | 1530 | 0.9229 |
0.9507 | 0.9116 | 1547 | 0.9223 |
0.955 | 0.9216 | 1564 | 0.9221 |
0.9212 | 0.9316 | 1581 | 0.9220 |
0.9257 | 0.9417 | 1598 | 0.9218 |
0.9765 | 0.9517 | 1615 | 0.9215 |
0.9094 | 0.9617 | 1632 | 0.9214 |
0.9401 | 0.9717 | 1649 | 0.9213 |
0.9492 | 0.9817 | 1666 | 0.9213 |
0.971 | 0.9918 | 1683 | 0.9213 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 128
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for darwinfegarido/chesspythia-70m-daryo
Base model
EleutherAI/pythia-70m-deduped