---
library_name: transformers
license: apache-2.0
base_model: EleutherAI/pythia-70m-deduped
tags:
- generated_from_trainer
model-index:
- name: chesspythia-70m-daryo
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# chesspythia-70m-daryo

This model is a fine-tuned version of [EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9213

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 1

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 1.5097        | 0.0100 | 17   | 1.4338          |
| 1.2602        | 0.0200 | 34   | 1.2784          |
| 1.2292        | 0.0301 | 51   | 1.2254          |
| 1.1925        | 0.0401 | 68   | 1.1904          |
| 1.1533        | 0.0501 | 85   | 1.1670          |
| 1.1705        | 0.0601 | 102  | 1.1515          |
| 1.1103        | 0.0701 | 119  | 1.1351          |
| 1.1797        | 0.0801 | 136  | 1.1217          |
| 1.0768        | 0.0902 | 153  | 1.1103          |
| 1.1207        | 0.1002 | 170  | 1.1013          |
| 1.1476        | 0.1102 | 187  | 1.0919          |
| 1.1478        | 0.1202 | 204  | 1.0913          |
| 1.1316        | 0.1302 | 221  | 1.0768          |
| 1.0524        | 0.1402 | 238  | 1.0750          |
| 1.0392        | 0.1503 | 255  | 1.0614          |
| 1.0935        | 0.1603 | 272  | 1.0641          |
| 1.0097        | 0.1703 | 289  | 1.0508          |
| 1.0855        | 0.1803 | 306  | 1.0528          |
| 1.0863        | 0.1903 | 323  | 1.0413          |
| 1.0812        | 0.2004 | 340  | 1.0400          |
| 1.0675        | 0.2104 | 357  | 1.0338          |
| 1.1371        | 0.2204 | 374  | 1.0348          |
| 1.0607        | 0.2304 | 391  | 1.0307          |
| 1.0659        | 0.2404 | 408  | 1.0268          |
| 1.046         | 0.2504 | 425  | 1.0200          |
| 1.0169        | 0.2605 | 442  | 1.0173          |
| 1.0329        | 0.2705 | 459  | 1.0125          |
| 1.0181        | 0.2805 | 476  | 1.0125          |
| 1.0158        | 0.2905 | 493  | 1.0063          |
| 1.0837        | 0.3005 | 510  | 1.0077          |
| 1.0016        | 0.3105 | 527  | 1.0113          |
| 1.0054        | 0.3206 | 544  | 1.0029          |
| 1.0125        | 0.3306 | 561  | 0.9970          |
| 1.027         | 0.3406 | 578  | 0.9977          |
| 1.0072        | 0.3506 | 595  | 0.9903          |
| 1.0993        | 0.3606 | 612  | 0.9918          |
| 1.0218        | 0.3707 | 629  | 0.9872          |
| 0.961         | 0.3807 | 646  | 0.9841          |
| 1.0845        | 0.3907 | 663  | 0.9827          |
| 1.0536        | 0.4007 | 680  | 0.9848          |
| 0.9998        | 0.4107 | 697  | 0.9825          |
| 1.0145        | 0.4207 | 714  | 0.9814          |
| 0.9812        | 0.4308 | 731  | 0.9794          |
| 0.9736        | 0.4408 | 748  | 0.9761          |
| 0.9738        | 0.4508 | 765  | 0.9699          |
| 1.0023        | 0.4608 | 782  | 0.9703          |
| 1.0239        | 0.4708 | 799  | 0.9709          |
| 0.9626        | 0.4808 | 816  | 0.9673          |
| 0.9331        | 0.4909 | 833  | 0.9679          |
| 0.9569        | 0.5009 | 850  | 0.9643          |
| 0.9414        | 0.5109 | 867  | 0.9653          |
| 0.9671        | 0.5209 | 884  | 0.9613          |
| 0.9531        | 0.5309 | 901  | 0.9607          |
| 0.9611        | 0.5410 | 918  | 0.9591          |
| 1.0037        | 0.5510 | 935  | 0.9582          |
| 1.0062        | 0.5610 | 952  | 0.9581          |
| 0.9264        | 0.5710 | 969  | 0.9555          |
| 0.97          | 0.5810 | 986  | 0.9546          |
| 0.9121        | 0.5910 | 1003 | 0.9505          |
| 0.9815        | 0.6011 | 1020 | 0.9489          |
| 0.9873        | 0.6111 | 1037 | 0.9475          |
| 0.9398        | 0.6211 | 1054 | 0.9467          |
| 0.942         | 0.6311 | 1071 | 0.9455          |
| 0.9716        | 0.6411 | 1088 | 0.9471          |
| 0.9642        | 0.6511 | 1105 | 0.9436          |
| 0.93          | 0.6612 | 1122 | 0.9424          |
| 0.9498        | 0.6712 | 1139 | 0.9410          |
| 0.9216        | 0.6812 | 1156 | 0.9420          |
| 0.9522        | 0.6912 | 1173 | 0.9380          |
| 0.9366        | 0.7012 | 1190 | 0.9382          |
| 0.9293        | 0.7113 | 1207 | 0.9353          |
| 0.9097        | 0.7213 | 1224 | 0.9356          |
| 1.0044        | 0.7313 | 1241 | 0.9352          |
| 0.9624        | 0.7413 | 1258 | 0.9319          |
| 0.9621        | 0.7513 | 1275 | 0.9315          |
| 0.9402        | 0.7613 | 1292 | 0.9314          |
| 0.9148        | 0.7714 | 1309 | 0.9314          |
| 0.9373        | 0.7814 | 1326 | 0.9300          |
| 0.9458        | 0.7914 | 1343 | 0.9289          |
| 0.917         | 0.8014 | 1360 | 0.9283          |
| 0.9305        | 0.8114 | 1377 | 0.9282          |
| 0.8832        | 0.8214 | 1394 | 0.9273          |
| 0.908         | 0.8315 | 1411 | 0.9257          |
| 0.9667        | 0.8415 | 1428 | 0.9258          |
| 0.9673        | 0.8515 | 1445 | 0.9245          |
| 0.9462        | 0.8615 | 1462 | 0.9246          |
| 0.9475        | 0.8715 | 1479 | 0.9236          |
| 0.9716        | 0.8816 | 1496 | 0.9237          |
| 0.936         | 0.8916 | 1513 | 0.9231          |
| 0.9497        | 0.9016 | 1530 | 0.9229          |
| 0.9507        | 0.9116 | 1547 | 0.9223          |
| 0.955         | 0.9216 | 1564 | 0.9221          |
| 0.9212        | 0.9316 | 1581 | 0.9220          |
| 0.9257        | 0.9417 | 1598 | 0.9218          |
| 0.9765        | 0.9517 | 1615 | 0.9215          |
| 0.9094        | 0.9617 | 1632 | 0.9214          |
| 0.9401        | 0.9717 | 1649 | 0.9213          |
| 0.9492        | 0.9817 | 1666 | 0.9213          |
| 0.971         | 0.9918 | 1683 | 0.9213          |


### Framework versions

- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3