Edit model card

amdchess-v9

This model is a fine-tuned version of amd/AMD-Llama-135m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6367

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use grokadamw with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.4763 0.0100 17 1.4222
1.0937 0.0201 34 1.1053
1.0732 0.0301 51 1.0270
0.991 0.0401 68 0.9671
1.0235 0.0502 85 0.9474
0.8849 0.0602 102 0.9239
0.9108 0.0702 119 0.8907
0.8907 0.0803 136 0.8745
0.8685 0.0903 153 0.8619
0.9375 0.1004 170 0.8547
0.7897 0.1104 187 0.8412
0.8594 0.1204 204 0.8293
0.8495 0.1305 221 0.8226
0.8618 0.1405 238 0.8129
0.8643 0.1505 255 0.8052
0.7375 0.1606 272 0.7985
0.7322 0.1706 289 0.7953
0.7991 0.1806 306 0.7923
0.8269 0.1907 323 0.7856
0.8031 0.2007 340 0.7776
0.7605 0.2107 357 0.7737
0.804 0.2208 374 0.7664
0.7683 0.2308 391 0.7600
0.7667 0.2409 408 0.7610
0.7823 0.2509 425 0.7508
0.7608 0.2609 442 0.7484
0.7291 0.2710 459 0.7457
0.8157 0.2810 476 0.7393
0.7526 0.2910 493 0.7353
0.7099 0.3011 510 0.7360
0.8242 0.3111 527 0.7331
0.7849 0.3211 544 0.7285
0.7558 0.3312 561 0.7224
0.6278 0.3412 578 0.7225
0.7135 0.3512 595 0.7197
0.6425 0.3613 612 0.7180
0.7721 0.3713 629 0.7137
0.8091 0.3813 646 0.7097
0.7518 0.3914 663 0.7063
0.7299 0.4014 680 0.7053
0.7563 0.4115 697 0.7051
0.658 0.4215 714 0.6997
0.7096 0.4315 731 0.6966
0.7555 0.4416 748 0.6954
0.7292 0.4516 765 0.6936
0.6349 0.4616 782 0.6908
0.6996 0.4717 799 0.6892
0.6849 0.4817 816 0.6892
0.7023 0.4917 833 0.6847
0.6547 0.5018 850 0.6850
0.7549 0.5118 867 0.6826
0.6987 0.5218 884 0.6798
0.648 0.5319 901 0.6796
0.7308 0.5419 918 0.6775
0.7245 0.5519 935 0.6756
0.6915 0.5620 952 0.6745
0.7287 0.5720 969 0.6716
0.739 0.5821 986 0.6704
0.7168 0.5921 1003 0.6686
0.685 0.6021 1020 0.6671
0.7183 0.6122 1037 0.6656
0.7138 0.6222 1054 0.6644
0.6738 0.6322 1071 0.6620
0.634 0.6423 1088 0.6611
0.703 0.6523 1105 0.6606
0.6538 0.6623 1122 0.6584
0.7167 0.6724 1139 0.6564
0.6717 0.6824 1156 0.6545
0.6633 0.6924 1173 0.6538
0.6035 0.7025 1190 0.6535
0.6444 0.7125 1207 0.6514
0.7171 0.7226 1224 0.6502
0.7157 0.7326 1241 0.6489
0.7028 0.7426 1258 0.6480
0.681 0.7527 1275 0.6479
0.6711 0.7627 1292 0.6464
0.7113 0.7727 1309 0.6454
0.7329 0.7828 1326 0.6454
0.694 0.7928 1343 0.6436
0.6304 0.8028 1360 0.6431
0.7129 0.8129 1377 0.6420
0.6531 0.8229 1394 0.6411
0.6791 0.8329 1411 0.6406
0.6963 0.8430 1428 0.6401
0.6285 0.8530 1445 0.6402
0.6484 0.8630 1462 0.6398
0.6505 0.8731 1479 0.6394
0.6985 0.8831 1496 0.6389
0.6643 0.8932 1513 0.6386
0.6292 0.9032 1530 0.6381
0.6237 0.9132 1547 0.6377
0.6159 0.9233 1564 0.6375
0.7027 0.9333 1581 0.6372
0.7068 0.9433 1598 0.6371
0.6021 0.9534 1615 0.6369
0.6812 0.9634 1632 0.6368
0.6805 0.9734 1649 0.6368
0.628 0.9835 1666 0.6367
0.6507 0.9935 1683 0.6367

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.5.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.1
Downloads last month
25
Safetensors
Model size
134M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nlpguy/amdchess-v9

Base model

amd/AMD-Llama-135m
Finetuned
(13)
this model
Quantizations
1 model