Model card for chessPT

A pretrained Decoder only transformer model for chess move prediction with only ~14M parameters.

Intended use

Predict new moves in a chess game based on PGN tokens.

Implementation

The model implementation is based on Andrej Karpathy's nanoGPT following the webseries "Zero to Hero" on youtube.

Training

You can find the training script in the repositories files under train.py. This also contains the used parameters

context_size = 256
batch_size = 128
max_iters = 30_000
learning_rate = 3e-5
eval_interval = 100
eval_iters = 20
n_embed = 384
n_layer = 6
n_head = 6
dropout = 0.2

Visitors

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train philipp-zettl/chessPT

Space using philipp-zettl/chessPT 1

Collection including philipp-zettl/chessPT