|
--- |
|
license: gpl-2.0 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets |
|
|
|
Model was trained on default train.py settings, except |
|
``` |
|
eval_intervals=20 |
|
eval_iters=40 |
|
batch_size=2 |
|
gradient_accumulation_steps = 64 |
|
``` |
|
This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings. |
|
|
|
Model is stored in "model" folder that contains model itself and "info.txt" file containing: |
|
- iter_num - number of iterations |
|
- train_loss - training loss at time of checkpoint |
|
- val_loss - validation loss at time of checkpoint |
|
- config - nanoGPT config |
|
|
|
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine. |