File size: 955 Bytes
984ac0a
 
c202300
984ac0a
 
2b41332
 
 
 
 
 
 
 
 
d2e1b5a
cd468f5
850ba12
9ff3927
 
 
 
850ba12
2b41332
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: gpl-2.0
pipeline_tag: text-generation
---

Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets

Model was trained on default train.py settings, except
```
eval_intervals=20
eval_iters=40
batch_size=2
gradient_accumulation_steps = 64
```
This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.

Model is stored in "model" folder that contains model itself and "info.txt" file containing:
- iter_num - number of iterations
- train_loss - training loss at time of checkpoint
- val_loss - validation loss at time of checkpoint
- config - nanoGPT config

At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.