File size: 753 Bytes
984ac0a
 
c202300
984ac0a
 
2b41332
 
 
 
 
 
 
 
 
 
c202300
2b41332
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
license: gpl-2.0
pipeline_tag: text-generation
---

Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets

Model was trained on default train.py settings, except
```
eval_intervals=20
eval_iters=40
batch_size=2
gradient_accumulation_steps = 64
```
This was because I was training it locally on RTX2060 and did not have enough power to train it more.
Current model was trained for 8880 iterations. Took around 20 hours.
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.