MicroPanda123 commited on
Commit
2b41332
·
1 Parent(s): 984ac0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -2,4 +2,15 @@
2
  license: gpl-2.0
3
  ---
4
 
5
- Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: gpl-2.0
3
  ---
4
 
5
+ Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets
6
+
7
+ Model was trained on default train.py settings, except
8
+ ```
9
+ eval_intervals=20
10
+ eval_iters=40
11
+ batch_size=2
12
+ gradient_accumulation_steps = 64
13
+ ```
14
+ This was because I was training it locally on RTX2060 and did not have enough power to train it more.
15
+ Current model was trained for 8880 iterations.
16
+ At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.