MicroPanda123 commited on
Commit
c202300
·
1 Parent(s): 5c8c72a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  license: gpl-2.0
 
3
  ---
4
 
5
  Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets
@@ -12,5 +13,5 @@ batch_size=2
12
  gradient_accumulation_steps = 64
13
  ```
14
  This was because I was training it locally on RTX2060 and did not have enough power to train it more.
15
- Current model was trained for 8880 iterations.
16
  At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
 
1
  ---
2
  license: gpl-2.0
3
+ pipeline_tag: text-generation
4
  ---
5
 
6
  Got bored so used [nanoGPT](https://github.com/karpathy/nanoGPT) to train model on all Python snippets from https://www.kaggle.com/datasets/simiotic/github-code-snippets
 
13
  gradient_accumulation_steps = 64
14
  ```
15
  This was because I was training it locally on RTX2060 and did not have enough power to train it more.
16
+ Current model was trained for 8880 iterations. Took around 20 hours.
17
  At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.