MicroPanda123 commited on
Commit
850ba12
·
1 Parent(s): 57df228

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -13,5 +13,10 @@ batch_size=2
13
  gradient_accumulation_steps = 64
14
  ```
15
  This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
16
- Current model was trained for 8880 iterations. Took around 20 hours.
 
 
 
 
 
17
  At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
 
13
  gradient_accumulation_steps = 64
14
  ```
15
  This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
16
+ Model is stored in "model" folder that contains model itself and "info.txt" file containing:
17
+ iter_num - number of iterations
18
+ train_loss - training loss at time of checkpoint
19
+ val_loss - validation loss at time of checkpoint
20
+ config - nanoGPT config
21
+
22
  At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.