MicroPanda123
/

PythonBasic

Text Generation

Model card Files Files and versions Community

MicroPanda123 commited on Jul 14, 2023

Commit

850ba12

·

1 Parent(s): 57df228

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -13,5 +13,10 @@ batch_size=2
 gradient_accumulation_steps = 64
 ```
 This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
-Current model was trained for 8880 iterations. Took around 20 hours.
 At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.

 gradient_accumulation_steps = 64
 ```
 This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
+Model is stored in "model" folder that contains model itself and "info.txt" file containing:
+iter_num - number of iterations
+train_loss - training loss at time of checkpoint
+val_loss - validation loss at time of checkpoint
+config - nanoGPT config
 At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.