MicroPanda123
commited on
Commit
·
d2e1b5a
1
Parent(s):
c202300
Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,6 @@ eval_iters=40
|
|
12 |
batch_size=2
|
13 |
gradient_accumulation_steps = 64
|
14 |
```
|
15 |
-
This was because I was training it locally on RTX2060 and did not have enough power to train it
|
16 |
Current model was trained for 8880 iterations. Took around 20 hours.
|
17 |
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
|
|
|
12 |
batch_size=2
|
13 |
gradient_accumulation_steps = 64
|
14 |
```
|
15 |
+
This was because I was training it locally on RTX2060 and did not have enough power to train it on higher settings.
|
16 |
Current model was trained for 8880 iterations. Took around 20 hours.
|
17 |
At first I made it only save model after validation loss improved, to not allow overfitting, but after some time I decided to risk it and turned that off and allowed it to save everytime, luckly it worked out fine.
|