bhadresh-savani
commited on
Commit
•
739a21e
1
Parent(s):
07580c3
Update README.md
Browse files
README.md
CHANGED
@@ -27,4 +27,13 @@ Instantaneous batch size per device = 16
|
|
27 |
Total train batch size (w. parallel, distributed & accumulation) = 16
|
28 |
Gradient Accumulation steps = 1
|
29 |
Total optimization steps = 31728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
```
|
|
|
27 |
Total train batch size (w. parallel, distributed & accumulation) = 16
|
28 |
Gradient Accumulation steps = 1
|
29 |
Total optimization steps = 31728
|
30 |
+
```
|
31 |
+
|
32 |
+
## TrainOutput:
|
33 |
+
```
|
34 |
+
'train_runtime': 9182.5173,
|
35 |
+
'train_samples_per_second': 55.282,
|
36 |
+
'train_steps_per_second': 3.455,
|
37 |
+
'total_flos': 8968626056739648.0,
|
38 |
+
'train_loss': 0.12085497042373672,
|
39 |
```
|