Update README.md
Browse files
README.md
CHANGED
@@ -79,11 +79,13 @@ Use the code below to get started with the model.
|
|
79 |
|
80 |
#### Training Hyperparameters
|
81 |
|
82 |
-
- **Training regime:**
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
|
|
|
|
87 |
|
88 |
#### Speeds, Sizes, Times [optional]
|
89 |
|
|
|
79 |
|
80 |
#### Training Hyperparameters
|
81 |
|
82 |
+
- **Training regime:** <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
83 |
+
|
84 |
+
- Training regime: Mixed precision training using bf16
|
85 |
+
- Number of epochs: 3
|
86 |
+
- Learning rate: 1e-4
|
87 |
+
- Batch size: 16
|
88 |
+
- Seq length: 512
|
89 |
|
90 |
#### Speeds, Sizes, Times [optional]
|
91 |
|