cmarkea
/

bloomz-7b1-mt-sft-chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Cyrile commited on Sep 14, 2023

Commit

d71114e

•

1 Parent(s): 0e0abea

Update README.md

Files changed (1) hide show

README.md +9 -9

README.md CHANGED Viewed

@@ -52,16 +52,16 @@ Here is the table summarizing the architecture used for training, along with the
 |     Hyperparameter    |    Value   |
 |:---------------------:|:----------:|
-| label smoothing       | 0.05       |
-| optimize              | AdamW      |
-| betas                 | 0.9, 0.999 |
-| learning rate         | 5e-6       |
-| anneal strategy       | cos        |
-| div factor            | 100        |
-| final div factor      | 0.1        |
-| batch size            | 2          |
 | gradient accumulation | 200        |
-| max length            | 2048       |
 Experimentations
 ----------------

 |     Hyperparameter    |    Value   |
 |:---------------------:|:----------:|
+|       label smoothing | 0.05       |
+|              optimize | AdamW      |
+|                 betas | 0.9, 0.999 |
+|         learning rate | 5e-6       |
+|       anneal strategy | cos        |
+|            div factor | 100        |
+|      final div factor | 0.1        |
+|            batch size | 2          |
 | gradient accumulation | 200        |
+|            max length | 2048       |
 Experimentations
 ----------------