nihalnayak
commited on
Commit
•
1e33e79
1
Parent(s):
a8e503d
Update README.md
Browse files
README.md
CHANGED
@@ -123,7 +123,7 @@ The training takes about 4 days on four GPUs to complete.
|
|
123 |
|
124 |
We use the following hyperparameters:
|
125 |
- Q-LoRA rank (r): 64
|
126 |
-
- Q-LoRA scaling factor (
|
127 |
- Q-LoRA dropout: 0
|
128 |
- Optimizer: Paged AdamW
|
129 |
- Learning rate scheduler: linear
|
|
|
123 |
|
124 |
We use the following hyperparameters:
|
125 |
- Q-LoRA rank (r): 64
|
126 |
+
- Q-LoRA scaling factor (alpha): 4
|
127 |
- Q-LoRA dropout: 0
|
128 |
- Optimizer: Paged AdamW
|
129 |
- Learning rate scheduler: linear
|