kaizuberbuehler
commited on
Commit
•
853042c
1
Parent(s):
451dbc4
Update README.md
Browse files
README.md
CHANGED
@@ -12,22 +12,22 @@ license: llama3
|
|
12 |
|
13 |
## Training Details
|
14 |
|
15 |
-
Hardware: 1x RTX 4090
|
16 |
-
Duration: 30 hours in total (2 hours for first phase and 28 hours for second phase)
|
17 |
|
18 |
### Hyperparameters
|
19 |
|
20 |
-
Adapter: QLoRA
|
21 |
-
Precision: 4 bit
|
22 |
-
Optimizer: adamw_bnb_8bit
|
23 |
-
LoRA Rank: 256
|
24 |
-
LoRA Alpha: 256
|
25 |
-
Learning Rate: 1e-5
|
26 |
-
Context Length: 4096 tokens
|
27 |
-
Batch Size: 1
|
28 |
-
Gradient Accumulation Steps: 1
|
29 |
-
Sample Packing: Off for first phase, on for second phase
|
30 |
-
Epochs: 2
|
31 |
|
32 |
## Limitations
|
33 |
|
|
|
12 |
|
13 |
## Training Details
|
14 |
|
15 |
+
Hardware: 1x RTX 4090
|
16 |
+
Duration: 30 hours in total (2 hours for first phase and 28 hours for second phase)
|
17 |
|
18 |
### Hyperparameters
|
19 |
|
20 |
+
Adapter: QLoRA
|
21 |
+
Precision: 4 bit
|
22 |
+
Optimizer: adamw_bnb_8bit
|
23 |
+
LoRA Rank: 256
|
24 |
+
LoRA Alpha: 256
|
25 |
+
Learning Rate: 1e-5
|
26 |
+
Context Length: 4096 tokens
|
27 |
+
Batch Size: 1
|
28 |
+
Gradient Accumulation Steps: 1
|
29 |
+
Sample Packing: Off for first phase, on for second phase
|
30 |
+
Epochs: 2
|
31 |
|
32 |
## Limitations
|
33 |
|