mpasila commited on
Commit
b5d0b3d
·
verified ·
1 Parent(s): 03ac665

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -19,6 +19,30 @@ Uses Llama 3.1 formatting.
19
 
20
  Merged model: [mpasila/Llama-3.1-Discord-Short-8B](https://huggingface.co/mpasila/Llama-3.1-Discord-Short-8B)
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  # Uploaded model
23
 
24
  - **Developed by:** mpasila
 
19
 
20
  Merged model: [mpasila/Llama-3.1-Discord-Short-8B](https://huggingface.co/mpasila/Llama-3.1-Discord-Short-8B)
21
 
22
+ Trained with regular LoRA (not quantized/QLoRA) and LoRA rank was 128 and Alpha set to 32. Trained for 1 epoch using A40 for about 5,5 hours.
23
+
24
+ ```python
25
+ args = UnslothTrainingArguments(
26
+ per_device_train_batch_size = 1,
27
+ gradient_accumulation_steps = 8,
28
+
29
+ warmup_ratio = 0.1,
30
+ num_train_epochs = 1,
31
+
32
+ learning_rate = 5e-5,
33
+ embedding_learning_rate = 5e-6,
34
+
35
+ fp16 = not is_bfloat16_supported(),
36
+ bf16 = is_bfloat16_supported(),
37
+ logging_steps = 1,
38
+ optim = "adamw_8bit",
39
+ weight_decay = 0.00,
40
+ lr_scheduler_type = "cosine",
41
+ seed = 3407,
42
+ output_dir = "outputs",
43
+ ),
44
+ ```
45
+
46
  # Uploaded model
47
 
48
  - **Developed by:** mpasila