Training hyperparameters

by chenuneris - opened Sep 1, 2023

Discussion

chenuneris

Sep 1, 2023

•

edited Sep 1, 2023

Hey friend,

Amazing work, thanks very much for sharing it.
Could you also share the training parameters like Learning rate, optimizer, etc...?

CyberNative

Owner Sep 6, 2023

•

edited Sep 6, 2023

Thanks! Here you go:

lora_r: 256
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true

gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 3
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
logging_steps: 10
xformers_attention: false
flash_attention: true

warmup_steps: 10
eval_steps: 500
weight_decay: 0.0

chenuneris changed discussion status to closed Oct 13, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment