Technonia
/

Llama-3-8B-DPO-orca-NEFTune

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Model Card for Model ID

Model Details

Finetune Llama-3-8B model with Orca-DPO dataset.

Training Details

Training Data

Trained on Orca dataset (DPO).

Training Procedure

Add NEFTune module for robustness, and fine-tune the model with DPO trainer.

Training Hyperparameters

lora_alpha = 16
lora_r = 64
lora_dropout = 0.1
adam_beta1 = 0.9
adam_beta2 = 0.999
weight_decay = 0.001
max_grad_norm = 0.3
learning_rate = 2e-4
bnb_4bit_quant_type = nf4
optim = "paged_adamw_32bit"
optimizer_type = "paged_adamw_32bit"
max_steps = 5000
gradient_accumulation_steps = 4

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model’s pipeline type. Check the docs .