Llama 3 8B finetuned on mlabonne/orpo-dpo-mix-40k with ORPO.
Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.

Benchmark LLaMa 3 8B LLaMa 3 8B Inst LLaMa 3 8B ORPO V1 LLaMa 3 8B ORPO V2 (WIP)
MMLU 62.12 63.92 61.87
BoolQ 81.04 83.21 82.42
Winogrande 73.24 72.06 74.43
ARC-Challenge 53.24 56.91 52.90
TriviaQA 63.33 51.09 63.93
GSM-8K (flexible) 50.27 75.13 52.16
SQuAD V2 (f1) 32.48 29.68 33.68
LogiQA 29.23 32.87 30.26
All scores obtained with lm-evaluation-harness v0.4.2
Downloads last month
6
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Z3R6X/Llama-3-8B-ORPO-V1

Quantizations
1 model