YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ORPO

`Updates (24.03.25)`

Sample script for ORPOTrainer in 🤗TRL is added to trl/test_orpo_trainer_demo.py
New model, 🤗kaist-ai/mistral-orpo-capybara-7k, is added to 🤗ORPO Collection
Now you can try ORPO in 🤗TRL and Axolotl🔥
We are making general guideline for training LLMs with ORPO, stay tuned🔥
Mistral-ORPO-β achieved a 14.7% in the length-controlled (LC) win rate on official AlpacaEval Leaderboard🔥

This is the official repository for ORPO: Monolithic Preference Optimization without Reference Model. The detailed results in the paper can be found in:

`Model Checkpoints`

Our models trained with ORPO can be found in:

Mistral-ORPO-Capybara-7k: 🤗 kaist-ai/mistral-orpo-capybara-7k
Mistral-ORPO-⍺: 🤗 kaist-ai/mistral-orpo-alpha
Mistral-ORPO-β: 🤗 kaist-ai/mistral-orpo-beta

And the corresponding logs for the average log probabilities of chosen/rejected responses during training are reported in:

Mistral-ORPO-Capybara-7k: TBU
Mistral-ORPO-⍺: Wandb Report for Mistral-ORPO-⍺
Mistral-ORPO-β: Wandb Report for Mistral-ORPO-β

`AlpacaEval`

Description of the image — **Figure 1.** AlpacaEval 2.0 score for the models trained with different alignment methods.

`MT-Bench`

`IFEval`

IFEval scores are measured with EleutherAI/lm-evaluation-harness by applying the chat template. The scores for Llama-2-Chat (70B), Zephyr-β (7B), and Mixtral-8X7B-Instruct-v0.1 are originally reported in this tweet.

Model Type	Prompt-Strict	Prompt-Loose	Inst-Strict	Inst-Loose
Llama-2-Chat (70B)	0.4436	0.5342	0.5468	0.6319
Zephyr-β (7B)	0.4233	0.4547	0.5492	0.5767
Mixtral-8X7B-Instruct-v0.1	0.5213	0.5712	0.6343	0.6823
Mistral-ORPO-⍺ (7B)	0.5009	0.5083	0.5995	0.6163
Mistral-ORPO-β (7B)	0.5287	0.5564	0.6355	0.6619

Downloads last month: 9

Safetensors

Model size

620M params

Tensor type

BF16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.