Mistral-Small-Drummer-22B
mistralai/Mistral-Small-Instruct-2409 finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.
Method
ORPO tuned with 2xA40 on RunPod for 1 epoch.
learning_rate=4e-6,
lr_scheduler_type="linear",
beta=0.1,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
gradient_accumulation_steps=8,
optim="paged_adamw_8bit",
num_train_epochs=1,
Dataset was prepared using Mistral-Small Instruct format.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value | 
|---|---|
| Avg. | 29.45 | 
| IFEval (0-Shot) | 63.31 | 
| BBH (3-Shot) | 40.12 | 
| MATH Lvl 5 (4-Shot) | 16.69 | 
| GPQA (0-shot) | 12.42 | 
| MuSR (0-shot) | 9.80 | 
| MMLU-PRO (5-shot) | 34.39 | 
- Downloads last month
- 5
Model tree for nbeerbower/Mistral-Small-Drummer-22B
Base model
mistralai/Mistral-Small-Instruct-2409Datasets used to train nbeerbower/Mistral-Small-Drummer-22B
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard63.310
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard40.120
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard16.690
- acc_norm on GPQA (0-shot)Open LLM Leaderboard12.420
- acc_norm on MuSR (0-shot)Open LLM Leaderboard9.800
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard34.390

