Fine-Tuned
Collection
41 items
β’
Updated
β’
7
This model is a fine-tuned version of the powerful Qwen/Qwen2-72B-Instruct
, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.
This model is suitable for a wide range of applications, including but not limited to:
All GGUF models are available here: MaziyarPanahi/calme-2.1-qwen2-72b-GGUF
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 43.61 |
IFEval (0-Shot) | 81.63 |
BBH (3-Shot) | 57.33 |
MATH Lvl 5 (4-Shot) | 36.03 |
GPQA (0-shot) | 17.45 |
MuSR (0-shot) | 20.15 |
MMLU-PRO (5-shot) | 49.05 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
truthfulqa_mc2 | 2 | none | 0 | acc | 0.6761 | Β± | 0.0148 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
winogrande | 1 | none | 5 | acc | 0.8248 | Β± | 0.0107 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 25 | acc | 0.6852 | Β± | 0.0136 |
none | 25 | acc_norm | 0.7184 | Β± | 0.0131 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
gsm8k | 3 | strict-match | 5 | exact_match | 0.8582 | Β± | 0.0096 |
flexible-extract | 5 | exact_match | 0.8893 | Β± | 0.0086 |
This model uses ChatML
prompt template:
<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="MaziyarPanahi/calme-2.1-qwen2-72b")
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")
As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.