MaziyarPanahi/calme-2.1-qwen2-72b-GGUF

The GGUF and quantized models here are based on MaziyarPanahi/calme-2.1-qwen2-72b model

How to download

You can download only the quants you need instead of cloning the entire repository as follows:

huggingface-cli download MaziyarPanahi/calme-2.1-qwen2-72b-GGUF --local-dir . --include '*Q2_K*gguf'

Load GGUF models

./llama.cpp/main -m mode_name.Q2_K.gguf -p "<|im_start|>user\nJust say 1, 2, 3 hi and NOTHING else\n<|im_end|>\n<|im_start|>assistant\n" -n 1024

Original README

MaziyarPanahi/calme-2.1-qwen2-72b

This is a fine-tuned version of the Qwen/Qwen2-72B-Instruct model. It aims to improve the base model across all benchmarks.

⚡ Quantized GGUF

All GGUF models are available here: MaziyarPanahi/calme-2.1-qwen2-72b-GGUF

🏆 Open LLM Leaderboard Evaluation Results

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
truthfulqa_mc2	2	none	0	acc	0.6761	±	0.0148

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
winogrande	1	none	5	acc	0.8248	±	0.0107

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
arc_challenge	1	none	25	acc	0.6852	±	0.0136
		none	25	acc_norm	0.7184	±	0.0131

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
gsm8k	3	strict-match	5	exact_match	0.8582	±	0.0096
		flexible-extract	5	exact_match	0.8893	±	0.0086

Prompt Template

This model uses ChatML prompt template:

<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}

How to use


# Use a pipeline as a high-level helper

from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="MaziyarPanahi/calme-2.1-qwen2-72b")
pipe(messages)


# Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")

MaziyarPanahi
/

calme-2.1-qwen2-72b-GGUF