Gemma-2-Ataraxy-Gemmasutra-9B-slerp
๐ป Usage
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "recoilme/Gemma-2-Ataraxy-Gemmasutra-9B-slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 29.87 |
IFEval (0-Shot) | 76.49 |
BBH (3-Shot) | 42.25 |
MATH Lvl 5 (4-Shot) | 1.74 |
GPQA (0-shot) | 10.74 |
MuSR (0-shot) | 12.39 |
MMLU-PRO (5-shot) | 35.63 |
Open Portuguese LLM Leaderboard Evaluation Results
Detailed results can be found here and on the ๐ Open Portuguese LLM Leaderboard
Metric | Value |
---|---|
Average | 73.97 |
ENEM Challenge (No Images) | 75.65 |
BLUEX (No Images) | 64.26 |
OAB Exams | 53.76 |
Assin2 RTE | 93.21 |
Assin2 STS | 80.91 |
FaQuAD NLI | 77.39 |
HateBR Binary | 87.61 |
PT Hate Speech Binary | 66.84 |
tweetSentBR | 66.14 |
- Downloads last month
- 4,839
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for recoilme/Gemma-2-Ataraxy-Gemmasutra-9B-slerp
Space using recoilme/Gemma-2-Ataraxy-Gemmasutra-9B-slerp 1
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard76.490
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard42.250
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard1.740
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.740
- acc_norm on MuSR (0-shot)Open LLM Leaderboard12.390
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard35.630
- accuracy on ENEM Challenge (No Images)Open Portuguese LLM Leaderboard75.650
- accuracy on BLUEX (No Images)Open Portuguese LLM Leaderboard64.260
- accuracy on OAB ExamsOpen Portuguese LLM Leaderboard53.760
- f1-macro on Assin2 RTEtest set Open Portuguese LLM Leaderboard93.210