FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview
#1
by
AaronFeng753
- opened
I know this name is a mouthful, but this model achieves even better scores than the official DeepSeek-R1-Distill-Qwen-32B & QwQ-32B-Preview models.
https://huggingface.co/FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview
Models | AIME24 | MATH500 | GSM8K | GPQA-Diamond | ARC-Challenge | MMLU-Pro | MMLU | LiveCodeBench |
---|---|---|---|---|---|---|---|---|
o1-preview | 44.60 | 85.50 | - | 73.30 | - | - | 90.80 | - |
o1-mini | 63.60 | 90.00 | - | 60.00 | - | 80.30 | 85.20 | 53.80 |
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 46.67 | 88.20 | - | 57.58 | - | - | - | - |
Qwen/QwQ-32B-Preview | 43.33 | 87.80 | 95.45 | 49.49 | 95.73 | 63.49 | 85.19 | 51.86 |
NovaSky-AI/Sky-T1-32B-Preview | 43.33 | 86.80 | 95.15 | 50.51 | 95.56 | 65.80 | 82.71 | 51.66 |
Qwen/Qwen2.5-32B-Instruct | 20.00 | 81.60 | 93.63 | 46.46 | 95.22 | 56.27 | 79.63 | 48.53 |
FuseAI/FuseO1-DeekSeekR1-Qwen2.5-Instruct-32B-Preview | 46.67 | 87.20 | - | 55.05 | - | - | - | - |
FuseAI/FuseO1-DeekSeekR1-QwQ-32B-Preview | 56.67 | 85.60 | - | 62.12 | - | - | - | - |
FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview | 60.00 | 90.00 | - | 62.12 | - | - | - | - |
Could you consider upload gguf for this model? Thanks!
@AaronFeng753
Hey I've seen you around lately, I just got the Q4_K_M
of that to try from the "official" repo: https://huggingface.co/FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-GGUF/tree/main
But yeah folks including myself like bartowski's IQ flavors too!
Curious to see how the new DeepSeek-R1-Distill merges pan out!
I plana to make this soon, pipeline is clogged with R1 full haha :)