Qwen3
Collection
Models from the Qwen3 series
โข
11 items
โข
Updated
โข
2
This is a MXFP4 quant of Qwen3-Next-80B-A3B-Thinking
Download the latest llama.cpp in order to use it.
The context has been extended from 256k to 1M, with YaRN as seen on the repo
To enable it, run llama.cpp with options like:--ctx-size 0 --rope-scaling yarn --rope-scale 4
ctx-size 0 sets it to 1M context, else set a smaller number like 524288 for 512k
You can use also as normal if you don't want the extended context.
4-bit
Base model
Qwen/Qwen3-Next-80B-A3B-Thinking