Great model

#1
by SerialKicked - opened

This is seriously a very good model, no idea how I didn't notice it sooner.

It's showing a surprisingly good understanding of scenes and continuity (at least in the 20K context range). it's adaptive, and the intelligence is decently preserved. Qwentile is a great addition in that regard.

I disagree regarding your statement that it's not a CoT model, though. While it'll very rarely fumble, most of the time it can do CoT just fine as long as the thinking tag is prefilled. CoT will damage style for creative tasks, sure (and as usual, really), but it works just fine for more complex 1 shot questions / evaluations. (I guess the fumbles are a result of the think tags not being tokenized)

Some Chinese occasionally slips through, even when system prompted not to (it also translates it most of the time in the same sentence, which is kinda funny), but it's so rare that it's hard to hold it against the model. Plus maybe the quantization (q4ks) I'm running at has a part in it.

Thanks. I have noticed Chinese sneaks in as well, but it seems to generally translate to the correct concept. Alibaba did a really good job getting cross-language understanding; which probably organized the hidden space in a way that put the tokens very close together.

Yeah they did a good job with language support. It seems more rare with Qwen 3 models, but they have there own different host of issues. As I said, it's really not a common / big problem. I just had to find something to nitpick.

Anyway, I'll publish a IQ4_NL version, as I noticed mradermacher didn't make one, and it's quickly becoming my favorite quant level at 24GB.

Have a nice week-end.

I'll be sure to link it once you post it.

I generally use the IQ4_NL quants myself unless mixed precision ones are available. If you are interested in advanced quantization techniques, you should look in to how unsloth does their UD quants, and w4a16/w4a8 style consolidating around https://github.com/vllm-project/llm-compressor.

maldv changed discussion status to closed

Thanks for the link, I'll have a look!

It's done btw.
https://huggingface.co/SerialKicked/QwentileLambda2.5-32B-Instruct-GGUF-IQ4_NL

Cheers.

Sign up or log in to comment