request: IMat GGUF quants QueenLiz-120B

#11
by Noodlz - opened

https://huggingface.co/Noodlz/QueenLiz-120B

hi! i just made a new 120B merge and would love your help quantizing this into iMat GGUF formats for the community to try out.

Sure, congrats on your "first successfull merge". It's top place in my queue now, but it will take a few days.

(Hmm, quartetanemoi, that gets me interested)

thanks!! excited to see people try it out and put it through its paces so i can learn and make more / improve :)

Cheers - https://huggingface.co/mradermacher/QueenLiz-120B-i1-GGUF

it's half way there, the remaining quants will come in during the next day or two.

mradermacher changed discussion status to closed

Or maybe it will be slower, as it is currently sharing the server with grok, so runs at about half speed.

oh wow~~~ this is amazing. thanks!!! will share on my thread on reddit =)

Hope it comes in useful.

I gave it a try as well, and it looks quite good actually, although I feel it more often deviates from instructions that quartetanemoi.

I'll also quantize and try QLiz and see what that does.

Such a pity, it seems QLiz immediately overflows when trying to create an imatrix. This is likely a problem with the model weights. Such a shame, I was looking forward to trying it out, but it seems there will be no imatrix quants (only static ones).

oh interesting. is it because because of the way i merged it? i did a linear merge and sliced it per like every 20 layers out of the 80 layers, but staggered. (so like 1-20 for one slice of modelA, 10-30 for the model B, 20-40 for model A and so on)If you know any insights on how i can improve please do let me know =)

It's relatively common - basically, some values overflow somewhere in the many billion places in a tensor and the error propagates. Happens to a lot of models. I am not an expert, but I think there is nothing that can be done "right" - it sometimes happens, and you have no chance of preventing it. There are also half a dozen other problems that can happen (many of which are bugs in llama.cpp), so it's really just bad luck.

Aw damn ok. Welp I’ll be making more blends. I actually just made one last night.

https://huggingface.co/Noodlz/DolphinStar-12.5B

Basically merge of eric Hartford’s dolphin 2.8 uncensored off of mistral 7B v0.2, adding starlingLM 7B beta which is a finetune of the older v0.1

Would ya wanna quanitze this for imat / test it out?

Sure, it's in the queue. It's not as if I am stingy in what I quantize :)

Awesome! Thanks again =)

Sign up or log in to comment