request: IMat GGUF quants QueenLiz-120B

#11

by Noodlz - opened Mar 29, 2024

Mar 29, 2024

https://huggingface.co/Noodlz/QueenLiz-120B

hi! i just made a new 120B merge and would love your help quantizing this into iMat GGUF formats for the community to try out.

mradermacher

Owner Mar 29, 2024

Sure, congrats on your "first successfull merge". It's top place in my queue now, but it will take a few days.

(Hmm, quartetanemoi, that gets me interested)

Noodlz

Mar 29, 2024

thanks!! excited to see people try it out and put it through its paces so i can learn and make more / improve :)

mradermacher

Owner Mar 30, 2024

Cheers - https://huggingface.co/mradermacher/QueenLiz-120B-i1-GGUF

it's half way there, the remaining quants will come in during the next day or two.

mradermacher changed discussion status to closed Mar 30, 2024

mradermacher

Owner Mar 30, 2024

Or maybe it will be slower, as it is currently sharing the server with grok, so runs at about half speed.

Noodlz

Apr 1, 2024

oh wow~~~ this is amazing. thanks!!! will share on my thread on reddit =)

mradermacher

Owner Apr 1, 2024

Hope it comes in useful.

I gave it a try as well, and it looks quite good actually, although I feel it more often deviates from instructions that quartetanemoi.

I'll also quantize and try QLiz and see what that does.

mradermacher

Owner Apr 2, 2024

Such a pity, it seems QLiz immediately overflows when trying to create an imatrix. This is likely a problem with the model weights. Such a shame, I was looking forward to trying it out, but it seems there will be no imatrix quants (only static ones).

Noodlz

Apr 11, 2024

oh interesting. is it because because of the way i merged it? i did a linear merge and sliced it per like every 20 layers out of the 80 layers, but staggered. (so like 1-20 for one slice of modelA, 10-30 for the model B, 20-40 for model A and so on)If you know any insights on how i can improve please do let me know =)

mradermacher

Owner Apr 11, 2024

It's relatively common - basically, some values overflow somewhere in the many billion places in a tensor and the error propagates. Happens to a lot of models. I am not an expert, but I think there is nothing that can be done "right" - it sometimes happens, and you have no chance of preventing it. There are also half a dozen other problems that can happen (many of which are bugs in llama.cpp), so it's really just bad luck.

Noodlz

Apr 11, 2024

Aw damn ok. Welp I’ll be making more blends. I actually just made one last night.

https://huggingface.co/Noodlz/DolphinStar-12.5B

Basically merge of eric Hartford’s dolphin 2.8 uncensored off of mistral 7B v0.2, adding starlingLM 7B beta which is a finetune of the older v0.1

Would ya wanna quanitze this for imat / test it out?

mradermacher

Owner Apr 11, 2024

Sure, it's in the queue. It's not as if I am stingy in what I quantize :)

Noodlz

Apr 11, 2024

Awesome! Thanks again =)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment