On GGUF quantization code and UNET extraction

by city96 - opened Aug 19

Aug 19

Thank god someone else got the quantization script to work as well lol.
Did you convert from the diffusers unet back to the reference format or did you just merge the two reference ones? If it's the later just disregard this.

Anibaaal

Owner Aug 19

Hi, I merged the two original models in Comfy, I modified the native save checkpoint node to save only the unet. I didn't convert it to other format, at least not intentionally. In your script I had to hardcode the model architecture to flux and comment the key length check to get it running.
Thank you for the script btw, I couldn't have done it without it! I'm not so good at coding.
Here's the node I used to save the unet: https://github.com/Anibaaal/ComfyUI-UX-Nodes/blob/main/utils.py

city96

Aug 19

Ah, it looks like the keys you ended up with are invalid. You should remove the model.diffusion_model. part and reupload them with that change as that's non standard and not consistent with what the reference uses. The key checks were there for a reason, please don't disable them lol.

This should be usable to modify your original checkpoint:

from safetensors.torch import load_file, save_file
sd = load_file("your_unet.safetensors")
sd = {k.replace("model.diffusion_model.", ""):v for k,v in sd.items()}
save_file(sd, "your_fixed_unet.safetensors")

Anibaaal

Owner Aug 19

•

edited Aug 19

Thanks for your help and orientation, I will try that and update it asap.
It seems that when I made q4, 5 and 8 I had an old version of your script without the key checks, I only updated it when I tried q4_1 and q5_1 because they were super slow in comfy to see if it would improve and I disabled the checks :) sorry.

city96

Aug 19

No worries! This stuff is still evolving pretty fast so just trying to make sure we end up with at least somewhat similar quants across the board lol.

Anibaaal

Owner Aug 19

It went well. The corrected files are uploaded, thanks again!

Anibaaal changed discussion status to closed Aug 19

city96

Aug 21

Hey, just letting you know I've updated the instructions on how to make K quants too if that's something you're interested in. It's a bit more involved since it requires compiling llama.cpp but overall it's cleaner than the hacky python script from before lol.
Also added the key replace/removal change into the main convert script to make sure the format is always correct (together with instructions on how to convert a diffusers model, there's a base ComfyUI node that can do this now).

Anibaaal

Owner Aug 21

Hey! Ohh nice!! I will try your new script, I attempted to make the K ones the other day but I ended giving up. Thanks for letting me know !

city96

Aug 21

Good luck and report back if you get stuck on anything!

You may want to renamed the current ones if you add the K ones to avoid confusion, you can use the full lcpp names (Q4_0, Q4_K_S, etc), it tells you those when you run llama-quantize without any arguments.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment