On GGUF quantization code and UNET extraction

#1
by city96 - opened

Thank god someone else got the quantization script to work as well lol.
Did you convert from the diffusers unet back to the reference format or did you just merge the two reference ones? If it's the later just disregard this.

Hi, I merged the two original models in Comfy, I modified the native save checkpoint node to save only the unet. I didn't convert it to other format, at least not intentionally. In your script I had to hardcode the model architecture to flux and comment the key length check to get it running.
Thank you for the script btw, I couldn't have done it without it! I'm not so good at coding.
Here's the node I used to save the unet: https://github.com/Anibaaal/ComfyUI-UX-Nodes/blob/main/utils.py

Ah, it looks like the keys you ended up with are invalid. You should remove the model.diffusion_model. part and reupload them with that change as that's non standard and not consistent with what the reference uses. The key checks were there for a reason, please don't disable them lol.

image.png

This should be usable to modify your original checkpoint:

from safetensors.torch import load_file, save_file
sd = load_file("your_unet.safetensors")
sd = {k.replace("model.diffusion_model.", ""):v for k,v in sd.items()}
save_file(sd, "your_fixed_unet.safetensors")

Thanks for your help and orientation, I will try that and update it asap.
It seems that when I made q4, 5 and 8 I had an old version of your script without the key checks, I only updated it when I tried q4_1 and q5_1 because they were super slow in comfy to see if it would improve and I disabled the checks :) sorry.

No worries! This stuff is still evolving pretty fast so just trying to make sure we end up with at least somewhat similar quants across the board lol.

It went well. The corrected files are uploaded, thanks again!

Anibaaal changed discussion status to closed

Hey, just letting you know I've updated the instructions on how to make K quants too if that's something you're interested in. It's a bit more involved since it requires compiling llama.cpp but overall it's cleaner than the hacky python script from before lol.
Also added the key replace/removal change into the main convert script to make sure the format is always correct (together with instructions on how to convert a diffusers model, there's a base ComfyUI node that can do this now).

Hey! Ohh nice!! I will try your new script, I attempted to make the K ones the other day but I ended giving up. Thanks for letting me know !

Good luck and report back if you get stuck on anything!

You may want to renamed the current ones if you add the K ones to avoid confusion, you can use the full lcpp names (Q4_0, Q4_K_S, etc), it tells you those when you run llama-quantize without any arguments.

Sign up or log in to comment