Any chance of Qwen Image Edit quants?

#10

by willhsmit - opened Aug 19

Discussion

willhsmit

Aug 19

•

edited Aug 19

Any chance of Qwen Image Edit quants?

There are some from QuantStack, but the quality seems quite low at <= Q3. Maybe they're not using the "new dynamic logic where the first/last layer is kept in high precision" technique.

Edit: I should mention that your set of quants is great and gives really good results even down at Q3.

willhsmit

Aug 20

Closing because I see the metadata has the same structure in the QuantStack quants, so that's probably not the issue.

willhsmit changed discussion status to closed Aug 20

city96

Owner Aug 20

Yeah, I think those are also created with the latest code, which has the dynamic logic enabled by default for Qwen Image.

Might try and see if it can be improved for the edit model, though (iirc I was seeing something similar with the I2V wan model before, where it was more sensitive to quantization than the T2V so who knows)

Phil2Sat

4 days ago

•

edited 4 days ago

for image-edit 2509, did a fist test, have to redo with all models but clean image on Q2_K quant:

// first/last block high precision test
if (arch == LLM_ARCH_QWEN_IMAGE){
    if (
        (name.find("transformer_blocks.0.") != std::string::npos) ||
        (name.find("transformer_blocks.59.") != std::string::npos) // this should be dynamic
    ) {
        if (ftype == LLAMA_FTYPE_MOSTLY_Q2_K || 
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_S ||
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_M || 
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_L ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_0 ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_1 ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_K_S ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_K_M) {
            new_type = GGML_TYPE_Q5_K;  // Minimum Q5_K for low quants
        }
        else if (ftype == LLAMA_FTYPE_MOSTLY_Q5_K_M) {
            new_type = GGML_TYPE_Q6_K;
        }
    }
}

did the trick, further testing...

4-step

wsbagnsv1

4 days ago

for image-edit 2509, did a fist test, have to redo with all models but clean image on Q2_K quant:

// first/last block high precision test
if (arch == LLM_ARCH_QWEN_IMAGE){
    if (
        (name.find("transformer_blocks.0.") != std::string::npos) ||
        (name.find("transformer_blocks.59.") != std::string::npos) // this should be dynamic
    ) {
        if (ftype == LLAMA_FTYPE_MOSTLY_Q2_K || 
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_S ||
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_M || 
            ftype == LLAMA_FTYPE_MOSTLY_Q3_K_L ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_0 ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_1 ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_K_S ||
            ftype == LLAMA_FTYPE_MOSTLY_Q4_K_M) {
            new_type = GGML_TYPE_Q5_K;  // Minimum Q5_K for low quants
        }
        else if (ftype == LLAMA_FTYPE_MOSTLY_Q5_K_M) {
            new_type = GGML_TYPE_Q6_K;
        }
    }
}

did the trick, further testing...

so simply a bit higher with first last block? That would be rather easy, nice!

Phil2Sat

4 days ago

•

edited 4 days ago

as i said, testing, but compared to what i normally get is that like c64 vs. vga and its my own gguf from https://huggingface.co/Phr00t/Qwen-Image-Edit-Rapid-AIO so including some loras for 4 + 8 step
further pushing this 28 tensors to q8_0 improves allover quality, since i got bad resullts with 8-step i tried.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment