For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

Feel free to request for other models for compression as well, although compressing models that do not use the Flux architecture might be tricky for me.

This compressed model was made from rockerBOO/flux.1-dev-SRPO's BF16 quantization. Thanks to rockerBOO, without which I would not have been able to directly work with. (My PC only has 48 GB of system RAM, too little to work with a 12B model in FP32 precision)

How to Use

`diffusers`

Install the DFloat11 pip package (installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed):
```
pip install dfloat11[cuda12]
# or if you have CUDA version 11:
# pip install dfloat11[cuda11]
```

To use the DFloat11 model, run the following example code in Python:

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from dfloat11 import DFloat11Model
with no_init_weights():
  transformer = FluxTransformer2DModel.from_config(
      FluxTransformer2DModel.load_config(
          "black-forest-labs/FLUX.1-dev",
          subfolder="transformer"
      ),
      torch_dtype=torch.bfloat16
  ).to(torch.bfloat16)

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    transformer=transformer,
    torch_dtype=torch.bfloat16
)
DFloat11Model.from_pretrained('mingyi456/SRPO-DF11', device='cpu', bfloat16_model=pipe.transformer)
pipe.enable_model_cpu_offload()
prompt = "A futuristic cityscape at sunset, with flying cars, neon lights, and reflective water canals"
image = pipe(
    prompt,
    guidance_scale=3.5,
    num_inference_steps=30,
    max_sequence_length=256,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("SPRO.png")

ComfyUI

Refer to this model page instead, and follow the instructions there.

Downloads last month: 39

Model tree for mingyi456/SRPO-DF11

Base model

tencent/SRPO

Quantized

(5)

this model