For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11
Feel free to request for other models for compression as well, although compressing models that do not use the Flux architecture might be tricky for me.
This compressed model was made from rockerBOO/flux.1-dev-SRPO's BF16 quantization. Thanks to rockerBOO, without which I would not have been able to directly work with. (My PC only has 48 GB of system RAM, too little to work with a 12B model in FP32 precision)
How to Use
diffusers
Install the DFloat11 pip package (installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed):
pip install dfloat11[cuda12] # or if you have CUDA version 11: # pip install dfloat11[cuda11]To use the DFloat11 model, run the following example code in Python:
import torch from diffusers import FluxPipeline, FluxTransformer2DModel from dfloat11 import DFloat11Model with no_init_weights(): transformer = FluxTransformer2DModel.from_config( FluxTransformer2DModel.load_config( "black-forest-labs/FLUX.1-dev", subfolder="transformer" ), torch_dtype=torch.bfloat16 ).to(torch.bfloat16) pipe = FluxPipeline.from_pretrained( "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16 ) DFloat11Model.from_pretrained('mingyi456/SRPO-DF11', device='cpu', bfloat16_model=pipe.transformer) pipe.enable_model_cpu_offload() prompt = "A futuristic cityscape at sunset, with flying cars, neon lights, and reflective water canals" image = pipe( prompt, guidance_scale=3.5, num_inference_steps=30, max_sequence_length=256, generator=torch.Generator("cpu").manual_seed(0) ).images[0] image.save("SPRO.png")
ComfyUI
Refer to this model page instead, and follow the instructions there.
- Downloads last month
- 39
Model tree for mingyi456/SRPO-DF11
Base model
tencent/SRPO