Text-to-Image
Diffusers
stable-diffusion
File size: 4,941 Bytes
8d5ebb6
 
 
 
 
08f9888
8d5ebb6
 
 
 
 
 
63ecbd6
8d5ebb6
d933bd7
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
d2ebdfc
8d5ebb6
 
 
aa15e49
8d5ebb6
 
 
d2ebdfc
 
 
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa15e49
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1dcaccf
 
8d5ebb6
 
d2ebdfc
8d5ebb6
 
 
aa15e49
397f5c2
8d5ebb6
 
d2ebdfc
 
 
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d2ebdfc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
library_name: diffusers
---

# SDXL-Lightning

![Intro Image](images/intro.jpg)

SDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps. For more information, please refer to our paper: [SDXL-Lightning: Progressive Adversarial Diffusion Distillation](https://huggingface.co/ByteDance/SDXL-Lightning/resolve/main/sdxl_lightning_report.pdf). The models are released for research purposes only.

Our models are distilled from [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0). This repository contains checkpoints for 1-step, 2-step, 4-step, and 8-step distilled models. The generation quality of our 2-step, 4-step, and 8-step model is amazing. Our 1-step model is more experimental.

We provide both full UNet and LoRA checkpoints. The full UNet models have the best quality while the LoRA models can be applied to other base models.


## Diffusers Usage

Please always use the correct checkpoint for the corresponding inference steps.

### 2-Step, 4-Step, 8-Step UNet

```python
import torch
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_4step_unet.pth" # Use the correct ckpt for your step setting!

# Load model.
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(torch.load(hf_hub_download(repo, ckpt), map_location="cuda"))
pipe = StableDiffusionXLPipeline.from_pretrained(base, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")

# Ensure sampler uses "trailing" timesteps.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=4, guidance_scale=0).images[0].save("output.png")
```

### 2-Step, 4-Step, 8-Step LoRA

```python
import torch
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_4step_lora.pth" # Use the correct ckpt for your step setting!

# Load model.
pipe = StableDiffusionXLPipeline.from_pretrained(base, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo, ckpt))
pipe.fuse_lora()

# Ensure sampler uses "trailing" timesteps.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=4, guidance_scale=0).images[0].save("output.png")
```

### 1-Step UNet

The 1-step model uses "sample" prediction instead of "epsilon" prediction! The scheduler needs to be configured correctly.

```python
import torch
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_1step_unet_x0.pth" # Use the correct ckpt for your step setting!

# Load model.
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(torch.load(hf_hub_download(repo, ckpt), map_location="cuda"))
pipe = StableDiffusionXLPipeline.from_pretrained(base, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")

# Ensure sampler uses "trailing" timesteps and "sample" prediction type.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing", prediction_type="sample")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=1, guidance_scale=0).images[0].save("output.png")
```


## ComfyUI Usage

Please always use the correct checkpoint for the corresponding inference steps.
Please use Euler sampler with sgm_uniform scheduler.

### 2-Step, 4-Step, 8-Step UNet

1. Download the UNet checkpoint to `/ComfyUI/models/unet`.
2. Download our [ComfyUI UNet workflow](comfyui/sdxl_lightning_unet.json).

![SDXL-Lightning ComfyUI UNet Workflow](images/comfyui_unet.png)

### 2-Step, 4-Step, 8-Step LoRA

1. Download the LoRA checkpoint to `/ComfyUI/models/loras`
2. Download our [ComfyUI LoRA workflow](comfyui/sdxl_lightning_lora.json).

![SDXL-Lightning ComfyUI UNet Workflow](images/comfyui_lora.png)

### 1-Step UNet

ComfyUI does not support changing the model formulation to x0-prediction, so it is not usable in ComfyUI yet. Hopefully, ComfyUI will get updated soon.