|
--- |
|
license: mit |
|
library_name: diffusers |
|
tags: |
|
- text-to-image |
|
- stable-diffusion |
|
- diffusion distillation |
|
--- |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/63943c882b9483beb473ec25/f8ws6nGK2ZkPEiizha2t9.png) |
|
|
|
> [**Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation**](https://github.com/SYZhang0805/DisBack), |
|
> *[Shengyuan Zhang](https://github.com/SYZhang0805)<sup>1</sup>, [Ling Yang](https://github.com/YangLing0818)<sup>2</sup>, [Zejian Li*](https://zejianli.github.io/)<sup>1</sup>, An Zhao<sup>1</sup>, Chenye Meng<sup>1</sup>, Changyuan Yang<sup>3</sup>, Guang Yang<sup>3</sup>, Zhiyuan Yang<sup>3</sup>, [Lingyun Sun](https://person.zju.edu.cn/sly)<sup>1</sup>* |
|
> <sup>1</sup>Zhejiang University <sup>2</sup>Peking University <sup>3</sup>Alibaba Group* |
|
> |
|
## Contact |
|
|
|
Feel free to contact us if you have any questions about the paper! |
|
|
|
Shengyuan Zhang [zhangshengyuan@zju.edu.cn](mailto:zhangshengyuan@zju.edu.cn) |
|
|
|
## Usage |
|
|
|
For one-step text-to-image generation, DisBack can use the standard diffuser pipeline: |
|
|
|
```python |
|
import torch |
|
from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler |
|
from huggingface_hub import hf_hub_download |
|
|
|
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0" |
|
repo_name = "SYZhang0805/DisBack" |
|
ckpt_name = "SDXL_DisBack.bin" |
|
|
|
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16) |
|
unet.load_state_dict(torch.load(hf_hub_download(repo_name, ckpt_name), map_location="cuda")) |
|
|
|
pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, use_safetensors=True, variant="fp16").to("cuda") |
|
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) |
|
prompt="A photo of a dog." |
|
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[399], height=1024, width=1024).images[0] |
|
image.save('output.png', 'PNG') |
|
``` |
|
|
|
For more details, please refer to our [github repository](https://github.com/SYZhang0805/DisBack) |
|
|
|
## License |
|
|
|
DisBack is released under [MIT license](https://choosealicense.com/licenses/mit/) |
|
|
|
## Citation |
|
If you find our paper useful or relevant to your research, please kindly cite our papers: |
|
```bib |
|
@article{zhang2024distributionbacktrackingbuildsfaster, |
|
title={Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation}, |
|
author={Shengyuan Zhang and Ling Yang and Zejian Li and An Zhao and Chenye Meng and Changyuan Yang and Guang Yang and Zhiyuan Yang and Lingyun Sun}, |
|
journal={arXiv 2408.15991}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
Paper: https://huggingface.co/papers/2408.15991 |
|
|
|
## Credits |
|
|
|
DisBack is highly built on the following amazing open-source projects: |
|
|
|
[DMD2](https://tianweiy.github.io/dmd2/): Improved Distribution Matching Distillation for Fast Image Synthesis |
|
|
|
[Diff-Instruct](https://github.com/pkulwj1994/diff_instruct/tree/main): Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models |
|
|
|
[ScoreGAN](https://github.com/White-Link/gpm): Unifying GANs and Score-Based Diffusion as Generative Particle Models |
|
|
|
Thanks to the maintainers of these projects for their contribution to this project! |