|
# Latent Consistency Distillation Example: |
|
|
|
[Latent Consistency Models (LCMs)](https://arxiv.org/abs/2310.04378) is a method to distill a latent diffusion model to enable swift inference with minimal steps. This example demonstrates how to use latent consistency distillation to distill stable-diffusion-v1.5 for inference with few timesteps. |
|
|
|
## Full model distillation |
|
|
|
### Running locally with PyTorch |
|
|
|
#### Installing the dependencies |
|
|
|
Before running the scripts, make sure to install the library's training dependencies: |
|
|
|
**Important** |
|
|
|
To make sure you can successfully run the latest versions of the example scripts, we highly recommend **installing from source** and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment: |
|
```bash |
|
git clone https://github.com/huggingface/diffusers |
|
cd diffusers |
|
pip install -e . |
|
``` |
|
|
|
Then cd in the example folder and run |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
And initialize an [🤗 Accelerate](https://github.com/huggingface/accelerate/) environment with: |
|
|
|
```bash |
|
accelerate config |
|
``` |
|
|
|
Or for a default accelerate configuration without answering questions about your environment |
|
|
|
```bash |
|
accelerate config default |
|
``` |
|
|
|
Or if your environment doesn't support an interactive shell e.g. a notebook |
|
|
|
```python |
|
from accelerate.utils import write_basic_config |
|
write_basic_config() |
|
``` |
|
|
|
When running `accelerate config`, if we specify torch compile mode to True there can be dramatic speedups. |
|
|
|
|
|
#### Example |
|
|
|
The following uses the [Conceptual Captions 12M (CC12M) dataset](https://github.com/google-research-datasets/conceptual-12m) as an example, and for illustrative purposes only. For best results you may consider large and high-quality text-image datasets such as [LAION](https://laion.ai/blog/laion-400-open-dataset/). You may also need to search the hyperparameter space according to the dataset you use. |
|
|
|
```bash |
|
export MODEL_NAME="runwayml/stable-diffusion-v1-5" |
|
export OUTPUT_DIR="path/to/saved/model" |
|
|
|
accelerate launch train_lcm_distill_sd_wds.py \ |
|
--pretrained_teacher_model=$MODEL_NAME \ |
|
--output_dir=$OUTPUT_DIR \ |
|
--mixed_precision=fp16 \ |
|
--resolution=512 \ |
|
--learning_rate=1e-6 --loss_type="huber" --ema_decay=0.95 --adam_weight_decay=0.0 \ |
|
--max_train_steps=1000 \ |
|
--max_train_samples=4000000 \ |
|
--dataloader_num_workers=8 \ |
|
--train_shards_path_or_url="pipe:curl -L -s https://huggingface.co/datasets/laion/conceptual-captions-12m-webdataset/resolve/main/data/{00000..01099}.tar?download=true" \ |
|
--validation_steps=200 \ |
|
--checkpointing_steps=200 --checkpoints_total_limit=10 \ |
|
--train_batch_size=12 \ |
|
--gradient_checkpointing --enable_xformers_memory_efficient_attention \ |
|
--gradient_accumulation_steps=1 \ |
|
--use_8bit_adam \ |
|
--resume_from_checkpoint=latest \ |
|
--report_to=wandb \ |
|
--seed=453645634 \ |
|
--push_to_hub |
|
``` |
|
|
|
## LCM-LoRA |
|
|
|
Instead of fine-tuning the full model, we can also just train a LoRA that can be injected into any SDXL model. |
|
|
|
### Example |
|
|
|
The following uses the [Conceptual Captions 12M (CC12M) dataset](https://github.com/google-research-datasets/conceptual-12m) as an example. For best results you may consider large and high-quality text-image datasets such as [LAION](https://laion.ai/blog/laion-400-open-dataset/). |
|
|
|
```bash |
|
export MODEL_NAME="runwayml/stable-diffusion-v1-5" |
|
export OUTPUT_DIR="path/to/saved/model" |
|
|
|
accelerate launch train_lcm_distill_lora_sd_wds.py \ |
|
--pretrained_teacher_model=$MODEL_NAME \ |
|
--output_dir=$OUTPUT_DIR \ |
|
--mixed_precision=fp16 \ |
|
--resolution=512 \ |
|
--lora_rank=64 \ |
|
--learning_rate=1e-4 --loss_type="huber" --adam_weight_decay=0.0 \ |
|
--max_train_steps=1000 \ |
|
--max_train_samples=4000000 \ |
|
--dataloader_num_workers=8 \ |
|
--train_shards_path_or_url="pipe:curl -L -s https://huggingface.co/datasets/laion/conceptual-captions-12m-webdataset/resolve/main/data/{00000..01099}.tar?download=true" \ |
|
--validation_steps=200 \ |
|
--checkpointing_steps=200 --checkpoints_total_limit=10 \ |
|
--train_batch_size=12 \ |
|
--gradient_checkpointing --enable_xformers_memory_efficient_attention \ |
|
--gradient_accumulation_steps=1 \ |
|
--use_8bit_adam \ |
|
--resume_from_checkpoint=latest \ |
|
--report_to=wandb \ |
|
--seed=453645634 \ |
|
--push_to_hub \ |
|
``` |