svjack
/

diffusers-sdxl-controlnet

Model card Files Files and versions Community

diffusers-sdxl-controlnet / examples /research_projects /dreambooth_inpaint /README.md

svjack

Upload 1392 files

43b7e92 verified about 1 month ago

preview code

raw

history blame

4.28 kB

	# Dreambooth for the inpainting model

	This script was added by @thedarkzeno .

	Please note that this script is not actively maintained, you can open an issue and tag @thedarkzeno or @patil-suraj though.

	```bash
	export MODEL_NAME="runwayml/stable-diffusion-inpainting"
	export INSTANCE_DIR="path-to-instance-images"
	export OUTPUT_DIR="path-to-save-model"

	accelerate launch train_dreambooth_inpaint.py \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--instance_data_dir=$INSTANCE_DIR \
	--output_dir=$OUTPUT_DIR \
	--instance_prompt="a photo of sks dog" \
	--resolution=512 \
	--train_batch_size=1 \
	--gradient_accumulation_steps=1 \
	--learning_rate=5e-6 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--max_train_steps=400
	```

	### Training with prior-preservation loss

	Prior-preservation is used to avoid overfitting and language-drift. Refer to the paper to learn more about it. For prior-preservation we first generate images using the model with a class prompt and then use those during training along with our data.
	According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases.

	```bash
	export MODEL_NAME="runwayml/stable-diffusion-inpainting"
	export INSTANCE_DIR="path-to-instance-images"
	export CLASS_DIR="path-to-class-images"
	export OUTPUT_DIR="path-to-save-model"

	accelerate launch train_dreambooth_inpaint.py \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--instance_data_dir=$INSTANCE_DIR \
	--class_data_dir=$CLASS_DIR \
	--output_dir=$OUTPUT_DIR \
	--with_prior_preservation --prior_loss_weight=1.0 \
	--instance_prompt="a photo of sks dog" \
	--class_prompt="a photo of dog" \
	--resolution=512 \
	--train_batch_size=1 \
	--gradient_accumulation_steps=1 \
	--learning_rate=5e-6 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--num_class_images=200 \
	--max_train_steps=800
	```


	### Training with gradient checkpointing and 8-bit optimizer:

	With the help of gradient checkpointing and the 8-bit optimizer from bitsandbytes it's possible to run train dreambooth on a 16GB GPU.

	To install `bitandbytes` please refer to this [readme](https://github.com/TimDettmers/bitsandbytes#requirements--installation).

	```bash
	export MODEL_NAME="runwayml/stable-diffusion-inpainting"
	export INSTANCE_DIR="path-to-instance-images"
	export CLASS_DIR="path-to-class-images"
	export OUTPUT_DIR="path-to-save-model"

	accelerate launch train_dreambooth_inpaint.py \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--instance_data_dir=$INSTANCE_DIR \
	--class_data_dir=$CLASS_DIR \
	--output_dir=$OUTPUT_DIR \
	--with_prior_preservation --prior_loss_weight=1.0 \
	--instance_prompt="a photo of sks dog" \
	--class_prompt="a photo of dog" \
	--resolution=512 \
	--train_batch_size=1 \
	--gradient_accumulation_steps=2 --gradient_checkpointing \
	--use_8bit_adam \
	--learning_rate=5e-6 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--num_class_images=200 \
	--max_train_steps=800
	```

	### Fine-tune text encoder with the UNet.

	The script also allows to fine-tune the `text_encoder` along with the `unet`. It's been observed experimentally that fine-tuning `text_encoder` gives much better results especially on faces.
	Pass the `--train_text_encoder` argument to the script to enable training `text_encoder`.

	___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___

	```bash
	export MODEL_NAME="runwayml/stable-diffusion-inpainting"
	export INSTANCE_DIR="path-to-instance-images"
	export CLASS_DIR="path-to-class-images"
	export OUTPUT_DIR="path-to-save-model"

	accelerate launch train_dreambooth_inpaint.py \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--train_text_encoder \
	--instance_data_dir=$INSTANCE_DIR \
	--class_data_dir=$CLASS_DIR \
	--output_dir=$OUTPUT_DIR \
	--with_prior_preservation --prior_loss_weight=1.0 \
	--instance_prompt="a photo of sks dog" \
	--class_prompt="a photo of dog" \
	--resolution=512 \
	--train_batch_size=1 \
	--use_8bit_adam \
	--gradient_checkpointing \
	--learning_rate=2e-6 \
	--lr_scheduler="constant" \
	--lr_warmup_steps=0 \
	--num_class_images=200 \
	--max_train_steps=800
	```