Diffusers
Safetensors
Configuration Parsing Warning: In UNKNOWN_FILENAME: "diffusers._class_name" must be a string

InteractDiffusion Diffuser Implementation

Project Page | Paper | WebUI | Demo | Video | Diffuser | Colab

How to Use

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained(
    "interactdiffusion/diffusers-v1-2",
    trust_remote_code=True,
    variant="fp16", torch_dtype=torch.float16
)
pipeline = pipeline.to("cuda")

images = pipeline(
    prompt="a person is feeding a cat",
    interactdiffusion_subject_phrases=["person"],
    interactdiffusion_object_phrases=["cat"],
    interactdiffusion_action_phrases=["feeding"],
    interactdiffusion_subject_boxes=[[0.0332, 0.1660, 0.3359, 0.7305]],
    interactdiffusion_object_boxes=[[0.2891, 0.4766, 0.6680, 0.7930]],
    interactdiffusion_scheduled_sampling_beta=1,
    output_type="pil",
    num_inference_steps=50,
    ).images

images[0].save('out.jpg')

For more information, please check the project homepage.

Citation

@inproceedings{hoe2023interactdiffusion,
      title={InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models}, 
      author={Jiun Tian Hoe and Xudong Jiang and Chee Seng Chan and Yap-Peng Tan and Weipeng Hu},
      year={2024},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}

Acknowledgement

This work is developed based on the codebase of GLIGEN and LDM.

Downloads last month
75
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.