metadata

library_name: diffusers

MGIE

This repository contains the UNet and LLaVA model checkpoints from Guiding Instruction-based Image Editing via Multimodal Large Language Models.

For a detailed example of usage, refer to this notebook and the official repository.

Citation

@inproceedings{fu2024mgie,
  author = {Tsu-Jui Fu and Wenze Hu and Xianzhi Du and William Yang Wang and Yinfei Yang, and Zhe Gan}, 
  title = {{Guiding Instruction-based Image Editing via Multimodal Large Language Models}}, 
  booktitle = {International Conference on Learning Representations (ICLR)}, 
  year = {2024} 
}