metadata
library_name: diffusers
MGIE
This repository contains the UNet and LLaVA model checkpoints from Guiding Instruction-based Image Editing via Multimodal Large Language Models.
For a detailed example of usage, refer to this notebook and the official repository.
Citation
@inproceedings{fu2024mgie,
author = {Tsu-Jui Fu and Wenze Hu and Xianzhi Du and William Yang Wang and Yinfei Yang, and Zhe Gan},
title = {{Guiding Instruction-based Image Editing via Multimodal Large Language Models}},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2024}
}