sayakpaul
/

mgie

sayakpaul HF staff commited on Feb 18, 2024

Commit

fde4788

verified ·

1 Parent(s): fe38701

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -6,7 +6,14 @@ library_name: diffusers
 This repository contains the UNet and LLaVA model checkpoints from [Guiding Instruction-based Image Editing via Multimodal Large Language Models](https://arxiv.org/abs/2309.17102).
-For a detailed example of usage, refer to [this notebook](https://github.com/apple/ml-mgie/blob/main/demo.ipynb) and the [official repository](https://github.com/apple/ml-mgie).
 ## Citation

 This repository contains the UNet and LLaVA model checkpoints from [Guiding Instruction-based Image Editing via Multimodal Large Language Models](https://arxiv.org/abs/2309.17102).
+For a detailed example of usage, refer to [this notebook](https://github.com/apple/ml-mgie/blob/main/demo.ipynb) and the [official repository](https://github.com/apple/ml-mgie). Additionally, this notebook is a memory-optimized version of the original one. This decouples the MGIE inference pipeline into two broad stages:
+1. Calculate all the embeddings in a batched manner with the LLaVA model and the edit head.
+2. Pop it off the memory to gain VRAM.
+3. Loads the InstructPix2Pix pipeline and performs editing.
+💡 MGIE needs additional set up steps that are important to follow before running inference. Please refer to the repository for those instructions.
 ## Citation