Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,14 @@ library_name: diffusers
|
|
6 |
|
7 |
This repository contains the UNet and LLaVA model checkpoints from [Guiding Instruction-based Image Editing via Multimodal Large Language Models](https://arxiv.org/abs/2309.17102).
|
8 |
|
9 |
-
For a detailed example of usage, refer to [this notebook](https://github.com/apple/ml-mgie/blob/main/demo.ipynb) and the [official repository](https://github.com/apple/ml-mgie).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
## Citation
|
12 |
|
|
|
6 |
|
7 |
This repository contains the UNet and LLaVA model checkpoints from [Guiding Instruction-based Image Editing via Multimodal Large Language Models](https://arxiv.org/abs/2309.17102).
|
8 |
|
9 |
+
For a detailed example of usage, refer to [this notebook](https://github.com/apple/ml-mgie/blob/main/demo.ipynb) and the [official repository](https://github.com/apple/ml-mgie). Additionally, this notebook is a memory-optimized version of the original one. This decouples the MGIE inference pipeline into two broad stages:
|
10 |
+
|
11 |
+
1. Calculate all the embeddings in a batched manner with the LLaVA model and the edit head.
|
12 |
+
2. Pop it off the memory to gain VRAM.
|
13 |
+
3. Loads the InstructPix2Pix pipeline and performs editing.
|
14 |
+
|
15 |
+
💡 MGIE needs additional set up steps that are important to follow before running inference. Please refer to the repository for those instructions.
|
16 |
+
|
17 |
|
18 |
## Citation
|
19 |
|