Spaces:
Running
on
Zero
Running
on
Zero
Upload README.md
Browse files
README.md
CHANGED
@@ -1,196 +1,12 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
<p>
|
19 |
-
We propose <strong>RF-Solver</strong> to solve the rectified flow ODE with less error, thus enhancing both sampling quality and inversion-reconstruction accuracy for rectified-flow-based generative models. Furthermore, we propose <strong>RF-Edit</strong> to leverage the <strong>RF-Solver</strong> for image and video editing tasks. Our methods achieve impressive performance on various tasks, including text-to-image generation, image/video inversion, and image/video editing.
|
20 |
-
</p>
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
<p align="center">
|
25 |
-
<img src="assets/repo_figures/Picture1.jpg" width="1080px"/>
|
26 |
-
</p>
|
27 |
-
|
28 |
-
# π₯ News
|
29 |
-
- [2024.11.18] More examples for style transfer are available!
|
30 |
-
- [2024.11.18] Gradio Demo for image editing is available!
|
31 |
-
- [2024.11.11] The homepage of the project is available!
|
32 |
-
- [2024.11.08] Code for image editing is released!
|
33 |
-
- [2024.11.08] Paper released!
|
34 |
-
|
35 |
-
# π¨βπ» ToDo
|
36 |
-
- βοΈ Release the gradio demo
|
37 |
-
- βοΈ Release scripts to for more image editing cases
|
38 |
-
- β Release the code for video editing
|
39 |
-
|
40 |
-
|
41 |
-
# π Method
|
42 |
-
## RF-Solver
|
43 |
-
<p>
|
44 |
-
<img src="assets/repo_figures/Picture2.jpg" width="1080px"/>
|
45 |
-
We derive the exact formulation of the solution for Rectified Flow ODE. The non-linear part in this solution is processed by Taylor Expansion. Through higher order expansion, the approximation error in the solution is significantly reduced, thus achieving impressive performance on both text-to-image sampling and image/video inversion.
|
46 |
-
</p>
|
47 |
-
|
48 |
-
## RF-Edit
|
49 |
-
<p>
|
50 |
-
<img src="assets/repo_figures/Picture3.jpg" width="1080px"/>
|
51 |
-
Based on RF-Solver, we further propose the RF-Edit for image and video editing. RF-Edit framework leverages the features from inversion in the denoising process, which enables high-quality editing while preserving the structual information of source image/video. RF-Edit contains two sub-modules, espectively for image editing and video editing.
|
52 |
-
</p>
|
53 |
-
|
54 |
-
# π οΈ Code Setup
|
55 |
-
The environment of our code is the same as FLUX, you can refer to the [official repo](https://github.com/black-forest-labs/flux/tree/main) of FLUX, or running the following command to construct the environment.
|
56 |
-
```
|
57 |
-
conda create --name RF-Solver-Edit python=3.10
|
58 |
-
conda activate RF-Solver-Edit
|
59 |
-
pip install -e ".[all]"
|
60 |
-
```
|
61 |
-
# π Examples for Image Editing
|
62 |
-
We have provided several scripts to reproduce the results in the paper, mainly including 3 types of editing: Stylization, Adding, Replacing. We suggest to run the experiment on a single A100 GPU.
|
63 |
-
|
64 |
-
## Stylization
|
65 |
-
<table class="center">
|
66 |
-
<tr>
|
67 |
-
<td width=10% align="center">Ref Style</td>
|
68 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/nobel.jpg" raw=true></td>
|
69 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/art.jpg" raw=true></td>
|
70 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/cartoon.jpg" raw=true></td>
|
71 |
-
</tr>
|
72 |
-
<tr>
|
73 |
-
<td width="10%" align="center">Editing Scripts</td>
|
74 |
-
<td width="30%" align="center"><a href="src/run_nobel_trump.sh">Trump</a></td>
|
75 |
-
<td width="30%" align="center"><a href="src/run_art_mari.sh"> Marilyn Monroe</a></td>
|
76 |
-
<td width="30%" align="center"><a href="src/run_cartoon_ein.sh">Einstein</a></td>
|
77 |
-
</tr>
|
78 |
-
<tr>
|
79 |
-
<td width=10% align="center">Edtied image</td>
|
80 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/nobel_Trump.jpg" raw=true></td>
|
81 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/art_mari.jpg" raw=true></td>
|
82 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/cartoon_ein.jpg" raw=true></td>
|
83 |
-
</tr>
|
84 |
-
|
85 |
-
<tr>
|
86 |
-
<td width="10%" align="center">Editing Scripts</td>
|
87 |
-
<td width="30%" align="center"><a href="src/run_nobel_biden.sh">Biden</a></td>
|
88 |
-
<td width="30%" align="center"><a href="src/run_art_batman.sh">Batman</a></td>
|
89 |
-
<td width="30%" align="center"><a href="src/run_cartoon_herry.sh">Herry Potter</a></td>
|
90 |
-
</tr>
|
91 |
-
<tr>
|
92 |
-
<td width=10% align="center">Edtied image</td>
|
93 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/nobel_Biden.jpg" raw=true></td>
|
94 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/art_batman.jpg" raw=true></td>
|
95 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/cartoon_herry.jpg" raw=true></td>
|
96 |
-
</tr>
|
97 |
-
</table>
|
98 |
-
|
99 |
-
## Adding & Replacing
|
100 |
-
<table class="center">
|
101 |
-
<tr>
|
102 |
-
<td width=10% align="center">Source image</td>
|
103 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/hiking.jpg" raw=true></td>
|
104 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/horse.jpg" raw=true></td>
|
105 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/source/boy.jpg" raw=true></td>
|
106 |
-
</tr>
|
107 |
-
<tr>
|
108 |
-
<td width="10%" align="center">Editing Scripts</td>
|
109 |
-
<td width="30%" align="center"><a href="src/run_boy.sh">+ hiking stick</a></td>
|
110 |
-
<td width="30%" align="center"><a href="src/run_horse.sh">horse -> camel</a></td>
|
111 |
-
<td width="30%" align="center"><a href="src/run_boy.sh">+ dog</a></td>
|
112 |
-
</tr>
|
113 |
-
<tr>
|
114 |
-
<td width=10% align="center">Edtied image</td>
|
115 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/hiking.jpg" raw=true></td>
|
116 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/horse.jpg" raw=true></td>
|
117 |
-
<td width=30% align="center"><img src="assets/repo_figures/examples/edit/boy.jpg" raw=true></td>
|
118 |
-
</tr>
|
119 |
-
|
120 |
-
</table>
|
121 |
-
|
122 |
-
|
123 |
-
# πͺ Edit Your Own Image
|
124 |
-
|
125 |
-
## Gradio Demo
|
126 |
-
We privide the gradio demo for image editing. Run the following command:
|
127 |
-
```
|
128 |
-
cd src
|
129 |
-
python gradio_demo.py
|
130 |
-
```
|
131 |
-
Here is an example for using the gradio demo to edit an image! Note that here "Number of inject steps" means the steps of feature sharing in RF-Edit, which is highly related to the quality of edited results. We suggest to tune this parameter, selecting the results with best visual quality.
|
132 |
-
<div style="text-align: center;">
|
133 |
-
<img src="assets/repo_figures/Picture7.jpg" style="width:100%; display: block; margin: 0 auto;" />
|
134 |
-
</div>
|
135 |
-
|
136 |
-
|
137 |
-
## Command Line
|
138 |
-
You can also run the following scripts to edit your own image.
|
139 |
-
```
|
140 |
-
cd src
|
141 |
-
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
|
142 |
-
--target_prompt [describe your editing requirements] \
|
143 |
-
--guidance 2 \
|
144 |
-
--source_img_dir [the path of your source image] \
|
145 |
-
--num_steps 30 \
|
146 |
-
--inject [typically set to a number between 2 to 8] \
|
147 |
-
--name 'flux-dev' --offload \
|
148 |
-
--output_dir [output path]
|
149 |
-
```
|
150 |
-
Similarly, The ```--inject``` refers to the steps of feature sharing in RF-Edit, which is highly related to the performance of editing.
|
151 |
-
|
152 |
-
|
153 |
-
|
154 |
-
# πΌοΈ Gallery
|
155 |
-
## Inversion and Reconstruction
|
156 |
-
|
157 |
-
<p align="center">
|
158 |
-
<img src="assets/repo_figures/Picture4.jpg" width="1080px"/>
|
159 |
-
</p>
|
160 |
-
|
161 |
-
## Image Stylization
|
162 |
-
|
163 |
-
<p align="center">
|
164 |
-
<img src="assets/repo_figures/Picture8.jpg" width="1080px"/>
|
165 |
-
</p>
|
166 |
-
|
167 |
-
## Image Editing
|
168 |
-
|
169 |
-
<p align="center">
|
170 |
-
<img src="assets/repo_figures/Picture5.jpg" width="1080px"/>
|
171 |
-
</p>
|
172 |
-
|
173 |
-
## Video Editing
|
174 |
-
|
175 |
-
<p align="center">
|
176 |
-
<img src="assets/repo_figures/Picture6.jpg" width="1080px"/>
|
177 |
-
</p>
|
178 |
-
|
179 |
-
# ποΈ Citation
|
180 |
-
|
181 |
-
If you find our work helpful, please **star π** this repo and **cite π** our paper. Thanks for your support!
|
182 |
-
|
183 |
-
```
|
184 |
-
@article{wang2024taming,
|
185 |
-
title={Taming Rectified Flow for Inversion and Editing},
|
186 |
-
author={Wang, Jiangshan and Pu, Junfu and Qi, Zhongang and Guo, Jiayi and Ma, Yue and Huang, Nisha and Chen, Yuxin and Li, Xiu and Shan, Ying},
|
187 |
-
journal={arXiv preprint arXiv:2411.04746},
|
188 |
-
year={2024}
|
189 |
-
}
|
190 |
-
```
|
191 |
-
|
192 |
-
# Acknowledgements
|
193 |
-
We thank [FLUX](https://github.com/black-forest-labs/flux/tree/main) for their clean codebase.
|
194 |
-
|
195 |
-
# Contact
|
196 |
-
The code in this repository is still being reorganized. Errors that may arise during the organizing process could lead to code malfunctions or discrepancies from the original research results. If you have any questions or concerns, please send email to wjs23@mails.tsinghua.edu.cn.
|
|
|
1 |
+
title: RF-Solver-Edit
|
2 |
+
emoji: πͺ
|
3 |
+
colorFrom: purple
|
4 |
+
colorTo: red
|
5 |
+
sdk: gradio
|
6 |
+
sdk_version: 5.0.1
|
7 |
+
app_file: src/gradi_demo.py
|
8 |
+
pinned: false
|
9 |
+
license: other
|
10 |
+
license_name: flux-1-dev-non-commercial-license
|
11 |
+
license_link: LICENSE.md
|
12 |
+
short_description: High Quality Inversion and Editing of FLUX and OpenSora.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|