wujunqiang commited on
Commit
a48af49
1 Parent(s): f91a46b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -3
README.md CHANGED
@@ -1,3 +1,138 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - Kolors
7
+ - text-to-image
8
+ - stable-diffusion
9
+ library_name: diffusers
10
+ ---
11
+
12
+
13
+
14
+ Kolors-ControlNet-Canny weights and inference code
15
+
16
+
17
+ ## <a name="Introduction"></a>📖 Introduction
18
+
19
+ We provide two ControlNet weights and inference code based on Kolors-Basemodel: Canny and Depth. You can find some example images in the following.
20
+
21
+
22
+ **1、ControlNet Demos**
23
+
24
+ <img src="demo1.png">
25
+
26
+
27
+
28
+
29
+ **2、ControlNet and IP-Adapter-Plus Demos**
30
+
31
+ We also support joint inference code between Kolors-IPadapter and Kolors-ControlNet.
32
+
33
+
34
+ <img src="demo2.png">
35
+
36
+ <br>
37
+
38
+
39
+ <br>
40
+
41
+
42
+ ## <a name="Evaluation"></a>📊 Evaluation
43
+ To evaluate the performance of models, we compiled a test set of more than 200 images and text prompts. We invite several image experts to provide fair ratings for the generated results of different models. The experts rate the generated images based on four criteria: visual appeal, text faithfulness, conditional controllability, and overall satisfaction. Conditional controllability measures controlnet's ability to preserve spatial structure, while the other criteria follow the evaluation standards of BaseModel. The specific results are summarized in the table below, where Kolors-ControlNet achieved better performance in various criterias.
44
+
45
+ **1、Canny**
46
+
47
+ | Model | Average Overall Satisfaction | Average Visual Appeal | Average Text Faithfulness | Average Conditional Controllability |
48
+ | :--------------: | :--------: | :--------: | :--------: | :--------: |
49
+ | SDXL-ControlNet-Canny | 3.14 | 3.63 | 4.37 | 2.84 |
50
+ | **Kolors-ControlNet-Canny** | **4.06** | **4.64** | **4.45** | **3.52** |
51
+
52
+
53
+
54
+ **2、Depth**
55
+
56
+ | Model | Average Overall Satisfaction | Average Visual Appeal | Average Text Faithfulness | Average Conditional Controllability |
57
+ | :--------------: | :--------: | :--------: | :--------: | :--------: |
58
+ | SDXL-ControlNet-Canny | 3.35 | 3.77 | 4.26 | 4.5 |
59
+ | **Kolors-ControlNet-Depth** | **4.12** | **4.12** | **4.62** | **4.6** |
60
+
61
+
62
+ <font color=gray style="font-size:12px">*The [SDXL-ControlNet-Canny](https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0) and [SDXL-ControlNet-Depth](https://huggingface.co/diffusers/controlnet-depth-sdxl-1.0) load [DreamShaper-XL](https://civitai.com/models/112902?modelVersionId=351306) as backbone model.*</font>
63
+
64
+ <img src="compare_dmeo.png">
65
+
66
+
67
+ ------
68
+
69
+
70
+ ## <a name="Usage"></a>🛠️ Usage
71
+
72
+ ### Requirements
73
+
74
+ The dependencies and installation are basically the same as the [Kolors-BaseModel](https://huggingface.co/Kwai-Kolors/Kolors).
75
+
76
+ <br>
77
+
78
+
79
+ 1. Weights download:
80
+ ```bash
81
+ # Canny - ControlNet
82
+ huggingface-cli download --resume-download Kwai-Kolors/Kolors-ControlNet-Canny --local-dir weights/Kolors-ControlNet-Canny
83
+
84
+ # Depth - ControlNet
85
+ huggingface-cli download --resume-download Kwai-Kolors/Kolors-ControlNet-Depth --local-dir weights/Kolors-ControlNet-Depth
86
+ ```
87
+
88
+ If you intend to utilize the depth estimation network, please ensure to download its corresponding model weights.
89
+ ```
90
+ huggingface-cli download lllyasviel/Annotators ./dpt_hybrid-midas-501f0c75.pt --local-dir ./controlnet/annotator/ckpts
91
+ ```
92
+
93
+
94
+ ### Inference:
95
+
96
+
97
+ **a. Using canny ControlNet:**
98
+
99
+ ```bash
100
+ python ./controlnet/sample_controlNet.py ./controlnet/assets/woman_1.png 一个漂亮的女孩,高品质,超清晰,色彩鲜艳,超高分辨率,最佳品质,8k,高清,4K Canny
101
+
102
+ python ./controlnet/sample_controlNet.py ./controlnet/assets/dog.png 全景,一只可爱的白色小狗坐在杯子里,看向镜头,动漫风格,3d渲染,辛烷值渲染 Canny
103
+
104
+ # The image will be saved to "controlnet/outputs/"
105
+ ```
106
+
107
+ **b. Using depth ControlNet:**
108
+
109
+ ```bash
110
+ python ./controlnet/sample_controlNet.py ./controlnet/assets/woman_2.png 新海诚风格,丰富的色彩,穿着绿色衬衫的女人站在田野里,唯美风景,清新明亮,斑驳的光影,最好的质量,超细节,8K画质 Depth
111
+
112
+ python ./controlnet/sample_controlNet.py ./controlnet/assets/bird.png 一只颜色鲜艳的小鸟,高品质,超清晰,色彩鲜艳,超高分辨率,最佳品质,8k,高清,4K Depth
113
+
114
+ # The image will be saved to "controlnet/outputs/"
115
+ ```
116
+
117
+
118
+
119
+ **c. Using depth ControlNet + IP-Adapter-Plus:**
120
+
121
+ If you intend to utilize the kolors-ip-adapter-plus, please ensure to download its corresponding model weights.
122
+
123
+ ```bash
124
+ python ./controlnet/sample_controlNet_ipadapter.py ./controlnet/assets/woman_2.png ./ipadapter/asset/2.png 一个红色头发的女孩,唯美风景,清新明亮,斑驳的光影,最好的质量,超细节,8K画质 Depth
125
+
126
+ # The image will be saved to "controlnet/outputs/"
127
+ ```
128
+
129
+ <br>
130
+
131
+
132
+ ### Acknowledgments
133
+ - Thanks to [ControlNet](https://github.com/lllyasviel/ControlNet) for providing the codebase.
134
+
135
+ <br>
136
+
137
+
138
+