adhikjoshi commited on
Commit
fc0edda
1 Parent(s): 81196a4

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +2 -0
  2. README.md +203 -0
  3. config.json +57 -0
  4. images/000000_pose_concat.webp +0 -0
  5. images/000001_openpose_scribble_concat.webp +0 -0
  6. images/000001_pose_concat.webp +0 -0
  7. images/000002_openpose_scribble_concat.webp +0 -0
  8. images/000002_pose_concat.webp +0 -0
  9. images/000003_openpose_scribble_concat.webp +0 -0
  10. images/000003_pose_concat.webp +0 -0
  11. images/000004_openpose_scribble_concat.webp +0 -0
  12. images/000004_pose_concat.webp +0 -0
  13. images/000005_depth_concat.webp +0 -0
  14. images/000005_openpose_scribble_concat.webp +0 -0
  15. images/000006_depth_concat.webp +0 -0
  16. images/000006_openpose_scribble_concat.webp +0 -0
  17. images/000007_depth_concat.webp +0 -0
  18. images/000007_openpose_canny_concat.webp +0 -0
  19. images/000008_depth_concat.webp +0 -0
  20. images/000008_openpose_canny_concat.webp +0 -0
  21. images/000009_depth_concat.webp +0 -0
  22. images/000009_openpose_canny_concat.webp +0 -0
  23. images/000010_canny_concat.webp +0 -0
  24. images/000010_openpose_canny_concat.webp +0 -0
  25. images/000011_canny_concat.webp +0 -0
  26. images/000011_openpose_canny_concat.webp +0 -0
  27. images/000012_canny_concat.webp +0 -0
  28. images/000012_openpose_canny_concat.webp +0 -0
  29. images/000013_canny_concat.webp +0 -0
  30. images/000013_openpose_depth_concat.webp +0 -0
  31. images/000014_canny_concat.webp +0 -0
  32. images/000014_openpose_depth_concat.webp +0 -0
  33. images/000015_lineart_concat.webp +0 -0
  34. images/000015_openpose_depth_concat.webp +0 -0
  35. images/000016_lineart_concat.webp +0 -0
  36. images/000016_openpose_depth_concat.webp +0 -0
  37. images/000017_lineart_concat.webp +0 -0
  38. images/000017_openpose_depth_concat.webp +0 -0
  39. images/000018_lineart_concat.webp +0 -0
  40. images/000018_openpose_depth_concat.webp +0 -0
  41. images/000019_lineart_concat.webp +0 -0
  42. images/000019_openpose_normal_concat.webp +0 -0
  43. images/000020_anime_lineart_concat.webp +0 -0
  44. images/000020_openpose_normal_concat.webp +0 -0
  45. images/000021_openpose_normal_concat.webp +0 -0
  46. images/000022_anime_lineart_concat.webp +0 -0
  47. images/000023_anime_lineart_concat.webp +0 -0
  48. images/000024_openpose_normal_concat.webp +0 -0
  49. images/000025_mlsd_concat.webp +0 -0
  50. images/000026_mlsd_concat.webp +0 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ images/000027_mlsd_concat.webp filter=lfs diff=lfs merge=lfs -text
37
+ images/masonry.webp filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - Text-to-Image
5
+ - ControlNet
6
+ - Diffusers
7
+ - Stable Diffusion
8
+ pipeline_tag: text-to-image
9
+ ---
10
+
11
+ # **ControlNet++: All-in-one ControlNet for image generations and editing!**
12
+ ## **ProMax Model has released!! 12 control + 5 advanced editing, just try it!!!**
13
+ ![images_display](./images/masonry.webp)
14
+
15
+ ## Network Arichitecture
16
+ ![images](./images/ControlNet++.png)
17
+
18
+ ## Advantages about the model
19
+ - Use bucket training like novelai, can generate high resolutions images of any aspect ratio
20
+ - Use large amount of high quality data(over 10000000 images), the dataset covers a diversity of situation
21
+ - Use re-captioned prompt like DALLE.3, use CogVLM to generate detailed description, good prompt following ability
22
+ - Use many useful tricks during training. Including but not limited to date augmentation, mutiple loss, multi resolution
23
+ - Use almost the same parameter compared with original ControlNet. No obvious increase in network parameter or computation.
24
+ - Support 10+ control conditions, no obvious performance drop on any single condition compared with training independently
25
+ - Support multi condition generation, condition fusion is learned during training. No need to set hyperparameter or design prompts.
26
+ - Compatible with other opensource SDXL models, such as BluePencilXL, CounterfeitXL. Compatible with other Lora models.
27
+
28
+
29
+ ***We design a new architecture that can support 10+ control types in condition text-to-image generation and can generate high resolution images visually comparable with
30
+ midjourney***. The network is based on the original ControlNet architecture, we propose two new modules to: 1 Extend the original ControlNet to support different image
31
+ conditions using the same network parameter. 2 Support multiple conditions input without increasing computation offload, which is especially important for designers
32
+ who want to edit image in detail, different conditions use the same condition encoder, without adding extra computations or parameters. We do thoroughly experiments
33
+ on SDXL and achieve superior performance both in control ability and aesthetic score. We release the method and the model to the open source community to make everyone
34
+ can enjoy it.
35
+
36
+ Inference scripts and more details can found: https://github.com/xinsir6/ControlNetPlus/tree/main
37
+
38
+ **If you find it useful, please give me a star, thank you very much**
39
+
40
+ **SDXL ProMax version has been released!!!,Enjoy it!!!**
41
+
42
+ **I am sorry that because of the project's revenue and expenditure are difficult to balance, the GPU resources are assigned to other projects that are more likely to be profitable, the SD3 trainging is stopped until I find enough GPU supprt, I will try my best to find GPUs to continue training. If this brings you inconvenience, I sincerely apologize for that. I want to thank everyone who likes this project, your support is what keeps me going**
43
+
44
+ Note: we put the promax model with a promax suffix in the same [huggingface model repo](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0), detailed instructions will be added later.
45
+ ## Advanced editing features in Promax Model
46
+ ### Tile Deblur
47
+ ![blur0](./images/100000_tile_blur_concat.webp)
48
+ ![blur1](./images/100001_tile_blur_concat.webp)
49
+ ![blur2](./images/100002_tile_blur_concat.webp)
50
+ ![blur3](./images/100003_tile_blur_concat.webp)
51
+ ![blur4](./images/100004_tile_blur_concat.webp)
52
+ ![blur5](./images/100005_tile_blur_concat.webp)
53
+ ### Tile variation
54
+ ![var0](./images/100006_tile_var_concat.webp)
55
+ ![var1](./images/100007_tile_var_concat.webp)
56
+ ![var2](./images/100008_tile_var_concat.webp)
57
+ ![var3](./images/100009_tile_var_concat.webp)
58
+ ![var4](./images/100010_tile_var_concat.webp)
59
+ ![var5](./images/100011_tile_var_concat.webp)
60
+
61
+ ### Tile Super Resolution
62
+ Following example show from 1M resolution --> 9M resolution
63
+ <div style="display: flex; justify-content: space-between;">
64
+ <img src="./images/tile_super1.webp" alt="Image 1" style="width: 49%; margin: 1%;">
65
+ <img src="./images/tile_super1_9upscale.webp" alt="Image 2" style="width: 49%; margin: 1%;">
66
+ </div>
67
+
68
+ <div style="display: flex; justify-content: space-between;">
69
+ <img src="./images/tile_super2.webp" alt="Image 1" style="width: 49%; margin: 1%;">
70
+ <img src="./images/tile_super2_9upscale.webp" alt="Image 2" style="width: 49%; margin: 1%;">
71
+ </div>
72
+
73
+ ### Image Inpainting
74
+ ![inp0](./images/100018_inpainting_concat.webp)
75
+ ![inp1](./images/100019_inpainting_concat.webp)
76
+ ![inp2](./images/100020_inpainting_concat.webp)
77
+ ![inp3](./images/100021_inpainting_concat.webp)
78
+ ![inp4](./images/100022_inpainting_concat.webp)
79
+ ![inp5](./images/100023_inpainting_concat.webp)
80
+
81
+ ### Image Outpainting
82
+ ![oup0](./images/100012_outpainting_concat.webp)
83
+ ![oup1](./images/100013_outpainting_concat.webp)
84
+ ![oup2](./images/100014_outpainting_concat.webp)
85
+ ![oup3](./images/100015_outpainting_concat.webp)
86
+ ![oup4](./images/100016_outpainting_concat.webp)
87
+ ![oup5](./images/100017_outpainting_concat.webp)
88
+
89
+
90
+ ## Visual Examples
91
+ ### Openpose
92
+ ![pose0](./images/000000_pose_concat.webp)
93
+ ![pose1](./images/000001_pose_concat.webp)
94
+ ![pose2](./images/000002_pose_concat.webp)
95
+ ![pose3](./images/000003_pose_concat.webp)
96
+ ![pose4](./images/000004_pose_concat.webp)
97
+ ### Depth
98
+ ![depth0](./images/000005_depth_concat.webp)
99
+ ![depth1](./images/000006_depth_concat.webp)
100
+ ![depth2](./images/000007_depth_concat.webp)
101
+ ![depth3](./images/000008_depth_concat.webp)
102
+ ![depth4](./images/000009_depth_concat.webp)
103
+ ### Canny
104
+ ![canny0](./images/000010_canny_concat.webp)
105
+ ![canny1](./images/000011_canny_concat.webp)
106
+ ![canny2](./images/000012_canny_concat.webp)
107
+ ![canny3](./images/000013_canny_concat.webp)
108
+ ![canny4](./images/000014_canny_concat.webp)
109
+ ### Lineart
110
+ ![lineart0](./images/000015_lineart_concat.webp)
111
+ ![lineart1](./images/000016_lineart_concat.webp)
112
+ ![lineart2](./images/000017_lineart_concat.webp)
113
+ ![lineart3](./images/000018_lineart_concat.webp)
114
+ ![lineart4](./images/000019_lineart_concat.webp)
115
+ ### AnimeLineart
116
+ ![animelineart0](./images/000020_anime_lineart_concat.webp)
117
+ ![animelineart1](./images/000021_anime_lineart_concat.webp)
118
+ ![animelineart2](./images/000022_anime_lineart_concat.webp)
119
+ ![animelineart3](./images/000023_anime_lineart_concat.webp)
120
+ ![animelineart4](./images/000024_anime_lineart_concat.webp)
121
+ ### Mlsd
122
+ ![mlsd0](./images/000025_mlsd_concat.webp)
123
+ ![mlsd1](./images/000026_mlsd_concat.webp)
124
+ ![mlsd2](./images/000027_mlsd_concat.webp)
125
+ ![mlsd3](./images/000028_mlsd_concat.webp)
126
+ ![mlsd4](./images/000029_mlsd_concat.webp)
127
+ ### Scribble
128
+ ![scribble0](./images/000030_scribble_concat.webp)
129
+ ![scribble1](./images/000031_scribble_concat.webp)
130
+ ![scribble2](./images/000032_scribble_concat.webp)
131
+ ![scribble3](./images/000033_scribble_concat.webp)
132
+ ![scribble4](./images/000034_scribble_concat.webp)
133
+ ### Hed
134
+ ![hed0](./images/000035_hed_concat.webp)
135
+ ![hed1](./images/000036_hed_concat.webp)
136
+ ![hed2](./images/000037_hed_concat.webp)
137
+ ![hed3](./images/000038_hed_concat.webp)
138
+ ![hed4](./images/000039_hed_concat.webp)
139
+ ### Pidi(Softedge)
140
+ ![pidi0](./images/000040_softedge_concat.webp)
141
+ ![pidi1](./images/000041_softedge_concat.webp)
142
+ ![pidi2](./images/000042_softedge_concat.webp)
143
+ ![pidi3](./images/000043_softedge_concat.webp)
144
+ ![pidi4](./images/000044_softedge_concat.webp)
145
+ ### Teed
146
+ ![ted0](./images/000045_ted_concat.webp)
147
+ ![ted1](./images/000046_ted_concat.webp)
148
+ ![ted2](./images/000047_ted_concat.webp)
149
+ ![ted3](./images/000048_ted_concat.webp)
150
+ ![ted4](./images/000049_ted_concat.webp)
151
+ ### Segment
152
+ ![segment0](./images/000050_seg_concat.webp)
153
+ ![segment1](./images/000051_seg_concat.webp)
154
+ ![segment2](./images/000052_seg_concat.webp)
155
+ ![segment3](./images/000053_seg_concat.webp)
156
+ ![segment4](./images/000054_seg_concat.webp)
157
+ ### Normal
158
+ ![normal0](./images/000055_normal_concat.webp)
159
+ ![normal1](./images/000056_normal_concat.webp)
160
+ ![normal2](./images/000057_normal_concat.webp)
161
+ ![normal3](./images/000058_normal_concat.webp)
162
+ ![normal4](./images/000059_normal_concat.webp)
163
+
164
+ ## Multi Control Visual Examples
165
+ ### Openpose + Canny
166
+ ![pose_canny0](./images/000007_openpose_canny_concat.webp)
167
+ ![pose_canny1](./images/000008_openpose_canny_concat.webp)
168
+ ![pose_canny2](./images/000009_openpose_canny_concat.webp)
169
+ ![pose_canny3](./images/000010_openpose_canny_concat.webp)
170
+ ![pose_canny4](./images/000011_openpose_canny_concat.webp)
171
+ ![pose_canny5](./images/000012_openpose_canny_concat.webp)
172
+
173
+ ### Openpose + Depth
174
+ ![pose_depth0](./images/000013_openpose_depth_concat.webp)
175
+ ![pose_depth1](./images/000014_openpose_depth_concat.webp)
176
+ ![pose_depth2](./images/000015_openpose_depth_concat.webp)
177
+ ![pose_depth3](./images/000016_openpose_depth_concat.webp)
178
+ ![pose_depth4](./images/000017_openpose_depth_concat.webp)
179
+ ![pose_depth5](./images/000018_openpose_depth_concat.webp)
180
+
181
+ ### Openpose + Scribble
182
+ ![pose_scribble0](./images/000001_openpose_scribble_concat.webp)
183
+ ![pose_scribble1](./images/000002_openpose_scribble_concat.webp)
184
+ ![pose_scribble2](./images/000003_openpose_scribble_concat.webp)
185
+ ![pose_scribble3](./images/000004_openpose_scribble_concat.webp)
186
+ ![pose_scribble4](./images/000005_openpose_scribble_concat.webp)
187
+ ![pose_scribble5](./images/000006_openpose_scribble_concat.webp)
188
+
189
+ ### Openpose + Normal
190
+ ![pose_normal0](./images/000019_openpose_normal_concat.webp)
191
+ ![pose_normal1](./images/000020_openpose_normal_concat.webp)
192
+ ![pose_normal2](./images/000021_openpose_normal_concat.webp)
193
+ ![pose_normal3](./images/000022_openpose_normal_concat.webp)
194
+ ![pose_normal4](./images/000023_openpose_normal_concat.webp)
195
+ ![pose_normal5](./images/000024_openpose_normal_concat.webp)
196
+
197
+ ### Openpose + Segment
198
+ ![pose_segment0](./images/000025_openpose_sam_concat.webp)
199
+ ![pose_segment1](./images/000026_openpose_sam_concat.webp)
200
+ ![pose_segment2](./images/000027_openpose_sam_concat.webp)
201
+ ![pose_segment3](./images/000028_openpose_sam_concat.webp)
202
+ ![pose_segment4](./images/000029_openpose_sam_concat.webp)
203
+ ![pose_segment5](./images/000030_openpose_sam_concat.webp)
config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "ControlNetModel",
3
+ "_diffusers_version": "0.20.0.dev0",
4
+ "act_fn": "silu",
5
+ "addition_embed_type": "text_time",
6
+ "addition_embed_type_num_heads": 64,
7
+ "addition_time_embed_dim": 256,
8
+ "attention_head_dim": [
9
+ 5,
10
+ 10,
11
+ 20
12
+ ],
13
+ "block_out_channels": [
14
+ 320,
15
+ 640,
16
+ 1280
17
+ ],
18
+ "class_embed_type": null,
19
+ "conditioning_channels": 3,
20
+ "conditioning_embedding_out_channels": [
21
+ 16,
22
+ 32,
23
+ 96,
24
+ 256
25
+ ],
26
+ "controlnet_conditioning_channel_order": "rgb",
27
+ "cross_attention_dim": 2048,
28
+ "down_block_types": [
29
+ "DownBlock2D",
30
+ "CrossAttnDownBlock2D",
31
+ "CrossAttnDownBlock2D"
32
+ ],
33
+ "downsample_padding": 1,
34
+ "encoder_hid_dim": null,
35
+ "encoder_hid_dim_type": null,
36
+ "flip_sin_to_cos": true,
37
+ "freq_shift": 0,
38
+ "global_pool_conditions": false,
39
+ "in_channels": 4,
40
+ "layers_per_block": 2,
41
+ "mid_block_scale_factor": 1,
42
+ "norm_eps": 1e-05,
43
+ "norm_num_groups": 32,
44
+ "num_attention_heads": null,
45
+ "num_class_embeds": null,
46
+ "only_cross_attention": false,
47
+ "projection_class_embeddings_input_dim": 2816,
48
+ "resnet_time_scale_shift": "default",
49
+ "transformer_layers_per_block": [
50
+ 1,
51
+ 2,
52
+ 10
53
+ ],
54
+ "upcast_attention": null,
55
+ "use_linear_projection": true,
56
+ "num_control_type": 8
57
+ }
images/000000_pose_concat.webp ADDED
images/000001_openpose_scribble_concat.webp ADDED
images/000001_pose_concat.webp ADDED
images/000002_openpose_scribble_concat.webp ADDED
images/000002_pose_concat.webp ADDED
images/000003_openpose_scribble_concat.webp ADDED
images/000003_pose_concat.webp ADDED
images/000004_openpose_scribble_concat.webp ADDED
images/000004_pose_concat.webp ADDED
images/000005_depth_concat.webp ADDED
images/000005_openpose_scribble_concat.webp ADDED
images/000006_depth_concat.webp ADDED
images/000006_openpose_scribble_concat.webp ADDED
images/000007_depth_concat.webp ADDED
images/000007_openpose_canny_concat.webp ADDED
images/000008_depth_concat.webp ADDED
images/000008_openpose_canny_concat.webp ADDED
images/000009_depth_concat.webp ADDED
images/000009_openpose_canny_concat.webp ADDED
images/000010_canny_concat.webp ADDED
images/000010_openpose_canny_concat.webp ADDED
images/000011_canny_concat.webp ADDED
images/000011_openpose_canny_concat.webp ADDED
images/000012_canny_concat.webp ADDED
images/000012_openpose_canny_concat.webp ADDED
images/000013_canny_concat.webp ADDED
images/000013_openpose_depth_concat.webp ADDED
images/000014_canny_concat.webp ADDED
images/000014_openpose_depth_concat.webp ADDED
images/000015_lineart_concat.webp ADDED
images/000015_openpose_depth_concat.webp ADDED
images/000016_lineart_concat.webp ADDED
images/000016_openpose_depth_concat.webp ADDED
images/000017_lineart_concat.webp ADDED
images/000017_openpose_depth_concat.webp ADDED
images/000018_lineart_concat.webp ADDED
images/000018_openpose_depth_concat.webp ADDED
images/000019_lineart_concat.webp ADDED
images/000019_openpose_normal_concat.webp ADDED
images/000020_anime_lineart_concat.webp ADDED
images/000020_openpose_normal_concat.webp ADDED
images/000021_openpose_normal_concat.webp ADDED
images/000022_anime_lineart_concat.webp ADDED
images/000023_anime_lineart_concat.webp ADDED
images/000024_openpose_normal_concat.webp ADDED
images/000025_mlsd_concat.webp ADDED
images/000026_mlsd_concat.webp ADDED