saltknox commited on
Commit
629a3bd
β€’
1 Parent(s): aaeb7e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -10,35 +10,40 @@ tags:
10
  ---
11
 
12
  ## <a name="Introduction"></a>πŸ“– Introduction
 
13
  We provide Kolors-Inpainting inference code and weights which were initialized with [Kolors-Basemodel](https://huggingface.co/Kwai-Kolors/Kolors). Examples of Kolors-Inpainting results are as follows:
14
 
15
  <img src="imgs/example.png">
16
 
17
  <br>
18
 
19
- **Our improvements**
20
 
21
  - For inpainting, the UNet has 5 additional input channels (4 for the encoded masked image and 1 for the mask itself). The weights for the encoded masked-image channels were initialized from the non-inpainting checkpoint, while the weights for the mask channel were zero-initialized.
22
  - To improve the robustness of the inpainting model, we adopt a more diverse strategy for generating masks, including random masks, subject segmentation masks, rectangular masks, and masks based on dilation operations.
23
 
 
24
  <br>
25
 
26
 
27
  ## <a name="Evaluation"></a>πŸ“Š Evaluation
28
- For evaluation, we created a test set of 200 masked images and text prompts. We invited several image experts to provide fair ratings for the generated results of different models. The experts rated the generated images based on four criteria: visual appeal, text faithfulness, conditional controllability, and overall satisfaction. Conditional controllability measures the sense of boundaries of the inpainting results, while the other criteria follow the evaluation criteria of the BaseModel. The specific results are summarized in the table below, where Kolors-Inpainting achieved the highest overall satisfaction score.
29
 
30
- | Model | Average Overall Satisfaction | Average Conditional Controllability | Average Visual Appeal | Average Text Faithfulness |
31
  | :-----------------: | :-----------: | :-----------: | :-----------: | :-----------: |
32
  | SDXL-Inpainting | 2.573 | 3.795 | 3.000 | 4.299 |
33
  | **Kolors-Inpainting** | **3.493** | **4.796** | **3.855** | **4.346** |
34
 
 
35
  <br>
36
 
37
  <img src="imgs/contrast.png">
38
 
 
39
  <font color=gray style="font-size:12px"> *Kolors-Inpainting employs Chinese prompts, while SDXL-Inpainting uses English prompts.*</font>
40
 
41
 
 
42
  ## <a name="Usage"></a>πŸ› οΈ Usage
43
 
44
  ### Requirements
@@ -78,3 +83,4 @@ python inpainting/sample_inpainting.py ./inpainting/asset/4.png ./inpainting/ass
78
  ```
79
 
80
  <br>
 
 
10
  ---
11
 
12
  ## <a name="Introduction"></a>πŸ“– Introduction
13
+
14
  We provide Kolors-Inpainting inference code and weights which were initialized with [Kolors-Basemodel](https://huggingface.co/Kwai-Kolors/Kolors). Examples of Kolors-Inpainting results are as follows:
15
 
16
  <img src="imgs/example.png">
17
 
18
  <br>
19
 
20
+ **Model details**
21
 
22
  - For inpainting, the UNet has 5 additional input channels (4 for the encoded masked image and 1 for the mask itself). The weights for the encoded masked-image channels were initialized from the non-inpainting checkpoint, while the weights for the mask channel were zero-initialized.
23
  - To improve the robustness of the inpainting model, we adopt a more diverse strategy for generating masks, including random masks, subject segmentation masks, rectangular masks, and masks based on dilation operations.
24
 
25
+
26
  <br>
27
 
28
 
29
  ## <a name="Evaluation"></a>πŸ“Š Evaluation
30
+ For evaluation, we created a test set comprising 200 masked images and text prompts. We invited several image experts to provide unbiased ratings for the generated results of different models. The experts assessed the generated images based on four criteria: visual appeal, text faithfulness, inpainting artifacts, and overall satisfaction. Inpainting artifacts measure the perceptual boundaries in the inpainting results, while the other criteria adhere to the evaluation standards of the BaseModel. The specific results are summarized in the table below, where Kolors-Inpainting achieved the highest overall satisfaction score.
31
 
32
+ | Model | Average Overall Satisfaction | Average Inpainting Artifacts | Average Visual Appeal | Average Text Faithfulness |
33
  | :-----------------: | :-----------: | :-----------: | :-----------: | :-----------: |
34
  | SDXL-Inpainting | 2.573 | 3.795 | 3.000 | 4.299 |
35
  | **Kolors-Inpainting** | **3.493** | **4.796** | **3.855** | **4.346** |
36
 
37
+
38
  <br>
39
 
40
  <img src="imgs/contrast.png">
41
 
42
+
43
  <font color=gray style="font-size:12px"> *Kolors-Inpainting employs Chinese prompts, while SDXL-Inpainting uses English prompts.*</font>
44
 
45
 
46
+
47
  ## <a name="Usage"></a>πŸ› οΈ Usage
48
 
49
  ### Requirements
 
83
  ```
84
 
85
  <br>
86
+