Update README.md
Browse files
README.md
CHANGED
@@ -29,13 +29,14 @@ This repository is a test project comparing different loss weighting schemes for
|
|
29 |
|
30 |
## Test Cases
|
31 |
|
32 |
-
This repository includes test models using
|
33 |
|
34 |
1. **test_normal_weight**
|
35 |
- Baseline model using standard weighting
|
36 |
|
37 |
2. **test_edm2_weighting**
|
38 |
- New loss weighting scheme
|
|
|
39 |
|
40 |
3. **test_min_snr_1(incomplete)**
|
41 |
- Baseline model with `--min_snr_gamma = 1`
|
@@ -45,9 +46,14 @@ This repository includes test models using four different weighting schemes:
|
|
45 |
- `--debiased_estimation_loss`
|
46 |
- `--scale_v_pred_loss_like_noise_pred`
|
47 |
|
|
|
|
|
|
|
|
|
48 |
## Training Parameters
|
49 |
For detailed parameters, please refer to the `.toml` files in each model directory.
|
50 |
-
Each model
|
|
|
51 |
|
52 |
Common parameters:
|
53 |
- Samples: 57,373
|
@@ -57,4 +63,38 @@ Common parameters:
|
|
57 |
- Batch size: 8
|
58 |
- Gradient accumulation steps: 4
|
59 |
- Optimizer: Adafactor (stochastic rounding)
|
60 |
-
- Training time: 13.5 GPU hours (RTX4090) per trial
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Test Cases
|
31 |
|
32 |
+
This repository includes test models using different weighting schemes:
|
33 |
|
34 |
1. **test_normal_weight**
|
35 |
- Baseline model using standard weighting
|
36 |
|
37 |
2. **test_edm2_weighting**
|
38 |
- New loss weighting scheme
|
39 |
+
- implementation by A
|
40 |
|
41 |
3. **test_min_snr_1(incomplete)**
|
42 |
- Baseline model with `--min_snr_gamma = 1`
|
|
|
46 |
- `--debiased_estimation_loss`
|
47 |
- `--scale_v_pred_loss_like_noise_pred`
|
48 |
|
49 |
+
5. **test_edm2_weight_new(incomplete)**
|
50 |
+
- New loss weighting scheme
|
51 |
+
- Implementation by madman404
|
52 |
+
|
53 |
## Training Parameters
|
54 |
For detailed parameters, please refer to the `.toml` files in each model directory.
|
55 |
+
Each model uses sdxl_train.py in each model directory
|
56 |
+
(and sdxl_train.py and t.py for test_edm2_weighting, sdxl_train.py andlossweightMLP.py for test_edm2_weight_new)
|
57 |
|
58 |
Common parameters:
|
59 |
- Samples: 57,373
|
|
|
63 |
- Batch size: 8
|
64 |
- Gradient accumulation steps: 4
|
65 |
- Optimizer: Adafactor (stochastic rounding)
|
66 |
+
- Training time: 13.5 GPU hours (RTX4090) per trial
|
67 |
+
|
68 |
+
## Dataset Information
|
69 |
+
The dataset used for testing consists of:
|
70 |
+
- ~53,000 images extracted from danbooru2023 based on specific artist styles (approximately 300 artists)
|
71 |
+
- ~4,000 carefully selected danbooru images for standardization
|
72 |
+
|
73 |
+
**Note**: As this dataset is a subset of my regular training data focused on specific artists, the model's generalization might be limited. A wildcard file (wildcard_style.txt) containing the list of included artists is provided for reference.
|
74 |
+
|
75 |
+
### Tag Format
|
76 |
+
The training follows the tag format from [Kohaku-XL-Epsilon](https://huggingface.co/KBlueLeaf/Kohaku-XL-Epsilon):
|
77 |
+
`<1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>`
|
78 |
+
|
79 |
+
### Style Prompts
|
80 |
+
The following style prompts from Kohaku-XL-Epsilon might be compatible (untested):
|
81 |
+
```
|
82 |
+
ask \(askzy\), torino aqua, migolu, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
|
83 |
+
```
|
84 |
+
```
|
85 |
+
ciloranko, maccha \(mochancc\), lobelia \(saclia\), migolu,
|
86 |
+
ask \(askzy\), wanke, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
|
87 |
+
```
|
88 |
+
```
|
89 |
+
shiro9jira, ciloranko, ask \(askzy\), (tianliang duohe fangdongye:0.8)
|
90 |
+
```
|
91 |
+
```
|
92 |
+
(azuuru:1.1), (torino aqua:1.2), (azuuru:1.1), kedama milk,
|
93 |
+
fuzichoco, ask \(askzy\), chen bin, atdan, hito, mignon
|
94 |
+
```
|
95 |
+
```
|
96 |
+
ask \(askzy\), torino aqua, migolu
|
97 |
+
```
|
98 |
+
|
99 |
+
|
100 |
+
*This model card was written with the assistance of Claude 3.5 Sonnet.*
|