V-Prediction Loss Weighting Test
Notice
This repository contains personal experimental records. No guarantees are made regarding accuracy or reproducibility.
These models are for verification purposes only and is not intended for general use.
Overview
This repository is a test project comparing different loss weighting schemes for Stable Diffusion v-prediction training.
Environment
- sd-scripts dev branch
- Commit hash: [6adb69b] + Modified
Test Cases
This repository includes test models using different weighting schemes:
test_normal_weight
- Baseline model using standard weighting
test_edm2_weighting
- New loss weighting scheme
- implementation by A
test_min_snr_1
- Baseline model with
--min_snr_gamma = 1
- Baseline model with
test_debias_scale-like
- Baseline model with additional parameters:
--debiased_estimation_loss
--scale_v_pred_loss_like_noise_pred
- Baseline model with additional parameters:
test_edm2_weight_new
- New loss weighting scheme
- Implementation by madman404
Training Parameters
For detailed parameters, please refer to the .toml
files in each model directory.
Each model uses sdxl_train.py in each model directory
(and sdxl_train.py and t.py for test_edm2_weighting, sdxl_train.py andlossweightMLP.py for test_edm2_weight_new)
Common parameters:
- Samples: 57,373
- Epochs: 3
- U-Net only
- Learning rate: 3.5e-6
- Batch size: 8
- Gradient accumulation steps: 4
- Optimizer: Adafactor (stochastic rounding)
- Training time: 13.5 GPU hours (RTX4090) per trial
Dataset Information
The dataset used for testing consists of:
- ~53,000 images extracted from danbooru2023 based on specific artist styles (approximately 300 artists)
- ~4,000 carefully selected danbooru images for standardization
Note: As this dataset is a subset of my regular training data focused on specific artists, the model's generalization might be limited. A wildcard file (wildcard_style.txt) containing the list of included artists is provided for reference.
Tag Format
The training follows the tag format from Kohaku-XL-Epsilon:
<1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>
Style Prompts
The following style prompts from Kohaku-XL-Epsilon might be compatible (untested):
ask \(askzy\), torino aqua, migolu, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
ciloranko, maccha \(mochancc\), lobelia \(saclia\), migolu,
ask \(askzy\), wanke, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
shiro9jira, ciloranko, ask \(askzy\), (tianliang duohe fangdongye:0.8)
(azuuru:1.1), (torino aqua:1.2), (azuuru:1.1), kedama milk,
fuzichoco, ask \(askzy\), chen bin, atdan, hito, mignon
ask \(askzy\), torino aqua, migolu
This model card was written with the assistance of Claude 3.5 Sonnet.
- Downloads last month
- 75