--- license: other license_name: fair-ai-public-license-1.0-sd license_link: https://freedevproject.org/faipl-1.0-sd/ language: - en base_model: - Laxhar/noobai-XL-1.0 pipeline_tag: text-to-image library_name: diffusers tags: - safetensors - diffusers - stable-diffusion - stable-diffusion-xl - art --- # V-Prediction Loss Weighting Test ## Notice This repository contains personal experimental records. No guarantees are made regarding accuracy or reproducibility. ## Overview This repository is a test project comparing different loss weighting schemes for Stable Diffusion v-prediction training. ## Environment - [sd-scripts](https://github.com/kohya-ss/sd-scripts) dev branch - Commit hash: [6adb69b] + Modified ## Test Cases This repository includes test models using different weighting schemes: 1. **test_normal_weight** - Baseline model using standard weighting 2. **test_edm2_weighting** - New loss weighting scheme - implementation by A 3. **test_min_snr_1(incomplete)** - Baseline model with `--min_snr_gamma = 1` 4. **test_debias_scale-like(incomplete)** - Baseline model with additional parameters: - `--debiased_estimation_loss` - `--scale_v_pred_loss_like_noise_pred` 5. **test_edm2_weight_new(incomplete)** - New loss weighting scheme - Implementation by madman404 ## Training Parameters For detailed parameters, please refer to the `.toml` files in each model directory. Each model uses sdxl_train.py in each model directory (and sdxl_train.py and t.py for test_edm2_weighting, sdxl_train.py andlossweightMLP.py for test_edm2_weight_new) Common parameters: - Samples: 57,373 - Epochs: 3 - U-Net only - Learning rate: 3.5e-6 - Batch size: 8 - Gradient accumulation steps: 4 - Optimizer: Adafactor (stochastic rounding) - Training time: 13.5 GPU hours (RTX4090) per trial ## Dataset Information The dataset used for testing consists of: - ~53,000 images extracted from danbooru2023 based on specific artist styles (approximately 300 artists) - ~4,000 carefully selected danbooru images for standardization **Note**: As this dataset is a subset of my regular training data focused on specific artists, the model's generalization might be limited. A wildcard file (wildcard_style.txt) containing the list of included artists is provided for reference. ### Tag Format The training follows the tag format from [Kohaku-XL-Epsilon](https://huggingface.co/KBlueLeaf/Kohaku-XL-Epsilon): `<1girl/1boy/1other/...>, , , , , , , , ` ### Style Prompts The following style prompts from Kohaku-XL-Epsilon might be compatible (untested): ``` ask \(askzy\), torino aqua, migolu, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1) ``` ``` ciloranko, maccha \(mochancc\), lobelia \(saclia\), migolu, ask \(askzy\), wanke, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1) ``` ``` shiro9jira, ciloranko, ask \(askzy\), (tianliang duohe fangdongye:0.8) ``` ``` (azuuru:1.1), (torino aqua:1.2), (azuuru:1.1), kedama milk, fuzichoco, ask \(askzy\), chen bin, atdan, hito, mignon ``` ``` ask \(askzy\), torino aqua, migolu ``` *This model card was written with the assistance of Claude 3.5 Sonnet.*