Edit model card

V-Prediction Loss Weighting Test

Notice

This repository contains personal experimental records. No guarantees are made regarding accuracy or reproducibility.
These models are for verification purposes only and is not intended for general use.

Overview

This repository is a test project comparing different loss weighting schemes for Stable Diffusion v-prediction training.

Environment

  • sd-scripts dev branch
    • Commit hash: [6adb69b] + Modified

Test Cases

This repository includes test models using different weighting schemes:

  1. test_normal_weight

    • Baseline model using standard weighting
  2. test_edm2_weighting

    • New loss weighting scheme
    • implementation by A
  3. test_min_snr_1

    • Baseline model with --min_snr_gamma = 1
  4. test_debias_scale-like

    • Baseline model with additional parameters:
      • --debiased_estimation_loss
      • --scale_v_pred_loss_like_noise_pred
  5. test_edm2_weight_new

    • New loss weighting scheme
    • Implementation by madman404

Training Parameters

For detailed parameters, please refer to the .toml files in each model directory. Each model uses sdxl_train.py in each model directory (and sdxl_train.py and t.py for test_edm2_weighting, sdxl_train.py andlossweightMLP.py for test_edm2_weight_new)

Common parameters:

  • Samples: 57,373
  • Epochs: 3
  • U-Net only
  • Learning rate: 3.5e-6
  • Batch size: 8
  • Gradient accumulation steps: 4
  • Optimizer: Adafactor (stochastic rounding)
  • Training time: 13.5 GPU hours (RTX4090) per trial

Dataset Information

The dataset used for testing consists of:

  • ~53,000 images extracted from danbooru2023 based on specific artist styles (approximately 300 artists)
  • ~4,000 carefully selected danbooru images for standardization

Note: As this dataset is a subset of my regular training data focused on specific artists, the model's generalization might be limited. A wildcard file (wildcard_style.txt) containing the list of included artists is provided for reference.

Tag Format

The training follows the tag format from Kohaku-XL-Epsilon: <1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>

Style Prompts

The following style prompts from Kohaku-XL-Epsilon might be compatible (untested):

ask \(askzy\), torino aqua, migolu, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
ciloranko, maccha \(mochancc\), lobelia \(saclia\), migolu, 
ask \(askzy\), wanke, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
shiro9jira, ciloranko, ask \(askzy\), (tianliang duohe fangdongye:0.8)
(azuuru:1.1), (torino aqua:1.2), (azuuru:1.1), kedama milk, 
fuzichoco, ask \(askzy\), chen bin, atdan, hito, mignon
ask \(askzy\), torino aqua, migolu

This model card was written with the assistance of Claude 3.5 Sonnet.

Downloads last month
75
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kawaimasa/eps_to_vpred_test_from_noobAI1