|
--- |
|
license: creativeml-openrail-m |
|
--- |
|
|
|
SD1.5 experiments with Huber and MSE loss. All models trained for 4 epochs on approximately 250k images from a variety of sources. Approximately half from LAION Aesthetics, and a few thousand 4K video rips with COG-VLM captions. |
|
|
|
![](interpolated_huber_mse.png) |
|
|
|
Trained using Everydream2 Trainer (https://github.com/victorchall/EveryDream2trainer) on an RTX 6000 Ada 48gb. Each epoch takes approximately 10 hours for a total of about 40 hours per model. |
|
|
|
- Multi-aspect ratio trained with nominal size of <=768^2 pixels for each bucket |
|
- Batch size 12 with grad accum 10. |
|
- AdamW 8bit optimizer with standard betas of (0.9,0.999) and weight decay of 0.010. |
|
- Automatic mixed precision FP16 (note: grad scalar val was surprisingly identical on all runs) |
|
- TF32 matmul and SDP Attention |
|
- 3.0e-6 LR cosine schedule with a ~12 epoch target to decay, ending around 2.3e-6 at end of training |
|
- Pyramid noise using discount 0.03 |
|
- Zero offset noise of 0.02 |
|
- Min SNR gamma of 5.0 |
|
- Unet only training, text encoder left frozen. |
|
- Conditional dropout of 10% |
|
|
|
The following models were produced: |
|
- 768_huber.safetensors - Huber loss only |
|
- 768_mse_plus_huberd1.5.safetensors - MSE Plus Huber (d=1.5) loss |
|
- 768_ts0huber_ts999mse.safetensors - Huber loss at timestep 0 interpolated to MSE loss at timestep 999 |
|
- 768_ts0mse_ts999huber.safetensors - MSE loss at timestep 0 interpolated to Huber loss at timestep 999 |
|
|
|
Worth noting timestep 0 is the lowest-noise-added step and 999 is most noised timestep |
|
|
|
|