Flux-ControlNet: Text-to-Image Diffusion Model with Caption Alignment
This repository hosts Flux-ControlNet, a customized ControlNet-based diffusion model designed for generating text-embedded images.
Key Features
- Flux-ControlNet: Enhanced ControlNet architecture for better control over text-to-image generation.
- Optimized Diffusion: Uses Hugging Face Diffusers and Accelerate for streamlined performance.
How It Works
- Input: Provide text prompts and conditioning image.
- Processing:
- Flux-ControlNet processes the text and applies diffusion to synthesize aligned images.
- Output: High-quality, text-embedded images.
Training Parameters for Flux-ControlNet
General Parameters:
Model Architecture: Flux-based ControlNet Model
Image Resolution: 512x512
Batch Size: 4
Epochs: 50
Optimizer: AdamW
Learning Rate: 1e-5 (with cosine schedular)
Weight Decay: 0.01
Gradient Clipping: 1.0
Inference Code
Soon to be added