Spaces:
Running
on
Zero
Running
on
Zero
Utilize HF's "balanced" device_map + dynamically pair diffusion components to relevant execution cores
#1
by
diopside
- opened
By utilizing balanced mode + explicitly pairing diffusion components on grouped GPUs, we avoid OOM and being able to run on 4*40Ls.
Distribution approach (i.e): Text encoder on GPU 1 - 16.6GB, Everything else on GPU 2 - 44.5GB including: Controlnet (4.23GB), VAE (254MB), Transformer (40GB).
This keeps the overall memory usage efficiently split across the GPUs while ensuring all components that need to interact directly are on the same device.