Stuck trying to generate with the example ComfyUI flow

#12
by thiscrazyusernameishopping - opened

I have a 9070 XT GPU. I followed the ComfyUI manual installation instructions that included grabbing the latest nightly with

pip install --pre torch torchvision torchaudio --index-url https://rocm.nightlies.amd.com/v2/gfx120X-all/

Afterwards, I followed all of the rest of directions and was able to use an SDXL model in a standard flow.

I then followed the instructions here:
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

I can get to the image generation stage, but then it just hangs while attempting to generate. Here is a full dump, from start to generation attempt:

C:\ComfyUI>python main.py
Checkpoint files will always be loaded safely.
Total VRAM 16304 MB, total RAM 65461 MB
pytorch version: 2.10.0a0+rocm7.10.0a20251121
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 10)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 XT : native
Enabled pinned memory 29457.0
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
Python version: 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
ComfyUI version: 0.3.75
ComfyUI frontend version: 1.32.9
[Prompt Server] web root: C:\Users<redacted>\AppData\Local\Programs\Python\Python311\Lib\site-packages\comfyui_frontend_package\static
Total VRAM 16304 MB, total RAM 65461 MB
pytorch version: 2.10.0a0+rocm7.10.0a20251121
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 10)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 XT : native
Enabled pinned memory 29457.0

Import times for custom nodes:
0.0 seconds: C:\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds: C:\ComfyUI\custom_nodes\comfyui_lora_tag_loader
0.0 seconds: C:\ComfyUI\custom_nodes\ComfyUI-Gallery

Context impl SQLiteImpl.
Will assume non-transactional DDL.
No target revision found.
Starting server

To see the GUI go to: http://127.0.0.1:8188
got prompt
Using split attention in VAE
Using split attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load ZImageTEModel_
loaded completely; 14633.80 MB usable, 7672.25 MB loaded, full load: True
model weight dtype torch.bfloat16, manual cast: None
model_type FLOW
unet missing: ['norm_final.weight']
Requested to load Lumina2
loaded partially; 11125.08 MB usable, 11096.95 MB loaded, 642.60 MB offloaded, 28.12 MB buffer reserved, lowvram patches: 0
11%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1/9 [00:01<00:15, 1.93s/it]

It just hangs at this first step. Will play around, but opening discussion for suggestions.

Sign up or log in to comment