Stuck trying to generate with the example ComfyUI flow

#12

by thiscrazyusernameishopping - opened 5 days ago

5 days ago

I have a 9070 XT GPU. I followed the ComfyUI manual installation instructions that included grabbing the latest nightly with

pip install --pre torch torchvision torchaudio --index-url https://rocm.nightlies.amd.com/v2/gfx120X-all/

Afterwards, I followed all of the rest of directions and was able to use an SDXL model in a standard flow.

I then followed the instructions here:
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

I can get to the image generation stage, but then it just hangs while attempting to generate. Here is a full dump, from start to generation attempt:

C:\ComfyUI>python main.py
Checkpoint files will always be loaded safely.
Total VRAM 16304 MB, total RAM 65461 MB
pytorch version: 2.10.0a0+rocm7.10.0a20251121
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 10)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 XT : native
Enabled pinned memory 29457.0
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
Python version: 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
ComfyUI version: 0.3.75
ComfyUI frontend version: 1.32.9
[Prompt Server] web root: C:\Users<redacted>\AppData\Local\Programs\Python\Python311\Lib\site-packages\comfyui_frontend_package\static
Total VRAM 16304 MB, total RAM 65461 MB
pytorch version: 2.10.0a0+rocm7.10.0a20251121
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 10)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 XT : native
Enabled pinned memory 29457.0

Import times for custom nodes:
0.0 seconds: C:\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds: C:\ComfyUI\custom_nodes\comfyui_lora_tag_loader
0.0 seconds: C:\ComfyUI\custom_nodes\ComfyUI-Gallery

Context impl SQLiteImpl.
Will assume non-transactional DDL.
No target revision found.
Starting server

To see the GUI go to: http://127.0.0.1:8188
got prompt
Using split attention in VAE
Using split attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load ZImageTEModel_
loaded completely; 14633.80 MB usable, 7672.25 MB loaded, full load: True
model weight dtype torch.bfloat16, manual cast: None
model_type FLOW
unet missing: ['norm_final.weight']
Requested to load Lumina2
loaded partially; 11125.08 MB usable, 11096.95 MB loaded, 642.60 MB offloaded, 28.12 MB buffer reserved, lowvram patches: 0
11%|█████████▎ | 1/9 [00:01<00:15, 1.93s/it]

It just hangs at this first step. Will play around, but opening discussion for suggestions.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment