Diffusers
Safetensors
WuerstchenPriorPipeline
dome272 commited on
Commit
ced6b20
·
1 Parent(s): 77287c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -19,7 +19,7 @@ inference, its job is to generate the image latents given text. These image late
19
  ### Image Sizes
20
  Würstchen was trained on image resolutions between 1024x1024 & 1536x1536. We sometimes also observe good outputs at resolutions like 1024x2048. Feel free to try it out.
21
  We also observed that the Prior (Stage C) adapts extremely fast to new resolutions. So finetuning it at 2048x2048 should be computationally cheap.
22
- <img src="https://cdn-uploads.huggingface.co/production/uploads/634cb5eefb80cc6bcaf63c3e/IfVsUDcP15OY-5wyLYKnQ.jpeg" width=1000>
23
 
24
  ## How to run
25
  This pipeline should be run together with https://huggingface.co/warp-ai/wuerstchen:
@@ -62,6 +62,12 @@ decoder_output = decoder_pipeline(
62
  ).images
63
  ```
64
 
 
 
 
 
 
 
65
  ## Model Details
66
  - **Developed by:** Pablo Pernias, Dominic Rampas
67
  - **Model type:** Diffusion-based text-to-image generation model
 
19
  ### Image Sizes
20
  Würstchen was trained on image resolutions between 1024x1024 & 1536x1536. We sometimes also observe good outputs at resolutions like 1024x2048. Feel free to try it out.
21
  We also observed that the Prior (Stage C) adapts extremely fast to new resolutions. So finetuning it at 2048x2048 should be computationally cheap.
22
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/634cb5eefb80cc6bcaf63c3e/5pA5KUfGmvsObqiIjdGY1.jpeg" width=1000>
23
 
24
  ## How to run
25
  This pipeline should be run together with https://huggingface.co/warp-ai/wuerstchen:
 
62
  ).images
63
  ```
64
 
65
+ ### Image Sampling Times
66
+ The figure shows the inference times (on an A100) for different batch sizes (`num_images_per_prompt`) on Würstchen compared to [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) (without refiner).
67
+ The left figure shows inference times (using torch > 2.0), whereas the right figure applies `torch.compile` to both pipelines in advance.
68
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/634cb5eefb80cc6bcaf63c3e/UPhsIH2f079ZuTA_sLdVe.jpeg)
69
+
70
+
71
  ## Model Details
72
  - **Developed by:** Pablo Pernias, Dominic Rampas
73
  - **Model type:** Diffusion-based text-to-image generation model