Patrick Esser commited on
Commit
b3874d2
·
1 Parent(s): bfd8d6f

update readme

Browse files

Former-commit-id: d39f5b51a8d607fd855425a0d546b9f871034c3d

Files changed (1) hide show
  1. README.md +25 -18
README.md CHANGED
@@ -78,6 +78,9 @@ steps show the relative improvements of the checkpoints:
78
 
79
  Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder.
80
 
 
 
 
81
  After [obtaining the weights](#weights), link them
82
  ```
83
  mkdir -p models/ldm/stable-diffusion-v1/
@@ -88,24 +91,6 @@ and sample with
88
  python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
89
  ```
90
 
91
- Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers)
92
- ```py
93
- # make sure you're logged in with `huggingface-cli login`
94
- from torch import autocast
95
- from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
96
-
97
- pipe = StableDiffusionPipeline.from_pretrained(
98
- "CompVis/stable-diffusion-v1-3-diffusers",
99
- use_auth_token=True
100
- )
101
-
102
- prompt = "a photo of an astronaut riding a horse on mars"
103
- with autocast("cuda"):
104
- image = pipe(prompt)["sample"][0]
105
-
106
- image.save("astronaut_rides_horse.png")
107
- ```
108
-
109
  By default, this uses a guidance scale of `--scale 7.5`, [Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51) of the [PLMS](https://arxiv.org/abs/2202.09778) sampler,
110
  and renders images of size 512x512 (which it was trained on) in 50 steps. All supported arguments are listed below (type `python scripts/txt2img.py --help`).
111
 
@@ -149,6 +134,28 @@ non-EMA to EMA weights. If you want to examine the effect of EMA vs no EMA, we p
149
  which contain both types of weights. For these, `use_ema=False` will load and use the non-EMA weights.
150
 
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  ### Image Modification with Stable Diffusion
153
 
154
  By using a diffusion-denoising mechanism as first proposed by [SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different
 
78
 
79
  Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder.
80
 
81
+
82
+ #### Sampling Script
83
+
84
  After [obtaining the weights](#weights), link them
85
  ```
86
  mkdir -p models/ldm/stable-diffusion-v1/
 
91
  python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
92
  ```
93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  By default, this uses a guidance scale of `--scale 7.5`, [Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51) of the [PLMS](https://arxiv.org/abs/2202.09778) sampler,
95
  and renders images of size 512x512 (which it was trained on) in 50 steps. All supported arguments are listed below (type `python scripts/txt2img.py --help`).
96
 
 
134
  which contain both types of weights. For these, `use_ema=False` will load and use the non-EMA weights.
135
 
136
 
137
+ #### Diffusers Integration
138
+
139
+ Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers)
140
+ ```py
141
+ # make sure you're logged in with `huggingface-cli login`
142
+ from torch import autocast
143
+ from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
144
+
145
+ pipe = StableDiffusionPipeline.from_pretrained(
146
+ "CompVis/stable-diffusion-v1-3-diffusers",
147
+ use_auth_token=True
148
+ )
149
+
150
+ prompt = "a photo of an astronaut riding a horse on mars"
151
+ with autocast("cuda"):
152
+ image = pipe(prompt)["sample"][0]
153
+
154
+ image.save("astronaut_rides_horse.png")
155
+ ```
156
+
157
+
158
+
159
  ### Image Modification with Stable Diffusion
160
 
161
  By using a diffusion-denoising mechanism as first proposed by [SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different