dorkian commited on
Commit
5cc63d9
1 Parent(s): 96616f6

update model card

Browse files
Files changed (1) hide show
  1. README.md +210 -2
README.md CHANGED
@@ -1,4 +1,212 @@
1
- RaCC0on World!!
2
  ---
3
- license: cc0-1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
  ---
3
+ license: license: cc0-1.0
4
+ tags:
5
+ - stable-diffusion
6
+ - stable-diffusion-diffusers
7
+ - text-to-image
8
+ widget:
9
+ - text: "A high tech solarpunk utopia in the Amazon rainforest"
10
+ example_title: Amazon rainforest
11
+ - text: "A pikachu fine dining with a view to the Eiffel Tower"
12
+ example_title: Pikachu in Paris
13
+ - text: "A mecha robot in a favela in expressionist style"
14
+ example_title: Expressionist robot
15
+ - text: "an insect robot preparing a delicious meal"
16
+ example_title: Insect robot
17
+ - text: "A small cabin on top of a snowy mountain in the style of Disney, artstation"
18
+ example_title: Snowy disney cabin
19
+ extra_gated_prompt: |-
20
+ This model is open access and available to all, with a Crative Commons 0 license further specifying rights and usage.
21
+
22
+ extra_gated_heading: Please read the LICENSE to access this model
23
  ---
24
+
25
+ # Stable Diffusion v1-4 Model Card
26
+
27
+ Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.
28
+ For more information about how Stable Diffusion functions, please have a look at [🤗's Stable Diffusion with 🧨Diffusers blog](https://huggingface.co/blog/stable_diffusion).
29
+
30
+ The **Stable-Diffusion-v1-4** checkpoint was initialized with the weights of the [Stable-Diffusion-v1-2](https:/steps/huggingface.co/CompVis/stable-diffusion-v1-2)
31
+ checkpoint and subsequently fine-tuned on 225k steps at resolution 512x512 on "laion-aesthetics v2 5+" and 10% dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
32
+
33
+ This weights here are intended to be used with the 🧨 Diffusers library. If you are looking for the weights to be loaded into the CompVis Stable Diffusion codebase, [come here](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original)
34
+
35
+ ## Model Details
36
+
37
+ ## Examples
38
+
39
+ We recommend using [🤗's Diffusers library](https://github.com/huggingface/diffusers) to run Stable Diffusion.
40
+
41
+ ### PyTorch
42
+
43
+ ```bash
44
+ pip install --upgrade diffusers transformers scipy
45
+ ```
46
+
47
+ Running the pipeline with the default PNDM scheduler:
48
+
49
+ ```python
50
+ import torch
51
+ from diffusers import StableDiffusionPipeline
52
+ model_id = "CompVis/stable-diffusion-v1-4"
53
+ device = "cuda"
54
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
55
+ pipe = pipe.to(device)
56
+ prompt = "a photo of an astronaut riding a horse on mars"
57
+ image = pipe(prompt).images[0]
58
+
59
+ image.save("astronaut_rides_horse.png")
60
+ ```
61
+
62
+ **Note**:
63
+ If you are limited by GPU memory and have less than 4GB of GPU RAM available, please make sure to load the StableDiffusionPipeline in float16 precision instead of the default float32 precision as done above. You can do so by telling diffusers to expect the weights to be in float16 precision:
64
+
65
+
66
+ ```py
67
+ import torch
68
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
69
+ pipe = pipe.to(device)
70
+ pipe.enable_attention_slicing()
71
+ prompt = "a photo of an astronaut riding a horse on mars"
72
+ image = pipe(prompt).images[0]
73
+
74
+ image.save("astronaut_rides_horse.png")
75
+ ```
76
+
77
+ To swap out the noise scheduler, pass it to `from_pretrained`:
78
+
79
+ ```python
80
+ from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
81
+ model_id = "CompVis/stable-diffusion-v1-4"
82
+ # Use the Euler scheduler here instead
83
+ scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
84
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)
85
+ pipe = pipe.to("cuda")
86
+ prompt = "a photo of an astronaut riding a horse on mars"
87
+ image = pipe(prompt).images[0]
88
+
89
+ image.save("astronaut_rides_horse.png")
90
+ ```
91
+
92
+ ### JAX/Flax
93
+
94
+ To use StableDiffusion on TPUs and GPUs for faster inference you can leverage JAX/Flax.
95
+
96
+ Running the pipeline with default PNDMScheduler
97
+
98
+ ```python
99
+ import jax
100
+ import numpy as np
101
+ from flax.jax_utils import replicate
102
+ from flax.training.common_utils import shard
103
+ from diffusers import FlaxStableDiffusionPipeline
104
+ pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
105
+ "CompVis/stable-diffusion-v1-4", revision="flax", dtype=jax.numpy.bfloat16
106
+ )
107
+ prompt = "a photo of an astronaut riding a horse on mars"
108
+ prng_seed = jax.random.PRNGKey(0)
109
+ num_inference_steps = 50
110
+ num_samples = jax.device_count()
111
+ prompt = num_samples * [prompt]
112
+ prompt_ids = pipeline.prepare_inputs(prompt)
113
+ # shard inputs and rng
114
+ params = replicate(params)
115
+ prng_seed = jax.random.split(prng_seed, num_samples)
116
+ prompt_ids = shard(prompt_ids)
117
+ images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
118
+ images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
119
+ ```
120
+
121
+ **Note**:
122
+ If you are limited by TPU memory, please make sure to load the `FlaxStableDiffusionPipeline` in `bfloat16` precision instead of the default `float32` precision as done above. You can do so by telling diffusers to load the weights from "bf16" branch.
123
+
124
+ ```python
125
+ import jax
126
+ import numpy as np
127
+ from flax.jax_utils import replicate
128
+ from flax.training.common_utils import shard
129
+ from diffusers import FlaxStableDiffusionPipeline
130
+ pipeline, params = FlaxStableDiffusionPipeline.from_pretrained(
131
+ "CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jax.numpy.bfloat16
132
+ )
133
+ prompt = "a photo of an astronaut riding a horse on mars"
134
+ prng_seed = jax.random.PRNGKey(0)
135
+ num_inference_steps = 50
136
+ num_samples = jax.device_count()
137
+ prompt = num_samples * [prompt]
138
+ prompt_ids = pipeline.prepare_inputs(prompt)
139
+ # shard inputs and rng
140
+ params = replicate(params)
141
+ prng_seed = jax.random.split(prng_seed, num_samples)
142
+ prompt_ids = shard(prompt_ids)
143
+ images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
144
+ images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
145
+ ```
146
+
147
+ # Uses
148
+
149
+ ## Direct Use
150
+ The model is intended for research purposes only. Possible research areas and
151
+ tasks include
152
+
153
+ - Safe deployment of models which have the potential to generate harmful content.
154
+ - Probing and understanding the limitations and biases of generative models.
155
+ - Generation of artworks and use in design and other artistic processes.
156
+ - Applications in educational or creative tools.
157
+ - Research on generative models.
158
+
159
+ Excluded uses are described below.
160
+
161
+ ### Misuse, Malicious Use, and Out-of-Scope Use
162
+ _Note: This section is taken from the [DALLE-MINI model card](https://huggingface.co/dalle-mini/dalle-mini), but applies in the same way to Stable Diffusion v1_.
163
+
164
+
165
+ The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.
166
+
167
+ #### Out-of-Scope Use
168
+ The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
169
+
170
+ #### Misuse and Malicious Use
171
+ Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
172
+
173
+ - Generating demeaning, dehumanizing, or otherwise harmful representations of people or their environments, cultures, religions, etc.
174
+ - Intentionally promoting or propagating discriminatory content or harmful stereotypes.
175
+ - Impersonating individuals without their consent.
176
+ - Sexual content without consent of the people who might see it.
177
+ - Mis- and disinformation
178
+ - Representations of egregious violence and gore
179
+ - Sharing of copyrighted or licensed material in violation of its terms of use.
180
+ - Sharing content that is an alteration of copyrighted or licensed material in violation of its terms of use.
181
+
182
+ ## Limitations and Bias
183
+
184
+ ### Limitations
185
+
186
+ ### Bias
187
+
188
+
189
+
190
+ ### Safety Module
191
+
192
+
193
+
194
+ ## Training
195
+
196
+ **Training Data**
197
+ Used RaCC0on illustrated from Diego Rodriguez.
198
+
199
+ **Training Procedure**
200
+ ...
201
+
202
+
203
+
204
+ ## Environmental Impact
205
+
206
+ Reference Stable Diffusion's enviornemntal impact.
207
+
208
+
209
+ ## Citation
210
+
211
+
212
+ *This model card was written by: Dorian Collier and is based on the [DALL-E Mini model card](https://huggingface.co/dalle-mini/dalle-mini).*