|
|
|
--- |
|
license: cc0-1.0 |
|
tags: |
|
- stable-diffusion |
|
- stable-diffusion-diffusers |
|
- text-to-image |
|
widget: |
|
- text: "racc0ons, full_body, baseball_uniform, backwards_baseball_hat, angry, attitude" |
|
example_title: Batter up |
|
- text: "racc0ons, full_body, aviator_sunglasses, leather_jacket, slicked_back_hair" |
|
example_title: Ayyyyyy! |
|
- text: "racc0ons, full_body, sitting, throne, wearing_gold_crown, bright_background" |
|
example_title: Royal RaCC0on |
|
- text: "racc0ons, full_body, gold chain, tattoo, afro" |
|
example_title: RaCC0on with gold chain |
|
extra_gated_prompt: |- |
|
This model is open access and available to all, with a Crative Commons 0 license further specifying rights and usage. |
|
|
|
extra_gated_heading: Please read the LICENSE to access this model |
|
--- |
|
|
|
# RaCC0ons Diffusion 0.5 |
|
|
|
RaCC0ons Diffusion is a latent text-to-image diffusion model capable of generating cuddly trash pandas based on [RaCC0ons World](https://racc0ons.com/). |
|
|
|
## Prompts |
|
Get creative, but remember to always add "racc0ons" as part of your prompt in order to generate the right style. |
|
|
|
|
|
### PyTorch |
|
|
|
```bash |
|
pip install --upgrade diffusers transformers scipy |
|
``` |
|
|
|
Running the pipeline with the default PNDM scheduler: |
|
|
|
```python |
|
import torch |
|
from diffusers import StableDiffusionPipeline |
|
model_id = "CompVis/stable-diffusion-v1-4" |
|
device = "cuda" |
|
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) |
|
pipe = pipe.to(device) |
|
prompt = "a photo of an astronaut riding a horse on mars" |
|
image = pipe(prompt).images[0] |
|
|
|
image.save("astronaut_rides_horse.png") |
|
``` |
|
|
|
**Note**: |
|
If you are limited by GPU memory and have less than 4GB of GPU RAM available, please make sure to load the StableDiffusionPipeline in float16 precision instead of the default float32 precision as done above. You can do so by telling diffusers to expect the weights to be in float16 precision: |
|
|
|
|
|
```py |
|
import torch |
|
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) |
|
pipe = pipe.to(device) |
|
pipe.enable_attention_slicing() |
|
prompt = "a photo of an astronaut riding a horse on mars" |
|
image = pipe(prompt).images[0] |
|
|
|
image.save("astronaut_rides_horse.png") |
|
``` |
|
|
|
To swap out the noise scheduler, pass it to `from_pretrained`: |
|
|
|
```python |
|
from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler |
|
model_id = "CompVis/stable-diffusion-v1-4" |
|
# Use the Euler scheduler here instead |
|
scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler") |
|
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16) |
|
pipe = pipe.to("cuda") |
|
prompt = "a photo of an astronaut riding a horse on mars" |
|
image = pipe(prompt).images[0] |
|
|
|
image.save("astronaut_rides_horse.png") |
|
``` |
|
|
|
### JAX/Flax |
|
|
|
To use StableDiffusion on TPUs and GPUs for faster inference you can leverage JAX/Flax. |
|
|
|
Running the pipeline with default PNDMScheduler |
|
|
|
```python |
|
import jax |
|
import numpy as np |
|
from flax.jax_utils import replicate |
|
from flax.training.common_utils import shard |
|
from diffusers import FlaxStableDiffusionPipeline |
|
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( |
|
"CompVis/stable-diffusion-v1-4", revision="flax", dtype=jax.numpy.bfloat16 |
|
) |
|
prompt = "a photo of an astronaut riding a horse on mars" |
|
prng_seed = jax.random.PRNGKey(0) |
|
num_inference_steps = 50 |
|
num_samples = jax.device_count() |
|
prompt = num_samples * [prompt] |
|
prompt_ids = pipeline.prepare_inputs(prompt) |
|
# shard inputs and rng |
|
params = replicate(params) |
|
prng_seed = jax.random.split(prng_seed, num_samples) |
|
prompt_ids = shard(prompt_ids) |
|
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images |
|
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) |
|
``` |
|
|
|
**Note**: |
|
|
|
If you are limited by TPU memory, please make sure to load the `FlaxStableDiffusionPipeline` in `bfloat16` precision instead of the default `float32` precision as done above. You can do so by telling diffusers to load the weights from "bf16" branch. |
|
|
|
```python |
|
import jax |
|
import numpy as np |
|
from flax.jax_utils import replicate |
|
from flax.training.common_utils import shard |
|
from diffusers import FlaxStableDiffusionPipeline |
|
pipeline, params = FlaxStableDiffusionPipeline.from_pretrained( |
|
"CompVis/stable-diffusion-v1-4", revision="bf16", dtype=jax.numpy.bfloat16 |
|
) |
|
prompt = "a photo of an astronaut riding a horse on mars" |
|
prng_seed = jax.random.PRNGKey(0) |
|
num_inference_steps = 50 |
|
num_samples = jax.device_count() |
|
prompt = num_samples * [prompt] |
|
prompt_ids = pipeline.prepare_inputs(prompt) |
|
# shard inputs and rng |
|
params = replicate(params) |
|
prng_seed = jax.random.split(prng_seed, num_samples) |
|
prompt_ids = shard(prompt_ids) |
|
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images |
|
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) |
|
``` |
|
|
|
# Uses |
|
|
|
## Direct Use |
|
The model is intended for research purposes only. Possible research areas and |
|
tasks include |
|
|
|
- Safe deployment of models which have the potential to generate harmful content. |
|
- Probing and understanding the limitations and biases of generative models. |
|
- Generation of artworks and use in design and other artistic processes. |
|
- Applications in educational or creative tools. |
|
- Research on generative models. |
|
|
|
Excluded uses are described below. |
|
|
|
### Misuse, Malicious Use, and Out-of-Scope Use |
|
_Note: This section is taken from the [DALLE-MINI model card](https://huggingface.co/dalle-mini/dalle-mini), but applies in the same way to Stable Diffusion v1_. |
|
|
|
|
|
The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes. |
|
|
|
#### Out-of-Scope Use |
|
The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model. |
|
|
|
#### Misuse and Malicious Use |
|
Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to: |
|
|
|
- Generating demeaning, dehumanizing, or otherwise harmful representations of people or their environments, cultures, religions, etc. |
|
- Intentionally promoting or propagating discriminatory content or harmful stereotypes. |
|
- Impersonating individuals without their consent. |
|
- Sexual content without consent of the people who might see it. |
|
- Mis- and disinformation |
|
- Representations of egregious violence and gore |
|
- Sharing of copyrighted or licensed material in violation of its terms of use. |
|
- Sharing content that is an alteration of copyrighted or licensed material in violation of its terms of use. |
|
|
|
## Limitations and Bias |
|
|
|
### Limitations |
|
|
|
### Bias |
|
|
|
|
|
|
|
### Safety Module |
|
|
|
|
|
|
|
## Training |
|
|
|
**Training Data** |
|
Used RaCC0on illustrated from Diego Rodriguez. |
|
|
|
**Training Procedure** |
|
... |
|
|
|
|
|
|
|
## Environmental Impact |
|
|
|
Reference Stable Diffusion's enviornemntal impact. |
|
|
|
|
|
## Citation |
|
|
|
|
|
*This model card was written by: Dorian Collier and is based on the [DALL-E Mini model card](https://huggingface.co/dalle-mini/dalle-mini).* |