stable-diffusion / README.md
patrickvonplaten's picture
Update README.md
02bf629
|
raw
history blame
No virus
4.09 kB
metadata
license: other
tags:
  - stable-diffusion
  - text-to-image
inference: false

Stable Diffusion

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. This model card gives an overview of all available model checkpoints. For more in-detail model cards, please have a look at the model repositories listed under Model Access.

** Stable Diffusion V1**

In its first version, 4 model checkpoints are released: stable-diffusion-v1-1, stable-diffusion-v1-2, stable-diffusion-v1-3 and stable-diffusion-v1-4. Higher versions have been trained for longer and are thus usually better in terms of image generation quality then lower versions. More specifically:

The model can be used both with 🤗's diffusers library or the original Stable Diffusion GitHub repository.

Model access

Each checkpoint can be accessed as soon as having "click-requested" them on the respective model repositories.

For 🤗's diffusers:

For with the original Stable Diffusion GitHub repository:

Citation

    @InProceedings{Rombach_2022_CVPR,
        author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
        title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
        booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
        month     = {June},
        year      = {2022},
        pages     = {10684-10695}
    }

This model card was written by: Robin Rombach and Patrick Esser and is based on the DALL-E Mini model card.