README.md · nroggendorff/zelda-diffusion at fa53c2caca038d1c8acff1baec74da7a5e9cacdb

metadata

license: mit
metrics:
  - mse
library_name: diffusers
tags:
  - diffusion
pipeline_tag: text-to-image
datasets:
  - nroggendorff/zelda
language:
  - en

Zelda Diffusion Model Card

SDZelda is a latent text-to-image diffusion model capable of generating images of Zelda from The Legend of Zelda. For more information about how Stable Diffusion functions, please have a look at 🤗's Stable Diffusion blog.

You can use this with the 🧨Diffusers library from Hugging Face.

Diffusers

from diffusers import StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("nroggendorff/zelda-diffusion", torch_dtype=torch.float16, use_safetensors=True).to("cuda")

image = pipeline(prompt="a drawing of a woman in a blue dress and gold crown").images[0]
image.save("zelda.png")

Model Details

train_batch_size: 1
gradient_accumulation_steps: 4
learning_rate: 1e-2
lr_warmup_steps: 500
mixed_precision: "fp16"
eval_metric: "mean_squared_error"

Limitations

The model does not achieve perfect photorealism
The model cannot render legible text
The model was trained on a small-scale dataset: nroggendorff/zelda

Developed by

Noa Linden Roggendorff

This model card was written by Noa Roggendorff and is based on the Stable Diffusion v1-5 Model Card.