metadata
license: mit
metrics:
- mse
library_name: diffusers
tags:
- diffusion
pipeline_tag: text-to-image
datasets:
- nroggendorff/zelda
language:
- en
Zelda Diffusion Model Card
SDZelda is a latent text-to-image diffusion model capable of generating images of Zelda from The Legend of Zelda. For more information about how Stable Diffusion functions, please have a look at 🤗's Stable Diffusion blog.
You can use this with the 🧨Diffusers library from Hugging Face.
Diffusers
from diffusers import StableDiffusionPipeline
import torch
pipeline = StableDiffusionPipeline.from_pretrained("nroggendorff/zelda-diffusion", torch_dtype=torch.float16, use_safetensors=True).to("cuda")
image = pipeline(prompt="a drawing of a woman in a blue dress and gold crown").images[0]
image.save("zelda.png")
Model Details
train_batch_size
: 1gradient_accumulation_steps
: 4learning_rate
: 1e-2lr_warmup_steps
: 500mixed_precision
: "fp16"eval_metric
: "mean_squared_error"
Limitations
- The model does not achieve perfect photorealism
- The model cannot render legible text
- The model was trained on a small-scale dataset: nroggendorff/zelda
Developed by
- Noa Linden Roggendorff
This model card was written by Noa Roggendorff and is based on the Stable Diffusion v1-5 Model Card.