|
--- |
|
license: mit |
|
tags: |
|
- mdm |
|
--- |
|
|
|
# Matryoshka Diffusion Models |
|
|
|
Matryoshka Diffusion Models was introduced in [the paper of the same name](https://huggingface.co/papers/2310.15111), by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly. |
|
|
|
This repository contains the **Flickr 256** checkpoint. |
|
|
|
![Generation Examples from the MDM repository](samples.png) |
|
|
|
### Highlights |
|
|
|
* This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr. |
|
* This model was trained using nested UNets at various resolutions, and generates images with a resolution of 256 Γ 256. |
|
* Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos. |
|
|
|
## Checkpoints |
|
|
|
| Model | Dataset | Resolution | Nested UNets | |
|
|---------------------------------------------------------|------------|-------------|--------------| |
|
| [mdm-flickr-64](https://hf.co/pcuenq/mdm-flickr-64) | Flickr 50M | 64 Γ 64 | β | |
|
| [mdm-flickr-256](https://hf.co/pcuenq/mdm-flickr-256) | Flickr 50M | 256 Γ 256 | β
| |
|
| [mdm-flickr-1024](https://hf.co/pcuenq/mdm-flickr-1024) | Flickr 50M | 1024 Γ 1024 | β
| |
|
|
|
## How to Use |
|
|
|
Please, refer to the [original repository](https://github.com/apple/ml-mdm) for training and inference instructions. |
|
|
|
## Citation |
|
|
|
``` |
|
@misc{gu2023matryoshkadiffusionmodels, |
|
title={Matryoshka Diffusion Models}, |
|
author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly}, |
|
year={2023}, |
|
eprint={2310.15111}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CV}, |
|
url={https://arxiv.org/abs/2310.15111}, |
|
} |
|
``` |