File size: 1,762 Bytes
33631d3 0ef80d3 33631d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: mit
tags:
- mdm
---
# Matryoshka Diffusion Models
Matryoshka Diffusion Models was introduced in [the paper of the same name](https://huggingface.co/papers/2310.15111), by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly.
This repository contains the **Flickr 256** checkpoint.
![Generation Examples from the MDM repository](samples.png)
### Highlights
* This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr.
* This model was trained using nested UNets at various resolutions, and generates images with a resolution of 256 × 256.
* Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos.
## Checkpoints
| Model | Dataset | Resolution | Nested UNets |
|---------------------------------------------------------|------------|-------------|--------------|
| [mdm-flickr-64](https://hf.co/pcuenq/mdm-flickr-64) | Flickr 50M | 64 × 64 | ❎ |
| [mdm-flickr-256](https://hf.co/pcuenq/mdm-flickr-256) | Flickr 50M | 256 × 256 | ✅ |
| [mdm-flickr-1024](https://hf.co/pcuenq/mdm-flickr-1024) | Flickr 50M | 1024 × 1024 | ✅ |
## How to Use
Please, refer to the [original repository](https://github.com/apple/ml-mdm) for training and inference instructions.
## Citation
```
@misc{gu2023matryoshkadiffusionmodels,
title={Matryoshka Diffusion Models},
author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly},
year={2023},
eprint={2310.15111},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2310.15111},
}
``` |