Matryoshka Diffusion Models
Matryoshka Diffusion Models was introduced in the paper of the same name, by Jiatao Gu,Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly.
This repository contains the Flickr 1024 checkpoint.
Highlights
- This checkpoint was trained on a dataset of 50M text-image pairs collected from Flickr.
- This model was trained using nested UNets at various resolutions, and generates images with a resolution of 1024 ร 1024.
- Despite training on relatively small datasets, MDMs show strong zero-shot capabilities of generating high-resolution images and videos.
Checkpoints
Model | Dataset | Resolution | Nested UNets |
---|---|---|---|
mdm-flickr-64 | Flickr 50M | 64 ร 64 | โ |
mdm-flickr-256 | Flickr 50M | 256 ร 256 | โ |
mdm-flickr-1024 | Flickr 50M | 1024 ร 1024 | โ |
How to Use
Please, refer to the original repository for training and inference instructions.
Citation
@misc{gu2023matryoshkadiffusionmodels,
title={Matryoshka Diffusion Models},
author={Jiatao Gu and Shuangfei Zhai and Yizhe Zhang and Josh Susskind and Navdeep Jaitly},
year={2023},
eprint={2310.15111},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2310.15111},
}
- Downloads last month
- 13