Upload 5 files
Browse files- LICENSE +25 -0
- README.md +11 -0
- model_card.md +41 -0
- requirements.txt +8 -0
- setup.py +16 -0
LICENSE
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Modified MIT License
|
2 |
+
|
3 |
+
Software Copyright (c) 2021 OpenAI
|
4 |
+
|
5 |
+
We don’t claim ownership of the content you create with the DALL-E discrete VAE, so it is yours to
|
6 |
+
do with as you please. We only ask that you use the model responsibly and clearly indicate that it
|
7 |
+
was used.
|
8 |
+
|
9 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
|
10 |
+
associated documentation files (the "Software"), to deal in the Software without restriction,
|
11 |
+
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
12 |
+
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so,
|
13 |
+
subject to the following conditions:
|
14 |
+
|
15 |
+
The above copyright notice and this permission notice shall be included
|
16 |
+
in all copies or substantial portions of the Software.
|
17 |
+
The above copyright notice and this permission notice need not be included
|
18 |
+
with content created by the Software.
|
19 |
+
|
20 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
|
21 |
+
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
22 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
|
23 |
+
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
24 |
+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
|
25 |
+
OR OTHER DEALINGS IN THE SOFTWARE.
|
README.md
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Overview
|
2 |
+
|
3 |
+
[[Blog]](https://openai.com/blog/dall-e/) [[Paper]](https://arxiv.org/abs/2102.12092) [[Model Card]](model_card.md) [[Usage]](notebooks/usage.ipynb)
|
4 |
+
|
5 |
+
This is the official PyTorch package for the discrete VAE used for DALL·E. The transformer used to generate the images from the text is not part of this code release.
|
6 |
+
|
7 |
+
# Installation
|
8 |
+
|
9 |
+
Before running [the example notebook](notebooks/usage.ipynb), you will need to install the package using
|
10 |
+
|
11 |
+
pip install DALL-E
|
model_card.md
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Model Card: DALL·E dVAE
|
2 |
+
|
3 |
+
Following [Model Cards for Model Reporting (Mitchell et al.)](https://arxiv.org/abs/1810.03993) and [Lessons from
|
4 |
+
Archives (Jo & Gebru)](https://arxiv.org/pdf/1912.10389.pdf), we're providing some information about about the discrete
|
5 |
+
VAE (dVAE) that was used to train DALL·E.
|
6 |
+
|
7 |
+
## Model Details
|
8 |
+
|
9 |
+
The dVAE was developed by researchers at OpenAI to reduce the memory footprint of the transformer trained on the
|
10 |
+
text-to-image generation task. The details involved in training the dVAE are described in [the paper][dalle_paper]. This
|
11 |
+
model card describes the first version of the model, released in February 2021. The model consists of a convolutional
|
12 |
+
encoder and decoder whose architectures are described [here](dall_e/encoder.py) and [here](dall_e/decoder.py), respectively.
|
13 |
+
For questions or comments about the models or the code release, please file a Github issue.
|
14 |
+
|
15 |
+
## Model Use
|
16 |
+
|
17 |
+
### Intended Use
|
18 |
+
|
19 |
+
The model is intended for others to use for training their own generative models.
|
20 |
+
|
21 |
+
### Out-of-Scope Use Cases
|
22 |
+
|
23 |
+
This model is inappropriate for high-fidelity image processing applications. We also do not recommend its use as a
|
24 |
+
general-purpose image compressor.
|
25 |
+
|
26 |
+
## Training Data
|
27 |
+
|
28 |
+
The model was trained on publicly available text-image pairs collected from the internet. This data consists partly of
|
29 |
+
[Conceptual Captions][cc] and a filtered subset of [YFCC100M][yfcc100m]. We used a subset of the filters described in
|
30 |
+
[Sharma et al.][cc_paper] to construct this dataset; further details are described in [our paper][dalle_paper]. We will
|
31 |
+
not be releasing the dataset.
|
32 |
+
|
33 |
+
## Performance and Limitations
|
34 |
+
|
35 |
+
The heavy compression from the encoding process results in a noticeable loss of detail in the reconstructed images. This
|
36 |
+
renders it inappropriate for applications that require fine-grained details of the image to be preserved.
|
37 |
+
|
38 |
+
[dalle_paper]: https://arxiv.org/abs/2102.12092
|
39 |
+
[cc]: https://ai.google.com/research/ConceptualCaptions
|
40 |
+
[cc_paper]: https://www.aclweb.org/anthology/P18-1238/
|
41 |
+
[yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/
|
requirements.txt
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Pillow
|
2 |
+
blobfile
|
3 |
+
mypy
|
4 |
+
numpy
|
5 |
+
pytest
|
6 |
+
requests
|
7 |
+
torch
|
8 |
+
torchvision
|
setup.py
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from setuptools import setup
|
2 |
+
|
3 |
+
def parse_requirements(filename):
|
4 |
+
lines = (line.strip() for line in open(filename))
|
5 |
+
return [line for line in lines if line and not line.startswith("#")]
|
6 |
+
|
7 |
+
setup(name='DALL-E',
|
8 |
+
version='0.1',
|
9 |
+
description='PyTorch package for the discrete VAE used for DALL·E.',
|
10 |
+
url='http://github.com/openai/DALL-E',
|
11 |
+
author='Aditya Ramesh',
|
12 |
+
author_email='aramesh@openai.com',
|
13 |
+
license='BSD',
|
14 |
+
packages=['dall_e'],
|
15 |
+
install_requires=parse_requirements('requirements.txt'),
|
16 |
+
zip_safe=True)
|