beresandras
commited on
Commit
•
abb62fe
1
Parent(s):
2c13a98
Update README.md
Browse files
README.md
CHANGED
@@ -2,40 +2,48 @@
|
|
2 |
library_name: keras
|
3 |
tags:
|
4 |
- generative
|
5 |
-
- denoising
|
|
|
6 |
- ddim
|
7 |
- ddpm
|
8 |
---
|
9 |
|
10 |
## Model description
|
11 |
|
12 |
-
More
|
13 |
|
14 |
## Intended uses & limitations
|
15 |
|
16 |
-
|
17 |
|
18 |
## Training and evaluation data
|
19 |
|
20 |
-
|
21 |
|
22 |
## Training procedure
|
23 |
|
|
|
|
|
24 |
### Training hyperparameters
|
25 |
|
26 |
The following hyperparameters were used during training:
|
27 |
|
28 |
| Hyperparameters | Value |
|
29 |
| :-- | :-- |
|
30 |
-
|
|
31 |
-
|
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
35 |
-
|
|
36 |
-
|
|
37 |
-
|
|
38 |
-
|
|
|
|
|
|
|
|
|
|
|
39 |
|
40 |
## Model Plot
|
41 |
|
|
|
2 |
library_name: keras
|
3 |
tags:
|
4 |
- generative
|
5 |
+
- denoising
|
6 |
+
- diffusion
|
7 |
- ddim
|
8 |
- ddpm
|
9 |
---
|
10 |
|
11 |
## Model description
|
12 |
|
13 |
+
The model's architecture is a [U-Net](https://arxiv.org/abs/1505.04597) with identical input and output dimensions. U-Net is a popular semantic segmentation architecture that progressively downsamples and upsamples its input image, adding skip connections between layers having the same resolution. The network takes two inputs, the noisy images and the variances of their noise components, which it encodes using [sinusoidal embeddings](https://arxiv.org/abs/1706.03762). More details in the corresponding Keras code example.
|
14 |
|
15 |
## Intended uses & limitations
|
16 |
|
17 |
+
The model is intended for educational purposes, as a simple example of denoising diffusion generative models. It has modest compute requirements with reasonable natural image generation performance.
|
18 |
|
19 |
## Training and evaluation data
|
20 |
|
21 |
+
The model is trained on the [Oxford Flowers 102](https://www.tensorflow.org/datasets/catalog/oxford_flowers102) dataset for generating images, which is a diverse natural dataset containing around 8,000 images of flowers. Since the official splits are imbalanced (most of the images are contained in the test splite), I created new splits (80% train, 20% validation) using the [Tensorflow Datasets slicing API](https://www.tensorflow.org/datasets/splits). Center crops were used for preprocessing.
|
22 |
|
23 |
## Training procedure
|
24 |
|
25 |
+
The model is trained to denoise noisy images, and can generate images by iteratively denoising pure Gaussian noise. More details in the corresponding Keras code example.
|
26 |
+
|
27 |
### Training hyperparameters
|
28 |
|
29 |
The following hyperparameters were used during training:
|
30 |
|
31 |
| Hyperparameters | Value |
|
32 |
| :-- | :-- |
|
33 |
+
| num epochs | 80 |
|
34 |
+
| dataset repetitions per epoch| 5 |
|
35 |
+
| image resolution | 64 |
|
36 |
+
| min signal rate | 0.02 |
|
37 |
+
| max signal rate | 0.95 |
|
38 |
+
| embedding dimensions | 32 |
|
39 |
+
| embedding max frequency | 1000.0 |
|
40 |
+
| widths | 32, 64, 96, 128 |
|
41 |
+
| block depth | 2 |
|
42 |
+
| batch size | 64 |
|
43 |
+
| exponential moving average | 0.999 |
|
44 |
+
| optimizer | [AdamW](https://arxiv.org/abs/1711.05101) |
|
45 |
+
| learning rate | 1e-3 |
|
46 |
+
| weight decay | 1e-4 |
|
47 |
|
48 |
## Model Plot
|
49 |
|