File size: 14,071 Bytes
448ebbd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 |
# ClimateGAN
- [ClimateGAN](#climategan)
- [Setup](#setup)
- [Coding conventions](#coding-conventions)
- [updates](#updates)
- [interfaces](#interfaces)
- [Logging on comet](#logging-on-comet)
- [Resources](#resources)
- [Example](#example)
- [Release process](#release-process)
## Setup
**`PyTorch >= 1.1.0`** otherwise optimizer.step() and scheduler.step() are in the wrong order ([docs](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate))
**pytorch==1.6** to use pytorch-xla or automatic mixed precision (`amp` branch).
Configuration files use the **YAML** syntax. If you don't know what `&` and `<<` mean, you'll have a hard time reading the files. Have a look at:
* https://dev.to/paulasantamaria/introduction-to-yaml-125f
* https://stackoverflow.com/questions/41063361/what-is-the-double-left-arrow-syntax-in-yaml-called-and-wheres-it-specced/41065222
**pip**
```
$ pip install comet_ml scipy opencv-python torch torchvision omegaconf==1.4.1 hydra-core==0.11.3 scikit-image imageio addict tqdm torch_optimizer
```
## Coding conventions
* Tasks
* `x` is an input image, in [-1, 1]
* `s` is a segmentation target with `long` classes
* `d` is a depth map target in R, may be actually `log(depth)` or `1/depth`
* `m` is a binary mask with 1s where water is/should be
* Domains
* `r` is the *real* domain for the masker. Input images are real pictures of urban/suburban/rural areas
* `s` is the *simulated* domain for the masker. Input images are taken from our Unity world
* `rf` is the *real flooded* domain for the painter. Training images are pairs `(x, m)` of flooded scenes for which the water should be reconstructed, in the validation data input images are not flooded and we provide a manually labeled mask `m`
* `kitti` is a special `s` domain to pre-train the masker on [Virtual Kitti 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/)
* it alters the `trainer.loaders` dict to select relevant data sources from `trainer.all_loaders` in `trainer.switch_data()`. The rest of the code is identical.
* Flow
* This describes the call stack for the trainers standard training procedure
* `train()`
* `run_epoch()`
* `update_G()`
* `zero_grad(G)`
* `get_G_loss()`
* `get_masker_loss()`
* `masker_m_loss()` -> masking loss
* `masker_s_loss()` -> segmentation loss
* `masker_d_loss()` -> depth estimation loss
* `get_painter_loss()` -> painter's loss
* `g_loss.backward()`
* `g_opt_step()`
* `update_D()`
* `zero_grad(D)`
* `get_D_loss()`
* painter's disc losses
* `masker_m_loss()` -> masking AdvEnt disc loss
* `masker_s_loss()` -> segmentation AdvEnt disc loss
* `d_loss.backward()`
* `d_opt_step()`
* `update_learning_rates()` -> update learning rates according to schedules defined in `opts.gen.opt` and `opts.dis.opt`
* `run_validation()`
* compute val losses
* `eval_images()` -> compute metrics
* `log_comet_images()` -> compute and upload inferences
* `save()`
### Resuming
Set `train.resume` to `True` in `opts.yaml` and specify where to load the weights:
Use a config's `load_path` namespace. It should have sub-keys `m`, `p` and `pm`:
```yaml
load_paths:
p: none # Painter weights
m: none # Masker weights
pm: none # Painter + Masker weights (single ckpt for both)
```
1. any path which leads to a dir will be loaded as `path / checkpoints / latest_ckpt.pth`
2. if you want to specify a specific checkpoint (not the latest), it MUST be a `.pth` file
3. resuming a `P` **OR** an `M` model, you may only specify 1 of `load_path.p` **OR** `load_path.m`.
You may also leave **BOTH** at `none`, in which case `output_path / checkpoints / latest_ckpt.pth`
will be used
4. resuming a P+M model, you may specify (`p` AND `m`) **OR** `pm` **OR** leave all at `none`,
in which case `output_path / checkpoints / latest_ckpt.pth` will be used to load from
a single checkpoint
### Generator
* **Encoder**:
`trainer.G.encoder` Deeplabv2 or v3-based encoder
* Code borrowed from
* https://github.com/valeoai/ADVENT/blob/master/advent/model/deeplabv2.py
* https://github.com/CoinCheung/DeepLab-v3-plus-cityscapes
* **Decoders**:
* `trainer.G.decoders["s"]` -> *Segmentation* -> DLV3+ architecture (ASPP + Decoder)
* `trainer.G.decoders["d"]` -> *Depth* -> ResBlocks + (Upsample + Conv)
* `trainer.G.decoders["m"]` -> *Mask* -> ResBlocks + (Upsample + Conv) -> Binary mask: 1 = water should be there
* `trainer.G.mask()` predicts a mask and optionally applies `sigmoid` from an `x` input or a `z` input
* **Painter**: `trainer.G.painter` -> [GauGAN SPADE-based](https://github.com/NVlabs/SPADE)
* input = masked image
* `trainer.G.paint(m, x)` higher level function which takes care of masking
* If `opts.gen.p.paste_original_content` the painter should only create water and not reconstruct outside the mask: the output of `paint()` is `painted * m + x * (1 - m)`
High level methods of interest:
* `trainer.infer_all()` creates a dictionary of events with keys `flood` `wildfire` and `smog`. Can take in a single image or a batch, of numpy arrays or torch tensors, on CPU/GPU/TPU. This method calls, amongst others:
* `trainer.G.encode()` to compute the shared latent vector `z`
* `trainer.G.mask(z=z)` to infer the mask
* `trainer.compute_fire(x, segmentation)` to create a wildfire image from `x` and inferred segmentation
* `trainer.compute_smog(x, depth)` to create a smog image from `x` and inferred depth
* `trainer.compute_flood(x, mask)` to create a flood image from `x` and inferred mask using the painter (`trainer.G.paint(m, x)`)
* `Trainer.resume_from_path()` static method to resume a trainer from a path
### Discriminator
## updates
multi-batch:
```
multi_domain_batch = {"rf: batch0, "r": batch1, "s": batch2}
```
## interfaces
### batches
```python
batch = Dict({
"data": {
"d": depthmap,,
"s": segmentation_map,
"m": binary_mask
"x": real_flooded_image,
},
"paths":{
same_keys: path_to_file
}
"domain": list(rf | r | s),
"mode": list(train | val)
})
```
### data
#### json files
| name | domain | description | author |
| :--------------------------------------------- | :----: | :------------------------------------------------------------------------- | :-------: |
| **train_r_full.json, val_r_full.json** | r | MiDaS+ Segmentation pseudo-labels .pt (HRNet + Cityscapes) | Mélisande |
| **train_s_full.json, val_s_full.json** | s | Simulated data from Unity11k urban + Unity suburban dataset | *** |
| train_s_nofences.json, val_s_nofences.json | s | Simulated data from Unity11k urban + Unity suburban dataset without fences | Alexia |
| train_r_full_pl.json, val_r_full_pl.json | r | MegaDepth + Segmentation pseudo-labels .pt (HRNet + Cityscapes) | Alexia |
| train_r_full_midas.json, val_r_full_midas.json | r | MiDaS+ Segmentation (HRNet + Cityscapes) | Mélisande |
| train_r_full_old.json, val_r_full_old.json | r | MegaDepth+ Segmentation (HRNet + Cityscapes) | *** |
| train_r_nopeople.json, val_r_nopeople.json | r | Same training data as above with people removed | Sasha |
| train_rf_with_sim.json | rf | Doubled train_rf's size with sim data (randomly chosen) | Victor |
| train_rf.json | rf | UPDATE (12/12/20): added 50 ims & masks from ADE20K Outdoors | Victor |
| train_allres.json, val_allres.json | rf | includes both lowres and highres from ORCA_water_seg | Tianyu |
| train_highres_only.json, val_highres_only.json | rf | includes only highres from ORCA_water_seg | Tianyu |
```yaml
# data file ; one for each r|s
- x: /path/to/image
m: /path/to/mask
s: /path/to/segmentation map
- x: /path/to/another image
d: /path/to/depth map
m: /path/to/mask
s: /path/to/segmentation map
- x: ...
```
or
```json
[
{
"x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.jpg",
"s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.npy",
"d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005_depth.jpg"
},
{
"x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.jpg",
"s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.npy",
"d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006_depth.jpg"
}
]
```
The json files used are located at `/network/tmp1/ccai/data/climategan/`. In the basenames, `_s` denotes simulated domain data and `_r` real domain data.
The `base` folder contains json files with paths to images (`"x"`key) and masks (taken as ground truth for the area that should be flooded, `"m"` key).
The `seg` folder contains json files and keys `"x"`, `"m"` and `"s"` (segmentation) for each image.
loaders
```
loaders = Dict({
train: { r: loader, s: loader},
val: { r: loader, s: loader}
})
```
### losses
`trainer.losses` is a dictionary mapping to loss functions to optimize for the 3 main parts of the architecture: generator `G`, discriminators `D`:
```python
trainer.losses = {
"G":{ # generator
"gan": { # gan loss from the discriminators
"a": GANLoss, # adaptation decoder
"t": GANLoss # translation decoder
},
"cycle": { # cycle-consistency loss
"a": l1 | l2,,
"t": l1 | l2,
},
"auto": { # auto-encoding loss a.k.a. reconstruction loss
"a": l1 | l2,
"t": l1 | l2
},
"tasks": { # specific losses for each auxillary task
"d": func, # depth estimation
"h": func, # height estimation
"s": cross_entropy_2d, # segmentation
"w": func, # water generation
},
"classifier": l1 | l2 | CE # loss from fooling the classifier
},
"D": GANLoss, # discriminator losses from the generator and true data
"C": l1 | l2 | CE # classifier should predict the right 1-h vector [rf, rn, sf, sn]
}
```
## Logging on comet
Comet.ml will look for api keys in the following order: argument to the `Experiment(api_key=...)` call, `COMET_API_KEY` environment variable, `.comet.config` file in the current working directory, `.comet.config` in the current user's home directory.
If your not managing several comet accounts at the same time, I recommend putting `.comet.config` in your home as such:
```
[comet]
api_key=<api_key>
workspace=vict0rsch
rest_api_key=<rest_api_key>
```
### Tests
Run tests by executing `python test_trainer.py`. You can add `--no_delete` not to delete the comet experiment at exit and inspect uploads.
Write tests as scenarios by adding to the list `test_scenarios` in the file. A scenario is a dict of overrides over the base opts in `shared/trainer/defaults.yaml`. You can create special flags for the scenario by adding keys which start with `__`. For instance, `__doc` is a mandatory key in any scenario describing it succinctly.
## Resources
[Tricks and Tips for Training a GAN](https://chloes-dl.com/2019/11/19/tricks-and-tips-for-training-a-gan/)
[GAN Hacks](https://github.com/soumith/ganhacks)
[Keep Calm and train a GAN. Pitfalls and Tips on training Generative Adversarial Networks](https://medium.com/@utk.is.here/keep-calm-and-train-a-gan-pitfalls-and-tips-on-training-generative-adversarial-networks-edd529764aa9)
## Example
**Inference: computing floods**
```python
from pathlib import Path
from skimage.io import imsave
from tqdm import tqdm
from climategan.trainer import Trainer
from climategan.utils import find_images
from climategan.tutils import tensor_ims_to_np_uint8s
from climategan.transforms import PrepareInference
model_path = "some/path/to/output/folder" # not .ckpt
input_folder = "path/to/a/folder/with/images"
output_path = "path/where/images/will/be/written"
# resume trainer
trainer = Trainer.resume_from_path(model_path, new_exp=None, inference=True)
# find paths for all images in the input folder. There is a recursive option.
im_paths = sorted(find_images(input_folder), key=lambda x: x.name)
# Load images into tensors
# * smaller side resized to 640 - keeping aspect ratio
# * then longer side is cropped in the center
# * result is a 1x3x640x640 float tensor in [-1; 1]
xs = PrepareInference()(im_paths)
# send to device
xs = [x.to(trainer.device) for x in xs]
# compute flood
# * compute mask
# * binarize mask if bin_value > 0
# * paint x using this mask
ys = [trainer.compute_flood(x, bin_value=0.5) for x in tqdm(xs)]
# convert 1x3x640x640 float tensors in [-1; 1] into 640x640x3 numpy arrays in [0, 255]
np_ys = [tensor_ims_to_np_uint8s(y) for y in tqdm(ys)]
# write images
for i, n in tqdm(zip(im_paths, np_ys), total=len(im_paths)):
imsave(Path(output_path) / i.name, n)
```
## Release process
In the `release/` folder
* create a `model/` folder
* create folders `model/masker/` and `model/painter/`
* add the climategan code in `release/`: `git clone git@github.com:cc-ai/climategan.git`
* move the code to `release/`: `cp climategan/* . && rm -rf climategan`
* update `model/masker/opts/events` with `events:` from `shared/trainer/opts.yaml`
* update `model/masker/opts/val.val_painter` to `"model/painter/checkpoints/latest_ckpt.pth"`
* update `model/masker/opts/load_paths.m` to `"model/masker/checkpoints/latest_ckpt.pth"`
|