ClimateGAN
Setup
PyTorch >= 1.1.0
otherwise optimizer.step() and scheduler.step() are in the wrong order (docs)
pytorch==1.6 to use pytorch-xla or automatic mixed precision (amp
branch).
Configuration files use the YAML syntax. If you don't know what &
and <<
mean, you'll have a hard time reading the files. Have a look at:
- https://dev.to/paulasantamaria/introduction-to-yaml-125f
- https://stackoverflow.com/questions/41063361/what-is-the-double-left-arrow-syntax-in-yaml-called-and-wheres-it-specced/41065222
pip
$ pip install comet_ml scipy opencv-python torch torchvision omegaconf==1.4.1 hydra-core==0.11.3 scikit-image imageio addict tqdm torch_optimizer
Coding conventions
- Tasks
x
is an input image, in [-1, 1]s
is a segmentation target withlong
classesd
is a depth map target in R, may be actuallylog(depth)
or1/depth
m
is a binary mask with 1s where water is/should be
- Domains
r
is the real domain for the masker. Input images are real pictures of urban/suburban/rural areass
is the simulated domain for the masker. Input images are taken from our Unity worldrf
is the real flooded domain for the painter. Training images are pairs(x, m)
of flooded scenes for which the water should be reconstructed, in the validation data input images are not flooded and we provide a manually labeled maskm
kitti
is a specials
domain to pre-train the masker on Virtual Kitti 2- it alters the
trainer.loaders
dict to select relevant data sources fromtrainer.all_loaders
intrainer.switch_data()
. The rest of the code is identical.
- it alters the
- Flow
- This describes the call stack for the trainers standard training procedure
train()
run_epoch()
update_G()
zero_grad(G)
get_G_loss()
get_masker_loss()
masker_m_loss()
-> masking lossmasker_s_loss()
-> segmentation lossmasker_d_loss()
-> depth estimation loss
get_painter_loss()
-> painter's loss
g_loss.backward()
g_opt_step()
update_D()
zero_grad(D)
get_D_loss()
- painter's disc losses
masker_m_loss()
-> masking AdvEnt disc lossmasker_s_loss()
-> segmentation AdvEnt disc loss
d_loss.backward()
d_opt_step()
update_learning_rates()
-> update learning rates according to schedules defined inopts.gen.opt
andopts.dis.opt
run_validation()
- compute val losses
eval_images()
-> compute metricslog_comet_images()
-> compute and upload inferences
save()
Resuming
Set train.resume
to True
in opts.yaml
and specify where to load the weights:
Use a config's load_path
namespace. It should have sub-keys m
, p
and pm
:
load_paths:
p: none # Painter weights
m: none # Masker weights
pm: none # Painter + Masker weights (single ckpt for both)
- any path which leads to a dir will be loaded as
path / checkpoints / latest_ckpt.pth
- if you want to specify a specific checkpoint (not the latest), it MUST be a
.pth
file - resuming a
P
OR anM
model, you may only specify 1 ofload_path.p
ORload_path.m
. You may also leave BOTH atnone
, in which caseoutput_path / checkpoints / latest_ckpt.pth
will be used - resuming a P+M model, you may specify (
p
ANDm
) ORpm
OR leave all atnone
, in which caseoutput_path / checkpoints / latest_ckpt.pth
will be used to load from a single checkpoint
Generator
Encoder:
trainer.G.encoder
Deeplabv2 or v3-based encoderDecoders:
trainer.G.decoders["s"]
-> Segmentation -> DLV3+ architecture (ASPP + Decoder)trainer.G.decoders["d"]
-> Depth -> ResBlocks + (Upsample + Conv)trainer.G.decoders["m"]
-> Mask -> ResBlocks + (Upsample + Conv) -> Binary mask: 1 = water should be theretrainer.G.mask()
predicts a mask and optionally appliessigmoid
from anx
input or az
input
Painter:
trainer.G.painter
-> GauGAN SPADE-based- input = masked image
trainer.G.paint(m, x)
higher level function which takes care of maskingIf
opts.gen.p.paste_original_content
the painter should only create water and not reconstruct outside the mask: the output ofpaint()
ispainted * m + x * (1 - m)
High level methods of interest:
trainer.infer_all()
creates a dictionary of events with keysflood
wildfire
andsmog
. Can take in a single image or a batch, of numpy arrays or torch tensors, on CPU/GPU/TPU. This method calls, amongst others:trainer.G.encode()
to compute the shared latent vectorz
trainer.G.mask(z=z)
to infer the masktrainer.compute_fire(x, segmentation)
to create a wildfire image fromx
and inferred segmentationtrainer.compute_smog(x, depth)
to create a smog image fromx
and inferred depthtrainer.compute_flood(x, mask)
to create a flood image fromx
and inferred mask using the painter (trainer.G.paint(m, x)
)
Trainer.resume_from_path()
static method to resume a trainer from a path
Discriminator
updates
multi-batch:
multi_domain_batch = {"rf: batch0, "r": batch1, "s": batch2}
interfaces
batches
batch = Dict({
"data": {
"d": depthmap,,
"s": segmentation_map,
"m": binary_mask
"x": real_flooded_image,
},
"paths":{
same_keys: path_to_file
}
"domain": list(rf | r | s),
"mode": list(train | val)
})
data
json files
name | domain | description | author |
---|---|---|---|
train_r_full.json, val_r_full.json | r | MiDaS+ Segmentation pseudo-labels .pt (HRNet + Cityscapes) | Mélisande |
train_s_full.json, val_s_full.json | s | Simulated data from Unity11k urban + Unity suburban dataset | *** |
train_s_nofences.json, val_s_nofences.json | s | Simulated data from Unity11k urban + Unity suburban dataset without fences | Alexia |
train_r_full_pl.json, val_r_full_pl.json | r | MegaDepth + Segmentation pseudo-labels .pt (HRNet + Cityscapes) | Alexia |
train_r_full_midas.json, val_r_full_midas.json | r | MiDaS+ Segmentation (HRNet + Cityscapes) | Mélisande |
train_r_full_old.json, val_r_full_old.json | r | MegaDepth+ Segmentation (HRNet + Cityscapes) | *** |
train_r_nopeople.json, val_r_nopeople.json | r | Same training data as above with people removed | Sasha |
train_rf_with_sim.json | rf | Doubled train_rf's size with sim data (randomly chosen) | Victor |
train_rf.json | rf | UPDATE (12/12/20): added 50 ims & masks from ADE20K Outdoors | Victor |
train_allres.json, val_allres.json | rf | includes both lowres and highres from ORCA_water_seg | Tianyu |
train_highres_only.json, val_highres_only.json | rf | includes only highres from ORCA_water_seg | Tianyu |
# data file ; one for each r|s
- x: /path/to/image
m: /path/to/mask
s: /path/to/segmentation map
- x: /path/to/another image
d: /path/to/depth map
m: /path/to/mask
s: /path/to/segmentation map
- x: ...
or
[
{
"x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.jpg",
"s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.npy",
"d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005_depth.jpg"
},
{
"x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.jpg",
"s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.npy",
"d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006_depth.jpg"
}
]
The json files used are located at /network/tmp1/ccai/data/climategan/
. In the basenames, _s
denotes simulated domain data and _r
real domain data.
The base
folder contains json files with paths to images ("x"
key) and masks (taken as ground truth for the area that should be flooded, "m"
key).
The seg
folder contains json files and keys "x"
, "m"
and "s"
(segmentation) for each image.
loaders
loaders = Dict({
train: { r: loader, s: loader},
val: { r: loader, s: loader}
})
losses
trainer.losses
is a dictionary mapping to loss functions to optimize for the 3 main parts of the architecture: generator G
, discriminators D
:
trainer.losses = {
"G":{ # generator
"gan": { # gan loss from the discriminators
"a": GANLoss, # adaptation decoder
"t": GANLoss # translation decoder
},
"cycle": { # cycle-consistency loss
"a": l1 | l2,,
"t": l1 | l2,
},
"auto": { # auto-encoding loss a.k.a. reconstruction loss
"a": l1 | l2,
"t": l1 | l2
},
"tasks": { # specific losses for each auxillary task
"d": func, # depth estimation
"h": func, # height estimation
"s": cross_entropy_2d, # segmentation
"w": func, # water generation
},
"classifier": l1 | l2 | CE # loss from fooling the classifier
},
"D": GANLoss, # discriminator losses from the generator and true data
"C": l1 | l2 | CE # classifier should predict the right 1-h vector [rf, rn, sf, sn]
}
Logging on comet
Comet.ml will look for api keys in the following order: argument to the Experiment(api_key=...)
call, COMET_API_KEY
environment variable, .comet.config
file in the current working directory, .comet.config
in the current user's home directory.
If your not managing several comet accounts at the same time, I recommend putting .comet.config
in your home as such:
[comet]
api_key=<api_key>
workspace=vict0rsch
rest_api_key=<rest_api_key>
Tests
Run tests by executing python test_trainer.py
. You can add --no_delete
not to delete the comet experiment at exit and inspect uploads.
Write tests as scenarios by adding to the list test_scenarios
in the file. A scenario is a dict of overrides over the base opts in shared/trainer/defaults.yaml
. You can create special flags for the scenario by adding keys which start with __
. For instance, __doc
is a mandatory key in any scenario describing it succinctly.
Resources
Tricks and Tips for Training a GAN GAN Hacks Keep Calm and train a GAN. Pitfalls and Tips on training Generative Adversarial Networks
Example
Inference: computing floods
from pathlib import Path
from skimage.io import imsave
from tqdm import tqdm
from climategan.trainer import Trainer
from climategan.utils import find_images
from climategan.tutils import tensor_ims_to_np_uint8s
from climategan.transforms import PrepareInference
model_path = "some/path/to/output/folder" # not .ckpt
input_folder = "path/to/a/folder/with/images"
output_path = "path/where/images/will/be/written"
# resume trainer
trainer = Trainer.resume_from_path(model_path, new_exp=None, inference=True)
# find paths for all images in the input folder. There is a recursive option.
im_paths = sorted(find_images(input_folder), key=lambda x: x.name)
# Load images into tensors
# * smaller side resized to 640 - keeping aspect ratio
# * then longer side is cropped in the center
# * result is a 1x3x640x640 float tensor in [-1; 1]
xs = PrepareInference()(im_paths)
# send to device
xs = [x.to(trainer.device) for x in xs]
# compute flood
# * compute mask
# * binarize mask if bin_value > 0
# * paint x using this mask
ys = [trainer.compute_flood(x, bin_value=0.5) for x in tqdm(xs)]
# convert 1x3x640x640 float tensors in [-1; 1] into 640x640x3 numpy arrays in [0, 255]
np_ys = [tensor_ims_to_np_uint8s(y) for y in tqdm(ys)]
# write images
for i, n in tqdm(zip(im_paths, np_ys), total=len(im_paths)):
imsave(Path(output_path) / i.name, n)
Release process
In the release/
folder
- create a
model/
folder - create folders
model/masker/
andmodel/painter/
- add the climategan code in
release/
:git clone git@github.com:cc-ai/climategan.git
- move the code to
release/
:cp climategan/* . && rm -rf climategan
- update
model/masker/opts/events
withevents:
fromshared/trainer/opts.yaml
- update
model/masker/opts/val.val_painter
to"model/painter/checkpoints/latest_ckpt.pth"
- update
model/masker/opts/load_paths.m
to"model/masker/checkpoints/latest_ckpt.pth"