File size: 13,540 Bytes
4dfb78b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
## CroCo-Stereo and CroCo-Flow

This README explains how to use CroCo-Stereo and CroCo-Flow as well as how they were trained.
All commands should be launched from the root directory.

### Simple inference example

We provide a simple inference exemple for CroCo-Stereo and CroCo-Flow in the Totebook `croco-stereo-flow-demo.ipynb`.
Before running it, please download the trained models with:
```
bash stereoflow/download_model.sh crocostereo.pth
bash stereoflow/download_model.sh crocoflow.pth
```

### Prepare data for training or evaluation

Put the datasets used for training/evaluation in `./data/stereoflow` (or update the paths at the top of `stereoflow/datasets_stereo.py` and `stereoflow/datasets_flow.py`).
Please find below on the file structure should look for each dataset:
<details>
<summary>FlyingChairs</summary>

```
./data/stereoflow/FlyingChairs/
└───chairs_split.txt
└───data/
    └─── ...
```
</details>

<details>
<summary>MPI-Sintel</summary>

```
./data/stereoflow/MPI-Sintel/
└───training/
│   └───clean/
│   └───final/
│   └───flow/
└───test/
    └───clean/
    └───final/
```
</details>

<details>
<summary>SceneFlow (including FlyingThings)</summary>

```
./data/stereoflow/SceneFlow/
└───Driving/
│   └───disparity/
│   └───frames_cleanpass/
│   └───frames_finalpass/
└───FlyingThings/
│   └───disparity/
│   └───frames_cleanpass/
│   └───frames_finalpass/
│   └───optical_flow/
└───Monkaa/
    └───disparity/
    └───frames_cleanpass/
    └───frames_finalpass/
```
</details>

<details>
<summary>TartanAir</summary>

```
./data/stereoflow/TartanAir/
└───abandonedfactory/
│   └───.../
└───abandonedfactory_night/
│   └───.../
└───.../
```
</details>

<details>
<summary>Booster</summary>

```
./data/stereoflow/booster_gt/
└───train/
    └───balanced/
        └───Bathroom/
        └───Bedroom/
        └───...
```
</details>

<details>
<summary>CREStereo</summary>

```
./data/stereoflow/crenet_stereo_trainset/
└───stereo_trainset/
    └───crestereo/
        └───hole/
        └───reflective/
        └───shapenet/
        └───tree/
```
</details>

<details>
<summary>ETH3D Two-view Low-res</summary>

```
./data/stereoflow/eth3d_lowres/
└───test/
│   └───lakeside_1l/
│   └───...
└───train/
│   └───delivery_area_1l/
│   └───...
└───train_gt/
    └───delivery_area_1l/
    └───...
```
</details>

<details>
<summary>KITTI 2012</summary>

```
./data/stereoflow/kitti-stereo-2012/
└───testing/
│   └───colored_0/
│   └───colored_1/
└───training/
    └───colored_0/
    └───colored_1/
    └───disp_occ/
    └───flow_occ/
```
</details>

<details>
<summary>KITTI 2015</summary>

```
./data/stereoflow/kitti-stereo-2015/
└───testing/
│   └───image_2/
│   └───image_3/
└───training/
    └───image_2/
    └───image_3/
    └───disp_occ_0/
    └───flow_occ/
```
</details>

<details>
<summary>Middlebury</summary>

```
./data/stereoflow/middlebury
└───2005/
│   └───train/
│       └───Art/
│       └───...
└───2006/
│   └───Aloe/
│   └───Baby1/
│   └───...
└───2014/
│   └───Adirondack-imperfect/
│   └───Adirondack-perfect/
│   └───...
└───2021/
│   └───data/
│       └───artroom1/
│       └───artroom2/
│       └───...
└───MiddEval3_F/
    └───test/
    │   └───Australia/
    │   └───...
    └───train/
        └───Adirondack/
        └───...
```
</details>

<details>
<summary>Spring</summary>

```
./data/stereoflow/spring/
└───test/
│   └───0003/
│   └───...
└───train/
    └───0001/
    └───...
```
</details>


### CroCo-Stereo

##### Main model 

The main training of CroCo-Stereo was performed on a series of datasets, and it was used as it for Middlebury v3 benchmark.

```
# Download the model
bash stereoflow/download_model.sh crocostereo.pth
# Middlebury v3 submission
python stereoflow/test.py --model stereoflow_models/crocostereo.pth --dataset "MdEval3('all_full')" --save submission --tile_overlap 0.9
# Training command that was used, using checkpoint-last.pth
python -u stereoflow/train.py stereo --criterion "LaplacianLossBounded2()" --dataset "CREStereo('train')+SceneFlow('train_allpass')+30*ETH3DLowRes('train')+50*Md05('train')+50*Md06('train')+50*Md14('train')+50*Md21('train')+50*MdEval3('train_full')+Booster('train_balanced')" --val_dataset "SceneFlow('test1of100_finalpass')+SceneFlow('test1of100_cleanpass')+ETH3DLowRes('subval')+Md05('subval')+Md06('subval')+Md14('subval')+Md21('subval')+MdEval3('subval_full')+Booster('subval_balanced')" --lr 3e-5 --batch_size 6 --epochs 32 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --output_dir xps/crocostereo/main/
# or it can be launched on multiple gpus (while maintaining the effective batch size), e.g. on 3 gpus:
torchrun --nproc_per_node 3 stereoflow/train.py stereo --criterion "LaplacianLossBounded2()" --dataset "CREStereo('train')+SceneFlow('train_allpass')+30*ETH3DLowRes('train')+50*Md05('train')+50*Md06('train')+50*Md14('train')+50*Md21('train')+50*MdEval3('train_full')+Booster('train_balanced')" --val_dataset "SceneFlow('test1of100_finalpass')+SceneFlow('test1of100_cleanpass')+ETH3DLowRes('subval')+Md05('subval')+Md06('subval')+Md14('subval')+Md21('subval')+MdEval3('subval_full')+Booster('subval_balanced')" --lr 3e-5 --batch_size 2 --epochs 32 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --output_dir xps/crocostereo/main/
```

For evaluation of validation set, we also provide the model trained on the `subtrain` subset of the training sets.

```
# Download the model
bash stereoflow/download_model.sh crocostereo_subtrain.pth
# Evaluation on validation sets 
python stereoflow/test.py --model stereoflow_models/crocostereo_subtrain.pth --dataset "MdEval3('subval_full')+ETH3DLowRes('subval')+SceneFlow('test_finalpass')+SceneFlow('test_cleanpass')" --save metrics --tile_overlap 0.9
# Training command that was used (same as above but on subtrain, using checkpoint-best.pth), can also be launched on multiple gpus
python -u stereoflow/train.py stereo --criterion "LaplacianLossBounded2()" --dataset "CREStereo('train')+SceneFlow('train_allpass')+30*ETH3DLowRes('subtrain')+50*Md05('subtrain')+50*Md06('subtrain')+50*Md14('subtrain')+50*Md21('subtrain')+50*MdEval3('subtrain_full')+Booster('subtrain_balanced')" --val_dataset "SceneFlow('test1of100_finalpass')+SceneFlow('test1of100_cleanpass')+ETH3DLowRes('subval')+Md05('subval')+Md06('subval')+Md14('subval')+Md21('subval')+MdEval3('subval_full')+Booster('subval_balanced')" --lr 3e-5 --batch_size 6 --epochs 32 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --output_dir xps/crocostereo/main_subtrain/
```

##### Other models 

<details>
	<summary>Model for ETH3D</summary> 
	The model used for the submission on ETH3D is trained with the same command but using an unbounded Laplacian loss.
	
	# Download the model
	bash stereoflow/download_model.sh crocostereo_eth3d.pth
	# ETH3D submission
	python stereoflow/test.py --model stereoflow_models/crocostereo_eth3d.pth --dataset "ETH3DLowRes('all')" --save submission --tile_overlap 0.9
	# Training command that was used
	python -u stereoflow/train.py stereo --criterion "LaplacianLoss()" --tile_conf_mode conf_expbeta3 --dataset "CREStereo('train')+SceneFlow('train_allpass')+30*ETH3DLowRes('train')+50*Md05('train')+50*Md06('train')+50*Md14('train')+50*Md21('train')+50*MdEval3('train_full')+Booster('train_balanced')" --val_dataset "SceneFlow('test1of100_finalpass')+SceneFlow('test1of100_cleanpass')+ETH3DLowRes('subval')+Md05('subval')+Md06('subval')+Md14('subval')+Md21('subval')+MdEval3('subval_full')+Booster('subval_balanced')" --lr 3e-5 --batch_size 6 --epochs 32 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --output_dir xps/crocostereo/main_eth3d/
	
</details>

<details>
	<summary>Main model finetuned on Kitti</summary>
	
	# Download the model
	bash stereoflow/download_model.sh crocostereo_finetune_kitti.pth
	# Kitti submission 
	python stereoflow/test.py --model stereoflow_models/crocostereo_finetune_kitti.pth --dataset "Kitti15('test')" --save submission --tile_overlap 0.9
	# Training that was used
	python -u stereoflow/train.py stereo --crop 352 1216 --criterion "LaplacianLossBounded2()" --dataset "Kitti12('train')+Kitti15('train')" --lr 3e-5 --batch_size 1 --accum_iter 6 --epochs 20 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --start_from stereoflow_models/crocostereo.pth --output_dir xps/crocostereo/finetune_kitti/ --save_every 5
</details>

<details>
	<summary>Main model finetuned on Spring</summary>
	
	# Download the model
	bash stereoflow/download_model.sh crocostereo_finetune_spring.pth
	# Spring submission 
	python stereoflow/test.py --model stereoflow_models/crocostereo_finetune_spring.pth --dataset "Spring('test')" --save submission --tile_overlap 0.9
	# Training command that was used
	python -u stereoflow/train.py stereo --criterion "LaplacianLossBounded2()" --dataset "Spring('train')" --lr 3e-5 --batch_size 6 --epochs 8 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --start_from stereoflow_models/crocostereo.pth --output_dir xps/crocostereo/finetune_spring/
</details>

<details>
	<summary>Smaller models</summary>
	To train CroCo-Stereo with smaller CroCo pretrained models, simply replace the <code>--pretrained</code> argument. To download the smaller CroCo-Stereo models based on CroCo v2 pretraining with ViT-Base encoder and Small encoder, use <code>bash stereoflow/download_model.sh crocostereo_subtrain_vitb_smalldecoder.pth</code>, and for the model with a ViT-Base encoder and a Base decoder, use <code>bash stereoflow/download_model.sh crocostereo_subtrain_vitb_basedecoder.pth</code>.
</details>
	

### CroCo-Flow

##### Main model

The main training of CroCo-Flow was performed on the FlyingThings, FlyingChairs, MPI-Sintel and TartanAir datasets.
It was used for our submission to the MPI-Sintel benchmark.

```
# Download the model 
bash stereoflow/download_model.sh crocoflow.pth
# Evaluation 
python stereoflow/test.py --model stereoflow_models/crocoflow.pth --dataset "MPISintel('subval_cleanpass')+MPISintel('subval_finalpass')" --save metrics --tile_overlap 0.9
# Sintel submission
python stereoflow/test.py --model stereoflow_models/crocoflow.pth --dataset "MPISintel('test_allpass')" --save submission --tile_overlap 0.9
# Training command that was used, with checkpoint-best.pth
python -u stereoflow/train.py flow --criterion "LaplacianLossBounded()" --dataset "40*MPISintel('subtrain_cleanpass')+40*MPISintel('subtrain_finalpass')+4*FlyingThings('train_allpass')+4*FlyingChairs('train')+TartanAir('train')" --val_dataset "MPISintel('subval_cleanpass')+MPISintel('subval_finalpass')" --lr 2e-5 --batch_size 8 --epochs 240 --img_per_epoch 30000 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --output_dir xps/crocoflow/main/
```

##### Other models 

<details>
	<summary>Main model finetuned on Kitti</summary>
	
	# Download the model
	bash stereoflow/download_model.sh crocoflow_finetune_kitti.pth
	# Kitti submission 
	python stereoflow/test.py --model stereoflow_models/crocoflow_finetune_kitti.pth --dataset "Kitti15('test')" --save submission --tile_overlap 0.99
	# Training that was used, with checkpoint-last.pth
	python -u stereoflow/train.py flow --crop 352 1216 --criterion "LaplacianLossBounded()" --dataset "Kitti15('train')+Kitti12('train')" --lr 2e-5 --batch_size 1 --accum_iter 8 --epochs 150 --save_every 5 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --start_from stereoflow_models/crocoflow.pth --output_dir xps/crocoflow/finetune_kitti/
</details>

<details>
	<summary>Main model finetuned on Spring</summary>
	
	# Download the model
	bash stereoflow/download_model.sh crocoflow_finetune_spring.pth
	# Spring submission 
	python stereoflow/test.py --model stereoflow_models/crocoflow_finetune_spring.pth --dataset "Spring('test')" --save submission --tile_overlap 0.9
	# Training command that was used, with checkpoint-last.pth
	python -u stereoflow/train.py flow --criterion "LaplacianLossBounded()" --dataset "Spring('train')" --lr 2e-5 --batch_size 8 --epochs 12 --pretrained pretrained_models/CroCo_V2_ViTLarge_BaseDecoder.pth --start_from stereoflow_models/crocoflow.pth --output_dir xps/crocoflow/finetune_spring/
</details>

<details>
	<summary>Smaller models</summary>
	To train CroCo-Flow with smaller CroCo pretrained models, simply replace the <code>--pretrained</code> argument. To download the smaller CroCo-Flow models based on CroCo v2 pretraining with ViT-Base encoder and Small encoder, use <code>bash stereoflow/download_model.sh crocoflow_vitb_smalldecoder.pth</code>, and for the model with a ViT-Base encoder and a Base decoder, use <code>bash stereoflow/download_model.sh crocoflow_vitb_basedecoder.pth</code>.
</details>