Spaces:
Running
on
Zero
Running
on
Zero
File size: 7,870 Bytes
38e3f9b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 |
# **ZoeDepth: Combining relative and metric depth** (Official implementation) <!-- omit in toc -->
[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/isl-org/ZoeDepth)
[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/shariqfarooq/ZoeDepth)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) ![PyTorch](https://img.shields.io/badge/PyTorch_v1.10.1-EE4C2C?&logo=pytorch&logoColor=white)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoedepth-zero-shot-transfer-by-combining/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=zoedepth-zero-shot-transfer-by-combining)
>#### [ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth](https://arxiv.org/abs/2302.12288)
> ##### [Shariq Farooq Bhat](https://shariqfarooq123.github.io), [Reiner Birkl](https://www.researchgate.net/profile/Reiner-Birkl), [Diana Wofk](https://dwofk.github.io/), [Peter Wonka](http://peterwonka.net/), [Matthias Müller](https://matthias.pw/)
[[Paper]](https://arxiv.org/abs/2302.12288)
![teaser](assets/zoedepth-teaser.png)
## **Table of Contents** <!-- omit in toc -->
- [**Usage**](#usage)
- [Using torch hub](#using-torch-hub)
- [Using local copy](#using-local-copy)
- [Using local torch hub](#using-local-torch-hub)
- [or load the models manually](#or-load-the-models-manually)
- [Using ZoeD models to predict depth](#using-zoed-models-to-predict-depth)
- [**Environment setup**](#environment-setup)
- [**Sanity checks** (Recommended)](#sanity-checks-recommended)
- [Model files](#model-files)
- [**Evaluation**](#evaluation)
- [Evaluating offical models](#evaluating-offical-models)
- [Evaluating local checkpoint](#evaluating-local-checkpoint)
- [**Training**](#training)
- [**Gradio demo**](#gradio-demo)
- [**Citation**](#citation)
## **Usage**
It is recommended to fetch the latest [MiDaS repo](https://github.com/isl-org/MiDaS) via torch hub before proceeding:
```python
import torch
torch.hub.help("intel-isl/MiDaS", "DPT_BEiT_L_384", force_reload=True) # Triggers fresh download of MiDaS repo
```
### **ZoeDepth models** <!-- omit in toc -->
### Using torch hub
```python
import torch
repo = "isl-org/ZoeDepth"
# Zoe_N
model_zoe_n = torch.hub.load(repo, "ZoeD_N", pretrained=True)
# Zoe_K
model_zoe_k = torch.hub.load(repo, "ZoeD_K", pretrained=True)
# Zoe_NK
model_zoe_nk = torch.hub.load(repo, "ZoeD_NK", pretrained=True)
```
### Using local copy
Clone this repo:
```bash
git clone https://github.com/isl-org/ZoeDepth.git && cd ZoeDepth
```
#### Using local torch hub
You can use local source for torch hub to load the ZoeDepth models, for example:
```python
import torch
# Zoe_N
model_zoe_n = torch.hub.load(".", "ZoeD_N", source="local", pretrained=True)
```
#### or load the models manually
```python
from zoedepth.models.builder import build_model
from zoedepth.utils.config import get_config
# ZoeD_N
conf = get_config("zoedepth", "infer")
model_zoe_n = build_model(conf)
# ZoeD_K
conf = get_config("zoedepth", "infer", config_version="kitti")
model_zoe_k = build_model(conf)
# ZoeD_NK
conf = get_config("zoedepth_nk", "infer")
model_zoe_nk = build_model(conf)
```
### Using ZoeD models to predict depth
```python
##### sample prediction
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
zoe = model_zoe_n.to(DEVICE)
# Local file
from PIL import Image
image = Image.open("/path/to/image.jpg").convert("RGB") # load
depth_numpy = zoe.infer_pil(image) # as numpy
depth_pil = zoe.infer_pil(image, output_type="pil") # as 16-bit PIL Image
depth_tensor = zoe.infer_pil(image, output_type="tensor") # as torch tensor
# Tensor
from zoedepth.utils.misc import pil_to_batched_tensor
X = pil_to_batched_tensor(image).to(DEVICE)
depth_tensor = zoe.infer(X)
# From URL
from zoedepth.utils.misc import get_image_from_url
# Example URL
URL = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS4W8H_Nxk_rs3Vje_zj6mglPOH7bnPhQitBH8WkqjlqQVotdtDEG37BsnGofME3_u6lDk&usqp=CAU"
image = get_image_from_url(URL) # fetch
depth = zoe.infer_pil(image)
# Save raw
from zoedepth.utils.misc import save_raw_16bit
fpath = "/path/to/output.png"
save_raw_16bit(depth, fpath)
# Colorize output
from zoedepth.utils.misc import colorize
colored = colorize(depth)
# save colored output
fpath_colored = "/path/to/output_colored.png"
Image.fromarray(colored).save(fpath_colored)
```
## **Environment setup**
The project depends on :
- [pytorch](https://pytorch.org/) (Main framework)
- [timm](https://timm.fast.ai/) (Backbone helper for MiDaS)
- pillow, matplotlib, scipy, h5py, opencv (utilities)
Install environment using `environment.yml` :
Using [mamba](https://github.com/mamba-org/mamba) (fastest):
```bash
mamba env create -n zoe --file environment.yml
mamba activate zoe
```
Using conda :
```bash
conda env create -n zoe --file environment.yml
conda activate zoe
```
## **Sanity checks** (Recommended)
Check if models can be loaded:
```bash
python sanity_hub.py
```
Try a demo prediction pipeline:
```bash
python sanity.py
```
This will save a file `pred.png` in the root folder, showing RGB and corresponding predicted depth side-by-side.
## Model files
Models are defined under `models/` folder, with `models/<model_name>_<version>.py` containing model definitions and `models/config_<model_name>.json` containing configuration.
Single metric head models (Zoe_N and Zoe_K from the paper) have the common definition and are defined under `models/zoedepth` while as the multi-headed model (Zoe_NK) is defined under `models/zoedepth_nk`.
## **Evaluation**
Download the required dataset and change the `DATASETS_CONFIG` dictionary in `utils/config.py` accordingly.
### Evaluating offical models
On NYU-Depth-v2 for example:
For ZoeD_N:
```bash
python evaluate.py -m zoedepth -d nyu
```
For ZoeD_NK:
```bash
python evaluate.py -m zoedepth_nk -d nyu
```
### Evaluating local checkpoint
```bash
python evaluate.py -m zoedepth --pretrained_resource="local::/path/to/local/ckpt.pt" -d nyu
```
Pretrained resources are prefixed with `url::` to indicate weights should be fetched from a url, or `local::` to indicate path is a local file. Refer to `models/model_io.py` for details.
The dataset name should match the corresponding key in `utils.config.DATASETS_CONFIG` .
## **Training**
Download training datasets as per instructions given [here](https://github.com/cleinc/bts/tree/master/pytorch#nyu-depvh-v2). Then for training a single head model on NYU-Depth-v2 :
```bash
python train_mono.py -m zoedepth --pretrained_resource=""
```
For training the Zoe-NK model:
```bash
python train_mix.py -m zoedepth_nk --pretrained_resource=""
```
## **Gradio demo**
We provide a UI demo built using [gradio](https://gradio.app/). To get started, install UI requirements:
```bash
pip install -r ui/ui_requirements.txt
```
Then launch the gradio UI:
```bash
python -m ui.app
```
The UI is also hosted on HuggingFace🤗 [here](https://huggingface.co/spaces/shariqfarooq/ZoeDepth)
## **Citation**
```
@misc{https://doi.org/10.48550/arxiv.2302.12288,
doi = {10.48550/ARXIV.2302.12288},
url = {https://arxiv.org/abs/2302.12288},
author = {Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and Müller, Matthias},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth},
publisher = {arXiv},
year = {2023},
copyright = {arXiv.org perpetual, non-exclusive license}
}
```
|