Omnidata (Steerable Datasets)
A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV 2021)
Project Website
路 Paper
路 >> [Github] <<
路 Data
路 Pretrained Weights
路 Annotator
路
DPT-Hybrid trained for surface normal estimation or depth estimation
Vision Transformer (ViT) model trained using a DPT (Dense Prediction Transformer) decoder.
Intended uses & limitations
You can use this model for monocular surface normal estimation or depth estimation.
- Normal: estimates surface normals, a unit vector representing the tangent plane of the surface at each pixel.
- Depth: estimates normalized depth, a relative depth rather then metric depth.
Models
Models to estimate surface depth from RGB images.
- Architecture: DPT
- Training resolutions: 384x384
- Training data: Omnidate dataset
- Input:
- Dimensions: 384x384
- Normalization: (normals: [0, 1], depth: [-1,1])
BibTeX entry and citation info
@inproceedings{eftekhar2021omnidata,
title={Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets From 3D Scans},
author={Eftekhar, Ainaz and Sax, Alexander and Malik, Jitendra and Zamir, Amir},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={10786--10796},
year={2021}
}
In case you use our latest pretrained models please also cite the following paper for 3D data augmentations:
@inproceedings{kar20223d,
title={3D Common Corruptions and Data Augmentation},
author={Kar, O{\u{g}}uzhan Fatih and Yeo, Teresa and Atanov, Andrei and Zamir, Amir},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={18963--18974},
year={2022}
}
...were you looking for the research paper or project website?
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.