--- license: cc-by-nc-4.0 ---

# Omnidata (Steerable Datasets) **A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans (ICCV 2021)** [`Project Website`](https://omnidata.vision) · [`Paper`](https://arxiv.org/abs/2110.04994) · [**`>> [Github] <<`**](https://github.com/EPFL-VILAB/omnidata#readme) · [`Data`](https://github.com/EPFL-VILAB/omnidata/tree/main/omnidata_tools/dataset#readme) · [`Pretrained Weights`](https://github.com/EPFL-VILAB/omnidata-tools/tree/main/omnidata_tools/torch#readme) · [`Annotator`](https://github.com/EPFL-VILAB/omnidata-tools/tree/main/omnidata_annotator#readme) ·

# DPT-Hybrid trained for surface normal estimation or depth estimation Vision Transformer (ViT) model trained using a DPT (Dense Prediction Transformer) decoder. ## Intended uses & limitations You can use this model for monocular surface normal estimation or depth estimation. * Normal: estimates surface normals, a unit vector representing the tangent plane of the surface at each pixel. * Depth: estimates normalized depth, a relative depth rather then metric depth. ## Models Models to estimate surface depth from RGB images. * Architecture: [DPT](https://github.com/isl-org/DPT) * Training resolutions: 384x384 * Training data: [Omnidate dataset](https://github.com/EPFL-VILAB/omnidata/tree/main) * Input: * Dimensions: 384x384 * Normalization: (normals: [0, 1], depth: [-1,1]) ### BibTeX entry and citation info ```bibtex @inproceedings{eftekhar2021omnidata, title={Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets From 3D Scans}, author={Eftekhar, Ainaz and Sax, Alexander and Malik, Jitendra and Zamir, Amir}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages={10786--10796}, year={2021} } ``` In case you use our latest pretrained models please also cite the following paper for 3D data augmentations: ```bibtex @inproceedings{kar20223d, title={3D Common Corruptions and Data Augmentation}, author={Kar, O{\u{g}}uzhan Fatih and Yeo, Teresa and Atanov, Andrei and Zamir, Amir}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={18963--18974}, year={2022} } ``` > ...were you looking for the [research paper](//omnidata.vision/#paper) or [project website](//omnidata.vision)?