opdmulti-demo / README.md
atwang's picture
move model to huggingface
e5d2d7f
|
raw
history blame
4.48 kB
---
title: Opdmulti Demo
emoji: 🌍
colorFrom: gray
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: mit
---
# OPDMulti: Openable Part Detection for Multiple Objects
[Xiaohao Sun*](https://sun-xh.github.io/), [Hanxiao Jiang*](https://jianghanxiao.github.io/), [Manolis Savva](https://msavva.github.io/), [Angel Xuan Chang](http://angelxuanchang.github.io/)
This repository is intended as a deployment of a demo for the [OPDMulti](https://github.com/3dlg-hcvc/OPDMulti) project.
Please refer there for more information about the proect and implementation.
[arXiv](https://arxiv.org/abs/2303.14087)  [Website](https://3dlg-hcvc.github.io/OPDMulti/)
## Installation
### Requirements
For the docker build, you will just need docker in order to build and run the container, else you will need
* python 3.10 (this definitely does not work with 3.11, and you may need to downgrade some packages to work with earlier versions of Python)
* git
* cmake
* libosmesa6-dev (for open3d headless rendering)
A full list of other packages can be found in the Dockerfile, or in `Open3D/util/install_deps_ubuntu.sh`.
The model file can currently be found [here](https://huggingface.co/3dlg-hcvc/opdmulti-motion-state-rgb-model) and is
downloaded as part of the demo code.
### Docker Build (preferred)
To build the docker container, run
```
docker build -f Dockerfile -t opdmulti-demo .
```
### Local Build
To setup the environment, run the following (recommended in a virtual environment):
```
# install base requirements
python3.10 -m pip install -r requirements.txt
# install detectron2 (must be done after some of the libraries in requirements.txt)
python3.10 -m pip install git+https://github.com/facebookresearch/detectron2.git@fc9c33b1f6e5d4c37bbb46dde19af41afc1ddb2a
# build library for model
cd mask2former/modeling/pixel_decoder/ops
python setup.py build install
# INSTALL OPEN3D
# --------------
# Option A: running locally only
pip install open3d==0.17.0
# Option B: running over ssh connection / headless environment
# in a separate folder
git clone https://github.com/isl-org/Open3D.git
cd Open3D/
mkdir build && cd build
cmake -DENABLE_HEADLESS_RENDERING=ON -DBUILD_GUI=OFF -DBUILD_WEBRTC=OFF -DUSE_SYSTEM_GLEW=OFF -DUSE_SYSTEM_GLFW=OFF ..
make -j$(nproc)
make install-pip-package
# to test custom build
cd ../examples/python/visualization/
python headless_rendering.py
```
## Usage
### Docker (preferred)
To run the docker container, execute
```
docker run -d --network host -t opdmulti-demo
```
If you want to see the output of the container or interact with it,
* use `-it` to run in interactive mode, and remove the `-d` option
* add `bash` to the end to open into a console rather than running the app directly
### Local
To startup the application locally, run
```
gradio app.py
```
You can view the app on the specified port (usually 7860). To run over an ssh connection, setup port forwarding using
`-L 7860:localhost:7860` when you create your ssh connection. Note that you will need to install Open3D in headless
rendering for this to work, as described above.
## Citation
If you find this code useful, please consider citing:
```bibtex
@article{sun2023opdmulti,
title={OPDMulti: Openable Part Detection for Multiple Objects},
author={Sun, Xiaohao and Jiang, Hanxiao and Savva, Manolis and Chang, Angel Xuan},
journal={arXiv preprint arXiv:2303.14087},
year={2023}
}
@article{mao2022multiscan,
title={MultiScan: Scalable RGBD scanning for 3D environments with articulated objects},
author={Mao, Yongsen and Zhang, Yiming and Jiang, Hanxiao and Chang, Angel and Savva, Manolis},
journal={Advances in Neural Information Processing Systems},
volume={35},
pages={9058--9071},
year={2022}
}
@inproceedings{jiang2022opd,
title={OPD: Single-view 3D openable part detection},
author={Jiang, Hanxiao and Mao, Yongsen and Savva, Manolis and Chang, Angel X},
booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXIX},
pages={410--426},
year={2022},
organization={Springer}
}
@inproceedings{cheng2022masked,
title={Masked-attention mask transformer for universal image segmentation},
author={Cheng, Bowen and Misra, Ishan and Schwing, Alexander G and Kirillov, Alexander and Girdhar, Rohit},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={1290--1299},
year={2022}
}
```