Spaces:
Running
on
Zero
πΉ RollingDepth: Video Depth without Video Models
This repository represents the official implementation of the paper titled "Video Depth without Video Models".
Bingxin Ke1, Dominik Narnhofer1, Shengyu Huang1, Lei Ke2, Torben Peters1, Katerina Fragkiadaki2, Anton Obukhov1, Konrad Schindler1
1ETH Zurich, 2Carnegie Mellon University
π’ News
2024-11-28: Inference code is released.
π οΈ Setup
The inference code was tested on: Debian 12, Python 3.12.7 (venv), CUDA 12.4, GeForce RTX 3090
π¦ Repository
git clone https://github.com/prs-eth/RollingDepth.git
cd RollingDepth
π Python environment
Create python environment:
# with venv
python -m venv venv/rollingdepth
source venv/rollingdepth/bin/activate
# or with conda
conda create --name rollingdepth python=3.12
conda activate rollingdepth
π» Dependencies
Install dependicies:
pip install -r requirements.txt
# Install modified diffusers with cross-frame self-attention
bash script/install_diffusers_dev.sh
We use pyav for video I/O, which relies on ffmpeg.
π Test on your videos
All scripts are designed to run from the project root directory.
π· Prepare input videos
Use sample videos:
bash script/download_sample_data.sh
Or place your videos in a directory, for example, under
data/samples
.
π Run with presets
python run_video.py \
-i data/samples \
-o output/samples_fast \
-p fast \
--save-npy true \
--verbose
-p
or--preset
: preset optionsfast
for fast inference, with dilations [1, 25] (flexible), fp16, without refinement, at max. resolution 768.fast1024
for fast inference at resolution 1024full
for better details, with dilations [1, 10, 25] (flexible), fp16, with 10 refinement steps, at max. resolution 1024.paper
for reproducing paper numbers, with (fixed) dilations [1, 10, 25], fp32, with 10 refinement steps, at max. resolution 768.
-i
or--input-video
: path to input data, can be a single video file, a text file with video paths, or a directory of videos.-o
or--output-dir
: output directory.
Passing other arguments below may overwrite the preset settings:
- Coming soon
β¬ Checkpoint cache
By default, the checkpoint is stored in the Hugging Face cache. The HF_HOME environment variable defines its location and can be overridden, e.g.:
export HF_HOME=$(pwd)/cache
Alternatively, use the following script to download the checkpoint weights locally and specify checkpoint path by -c checkpoint/rollingdepth-v1-0
bash script/download_weight.sh
π¦Ώ Evaluation on test datasets
Coming soon
π Acknowledgments
We thank Yue Pan, Shuchang Liu, Nando Metzger, and Nikolai Kalischek for fruitful discussions.
We are grateful to redmond.ai (robin@redmond.ai) for providing GPU resources.
π« License
This code of this work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).
The model is licensed under RAIL++-M License (as defined in the LICENSE-MODEL)
By downloading and using the code and model you agree to the terms in LICENSE and LICENSE-MODEL respectively.