Human3R: Everyone Everywhere All at Once

Human3R is a unified, feed-forward framework for online 4D human-scene reconstruction, in the world frame, from casually captured monocular videos. It jointly recovers global multi-person SMPL-X bodies ("everyone"), dense 3D scene ("everywhere"), and camera trajectories in a single forward pass ("all-at-once").

TL;DR: Inference with One model, One stage; Training in One day using One GPU

Human3R Demo

Sample Usage

To run the inference demo, you can use the following command (assuming you have followed the installation steps from the GitHub repository):

# input can be a folder or a video
# the following script will run inference with Human3R and visualize the output with viser on port 8080
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path MODEL_PATH --size 512 \
    --seq_path SEQ_PATH --output_dir OUT_DIR --subsample 1 --use_ttt3r \
    --vis_threshold 2 --downsample_factor 1 --reset_interval 100

# Example:
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path src/human3r.pth --size 512 --seq_path examples/GoodMornin1.mp4 --subsample 1 --use_ttt3r --vis_threshold 2 --downsample_factor 1 --reset_interval 100 --output_dir tmp

Output results will be saved to output_dir.

Citation

If you find our work useful, please cite:

@article{chen2025human3r,
    title={Human3R: Everyone Everywhere All at Once},
    author={Chen, Yue and Chen, Xingyu and Xue, Yuxuan and Chen, Anpei and Xiu, Yuliang and Gerard, Pons-Moll},
    journal={arXiv preprint arXiv:2510.06219},
    year={2025}
    }
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support