Human3R: Everyone Everywhere All at Once
Human3R is a unified, feed-forward framework for online 4D human-scene reconstruction, in the world frame, from casually captured monocular videos. It jointly recovers global multi-person SMPL-X bodies ("everyone"), dense 3D scene ("everywhere"), and camera trajectories in a single forward pass ("all-at-once").
TL;DR: Inference with One model, One stage; Training in One day using One GPU
- Paper: Human3R: Everyone Everywhere All at Once
- Project Page: https://fanegg.github.io/Human3R/
- Code: https://github.com/fanegg/Human3R
Sample Usage
To run the inference demo, you can use the following command (assuming you have followed the installation steps from the GitHub repository):
# input can be a folder or a video
# the following script will run inference with Human3R and visualize the output with viser on port 8080
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path MODEL_PATH --size 512 \
--seq_path SEQ_PATH --output_dir OUT_DIR --subsample 1 --use_ttt3r \
--vis_threshold 2 --downsample_factor 1 --reset_interval 100
# Example:
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path src/human3r.pth --size 512 --seq_path examples/GoodMornin1.mp4 --subsample 1 --use_ttt3r --vis_threshold 2 --downsample_factor 1 --reset_interval 100 --output_dir tmp
Output results will be saved to output_dir
.
Citation
If you find our work useful, please cite:
@article{chen2025human3r,
title={Human3R: Everyone Everywhere All at Once},
author={Chen, Yue and Chen, Xingyu and Xue, Yuxuan and Chen, Anpei and Xiu, Yuliang and Gerard, Pons-Moll},
journal={arXiv preprint arXiv:2510.06219},
year={2025}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support