Human3R: Everyone Everywhere All at Once

Human3R is a unified, feed-forward framework for online 4D human-scene reconstruction, in the world frame, from casually captured monocular videos. It jointly recovers global multi-person SMPL-X bodies ("everyone"), dense 3D scene ("everywhere"), and camera trajectories in a single forward pass ("all-at-once").

TL;DR: Inference with One model, One stage; Training in One day using One GPU

Paper: Human3R: Everyone Everywhere All at Once
Project Page: https://fanegg.github.io/Human3R/
Code: https://github.com/fanegg/Human3R

Sample Usage

To run the inference demo, you can use the following command (assuming you have followed the installation steps from the GitHub repository):

# input can be a folder or a video
# the following script will run inference with Human3R and visualize the output with viser on port 8080
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path MODEL_PATH --size 512 \
    --seq_path SEQ_PATH --output_dir OUT_DIR --subsample 1 --use_ttt3r \
    --vis_threshold 2 --downsample_factor 1 --reset_interval 100

# Example:
CUDA_VISIBLE_DEVICES=0 python demo.py --model_path src/human3r.pth --size 512 --seq_path examples/GoodMornin1.mp4 --subsample 1 --use_ttt3r --vis_threshold 2 --downsample_factor 1 --reset_interval 100 --output_dir tmp

Output results will be saved to output_dir.

Citation

If you find our work useful, please cite:

@article{chen2025human3r,
    title={Human3R: Everyone Everywhere All at Once},
    author={Chen, Yue and Chen, Xingyu and Xue, Yuxuan and Chen, Anpei and Xiu, Yuliang and Gerard, Pons-Moll},
    journal={arXiv preprint arXiv:2510.06219},
    year={2025}
    }

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-to-3D

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support