Sculpt4D

Pretrained model for Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers.

Given an image sequence of an animated object, Sculpt4D generates a temporally coherent sequence of 3D meshes. It integrates efficient temporal modeling into a pretrained 3D Diffusion Transformer (Hunyuan3D-2.1) via a Block Sparse Attention mechanism.

Checkpoint

This repository hosts the 8-frame block-mask model (20k steps) as a sharded bf16 checkpoint (~8 GB), under the blockmask_bf16/ subfolder:

blockmask_bf16/
β”œβ”€β”€ pytorch_model-00001-of-00002.bin
β”œβ”€β”€ pytorch_model-00002-of-00002.bin
└── pytorch_model.bin.index.json

Usage

Download the checkpoint:

huggingface-cli download TencentARC/Sculpt4D --include "blockmask_bf16/*" --local-dir checkpoints/sculpt4d

Run inference (see the code repository for full setup):

python inference_4d.py \
    --config configs/4d_config_8.yaml \
    --ckpt_path checkpoints/sculpt4d/blockmask_bf16 \
    --input_dir demos/door \
    --output_dir ./inference_output/door

Citation

@inproceedings{sculpt4d2026,
  title={Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers},
  author={Yin, Minghao and Hu, Wenbo and Xu, Jiale and Shan, Ying and Han, Kai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for TencentARC/Sculpt4D