|
--- |
|
task_categories: |
|
- robotics |
|
datasets: |
|
- USC-GVL/Humanoid-X |
|
pipeline_tag: robotics |
|
--- |
|
|
|
<div align="center"> |
|
<h1> <img src="assets/icon.png" width="50" /> UH-1 </h1> |
|
</div> |
|
<h5 align="center"> |
|
<a href="https://usc-gvl.github.io/UH-1/">π Homepage</a> | <a href="https://huggingface.co/datasets/USC-GVL/Humanoid-X">β Dataset</a> | <a href="https://huggingface.co/USC-GVL/UH-1">π€ Models</a> | <a href="https://arxiv.org/abs/2412.14172">π Paper</a> | <a href="https://github.com/sihengz02/UH-1">π» Code</a> |
|
</h5> |
|
|
|
|
|
This repo contains the officail model checkpoints for the paper "[Learning from Massive Human Videos for Universal Humanoid Pose Control](https://arxiv.org/abs/2412.14172)" |
|
If you like our project, please give us a star β on GitHub for latest update. |
|
|
|
Our model checkpoints contain a transformer model `UH1_Transformer.pth` and an action tokenizer `UH1_Action_Tokenizer.pth`. For the usage/inference of these models, please refer to the [codes](https://github.com/sihengz02/UH-1) here. |
|
|
|
![Alt text](assets/teaser.png) |
|
|
|
# UH-1 Model Architecture |
|
|
|
![Alt text](assets/model.png) |
|
|
|
# UH-1 Real Robot Demo Results |
|
|
|
![Alt text](assets/realbot.png) |
|
|
|
# Citation |
|
|
|
If you find our work helpful, please cite us: |
|
|
|
```bibtex |
|
@article{mao2024learning, |
|
title={Learning from Massive Human Videos for Universal Humanoid Pose Control}, |
|
author={Mao, Jiageng and Zhao, Siheng and Song, Siqi and Shi, Tianheng and Ye, Junjie and Zhang, Mingtong and Geng, Haoran and Malik, Jitendra and Guizilini, Vitor and Wang, Yue}, |
|
journal={arXiv preprint arXiv:2412.14172}, |
|
year={2024} |
|
} |
|
``` |