Xihui Liu's picture

Xihui Liu

XihuiLiu

·

https://xh-liu.github.io/

AI & ML interests

None yet

Organizations

upvoted 2 papers 3 months ago

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

Paper • 2512.08186 • Published Dec 9, 2025 • 23

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published Dec 9, 2025 • 133

upvoted 2 papers 5 months ago

Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation

Paper • 2510.08994 • Published Oct 10, 2025 • 4

CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images

Paper • 2510.11718 • Published Oct 13, 2025 • 14

upvoted a paper 8 months ago

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8, 2025 • 60

upvoted a paper 10 months ago

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

Paper • 2505.17022 • Published May 22, 2025 • 27

upvoted 5 papers 11 months ago

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published Apr 11, 2025 • 46

HoloPart: Generative 3D Part Amodal Segmentation

Paper • 2504.07943 • Published Apr 10, 2025 • 28

Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1

Paper • 2503.24376 • Published Mar 31, 2025 • 38

Position: Interactive Generative Video as Next-Generation Game Engine

Paper • 2503.17359 • Published Mar 21, 2025 • 61

AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset

Paper • 2503.19462 • Published Mar 25, 2025 • 10

upvoted 2 papers 12 months ago

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20, 2025 • 34

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

upvoted 4 papers about 1 year ago

GameFactory: Creating New Games with Generative Interactive Videos

Paper • 2501.08325 • Published Jan 14, 2025 • 67

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 53

GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

Paper • 2412.04440 • Published Dec 5, 2024 • 22

Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Paper • 2412.04445 • Published Dec 5, 2024 • 22

upvoted 3 papers over 1 year ago

WorldSimBench: Towards Video Generation Models as World Simulators

Paper • 2410.18072 • Published Oct 23, 2024 • 19

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Paper • 2410.13861 • Published Oct 17, 2024 • 56

LVD-2M: A Long-take Video Dataset with Temporally Dense Captions

Paper • 2410.10816 • Published Oct 14, 2024 • 21