Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper • 2412.04432 • Published 13 days ago • 13
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation Paper • 2412.04445 • Published 13 days ago • 21
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration Paper • 2412.04440 • Published 13 days ago • 18