Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28 • 36
Video Virtual Try-on with Conditional Diffusion Transformer Inpainter Paper • 2506.21270 • Published Jun 26
UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception Paper • 2509.23760 • Published Sep 28 • 1
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model Paper • 2510.20803 • Published Oct 23 • 9
LumiSculpt: A Consistency Lighting Control Network for Video Generation Paper • 2410.22979 • Published Oct 30, 2024 • 2
Mimir: Improving Video Diffusion Models for Precise Text Understanding Paper • 2412.03085 • Published Dec 4, 2024 • 12
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 56
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction Paper • 2505.02471 • Published May 5 • 15
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8 • 72