Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Paper
• 2506.18898 • Published
• 34
[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Unified MLLM with Text-Aligned Representations
Unified MLLM with Text-Aligned Representations
Unified MLLM with Text-Aligned Representations