X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models Paper • 2412.01824 • Published 10 days ago • 61
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model Paper • 2407.16198 • Published Jul 23 • 13
Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials Paper • 2404.16829 • Published Apr 25 • 5