IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published 18 days ago • 13
Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models Paper • 2410.00363 • Published Oct 1, 2024 • 1
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models Paper • 2312.06685 • Published Dec 9, 2023 • 1
Boosting Open-Domain Continual Learning via Leveraging Intra-domain Category-aware Prototype Paper • 2408.09984 • Published Aug 19, 2024 • 1
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Paper • 2409.15278 • Published Sep 23, 2024 • 24
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Paper • 2408.02657 • Published Aug 5, 2024 • 33
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Paper • 2402.05935 • Published Feb 8, 2024 • 16