EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published 6 days ago • 11
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published 6 days ago • 11
Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions Paper • 2406.10638 • Published Jun 15, 2024
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 53
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 53