AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 23
Evaluating Multiview Object Consistency in Humans and Image Models Paper • 2409.05862 • Published Sep 9, 2024 • 9