Video Understanding with Large Language Models: A Survey Paper • 2312.17432 • Published Dec 29, 2023 • 3
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos? Paper • 2411.10979 • Published Nov 17, 2024
VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity Paper • 2503.11557 • Published Mar 14 • 22