Can MLLMs Understand the Deep Implication Behind Chinese Images? Paper โข 2410.13854 โข Published Oct 17, 2024 โข 10
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper โข 2401.13919 โข Published Jan 25, 2024 โข 27