MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17 • 31 • 3
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16 • 30
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17 • 31 • 3
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17 • 31
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback Paper • 2403.18349 • Published Mar 27
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI Paper • 2205.11029 • Published May 23, 2022
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding Paper • 2402.18262 • Published Feb 28
MULTI: Multimodal Understanding Leaderboard with Text and Images Paper • 2402.03173 • Published Feb 5 • 3
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17 • 31
MULTI: Multimodal Understanding Leaderboard with Text and Images Paper • 2402.03173 • Published Feb 5 • 3