AAAR-1.0: Assessing AI's Potential to Assist Research Paper • 2410.22394 • Published 23 days ago • 13
How Easily do Irrelevant Inputs Skew the Responses of Large Language Models? Paper • 2404.03302 • Published Apr 4 • 2
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2 • 33
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions Paper • 2403.19651 • Published Mar 28 • 23
Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes Paper • 2305.13300 • Published May 22, 2023 • 2