TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published Dec 18, 2024 • 51
WebArena: A Realistic Web Environment for Building Autonomous Agents Paper • 2307.13854 • Published Jul 25, 2023 • 25