kaizuberbuehler
's Collections
Agents
updated
More Agents Is All You Need
Paper
•
2402.05120
•
Published
•
51
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper
•
2402.07456
•
Published
•
41
Generative Agents: Interactive Simulacra of Human Behavior
Paper
•
2304.03442
•
Published
•
11
Language Agent Tree Search Unifies Reasoning Acting and Planning in
Language Models
Paper
•
2310.04406
•
Published
•
8
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and
Optimisation
Paper
•
2312.13010
•
Published
•
4
GAIA: a benchmark for General AI Assistants
Paper
•
2311.12983
•
Published
•
184
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
65
Octopus v2: On-device language model for super agent
Paper
•
2404.01744
•
Published
•
57
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler
Generation
Paper
•
2404.12753
•
Published
•
41
Scaling Instructable Agents Across Many Simulated Worlds
Paper
•
2404.10179
•
Published
•
26
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real
Computer Environments
Paper
•
2404.07972
•
Published
•
46
WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents
Paper
•
2404.05902
•
Published
•
20
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper
•
2404.05719
•
Published
•
80
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web
Navigating Agent
Paper
•
2404.03648
•
Published
•
24
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper
•
2305.16291
•
Published
•
9
LASER: LLM Agent with State-Space Exploration for Web Navigation
Paper
•
2309.08172
•
Published
•
11
The Rise and Potential of Large Language Model Based Agents: A Survey
Paper
•
2309.07864
•
Published
•
7
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper
•
2303.11366
•
Published
•
4
LEGENT: Open Platform for Embodied Agents
Paper
•
2404.18243
•
Published
•
21
Diffusion for World Modeling: Visual Details Matter in Atari
Paper
•
2405.12399
•
Published
•
27
OpenVLA: An Open-Source Vision-Language-Action Model
Paper
•
2406.09246
•
Published
•
36
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex
Interactive Tasks
Paper
•
2305.17390
•
Published
•
2
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains
Paper
•
2407.18961
•
Published
•
39
AppWorld: A Controllable World of Apps and People for Benchmarking
Interactive Coding Agents
Paper
•
2407.18901
•
Published
•
32
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper
•
2407.21787
•
Published
•
3
OmniParser for Pure Vision Based GUI Agent
Paper
•
2408.00203
•
Published
•
23
WebArena: A Realistic Web Environment for Building Autonomous Agents
Paper
•
2307.13854
•
Published
•
23
Diffusion Augmented Agents: A Framework for Efficient Exploration and
Transfer Learning
Paper
•
2407.20798
•
Published
•
23
AgentGen: Enhancing Planning Abilities for Large Language Model based
Agent via Environment and Task Generation
Paper
•
2408.00764
•
Published
•
1
Diversity Empowers Intelligence: Integrating Expertise of Software
Engineering Agents
Paper
•
2408.07060
•
Published
•
40
The AI Scientist: Towards Fully Automated Open-Ended Scientific
Discovery
Paper
•
2408.06292
•
Published
•
116
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
Paper
•
2408.14354
•
Published
•
40
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated
clinical environments
Paper
•
2405.07960
•
Published
•
1
On the limits of agency in agent-based models
Paper
•
2409.10568
•
Published
•
12
DSBench: How Far Are Data Science Agents to Becoming Data Science
Experts?
Paper
•
2409.07703
•
Published
•
66
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
•
2409.16299
•
Published
•
10
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use
Paper
•
2411.10323
•
Published
•
28
Generative World Explorer
Paper
•
2411.11844
•
Published
•
67