Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning
Abstract
CEL, a novel agent architecture using a Large Language Model, learns to master complex environments through explicit reasoning and planning, achieving success in diverse grid-world tasks with sparse rewards.
The pursuit of artificial agents that can learn to master complex environments has led to remarkable successes, yet prevailing deep reinforcement learning methods often rely on immense experience, encoding their knowledge opaquely within neural network weights. We propose a different paradigm, one in which an agent learns to play by reasoning and planning. We introduce Cogito, ergo ludo (CEL), a novel agent architecture that leverages a Large Language Model (LLM) to build an explicit, language-based understanding of its environment's mechanics and its own strategy. Starting from a tabula rasa state with no prior knowledge (except action set), CEL operates on a cycle of interaction and reflection. After each episode, the agent analyzes its complete trajectory to perform two concurrent learning processes: Rule Induction, where it refines its explicit model of the environment's dynamics, and Strategy and Playbook Summarization, where it distills experiences into an actionable strategic playbook. We evaluate CEL on diverse grid-world tasks (i.e., Minesweeper, Frozen Lake, and Sokoban), and show that the CEL agent successfully learns to master these games by autonomously discovering their rules and developing effective policies from sparse rewards. Ablation studies confirm that the iterative process is critical for sustained learning. Our work demonstrates a path toward more general and interpretable agents that not only act effectively but also build a transparent and improving model of their world through explicit reasoning on raw experience.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models (2025)
- Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models (2025)
- AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning (2025)
- DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning (2025)
- Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents (2025)
- Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning (2025)
- Reinforced Language Models for Sequential Decision Making (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper