-
End-to-End Goal-Driven Web Navigation
Paper • 1602.02261 • Published -
Learning Language Games through Interaction
Paper • 1606.02447 • Published -
Naturalizing a Programming Language via Interactive Learning
Paper • 1704.06956 • Published -
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2508.10833
-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 234 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 169 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 171
-
MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment
Paper • 2507.05720 • Published • 2 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 131 -
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper • 2508.04026 • Published • 157 -
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 41
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 28 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 51
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 88 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 27
-
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 41 -
inclusionAI/UI-Venus-Ground-7B
Image-Text-to-Text • 8B • Updated • 2.74k • 14 -
inclusionAI/UI-Venus-Ground-72B
Image-Text-to-Text • 73B • Updated • 336 • 10 -
inclusionAI/UI-Venus-Navi-7B
Image-Text-to-Text • 8B • Updated • 364 • 7
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 287 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 10 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
End-to-End Goal-Driven Web Navigation
Paper • 1602.02261 • Published -
Learning Language Games through Interaction
Paper • 1606.02447 • Published -
Naturalizing a Programming Language via Interactive Learning
Paper • 1704.06956 • Published -
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published • 1
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 88 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 27
-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 234 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 169 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 171
-
MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment
Paper • 2507.05720 • Published • 2 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 131 -
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper • 2508.04026 • Published • 157 -
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 41
-
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper • 2508.10833 • Published • 41 -
inclusionAI/UI-Venus-Ground-7B
Image-Text-to-Text • 8B • Updated • 2.74k • 14 -
inclusionAI/UI-Venus-Ground-72B
Image-Text-to-Text • 73B • Updated • 336 • 10 -
inclusionAI/UI-Venus-Navi-7B
Image-Text-to-Text • 8B • Updated • 364 • 7
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 287 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 28 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 51
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 10 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32