DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published 7 days ago • 83
Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published 5 days ago • 14
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published 10 days ago • 64
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? Paper • 2510.08189 • Published 17 days ago • 25
Imperceptible Jailbreaking against Large Language Models Paper • 2510.05025 • Published 20 days ago • 33
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs Paper • 2510.05069 • Published 20 days ago • 12
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards Paper • 2509.24981 • Published 27 days ago • 29
Quantile Advantage Estimation for Entropy-Safe Reasoning Paper • 2509.22611 • Published about 1 month ago • 117
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published Sep 4 • 208
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published Sep 4 • 57
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 201
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated 28 days ago • 99
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 177