Bridging Supervised Learning and Reinforcement Learning in Math Reasoning Paper • 2505.18116 • Published May 23 • 4
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 130
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 183
Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency Paper • 2510.08431 • Published 13 days ago • 8
DiffusionNFT: Online Diffusion Reinforcement with Forward Process Paper • 2509.16117 • Published Sep 19 • 20
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning Paper • 2304.12824 • Published Apr 25, 2023
Score Regularized Policy Optimization through Diffusion Behavior Paper • 2310.07297 • Published Oct 11, 2023 • 1
Noise Contrastive Alignment of Language Models with Explicit Rewards Paper • 2402.05369 • Published Feb 8, 2024 • 1
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation Paper • 2410.07864 • Published Oct 10, 2024 • 1
Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control Paper • 2407.09024 • Published Jul 12, 2024
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment Paper • 2410.09347 • Published Oct 12, 2024 • 5