Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning Paper • 2512.05591 • Published Dec 5, 2025 • 17
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment Paper • 2605.19577 • Published 12 days ago • 58
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation Paper • 2605.28293 • Published 4 days ago • 80
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation Paper • 2605.28293 • Published 4 days ago • 80
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation Paper • 2605.28293 • Published 4 days ago • 80
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment Paper • 2605.19577 • Published 12 days ago • 58
Open Multimodal Retrieval-Augmented Factual Image Generation Paper • 2510.22521 • Published Oct 26, 2025 • 31
GORACS: Group-level Optimal Transport-guided Coreset Selection for LLM-based Recommender Systems Paper • 2506.04015 • Published Jun 4, 2025 • 1
GORACS: Group-level Optimal Transport-guided Coreset Selection for LLM-based Recommender Systems Paper • 2506.04015 • Published Jun 4, 2025 • 1