Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning Paper • 2510.03259 • Published 19 days ago • 53
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping Paper • 2509.21880 • Published 20 days ago • 39
ReviewScore: Misinformed Peer Review Detection with Large Language Models Paper • 2509.21679 • Published 20 days ago • 62