5 17 1

Yang Yue

Yang130

https://yueyang130.github.io

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

upvoted a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

upvoted a paper 4 months ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

View all activity

Organizations

upvoted a paper about 1 month ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 136

upvoted a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

upvoted a paper 4 months ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 84

upvoted an article 4 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

and 18 others •

Jul 10

• 53

upvoted a paper 5 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 141

upvoted an article 5 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3

• 278

upvoted 3 papers 5 months ago

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Paper • 2506.01049 • Published Jun 1 • 38

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

Large Language Models for Data Synthesis

Paper • 2505.14752 • Published May 20 • 49

upvoted 2 papers 6 months ago

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 35

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185

upvoted 2 papers 7 months ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18 • 16

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 135

upvoted 2 papers about 1 year ago

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Paper • 2411.02359 • Published Nov 4, 2024 • 13

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 34

upvoted a paper over 1 year ago

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Paper • 2407.08770 • Published Jul 11, 2024 • 21