Yaorui SHI

yrshi

syr-cn

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts

upvoted a paper 7 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

upvoted a paper 7 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

View all activity

Organizations

upvoted a paper 6 days ago

V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts

Paper • 2603.10848 • Published Mar 11 • 16

upvoted 2 papers 7 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Paper • 2605.27141 • Published 8 days ago • 19

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published 9 days ago • 31

upvoted a paper 9 days ago

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Paper • 2506.03610 • Published Jun 4, 2025 • 10

upvoted a paper 10 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 14 days ago • 204

upvoted a paper 12 days ago

SOD: Step-wise On-policy Distillation for Small Language Model Agents

Paper • 2605.07725 • Published 26 days ago • 25

upvoted 2 papers 14 days ago

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published 15 days ago • 58

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 291

upvoted 2 collections 14 days ago

Agent

Collection

114 items • Updated about 15 hours ago • 12

Papers

Collection

1 item • Updated 25 days ago • 1

upvoted a paper 19 days ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 20 days ago • 111

upvoted a paper 20 days ago

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Paper • 2605.13831 • Published 21 days ago • 87

authored 3 papers 21 days ago

upvoted 2 papers 22 days ago

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

Paper • 2605.08354 • Published 26 days ago • 23

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published 26 days ago • 41

upvoted a paper 26 days ago

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Paper • 2605.06130 • Published 27 days ago • 111

upvoted a paper 27 days ago

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

Paper • 2605.03849 • Published 29 days ago • 126

upvoted a paper about 1 month ago

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

Paper • 2604.22748 • Published Apr 24 • 227

Yaorui SHI

AI & ML interests

Recent Activity

Organizations

yrshi's activity