5 12 14

Boyuan Zheng

boyuanzheng010

https://boyuanzheng010.github.io/

AI & ML interests

Language Agents, Multilinguality

Recent Activity

upvoted a paper 21 days ago

Agent Learning via Early Experience

upvoted a paper 21 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

upvoted a paper about 2 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

View all activity

Organizations

upvoted 2 papers 21 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published 22 days ago • 254

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 22 days ago • 40

upvoted a paper about 2 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 83

updated a dataset 3 months ago

osunlp/WebGuard

Viewer • Updated Jul 28 • 6k • 44

published a dataset 3 months ago

osunlp/WebGuard

Viewer • Updated Jul 28 • 6k • 44

updated a dataset 3 months ago

boyuanzheng010/webguard_test

Viewer • Updated Jul 24 • 6.49k • 5

published a dataset 3 months ago

boyuanzheng010/webguard_test

Viewer • Updated Jul 24 • 6.49k • 5

upvoted a paper 4 months ago

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

Paper • 2506.21506 • Published Jun 26 • 51

updated a dataset 6 months ago

boyuanzheng010/webguard

Viewer • Updated May 16 • 6.49k • 8 • 1

published a dataset 6 months ago

boyuanzheng010/webguard

Viewer • Updated May 16 • 6.49k • 8 • 1

upvoted a paper 7 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11 • 28

liked a Space 7 months ago

Agent Reward Bench Demo

💻

Explore agent trajectories and judgments in web benchmarks

upvoted a paper 7 months ago

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Paper • 2504.07079 • Published Apr 9 • 12

commented a paper 7 months ago

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Paper • 2504.07079 • Published Apr 9 • 12 •

published a model 7 months ago

boyuanzheng010/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

Updated Apr 6

updated a model 7 months ago

boyuanzheng010/Qwen2.5-1.5B-Open-R1-Distill

Text Generation • 2B • Updated Apr 2 • 4

published a model 7 months ago

boyuanzheng010/Qwen2.5-1.5B-Open-R1-Distill

Text Generation • 2B • Updated Apr 2 • 4

liked a Space 7 months ago

Online-Mind2Web Leaderboard

🌐

Display and analyze evaluation results for agents

upvoted an article 8 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

liked a Space 8 months ago

Safearena Leaderboard

🏃

SafeArena Leaderboard