Richard Zhuang's picture

6 10 9

Richard Zhuang PRO

RZ412

·

https://richardzhuang0412.github.io

AI & ML interests

LLM Routing, LLM + Games, Post-Training, Agents

Recent Activity

updated a dataset 11 days ago

DCAgent2/bfcl-parity

published a dataset 11 days ago

DCAgent2/bfcl-parity

updated a dataset 23 days ago

RZ412/PokerBench

View all activity

Organizations

upvoted 2 collections about 2 months ago

OpenThinker-Agent

5 items • Updated Dec 6, 2025 • 6

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated Dec 23, 2025 • 47

upvoted a paper about 2 months ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published Dec 3, 2025 • 154

upvoted a paper 4 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 145

upvoted an article 6 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

752

upvoted 2 collections 7 months ago

Reasoning Datasets

50 items • Updated Jun 8, 2025 • 10

Reasoning Models

53 items • Updated Jun 8, 2025 • 1

upvoted an article 10 months ago

Article

Reasoning Datasets Competition

Apr 9, 2025

•

38

upvoted a paper about 1 year ago

PokerBench: Training Large Language Models to become Professional Poker Players

Paper • 2501.08328 • Published Jan 14, 2025 • 19

upvoted a paper over 1 year ago

EmbedLLM: Learning Compact Representations of Large Language Models

Paper • 2410.02223 • Published Oct 3, 2024 • 3