Shaobai Jiang
shaobaij
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 4 hours ago
Low-probability Tokens Sustain Exploration in Reinforcement Learning
with Verifiable Reward
upvoted
a
paper
about 4 hours ago
Pretraining with hierarchical memories: separating long-tail and common
knowledge
upvoted
a
paper
about 4 hours ago
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget
Allocation
Organizations
None yet