Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
2
12
10
Tianjian Li
dogtooth
Follow
Fishtiks's profile picture
ZhaoningYu's profile picture
shopkeeper's profile picture
4 followers
·
7 following
https://tianjianl.github.io
truthbutcher
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
6 days ago
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
upvoted
a
paper
6 days ago
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
upvoted
a
paper
13 days ago
RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization
View all activity
Organizations
Papers
3
arxiv:
2509.02534
arxiv:
2505.02363
arxiv:
2310.00840
models
0
None public yet
datasets
219
Sort: Recently updated
dogtooth/divpo_llama_3.3_rho_0.3
Viewer
•
Updated
26 days ago
•
10k
•
26
dogtooth/helpsteer2_binarized_filtered
Viewer
•
Updated
Apr 5
•
2.51k
•
5
dogtooth/Big-Math-RL-Verified
Viewer
•
Updated
Apr 3
•
1.52M
•
20
dogtooth/default_project_dev_test
Viewer
•
Updated
Mar 26
•
4k
•
7
dogtooth/Big-Math-Selected-500
Viewer
•
Updated
Mar 25
•
3.5k
•
15
dogtooth/Big-Math-RL-Verified-Chinese
Viewer
•
Updated
Mar 6
•
251k
•
8
dogtooth/mmlu
Viewer
•
Updated
Mar 5
•
14.2k
•
286
dogtooth/boolq
Viewer
•
Updated
Mar 5
•
3.27k
•
9
dogtooth/gpqa
Viewer
•
Updated
Mar 5
•
448
•
21
dogtooth/math_qa
Viewer
•
Updated
Mar 5
•
2.99k
•
7
View 219 datasets