arxiv:2412.16145
Huaijie Wang
jwhj
AI & ML interests
None yet
Recent Activity
updated
a dataset
4 days ago
jwhj/OREO-Qwen2.5-Math-1.5B-Train
commented
a paper
4 days ago
Offline Reinforcement Learning for LLM Multi-Step Reasoning
updated
a model
7 days ago
jwhj/Qwen2.5-Math-1.5B-OREO-Value
Organizations
None yet