Zhenru commited on
Commit
0062179
·
verified ·
1 Parent(s): 9851cb5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -5
README.md CHANGED
@@ -24,6 +24,10 @@ In addition to the mathematical Outcome Reward Model (ORM) Qwen2.5-Math-RM-72B,
24
  ![](http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/Qwen2.5-Math-PRM/Qwen2.5-Math-PRM.png)
25
 
26
 
 
 
 
 
27
 
28
  ## Requirements
29
  * `transformers>=4.40.0` for Qwen2.5-Math models. The latest version is recommended.
@@ -122,10 +126,12 @@ print(step_reward) # [[0.9921875, 0.0047607421875, 0.32421875, 0.8203125]]
122
  If you find our work helpful, feel free to give us a citation.
123
 
124
  ```
125
- @article{yang2024qwen25mathtechnicalreportmathematical,
126
- title={Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement},
127
- author={An Yang and Beichen Zhang and Binyuan Hui and Bofei Gao and Bowen Yu and Chengpeng Li and Dayiheng Liu and Jianhong Tu and Jingren Zhou and Junyang Lin and Keming Lu and Mingfeng Xue and Runji Lin and Tianyu Liu and Xingzhang Ren and Zhenru Zhang},
128
- journal={arXiv preprint arXiv:2409.12122},
129
- year={2024}
 
 
130
  }
131
  ```
 
24
  ![](http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/Qwen2.5-Math-PRM/Qwen2.5-Math-PRM.png)
25
 
26
 
27
+ ## Model Details
28
+
29
+ For more details, please refer to our [paper](https://arxiv.org/pdf/2501.07301).
30
+
31
 
32
  ## Requirements
33
  * `transformers>=4.40.0` for Qwen2.5-Math models. The latest version is recommended.
 
126
  If you find our work helpful, feel free to give us a citation.
127
 
128
  ```
129
+ @article{prmlessons,
130
+ title={The Lessons of Developing Process Reward Models in Mathematical Reasoning},
131
+ author={
132
+ Zhenru Zhang and Chujie Zheng and Yangzhen Wu and Beichen Zhang and Runji Lin and Bowen Yu and Dayiheng Liu and Jingren Zhou and Junyang Lin
133
+ },
134
+ journal={arXiv preprint arXiv:2501.07301},
135
+ year={2025}
136
  }
137
  ```