Ray2333
/

Gemma-2B-rewardmodel-baseline

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Ray2333 commited on Jul 6

Commit

ad12642

•

1 Parent(s): c8dbb02

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ We evaluate this reward model on the [reward model benchmark](https://huggingfac
 |       Model               | Average       |  Chat     |     Chat Hard      |     Safety      |     Reasoning     |
 |:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
-|  [**Ray2333/GRM-Gemma-2B-sftreg**](https://huggingface.co/Ray2333/GRM-Gemma-2B-sftreg)(Ours, 2B) | 75.1    |   95.5  |  48.2 |   80.0 | 76.8     |
 |    berkeley-nest/Starling-RM-7B-alpha      (7B)                          |    74.6      |   98      |   43.4   |   88.6  |    74.6    |
 |  **Ray2333/Gemma-2B-rewardmodel-baseline**(Ours, 2B) | 73.7    |   94.1  |  46.1 |  79.6 |  75.0   |
 |     stabilityai/stablelm-zephyr-3b             (3B)                                 |    73.1      |   86.3   |   60.1   |   70.3  |    75.7     |

 |       Model               | Average       |  Chat     |     Chat Hard      |     Safety      |     Reasoning     |
 |:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
+|  [**Ray2333/GRM-Gemma-2B-sftreg**](https://huggingface.co/Ray2333/GRM-Gemma-2B-sftreg)(Ours, 2B) | 75.3    |   95.5  |  48.7 |   80.0 | 76.8     |
 |    berkeley-nest/Starling-RM-7B-alpha      (7B)                          |    74.6      |   98      |   43.4   |   88.6  |    74.6    |
 |  **Ray2333/Gemma-2B-rewardmodel-baseline**(Ours, 2B) | 73.7    |   94.1  |  46.1 |  79.6 |  75.0   |
 |     stabilityai/stablelm-zephyr-3b             (3B)                                 |    73.1      |   86.3   |   60.1   |   70.3  |    75.7     |