MinghaoYang
commited on
Commit
•
c4e1e6b
1
Parent(s):
0132940
Update README.md
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ The evaluation follows the following rules:
|
|
31 |
3. If the chosen answer is slightly better than rejected answer, set 'Magnitude' value $d$ to 1.
|
32 |
|
33 |
After that, we train our model with the scaled BT loss. The scaled BT loss is defined as:
|
34 |
-
$$\mathcal{L}_{Scaled-BT} = -\alpha*d*log(\sigma(r_{\theta}(x, y_{c})
|
35 |
where $\alpha$ is the scaling factor. You can find more details about scaled BT loss here [1](https://arxiv.org/pdf/2410.01257).
|
36 |
|
37 |
> Here we look at the performance gains of scaled BT loss from a different perspective than [1](https://arxiv.org/pdf/2410.01257). The scaled BT loss can be thought of as a form of cross-entropy, where the distribution of the difference of the logits produced by the model is sensitive to the distribution of the magnitude. Therefore, we improve the difference of the values in the 'Magnitude' column from 1, 2, 3 to 1, 3, 10 and finally get better performance.
|
|
|
31 |
3. If the chosen answer is slightly better than rejected answer, set 'Magnitude' value $d$ to 1.
|
32 |
|
33 |
After that, we train our model with the scaled BT loss. The scaled BT loss is defined as:
|
34 |
+
$$\mathcal{L}_{Scaled-BT} = -\alpha*d*log(\sigma(r_{\theta}(x, y_{c})-r_{\theta}(x, y_{r})))$$
|
35 |
where $\alpha$ is the scaling factor. You can find more details about scaled BT loss here [1](https://arxiv.org/pdf/2410.01257).
|
36 |
|
37 |
> Here we look at the performance gains of scaled BT loss from a different perspective than [1](https://arxiv.org/pdf/2410.01257). The scaled BT loss can be thought of as a form of cross-entropy, where the distribution of the difference of the logits produced by the model is sensitive to the distribution of the magnitude. Therefore, we improve the difference of the values in the 'Magnitude' column from 1, 2, 3 to 1, 3, 10 and finally get better performance.
|