infly
/

INF-ORM-Llama3.1-70B

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MinghaoYang commited on 13 days ago

Commit

c4e1e6b

•

1 Parent(s): 0132940

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -31,7 +31,7 @@ The evaluation follows the following rules:
 3. If the chosen answer is slightly better than rejected answer, set 'Magnitude'  value  $d$ to 1.
 After that, we train our model with the scaled BT loss. The scaled BT loss is defined as:
-$$\mathcal{L}_{Scaled-BT} = -\alpha*d*log(\sigma(r_{\theta}(x, y_{c}))-\sigma(r_{\theta}(x, y_{r})))$$
 where $\alpha$ is the scaling factor. You can find more details about scaled BT loss here [1](https://arxiv.org/pdf/2410.01257).
 > Here we look at the performance gains of scaled BT loss from a different perspective than [1](https://arxiv.org/pdf/2410.01257). The scaled BT loss can be thought of as a form of cross-entropy, where the distribution of the difference of the logits produced by the model is sensitive to the distribution of the magnitude. Therefore, we improve the difference of the values in the 'Magnitude' column from 1, 2, 3 to 1, 3, 10 and finally get better performance.

 3. If the chosen answer is slightly better than rejected answer, set 'Magnitude'  value  $d$ to 1.
 After that, we train our model with the scaled BT loss. The scaled BT loss is defined as:
+$$\mathcal{L}_{Scaled-BT} = -\alpha*d*log(\sigma(r_{\theta}(x, y_{c})-r_{\theta}(x, y_{r})))$$
 where $\alpha$ is the scaling factor. You can find more details about scaled BT loss here [1](https://arxiv.org/pdf/2410.01257).
 > Here we look at the performance gains of scaled BT loss from a different perspective than [1](https://arxiv.org/pdf/2410.01257). The scaled BT loss can be thought of as a form of cross-entropy, where the distribution of the difference of the logits produced by the model is sensitive to the distribution of the magnitude. Therefore, we improve the difference of the values in the 'Magnitude' column from 1, 2, 3 to 1, 3, 10 and finally get better performance.