nicholasKluge commited on
Commit
055c256
1 Parent(s): ccaa04e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -29,7 +29,7 @@ co2_eq_emissions:
29
  ---
30
  # ToxicityModel
31
 
32
- The `ToxicityModel` is a fine-tuned version of [RoBERTa](https://huggingface.co/roberta-base) that can be used to score the toxicity of a sentence.
33
 
34
  The model was trained with a dataset composed of `toxic_response` and `non_toxic_response`.
35
 
@@ -52,9 +52,9 @@ This repository has the [source code](https://github.com/Nkluge-correa/Aira) use
52
 
53
  ⚠️ THE EXAMPLES BELOW CONTAIN TOXIC/OFFENSIVE LANGUAGE ⚠️
54
 
55
- The `ToxicityModel` was trained as an auxiliary reward model for RLHF training (its logit outputs can be treated as penalizations/rewards). Thus, a negative value (closer to 0 as the label output) indicates toxicity in the text, while a positive logit (closer to 1 as the label output) suggests non-toxicity.
56
 
57
- Here's an example of how to use the `ToxicityModel` to score the toxicity of a text:
58
 
59
  ```python
60
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -138,4 +138,4 @@ Idiot, Dumbass, Moron, Stupid, Fool, Fuck Face. Score: -7.300
138
 
139
  ## License
140
 
141
- The `ToxicityModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
 
29
  ---
30
  # ToxicityModel
31
 
32
+ The ToxicityModel is a fine-tuned version of [RoBERTa](https://huggingface.co/roberta-base) that can be used to score the toxicity of a sentence.
33
 
34
  The model was trained with a dataset composed of `toxic_response` and `non_toxic_response`.
35
 
 
52
 
53
  ⚠️ THE EXAMPLES BELOW CONTAIN TOXIC/OFFENSIVE LANGUAGE ⚠️
54
 
55
+ The ToxicityModel was trained as an auxiliary reward model for RLHF training (its logit outputs can be treated as penalizations/rewards). Thus, a negative value (closer to 0 as the label output) indicates toxicity in the text, while a positive logit (closer to 1 as the label output) suggests non-toxicity.
56
 
57
+ Here's an example of how to use the ToxicityModel to score the toxicity of a text:
58
 
59
  ```python
60
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
138
 
139
  ## License
140
 
141
+ ToxicityModel is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.