Text Classification
Transformers
Safetensors
xlm-roberta
toxicity
Inference Endpoints
dardem commited on
Commit
e0a9096
1 Parent(s): 5cbf5c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -1,3 +1,33 @@
1
  ---
2
  license: openrail++
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: openrail++
3
+ datasets:
4
+ - textdetox/multilingual_toxicity_dataset
5
+ language:
6
+ - en
7
+ - ru
8
+ - uk
9
+ - es
10
+ - de
11
+ - am
12
+ - ar
13
+ - zh
14
+ - hi
15
+ metrics:
16
+ - f1
17
  ---
18
+ This is an instance of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) that was fine-tuned on binary toxicity classification task based on our compiled dataset [textdetox/multilingual_toxicity_dataset](https://huggingface.co/datasets/textdetox/multilingual_toxicity_dataset).
19
+
20
+ Firstly, we separated a balanced 20% test set to check the model adequency. Then, the model was fine-tuned on the full data. The results on the test set are the following:
21
+
22
+ | | Precision | Recall | F1 |
23
+ |----------|-----------|--------|-------|
24
+ | all_lang | 0.8713 | 0.8710 | 0.8710|
25
+ | en | 0.9650 | 0.9650 | 0.9650|
26
+ | ru | 0.9791 | 0.9790 | 0.9790|
27
+ | uk | 0.9267 | 0.9250 | 0.9251|
28
+ | de | 0.8791 | 0.8760 | 0.8758|
29
+ | es | 0.8700 | 0.8700 | 0.8700|
30
+ | ar | 0.7787 | 0.7780 | 0.7780|
31
+ | am | 0.7781 | 0.7780 | 0.7780|
32
+ | hi | 0.9360 | 0.9360 | 0.9360|
33
+ | zh | 0.7318 | 0.7320 | 0.7315|