textdetox
/

xlmr-large-toxicity-classifier

Text Classification

Model card Files Files and versions

dardem commited on Feb 14, 2024

Commit

e0a9096

·

verified ·

1 Parent(s): 5cbf5c5

Update README.md

Files changed (1) hide show

README.md +30 -0

README.md CHANGED Viewed

@@ -1,3 +1,33 @@
 ---
 license: openrail++
 ---

 ---
 license: openrail++
+datasets:
+- textdetox/multilingual_toxicity_dataset
+language:
+- en
+- ru
+- uk
+- es
+- de
+- am
+- ar
+- zh
+- hi
+metrics:
+- f1
 ---
+This is an instance of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) that was fine-tuned on binary toxicity classification task based on our compiled dataset [textdetox/multilingual_toxicity_dataset](https://huggingface.co/datasets/textdetox/multilingual_toxicity_dataset).
+Firstly, we separated a balanced 20% test set to check the model adequency. Then, the model was fine-tuned on the full data. The results on the test set are the following:
+|          | Precision | Recall | F1    |
+|----------|-----------|--------|-------|
+| all_lang | 0.8713    | 0.8710 | 0.8710|
+| en       | 0.9650    | 0.9650 | 0.9650|
+| ru       | 0.9791    | 0.9790 | 0.9790|
+| uk       | 0.9267    | 0.9250 | 0.9251|
+| de       | 0.8791    | 0.8760 | 0.8758|
+| es       | 0.8700    | 0.8700 | 0.8700|
+| ar       | 0.7787    | 0.7780 | 0.7780|
+| am       | 0.7781    | 0.7780 | 0.7780|
+| hi       | 0.9360    | 0.9360 | 0.9360|
+| zh       | 0.7318    | 0.7320 | 0.7315|