--- license: openrail++ datasets: - textdetox/multilingual_toxicity_dataset language: - en - ru - uk - es - de - am - ar - zh - hi metrics: - f1 --- This is an instance of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) that was fine-tuned on binary toxicity classification task based on our compiled dataset [textdetox/multilingual_toxicity_dataset](https://huggingface.co/datasets/textdetox/multilingual_toxicity_dataset). Firstly, we separated a balanced 20% test set to check the model adequency. Then, the model was fine-tuned on the full data. The results on the test set are the following: | | Precision | Recall | F1 | |----------|-----------|--------|-------| | all_lang | 0.8713 | 0.8710 | 0.8710| | en | 0.9650 | 0.9650 | 0.9650| | ru | 0.9791 | 0.9790 | 0.9790| | uk | 0.9267 | 0.9250 | 0.9251| | de | 0.8791 | 0.8760 | 0.8758| | es | 0.8700 | 0.8700 | 0.8700| | ar | 0.7787 | 0.7780 | 0.7780| | am | 0.7781 | 0.7780 | 0.7780| | hi | 0.9360 | 0.9360 | 0.9360| | zh | 0.7318 | 0.7320 | 0.7315|