t-bank-ai
/

response-toxicity-classifier-base

Text Classification

Inference Endpoints

Model card Files Files and versions Community

response-toxicity-classifier-base / README.md

amarkv's picture

Update README.md

7ade9e9 over 2 years ago

|

1.91 kB

	---
	language: ["ru"]
	tags:
	- russian
	- pretraining
	license: mit
	widget:
	- text: "[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] норм"
	example_title: "Dialog example 1"
	- text: "[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] соси вола"
	example_title: "Dialog example 2"
	- text: "[CLS] здравствуйте товарищ [RESPONSE_TOKEN] что это за говно на тебе надето?))"
	example_title: "Dialog example 3"
	---

	# dialog-inapropriate-messages-classifier

	[BERT classifier from Skoltech](https://huggingface.co/Skoltech/russian-inappropriate-messages), finetuned on contextual data with 4 labels.

	# Training

	Skoltech/russian-inappropriate-messages was finetuned on a multiclass data with four classes

	1) OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
	2) Toxic label -- the message might be seen as a offensive one in given context.
	3) Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
	4) Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics)

	The model was finetuned on DATASET_LINK.

	# Evaluation results

	Model achieves the following results:

	\| \| OK - F1-score \| TOXIC - F1-score \| SEVERE TOXIC - F1-score \| RISKS - F1-score \|
	\|-------------------------\|-------------------------\|-------------------\|----------------\|------------------\|
	\| DATASET_TWITTER val.csv \| 0.896 \| 0.348 \| 0.490 \| 0.591 \|
	\| DATASET_GENA val.csv \| 0.940 \| 0.295 \| 0.729 \| 0.46 \|

	The work was done during internship at Tinkoff by [Nikita Stepanov](https://huggingface.co/nikitast).