amarkv's picture
Update README.md
7ade9e9
|
raw
history blame
1.91 kB
metadata
language:
  - ru
tags:
  - russian
  - pretraining
license: mit
widget:
  - text: '[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] норм'
    example_title: Dialog example 1
  - text: '[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] соси вола'
    example_title: Dialog example 2
  - text: >-
      [CLS] здравствуйте товарищ [RESPONSE_TOKEN] что это за говно на тебе
      надето?))
    example_title: Dialog example 3

dialog-inapropriate-messages-classifier

BERT classifier from Skoltech, finetuned on contextual data with 4 labels.

Training

Skoltech/russian-inappropriate-messages was finetuned on a multiclass data with four classes

  1. OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
  2. Toxic label -- the message might be seen as a offensive one in given context.
  3. Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
  4. Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics)

The model was finetuned on DATASET_LINK.

Evaluation results

Model achieves the following results:

OK - F1-score TOXIC - F1-score SEVERE TOXIC - F1-score RISKS - F1-score
DATASET_TWITTER val.csv 0.896 0.348 0.490 0.591
DATASET_GENA val.csv 0.940 0.295 0.729 0.46

The work was done during internship at Tinkoff by Nikita Stepanov.