metadata
language:
- vi
tags:
- classification
widget:
- text: Xấu vcl
example_title: Công kích
- text: Đồ ngu
example_title: Thù ghét
- text: Xin chào chúc một ngày tốt lành
example_title: Normal
PhoBert finetuned version for hate speech detection
Dataset
- VLSP2019: Hate Speech Detection on Social Networks Dataset
- ViHSD: Vietnamese Hate Speech Detection dataset
Class name
- LABEL_0 : Normal
- LABEL_1 : OFFENSIVE
- LABEL_2 : HATE
Usage example with TextClassificationPipeline
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TextClassificationPipeline
model = AutoModelForSequenceClassification.from_pretrained("tsdocode/phobert-finetune-hatespeech", num_labels=3)
tokenizer = AutoTokenizer.from_pretrained("tsdocode/phobert-finetune-hatespeech")
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)
# outputs a list of dicts like [[{'label': 'NEGATIVE', 'score': 0.0001223755971295759}, {'label': 'POSITIVE', 'score': 0.9998776316642761}]]
pipe("đồ ngu")