Model description
This model is a fine-tuned version of the bert-base-uncased model to classify toxic comments.
How to use
You can use the model with the following code.
from transformers import BertForSequenceClassification, BertTokenizer, TextClassificationPipeline
model_path = "JungleLee/bert-toxic-comment-classification"
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path, num_labels=2)
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(pipeline("You're a fucking nerd."))
Training data
The training data comes from this Kaggle competition. We use 90% of the train.csv
data to train the model.
Evaluation results
The model achieves 0.95 AUC in a 1500 rows held-out test set.
- Downloads last month
- 28,430
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.