TLeonidas's picture
Update README.md
2eeebcb verified
|
raw
history blame
1.41 kB
---
title: Hate Speech Text Classifier
emoji: πŸ‘
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 4.21.0
app_file: app.py
pinned: false
---
# Monitoring Harmful Text in Online Platforms
## Overview
This repository hosts the RandomForest classifier model designed for detecting harmful text AGAINST GROUPS.
The model classifies text into one of three categories: "Offensive or Hateful", "Neutral or Ambiguous", and "Not Hate".
Achieving an accuracy of 92.5%, this model was developed through the combination of three distinct datasets, ensuring robustness and reliability in varied contexts.
It was presented at the prestigious annual Gulf Coast Conference & Expo on AI.
Model Details
Model Type: RandomForest Classifier
Accuracy: 92.5%
Labels:
0: Neutral or Ambiguous
1: Not Hate
2: Offensive or Hateful
Training Data: Augmented version of [this dataset](https://huggingface.co/datasets/TLeonidas/twitter-hate-speech-en-240ksamples) (279k+ rows)
## Dataset
The model was trained on a concatenated dataset from three distinct sources, curated to encompass a wide range of linguistic expressions and contexts.
The datasets were preprocessed and balanced to ensure fair representation across all categories.
## Performance
The model achieves an accuracy of 92.5%, evaluated on a held-out test set.
## Contributing
We welcome contributions to improve the model and extend its applicability.