README.md · wesleyacheng/hate-speech-multilabel-classification-with-bert at refs/pr/1

metadata

license: apache-2.0
datasets:
  - ucberkeley-dlab/measuring-hate-speech
language:
  - en
metrics:
  - f1
  - accuracy
pipeline_tag: text-classification
widget:
  - text: Yellow peril.
    example_title: Age Hate Example
  - text: Nietzsche said 'God is dead'.
    example_title: Religion Hate Example
  - text: Go back to where you came from.
    example_title: Origin Hate Example
  - text: You're being emotional.
    example_title: Gender Hate Example
  - text: I identify as a sandwich.
    example_title: Sexuality Hate Example
  - text: Old fart.
    example_title: Age Hate Example
  - text: Confined to a wheelchair.
    example_title: Disability Hate Example

First posted on Kaggle.

I've recently stumbled upon a very comprehensive dataset on measurement of hate speech from my alma mater, UC Berkeley. It aggregated social media comments from Youtube, Reddit, and Twitter.

The interesting thing about the dataset is the annotator's profile with attributes such as ideology, income, race, and etc. is included in the dataset. This is, unfortunately, uncommon in most social media dataset and so I was intrigued about this. I also find the paper's idea of data perspectivism interesting as it argues that the disagreement of the annotator on the attributes of a hate speech is informative, rather than throwing it away.

Although, we are not leveraging the annotator's information in this model, I encourage you to explore the dataset and maybe leverage the annotator's information to make variations of this classifier.

Here I made a Hate Speech MultiLabel Classifier to classify independent targets of race, religion, origin, gender, sexuality, age, disability by doing transfer learning on BERT with the UC Berkeley D-Lab's Hate Speech Dataset from the paper The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism.