Edit model card

roberta-tagalog-base-philippine-elections-2016-2022-hate-speech

This model is a fine-tuned version of jcblaise/roberta-tagalog-base for the task of Text Classification, classifying hate and non-hate tweets.

The model was fine-tuned on a combined dataset mapsoriano/2016_2022_hate_speech_filipino consisting of the hate_speech_filipino dataset and a newly crawled 2022 Philippine Presidential Elections-related Tweets Hate Speech Dataset.

It achieves the following results on the evaluation (validation) set:

  • Loss: 0.3574
  • Accuracy: 0.8743

It achieves the following results on the test set:

  • Accuracy: 0.8783
  • Precision: 0.8563
  • Recall: 0.9077
  • F1: 0.8813

Feel free to connect via LinkedIn for further information on this model or on the study that it was used on.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.3423 1.0 1361 0.3167 0.8693
0.2194 2.0 2722 0.3574 0.8743

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3

Citation Information

Research Title: Application of BERT in Detecting Online Hate

Published: 2023

Authors:

  • Castro, D.
  • Dizon, L. J.
  • Sarip, A. J.
  • Soriano, M. A.

Feel free to connect via LinkedIn for further information on this model or on the study that it was used on.

Downloads last month
55
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for mapsoriano/roberta-tagalog-base-philippine-elections-2016-2022-hate-speech

Finetuned
this model

Datasets used to train mapsoriano/roberta-tagalog-base-philippine-elections-2016-2022-hate-speech