metadata

license: cc-by-sa-4.0
base_model: jcblaise/roberta-tagalog-base
tags:
  - generated_from_trainer
  - tagalog
  - filipino
  - twitter
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: roberta-tagalog-base-philippine-elections-2016-2022-hate-speech
    results: []
datasets:
  - hate_speech_filipino
  - mapsoriano/2016_2022_hate_speech_filipino
language:
  - tl
  - en

roberta-tagalog-base-philippine-elections-2016-2022-hate-speech

This model is a fine-tuned version of jcblaise/roberta-tagalog-base for the task of Text Classification, classifying hate and non-hate tweets.

The model was fine-tuned on a combined dataset mapsoriano/2016_2022_hate_speech_filipino consisting of the hate_speech_filipino dataset and a newly crawled 2022 Philippine Presidential Elections-related Tweets Hate Speech Dataset.

It achieves the following results on the evaluation (validation) set:

Loss: 0.3574
Accuracy: 0.8743

It achieves the following results on the test set:

Accuracy: 0.8783
Precision: 0.8563
Recall: 0.9077
F1: 0.8813

Feel free to connect via LinkedIn for further information on this model or on the study that it was used on.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.3423	1.0	1361	0.3167	0.8693
0.2194	2.0	2722	0.3574	0.8743

Framework versions

Transformers 4.33.2
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

Citation Information

Research Title: Application of BERT in Detecting Online Hate

Published: 2023

Authors:

Castro, D.
Dizon, L. J.
Sarip, A. J.
Soriano, M. A.

Feel free to connect via LinkedIn for further information on this model or on the study that it was used on.