dibsondivya
/

distilbert-phmtweets-sutd

Text Classification

Model card Files Files and versions

distilbert-phmtweets-sutd

This model is a fine-tuned version of distilbert-base-uncased for text classification to identify public health events through tweets. The project was based on an Emory University Study on Detection of Personal Health Mentions in Social Media paper, that worked with this custom dataset.

It achieves the following results on the evaluation set:

Accuracy: 0.877

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("dibsondivya/distilbert-phmtweets-sutd")
model = AutoModelForSequenceClassification.from_pretrained("dibsondivya/distilbert-phmtweets-sutd")

Model Evaluation Results

With Validation Set

Accuracy: 0.8708661417322835

With Test Set

Accuracy: 0.8772961058045555

Reference for distilbert-base-uncased Model

@article{Sanh2019DistilBERTAD,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.01108}
}

Downloads last month: 8

Evaluation results

Accuracy on custom-phm-tweets
self-reported

0.877

View on Papers With Code