---
language:
- en
license: mit
datasets:
- cardiffnlp/x_sensitive
metrics:
- f1
widget:
- text: Call me today to earn some money mofos!
pipeline_tag: text-classification
---

# twitter-roberta-base-sensitive-binary

This is a RoBERTa-base model trained on 154M tweets until the end of December 2022 and finetuned for detecting sensitive content (multilabel classification) on the [_X-Sensitive_](https://huggingface.co/datasets/cardiffnlp/x_sensitive) dataset.
The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-base-2022-154m).


## Labels
```
"id2label": {
  "0": "conflictual",
  "1": "profanity",
  "2": "sex",
  "3": "drugs",
  "4": "selfharm",
  "5": "spam"
  "6": "not-sensitive"
}
```

## Full classification example

```python
from transformers import pipeline
    
pipe = pipeline(model='cardiffnlp/twitter-roberta-base-sensitive-multilabel')
text = "Call me today to earn some money mofos!"

pipe(text)
```
Output: 

```
[[{'label': 'conflictual', 'score': 0.07463070750236511},
  {'label': 'profanity', 'score': 0.9888035655021667},
  {'label': 'sex', 'score': 0.0032050721347332},
  {'label': 'drugs', 'score': 0.004522938746958971},
  {'label': 'selfharm', 'score': 0.0036733713932335377},
  {'label': 'spam', 'score': 0.007278479170054197},
  {'label': 'not-sensitive', 'score': 0.00972921121865511}]]
```


## BibTeX entry and citation info

```
@article{antypas2024sensitive,
  title={Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation},
  author={Antypas, Dimosthenis and Sen, Indira and Perez-Almendros, Carla and Camacho-Collados, Jose and Barbieri, Francesco},
  journal={arXiv preprint arXiv:2411.19832},
  year={2024}
}
```