Model Card for identrics/wasper_propaganda_detection_bg
Model Description
- Developed by:
Identrics
- Language: Bulgarian
- License: apache-2.0
- Finetuned from model:
INSAIT-Institute/BgGPT-7B-Instruct-v0.2
- Context window : 8192 tokens
Model Description
This model consists of a fine-tuned version of BgGPT-7B-Instruct-v0.2 for a propaganda detection task. It is effectively a binary classifier, determining wether propaganda is present in the output string.
This model was created by Identrics
, in the scope of the WASPer project. The detailed taxonomy of the full pipeline could be found here.
Uses
Designed as a binary classifier to determine whether a traditional or social media comment contains propaganda.
Example
First install direct dependencies:
pip install transformers torch accelerate
Then the model can be downloaded and used for inference:
from transformers import pipeline
labels_map = {"LABEL_0": "No Propaganda", "LABEL_1": "Propaganda"}
pipe = pipeline(
"text-classification",
model="identrics/wasper_propaganda_detection_bg",
tokenizer="identrics/wasper_propaganda_detection_bg",
)
text = "Газа евтин, американското ядрено гориво евтино, пълно с фотоволтаици а пък тока с 30% нагоре. Защо ?"
prediction = pipe(text)
print(labels_map[prediction[0]["label"]])
Training Details
The training dataset for the model consists of a balanced collection of Bulgarian examples, including both propaganda and non-propaganda content. These examples were sourced from a variety of traditional media and social media platforms and manually annotated by domain experts. Additionally, the dataset is enriched with AI-generated samples.
The model achieved an F1 score of 0.836 during evaluation.
Compute Infrastructure
This model was fine-tuned using a GPU / 2xNVIDIA Tesla V100 32GB.
Citation [this section is to be updated soon]
If you find our work useful, please consider citing WASPer:
@article{...2024wasper,
title={WASPer: Propaganda Detection in Bulgarian and English},
author={....},
journal={arXiv preprint arXiv:...},
year={2024}
}
- Downloads last month
- 50
Model tree for identrics/wasper_propaganda_detection_bg
Base model
mistralai/Mistral-7B-v0.1