File size: 1,795 Bytes
3e7e156
c68b60f
 
3e7e156
c68b60f
 
3e7e156
c68b60f
3e7e156
 
c68b60f
3e7e156
 
 
 
c68b60f
3e7e156
2e80a9b
 
c68b60f
2e80a9b
 
3e7e156
c68b60f
3e7e156
c68b60f
 
3e7e156
 
c68b60f
3e7e156
c68b60f
3e7e156
c68b60f
3e7e156
c68b60f
 
 
 
3e7e156
c68b60f
 
 
3e7e156
c68b60f
 
3e7e156
c68b60f
 
 
 
3e7e156
c68b60f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
base_model: INSAIT-Institute/BgGPT-7B-Instruct-v0.2
library_name: peft
license: apache-2.0
language:
- en
tags:
- propaganda
---

# Model Card for identrics/BG_propaganda_detector




## Model Description

- **Developed by:** [`Identrics`](https://identrics.ai/)
- **Language:** Bulgarian
- **License:** apache-2.0
- **Finetuned from model:** [`google-bert/bert-base-cased`](https://huggingface.co/google-bert/bert-base-cased)
- **Context window :** 8192 tokens

## Model Description

This model consists of a fine-tuned version of google-bert/bert-base-cased for a propaganda detection task. It is effectively a binary classifier, determining wether propaganda is present in the output string.
This model was created by [`Identrics`](https://identrics.ai/), in the scope of the Wasper project.


## Uses

To be used as a binary classifier to identify if propaganda is present in a string containing a comment from a social media site

### Example

First install direct dependencies:
```
pip install transformers torch accelerate
```

Then the model can be downloaded and used for inference:
```py
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("identrics/EN_propaganda_detector", num_labels=2)
tokenizer = AutoTokenizer.from_pretrained("identrics/EN_propaganda_detector")

tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
output = model(**tokens)
print(output.logits)
```


## Training Details



Trained on a corpus of 200 human-generated comments, augmented with 200 more synthetic comments...

Achieved an f1 score of x%
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->





- PEFT 0.11.1