Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
- propaganda
|
9 |
---
|
10 |
|
11 |
-
# Model Card for identrics/
|
12 |
|
13 |
|
14 |
|
@@ -31,20 +31,20 @@ This model was created by [`Identrics`](https://identrics.ai/), in the scope of
|
|
31 |
|
32 |
The propaganda techniques we want to identify are classified in 5 categories:
|
33 |
|
34 |
-
1. Self-Identification Techniques
|
35 |
These techniques exploit the audience's feelings of association (or desire to be associated) with a larger group. They suggest that the audience should feel united, motivated, or threatened by the same factors that unite, motivate, or threaten that group.
|
36 |
|
37 |
|
38 |
-
2. Defamation Techniques
|
39 |
These techniques represent direct or indirect attacks against an entity's reputation and worth.
|
40 |
|
41 |
-
3. Legitimisation Techniques
|
42 |
These techniques attempt to prove and legitimise the propagandist's statements by using arguments that cannot be falsified because they are based on moral values or personal experiences.
|
43 |
|
44 |
-
4. Logical Fallacies
|
45 |
These techniques appeal to the audience's reason and masquerade as objective and factual arguments, but in reality, they exploit distractions and flawed logic.
|
46 |
|
47 |
-
5. Rhetorical Devices
|
48 |
These techniques seek to influence the audience and control the conversation by using linguistic methods.
|
49 |
|
50 |
|
@@ -53,7 +53,6 @@ These techniques seek to influence the audience and control the conversation by
|
|
53 |
## Uses
|
54 |
|
55 |
To be used as a multilabel classifier to identify if the Bulgarian sample text contains one or more of the five propaganda techniques mentioned above.
|
56 |
-
|
57 |
### Example
|
58 |
|
59 |
First install direct dependencies:
|
@@ -65,8 +64,8 @@ Then the model can be downloaded and used for inference:
|
|
65 |
```py
|
66 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
67 |
|
68 |
-
model = AutoModelForSequenceClassification.from_pretrained("identrics/
|
69 |
-
tokenizer = AutoTokenizer.from_pretrained("identrics/
|
70 |
|
71 |
tokens = tokenizer("Газа евтин, американското ядрено гориво евтино, пълно с фотоволтаици а пък тока с 30% нагоре. Защо ?", return_tensors="pt")
|
72 |
output = model(**tokens)
|
@@ -77,14 +76,18 @@ print(output.logits)
|
|
77 |
## Training Details
|
78 |
|
79 |
|
80 |
-
During the training stage,
|
81 |
-
|
|
|
82 |
|
83 |
|
|
|
|
|
|
|
84 |
|
85 |
-
|
86 |
|
87 |
-
## Citation
|
88 |
|
89 |
If you find our work useful, please consider citing WASPer:
|
90 |
|
|
|
8 |
- propaganda
|
9 |
---
|
10 |
|
11 |
+
# Model Card for identrics/wasper_propaganda_classifier_bg
|
12 |
|
13 |
|
14 |
|
|
|
31 |
|
32 |
The propaganda techniques we want to identify are classified in 5 categories:
|
33 |
|
34 |
+
1. **Self-Identification Techniques**:
|
35 |
These techniques exploit the audience's feelings of association (or desire to be associated) with a larger group. They suggest that the audience should feel united, motivated, or threatened by the same factors that unite, motivate, or threaten that group.
|
36 |
|
37 |
|
38 |
+
2. **Defamation Techniques**:
|
39 |
These techniques represent direct or indirect attacks against an entity's reputation and worth.
|
40 |
|
41 |
+
3. **Legitimisation Techniques**:
|
42 |
These techniques attempt to prove and legitimise the propagandist's statements by using arguments that cannot be falsified because they are based on moral values or personal experiences.
|
43 |
|
44 |
+
4. **Logical Fallacies**:
|
45 |
These techniques appeal to the audience's reason and masquerade as objective and factual arguments, but in reality, they exploit distractions and flawed logic.
|
46 |
|
47 |
+
5. **Rhetorical Devices**:
|
48 |
These techniques seek to influence the audience and control the conversation by using linguistic methods.
|
49 |
|
50 |
|
|
|
53 |
## Uses
|
54 |
|
55 |
To be used as a multilabel classifier to identify if the Bulgarian sample text contains one or more of the five propaganda techniques mentioned above.
|
|
|
56 |
### Example
|
57 |
|
58 |
First install direct dependencies:
|
|
|
64 |
```py
|
65 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
66 |
|
67 |
+
model = AutoModelForSequenceClassification.from_pretrained("identrics/wasper_propaganda_classifier_bg", num_labels=5)
|
68 |
+
tokenizer = AutoTokenizer.from_pretrained("identrics/wasper_propaganda_classifier_bg")
|
69 |
|
70 |
tokens = tokenizer("Газа евтин, американското ядрено гориво евтино, пълно с фотоволтаици а пък тока с 30% нагоре. Защо ?", return_tensors="pt")
|
71 |
output = model(**tokens)
|
|
|
76 |
## Training Details
|
77 |
|
78 |
|
79 |
+
During the training stage, the objective was to develop the multi-label classifier to identify different types of propaganda using a dataset containing both real and artificially generated samples.
|
80 |
+
|
81 |
+
The data has been carefully annotated by domain experts based on a predefined taxonomy, which covers five primary categories. Some examples are assigned to a single category, while others are classified into multiple categories, reflecting the nuanced nature of propaganda where multiple techniques can be found within a single text.
|
82 |
|
83 |
|
84 |
+
The model reached an F1-weighted score of **0.538** during training.
|
85 |
+
|
86 |
+
## Compute Infrastructure
|
87 |
|
88 |
+
This model was fine-tuned using a **GPU / 2xNVIDIA Tesla V100 32GB**.
|
89 |
|
90 |
+
## Citation [this section is to be updated soon]
|
91 |
|
92 |
If you find our work useful, please consider citing WASPer:
|
93 |
|