File size: 2,818 Bytes
12b5ad8 19fe8ee eb633b5 05e9869 eb633b5 043c321 eb633b5 05e9869 0968e1e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
---
license: cc-by-3.0
---
The SOTA model for Dissonance Detection from the paper [Transfer and Active Learning for Dissonance Detection: Addressing the Rare Class Challenge](https://arxiv.org/abs/2305.02459).
RoBERTA-base finetuned on [Dissonance Twitter Dataset](https://github.com/humanlab/dissonance-twitter-dataset), collected from annotating tweets for within-person dissonance.
## Dataset Annotation details
Tweets were parsed into discourse units, and marked as Belief (Thought or Action) or Other, and pairs of beliefs within the same tweet were relayed to annotators for Dissonance annotation.
![annotation process](./annotation_process.jpg)
The annotations were conducted on a sheet in the following **dissonance-first** format.
![annotation format](./annotation_format.png)
The annotators used the following flowchart as a more detailed guide to determining the Dissonance, Consonance and Neither/Other classes:
![annotation guidelines](./annotation_guidelines.jpg)
## Citation
If you use this dataset, please cite the associated paper:
```
@inproceedings{varadarajan2023transfer,
title={Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge},
author={Varadarajan, Vasudha and Juhng, Swanie and Mahwish, Syeda and Liu, Xiaoran and Luby, Jonah and Luhmann, Christian and Schwartz, H Andrew},
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Long Papers)",
month = july,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
abstract = "While transformer-based systems have enabled greater accuracies with fewer training examples, data acquisition obstacles still persist for rare-class tasks -- when the class label is very infrequent (e.g. < 5% of samples). Active learning has in general been proposed to alleviate such challenges, but choice of selection strategy, the criteria by which rare-class examples are chosen, has not been systematically evaluated. Further, transformers enable iterative transfer-learning approaches. We propose and investigate transfer- and active learning solutions to the rare class problem of dissonance detection through utilizing models trained on closely related tasks and the evaluation of acquisition strategies, including a proposed probability-of-rare-class (PRC) approach. We perform these experiments for a specific rare class problem: collecting language samples of cognitive dissonance from social media. We find that PRC is a simple and effective strategy to guide annotations and ultimately improve model accuracy while transfer-learning in a specific order can improve the cold-start performance of the learner but does not benefit iterations of active learning.",
}
```
|