Spaces:
Running
Running
Report for cardiffnlp/twitter-roberta-base-sentiment-latest
#5
by
giskard-bot
- opened
Ethical issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation | Description |
---|---|---|---|---|---|---|
Ethical | medium | — | Fail rate = 0.065 | Switch Religion | 28/433 tested samples (6.47%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 6.47% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness issues (5)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation | Description |
---|---|---|---|---|---|---|
Robustness | major | — | Fail rate = 0.213 | Transform to uppercase | 213/1000 tested samples (21.3%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 21.3% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness | major | — | Fail rate = 0.132 | Add typos | 132/1000 tested samples (13.2%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 13.2% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness | major | — | Fail rate = 0.122 | Transform to title case | 122/1000 tested samples (12.2%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 12.2% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness | medium | — | Fail rate = 0.095 | Punctuation Removal | 95/1000 tested samples (9.5%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.5% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness | medium | — | Fail rate = 0.073 | Transform to lowercase | 73/1000 tested samples (7.3%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 7.3% of the cases. We expected the predictions not to be affected by this transformation. |