Spaces:
Running
Running
Report for soleimanian/financial-roberta-large-sentiment on financial_phrasebank (sentences_allagree, train set)
#2
by
giskard-bot
- opened
Performance issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation | Description |
---|---|---|---|---|---|---|
Performance | medium | avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699 |
Balanced Accuracy = 0.892 | — | -5.29% than global | For records in the dataset where avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy. |
Robustness issues (2)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation | Description |
---|---|---|---|---|---|---|
Robustness | medium | — | Fail rate = 0.075 | Transform to uppercase | 75/1000 tested samples (7.5%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation. |
Robustness | medium | — | Fail rate = 0.071 | Add typos | 71/1000 tested samples (7.1%) changed prediction after perturbation | When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 7.1% of the cases. We expected the predictions not to be affected by this transformation. |