Spaces:
Duplicated from inoki-giskard/giskard-evaluator

giskardai
/

giskard-evaluator

Running

App Files Files Community

Report for soleimanian/financial-roberta-large-sentiment on financial_phrasebank (sentences_allagree, train set)

#2

by giskard-bot - opened Dec 1, 2023

Giskard org Dec 1, 2023

Performance issues (1)

Vulnerability	Level	Data slice	Metric	Transformation	Deviation	Description
Performance	medium	`avg_word_length(text)` < 3.860 AND `avg_word_length(text)` >= 3.699	Balanced Accuracy = 0.892	—	-5.29% than global	For records in the dataset where `avg_word_length(text)` < 3.860 AND `avg_word_length(text)` >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy.

Robustness issues (2)

Vulnerability	Level	Data slice	Metric	Transformation	Deviation	Description
Robustness	medium	—	Fail rate = 0.075	Transform to uppercase	75/1000 tested samples (7.5%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation.
Robustness	medium	—	Fail rate = 0.071	Add typos	71/1000 tested samples (7.1%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 7.1% of the cases. We expected the predictions not to be affected by this transformation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment