Report for TheBritishLibrary/bl-books-genre

#4
by giskard-bot - opened

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 5 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset TheBritishLibrary/blbooksgenre (subset title_genre_classifiction, split train).

You can find a full version of scan report here.

👉Performance issues (3)

For records in the dataset where text contains "van", the Balanced Accuracy is 20.89% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
major 🔴 text contains "van" Balanced Accuracy = 0.735 -20.89% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
807 Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc Non-fiction Fiction (p = 0.74)
835 Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff Non-fiction Fiction (p = 0.54)
855 Burgerlijk en kerkelijk gedenkboek van Haamstede, deels ook van Burgh, etc Fiction Non-fiction (p = 0.97)

For records in the dataset where text_length(text) >= 60.500 AND text_length(text) < 69.500, the Balanced Accuracy is 7.59% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
medium 🟡 text_length(text) >= 60.500 AND text_length(text) < 69.500 Balanced Accuracy = 0.858 -7.59% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text text_length(text) label Predicted label
252 Bush-Life in Queensland; or, John West's colonial experiences 61 Fiction Non-fiction (p = 1.00)
736 Die Wanderungen der Kelten. Historisch-kritisch dargelegt, etc 62 Fiction Non-fiction (p = 0.99)
807 Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc 65 Non-fiction Fiction (p = 0.74)

For records in the dataset where text_length(text) >= 101.500 AND text_length(text) < 147.500, the Balanced Accuracy is 5.69% lower than the global Balanced Accuracy.

Level Data slice Metric Deviation
medium 🟡 text_length(text) >= 101.500 AND text_length(text) < 147.500 Balanced Accuracy = 0.876 -5.69% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text text_length(text) label Predicted label
805 Verdens Storbyer [Translated from 'Les Capitales du monde.'] Paa Dansk af P. Nansen. Med 322 Afbildninger 105 Non-fiction Fiction (p = 0.66)
813 De stad Utrecht en hare geschiedenis, voorafgegaan door eene algemeene geschied- en aardrijkskundige beschouwing over de provincie Utrecht 138 Fiction Non-fiction (p = 0.78)
835 Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff 145 Non-fiction Fiction (p = 0.54)
👉Robustness issues (2)

When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 24.3% of the cases. We expected the predictions not to be affected by this transformation.

Level Metric Transformation Deviation
major 🔴 Fail rate = 0.243 Transform to uppercase 243/1000 tested samples (24.3%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to uppercase(text) Original prediction Prediction after perturbation
657 The Wish. A novel by Hermann Sudermann. Translated by Lily Henkel THE WISH. A NOVEL BY HERMANN SUDERMANN. TRANSLATED BY LILY HENKEL Fiction (p = 1.00) Non-fiction (p = 0.81)
289 A Great Gulf Fixed. A tale A GREAT GULF FIXED. A TALE Fiction (p = 1.00) Non-fiction (p = 0.96)
572 Original Sonnets, elegiac, ethic and erotic: with some miscellaneous productions and imitations ORIGINAL SONNETS, ELEGIAC, ETHIC AND EROTIC: WITH SOME MISCELLANEOUS PRODUCTIONS AND IMITATIONS Fiction (p = 1.00) Non-fiction (p = 0.98)

When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 5.0% of the cases. We expected the predictions not to be affected by this transformation.

Level Metric Transformation Deviation
medium 🟡 Fail rate = 0.050 Transform to title case 50/1000 tested samples (5.0%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to title case(text) Original prediction Prediction after perturbation
1268 Ça ira! or Danton in the French Revolution. A study Ça Ira! Or Danton In The French Revolution. A Study Non-fiction (p = 0.98) Fiction (p = 1.00)
1078 Two Women in the Klondike. The story of a journey to the gold-fields of Alaska ... With 105 illustrations and map Two Women In The Klondike. The Story Of A Journey To The Gold-Fields Of Alaska ... With 105 Illustrations And Map Non-fiction (p = 1.00) Fiction (p = 0.64)
489 The passionate pilgrim, or Eros and Anteros. By Henry J. Thurstan [pseudonym of F. T. Palgrave] The Passionate Pilgrim, Or Eros And Anteros. By Henry J. Thurstan [Pseudonym Of F. T. Palgrave] Non-fiction (p = 0.99) Fiction (p = 0.87)

Checkout out the Giskard Space and Giskard Documentation to learn more about how to test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment