TheBritishLibrary/bl-books-genre · Report for TheBritishLibrary/bl-books-genre

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 5 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset TheBritishLibrary/blbooksgenre (subset title_genre_classifiction, split train).

You can find a full version of scan report here.

👉Performance issues (3)

For records in the dataset where text contains "van", the Balanced Accuracy is 20.89% lower than the global Balanced Accuracy.

Level	Data slice	Metric	Deviation
major 🔴	`text` contains "van"	Balanced Accuracy = 0.735	-20.89% than global

Taxonomy

avid-effect:performance:P0204

🔍✨Examples

	text	label	Predicted `label`
807	Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc	Non-fiction	Fiction (p = 0.74)
835	Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff	Non-fiction	Fiction (p = 0.54)
855	Burgerlijk en kerkelijk gedenkboek van Haamstede, deels ook van Burgh, etc	Fiction	Non-fiction (p = 0.97)

For records in the dataset where text_length(text) >= 60.500 AND text_length(text) < 69.500, the Balanced Accuracy is 7.59% lower than the global Balanced Accuracy.

Level	Data slice	Metric	Deviation
medium 🟡	`text_length(text)` >= 60.500 AND `text_length(text)` < 69.500	Balanced Accuracy = 0.858	-7.59% than global

Taxonomy

avid-effect:performance:P0204

🔍✨Examples

	text	text_length(text)	label	Predicted `label`
252	Bush-Life in Queensland; or, John West's colonial experiences	61	Fiction	Non-fiction (p = 1.00)
736	Die Wanderungen der Kelten. Historisch-kritisch dargelegt, etc	62	Fiction	Non-fiction (p = 0.99)
807	Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc	65	Non-fiction	Fiction (p = 0.74)

For records in the dataset where text_length(text) >= 101.500 AND text_length(text) < 147.500, the Balanced Accuracy is 5.69% lower than the global Balanced Accuracy.

Level	Data slice	Metric	Deviation
medium 🟡	`text_length(text)` >= 101.500 AND `text_length(text)` < 147.500	Balanced Accuracy = 0.876	-5.69% than global

Taxonomy

avid-effect:performance:P0204

🔍✨Examples

	text	text_length(text)	label	Predicted `label`
805	Verdens Storbyer [Translated from 'Les Capitales du monde.'] Paa Dansk af P. Nansen. Med 322 Afbildninger	105	Non-fiction	Fiction (p = 0.66)
813	De stad Utrecht en hare geschiedenis, voorafgegaan door eene algemeene geschied- en aardrijkskundige beschouwing over de provincie Utrecht	138	Fiction	Non-fiction (p = 0.78)
835	Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff	145	Non-fiction	Fiction (p = 0.54)

👉Robustness issues (2)

When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 24.3% of the cases. We expected the predictions not to be affected by this transformation.

Level	Metric	Transformation	Deviation
major 🔴	Fail rate = 0.243	Transform to uppercase	243/1000 tested samples (24.3%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201

🔍✨Examples

	text	Transform to uppercase(text)	Original prediction	Prediction after perturbation
657	The Wish. A novel by Hermann Sudermann. Translated by Lily Henkel	THE WISH. A NOVEL BY HERMANN SUDERMANN. TRANSLATED BY LILY HENKEL	Fiction (p = 1.00)	Non-fiction (p = 0.81)
289	A Great Gulf Fixed. A tale	A GREAT GULF FIXED. A TALE	Fiction (p = 1.00)	Non-fiction (p = 0.96)
572	Original Sonnets, elegiac, ethic and erotic: with some miscellaneous productions and imitations	ORIGINAL SONNETS, ELEGIAC, ETHIC AND EROTIC: WITH SOME MISCELLANEOUS PRODUCTIONS AND IMITATIONS	Fiction (p = 1.00)	Non-fiction (p = 0.98)

When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 5.0% of the cases. We expected the predictions not to be affected by this transformation.

Level	Metric	Transformation	Deviation
medium 🟡	Fail rate = 0.050	Transform to title case	50/1000 tested samples (5.0%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201

🔍✨Examples

	text	Transform to title case(text)	Original prediction	Prediction after perturbation
1268	Ça ira! or Danton in the French Revolution. A study	Ça Ira! Or Danton In The French Revolution. A Study	Non-fiction (p = 0.98)	Fiction (p = 1.00)
1078	Two Women in the Klondike. The story of a journey to the gold-fields of Alaska ... With 105 illustrations and map	Two Women In The Klondike. The Story Of A Journey To The Gold-Fields Of Alaska ... With 105 Illustrations And Map	Non-fiction (p = 1.00)	Fiction (p = 0.64)
489	The passionate pilgrim, or Eros and Anteros. By Henry J. Thurstan [pseudonym of F. T. Palgrave]	The Passionate Pilgrim, Or Eros And Anteros. By Henry J. Thurstan [Pseudonym Of F. T. Palgrave]	Non-fiction (p = 0.99)	Fiction (p = 0.87)

Checkout out the Giskard Space and Giskard Documentation to learn more about how to test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.