Report for TheBritishLibrary/bl-books-genre
Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 5 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset TheBritishLibrary/blbooksgenre (subset title_genre_classifiction
, split train
).
You can find a full version of scan report here.
👉Performance issues (3)
For records in the dataset where text
contains "van", the Balanced Accuracy is 20.89% lower than the global Balanced Accuracy.
Level | Data slice | Metric | Deviation |
---|---|---|---|
major 🔴 | text contains "van" |
Balanced Accuracy = 0.735 | -20.89% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | label | Predicted label |
|
---|---|---|---|
807 | Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc | Non-fiction | Fiction (p = 0.74) |
835 | Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff | Non-fiction | Fiction (p = 0.54) |
855 | Burgerlijk en kerkelijk gedenkboek van Haamstede, deels ook van Burgh, etc | Fiction | Non-fiction (p = 0.97) |
For records in the dataset where text_length(text)
>= 60.500 AND text_length(text)
< 69.500, the Balanced Accuracy is 7.59% lower than the global Balanced Accuracy.
Level | Data slice | Metric | Deviation |
---|---|---|---|
medium 🟡 | text_length(text) >= 60.500 AND text_length(text) < 69.500 |
Balanced Accuracy = 0.858 | -7.59% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | text_length(text) | label | Predicted label |
|
---|---|---|---|---|
252 | Bush-Life in Queensland; or, John West's colonial experiences | 61 | Fiction | Non-fiction (p = 1.00) |
736 | Die Wanderungen der Kelten. Historisch-kritisch dargelegt, etc | 62 | Fiction | Non-fiction (p = 0.99) |
807 | Mohammed Christus. Vier Schetsen van eene reis in het Oosten, etc | 65 | Non-fiction | Fiction (p = 0.74) |
For records in the dataset where text_length(text)
>= 101.500 AND text_length(text)
< 147.500, the Balanced Accuracy is 5.69% lower than the global Balanced Accuracy.
Level | Data slice | Metric | Deviation |
---|---|---|---|
medium 🟡 | text_length(text) >= 101.500 AND text_length(text) < 147.500 |
Balanced Accuracy = 0.876 | -5.69% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | text_length(text) | label | Predicted label |
|
---|---|---|---|---|
805 | Verdens Storbyer [Translated from 'Les Capitales du monde.'] Paa Dansk af P. Nansen. Med 322 Afbildninger | 105 | Non-fiction | Fiction (p = 0.66) |
813 | De stad Utrecht en hare geschiedenis, voorafgegaan door eene algemeene geschied- en aardrijkskundige beschouwing over de provincie Utrecht | 138 | Fiction | Non-fiction (p = 0.78) |
835 | Leeuwarden na en voor hare wording als stad, en in hare betrekking tot de Leppa. Eene bydrage tot de Geschiedkundige beschrijving, van W. Eekhoff | 145 | Non-fiction | Fiction (p = 0.54) |
👉Robustness issues (2)
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 24.3% of the cases. We expected the predictions not to be affected by this transformation.
Level | Metric | Transformation | Deviation |
---|---|---|---|
major 🔴 | Fail rate = 0.243 | Transform to uppercase | 243/1000 tested samples (24.3%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201🔍✨Examples
text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
657 | The Wish. A novel by Hermann Sudermann. Translated by Lily Henkel | THE WISH. A NOVEL BY HERMANN SUDERMANN. TRANSLATED BY LILY HENKEL | Fiction (p = 1.00) | Non-fiction (p = 0.81) |
289 | A Great Gulf Fixed. A tale | A GREAT GULF FIXED. A TALE | Fiction (p = 1.00) | Non-fiction (p = 0.96) |
572 | Original Sonnets, elegiac, ethic and erotic: with some miscellaneous productions and imitations | ORIGINAL SONNETS, ELEGIAC, ETHIC AND EROTIC: WITH SOME MISCELLANEOUS PRODUCTIONS AND IMITATIONS | Fiction (p = 1.00) | Non-fiction (p = 0.98) |
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 5.0% of the cases. We expected the predictions not to be affected by this transformation.
Level | Metric | Transformation | Deviation |
---|---|---|---|
medium 🟡 | Fail rate = 0.050 | Transform to title case | 50/1000 tested samples (5.0%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201🔍✨Examples
text | Transform to title case(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
1268 | Ça ira! or Danton in the French Revolution. A study | Ça Ira! Or Danton In The French Revolution. A Study | Non-fiction (p = 0.98) | Fiction (p = 1.00) |
1078 | Two Women in the Klondike. The story of a journey to the gold-fields of Alaska ... With 105 illustrations and map | Two Women In The Klondike. The Story Of A Journey To The Gold-Fields Of Alaska ... With 105 Illustrations And Map | Non-fiction (p = 1.00) | Fiction (p = 0.64) |
489 | The passionate pilgrim, or Eros and Anteros. By Henry J. Thurstan [pseudonym of F. T. Palgrave] | The Passionate Pilgrim, Or Eros And Anteros. By Henry J. Thurstan [Pseudonym Of F. T. Palgrave] | Non-fiction (p = 0.99) | Fiction (p = 0.87) |
Checkout out the Giskard Space and Giskard Documentation to learn more about how to test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.