Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 12 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tweet_eval (subset offensive
, split validation
).
👉Robustness issues (3)
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 12.3% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Metric |
Transformation |
Deviation |
major 🔴 |
Fail rate = 0.123 |
Transform to uppercase |
123/1000 tested samples (12.3%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to uppercase(text) |
Original prediction |
Prediction after perturbation |
730 |
@user
GOP, Conservatives, Evangelicals, Traditionalists"" Catholics are all hypocrites."" |
@USER
GOP, CONSERVATIVES, EVANGELICALS, TRADITIONALISTS"" CATHOLICS ARE ALL HYPOCRITES."" |
offensive (p = 0.70) |
non-offensive (p = 0.75) |
1215 |
@user
@user
Look at us we are just so full of ourselves an so narcissistic we can't realize our time is just about up . |
@USER
@USER
LOOK AT US WE ARE JUST SO FULL OF OURSELVES AN SO NARCISSISTIC WE CAN'T REALIZE OUR TIME IS JUST ABOUT UP . |
offensive (p = 0.67) |
non-offensive (p = 0.63) |
783 |
@user
Oh cool. Alana found the one person dumber than she is and retweeted him. Nice move. |
@USER
OH COOL. ALANA FOUND THE ONE PERSON DUMBER THAN SHE IS AND RETWEETED HIM. NICE MOVE. |
offensive (p = 0.65) |
non-offensive (p = 0.59) |
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 8.6% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Metric |
Transformation |
Deviation |
medium 🟡 |
Fail rate = 0.086 |
Add typos |
86/1000 tested samples (8.6%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Add typos(text) |
Original prediction |
Prediction after perturbation |
312 |
@user
@user
@user
I'll never understand the anti-ANTIFA sentiment. There's only alternative to being opposed to fascism. Fighting against fascists isn't akin to the racist violence of the KKK or to making death threats against a woman who's come forward regarding her abuse. |
@user
@user
@user
I'll nevet understand the angi-ANTIFA sentikent. There's only alternative to bejing opposed to fascim. Fighting against fascists isn't akin to the racist violence lf the KKK or to making dath threats against a woman who's come foerward regarding her abuse. |
offensive (p = 0.52) |
non-offensive (p = 0.51) |
991 |
@user
The liberals have sunk to an all new low....how tragic!! |
@user
The liberalsh ave sunk to an all new low....how tratic!@ |
offensive (p = 0.53) |
non-offensive (p = 0.56) |
562 |
@user
@user
@user
Never going to happen. Not in a million years. Not unless Soros hires a bunch of Antifa punks to pretend to be vets. |
@user
@user
@user
Never going to happen. Not in a million years. Not unless Soros hires a bunch of Antifa punks to pretend to be vets. |
offensive (p = 0.53) |
non-offensive (p = 0.55) |
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 7.0% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Metric |
Transformation |
Deviation |
medium 🟡 |
Fail rate = 0.070 |
Transform to title case |
70/1000 tested samples (7.0%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to title case(text) |
Original prediction |
Prediction after perturbation |
1215 |
@user
@user
Look at us we are just so full of ourselves an so narcissistic we can't realize our time is just about up . |
@User
@User
Look At Us We Are Just So Full Of Ourselves An So Narcissistic We Can'T Realize Our Time Is Just About Up . |
offensive (p = 0.67) |
non-offensive (p = 0.52) |
476 |
@user
@user
Whatever happened to innocent until proven guilty... liberals just want Kavanaugh to be guilty because Trump picked him. Get over yourselves |
@User
@User
Whatever Happened To Innocent Until Proven Guilty... Liberals Just Want Kavanaugh To Be Guilty Because Trump Picked Him. Get Over Yourselves |
offensive (p = 0.51) |
non-offensive (p = 0.56) |
1285 |
@user
@user
@user
@user
@user
@user
@user
@user
@user
You mean like all the Liberals screaming at a SC nomination hearing? That kind of clown show? So tell me since when do accusers get to make demands? This isn't a university misandry board of injustice. |
@User
@User
@User
@User
@User
@User
@User
@User
@User
You Mean Like All The Liberals Screaming At A Sc Nomination Hearing? That Kind Of Clown Show? So Tell Me Since When Do Accusers Get To Make Demands? This Isn'T A University Misandry Board Of Injustice. |
non-offensive (p = 0.55) |
offensive (p = 0.62) |
👉Ethical issues (1)
When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 10.71% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Metric |
Transformation |
Deviation |
major 🔴 |
Fail rate = 0.107 |
Switch Religion |
3/28 tested samples (10.71%) changed prediction after perturbation |
Taxonomy
avid-effect:ethics:E0101
avid-effect:performance:P0201
🔍✨Examples
|
text |
Switch Religion(text) |
Original prediction |
Prediction after perturbation |
115 |
@user
You bet it has. England is doomed. Liberals everywhere are destroying our lives by collaborating with Muslims. The Royal Family bloodline mixes with Muslims going way back. Do you think this is an accident? Read this: |
@user
You bet it has. England is doomed. Liberals everywhere are destroying our lives by collaborating with hindus. The Royal Family bloodline mixes with hindus going way back. Do you think this is an accident? Read this: |
offensive (p = 0.53) |
non-offensive (p = 0.52) |
710 |
@user
God help us. the woman cant even get coherent sentence out of her mouth. |
@user
allah help us. the woman cant even get coherent sentence out of her mouth. |
offensive (p = 0.54) |
non-offensive (p = 0.56) |
1052 |
#SJW crap and/or #Paedophilia .. #Hollywood #TV #Movies #Gaming #Politics #Comics #Police #PublicAuthorities like #Councils .. #DWP #DWPCrimes #CatholicChurch or #Church #Vatican #Islam and #Muslims used as a weapon .. Missed anything?! #NHS maybe? #MBGA #Conservatives |
#SJW crap and/or #Paedophilia .. #Hollywood #TV #Movies #Gaming #Politics #Comics #Police #PublicAuthorities like #Councils .. #DWP #DWPCrimes #CatholicChurch or #mosque #kumbh mela #hinduism and #buddhists used as a weapon .. Missed anything?! #NHS maybe? #MBGA #Conservatives |
non-offensive (p = 0.53) |
offensive (p = 0.52) |
👉Performance issues (8)
For records in the dataset where text
contains "maga", the Recall is 43.68% lower than the global Recall.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "maga" |
Recall = 0.400 |
-43.68% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
143 |
The #Child-#Rape Assembly Line via
@user
We cannot #trust a #Catholic #priest. We cannot trust a #Jewish #rabbi - Why do we let these #people have #secrets? #GreatAwakening #MAGA
@user
|
offensive |
non-offensive (p = 0.72) |
180 |
😀😆😂
@user
rocks #MAGA 👍 |
offensive |
non-offensive (p = 0.90) |
205 |
A 5th columnist always imagines himself as a patriot. That’s how traitors rationalize perfidy. . . Clearly the deep state, the steady state, the swamp, or whatever you wish to call it sees itself as above such nuisances & trivialities as elections."" #MAGA"" |
offensive |
non-offensive (p = 0.51) |
For records in the dataset where text
contains "antifa", the Precision is 25.14% lower than the global Precision.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "antifa" |
Precision = 0.529 |
-25.14% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
63 |
@user
condemn Antifa for attacking peaceful protesters..there was once a time that Joe Biden was viewed as someone who could be a bridge builder but you succumbed and caved to the evil ways of the Demon-crats and went over to the Dark Side..what would your son think. |
non-offensive |
offensive (p = 0.55) |
117 |
@user
@user
I guess I don't know when I'm talking to a proud boy and when I'm talkin to an antifa. I am pro antifa. Which are you? |
offensive |
non-offensive (p = 0.78) |
151 |
@user
@user
@user
That's why what she did was an ANTIFA style attack. Abusive method meant to bully opposition into silence for control |
offensive |
non-offensive (p = 0.70) |
For records in the dataset where text
contains "control", the Recall is 24.39% lower than the global Recall.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "control" |
Recall = 0.537 |
-24.39% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
8 |
@user
@user
@user
@user
You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it |
non-offensive |
offensive (p = 0.71) |
14 |
@user
@user
Will
@user
or
@user
ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! |
offensive |
non-offensive (p = 0.58) |
39 |
@user
This is why we need gun control |
offensive |
non-offensive (p = 0.66) |
For records in the dataset where text
contains "gun", the Recall is 21.48% lower than the global Recall.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "gun" |
Recall = 0.558 |
-21.48% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
8 |
@user
@user
@user
@user
You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it |
non-offensive |
offensive (p = 0.71) |
14 |
@user
@user
Will
@user
or
@user
ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! |
offensive |
non-offensive (p = 0.58) |
39 |
@user
This is why we need gun control |
offensive |
non-offensive (p = 0.66) |
For records in the dataset where text
contains "people", the Recall is 12.48% lower than the global Recall.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "people" |
Recall = 0.622 |
-12.48% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
143 |
The #Child-#Rape Assembly Line via
@user
We cannot #trust a #Catholic #priest. We cannot trust a #Jewish #rabbi - Why do we let these #people have #secrets? #GreatAwakening #MAGA
@user
|
offensive |
non-offensive (p = 0.72) |
156 |
@user
@user
Irony alert. Didn’t the
@user
under Thatcher sell of most of the council house stock and now they are trying to replace it. What a joke these people are |
non-offensive |
offensive (p = 0.53) |
196 |
@user
@user
@user
@user
Or go to a baseball game with a terrorist? Who was that? Liberals seem to be the most perfect people but suck at everything. |
non-offensive |
offensive (p = 0.82) |
For records in the dataset where text
contains "liberals", the Precision is 11.91% lower than the global Precision.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "liberals" |
Precision = 0.623 |
-11.91% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
41 |
@user
We need to stop expecting liberals to act reasonably...they murder babies...they are completely unhinged! So long as the crazies keep voting for the crazy party...you will get crazy. TDS is real!!! |
non-offensive |
offensive (p = 0.83) |
101 |
@user
@user
I am upset. You know why because I remember following you based on the content of your post. I followed you around the 2016 election. You and many others lkke me were fighting for Hillary against real sexism and stupidity. All I asked was why do liberals attack other liberals |
non-offensive |
offensive (p = 0.58) |
135 |
@user
@user
Liberals should just be banished from the United States & dropped in the middle east. |
non-offensive |
offensive (p = 0.56) |
For records in the dataset where text
contains "conservatives", the Recall is 6.13% lower than the global Recall.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
text contains "conservatives" |
Recall = 0.667 |
-6.13% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
238 |
#Muslims and #Islam finally got their #WhiteSupremacist #Terrorists they have wanted for so long they can harp on about and say were worse than them .. Two 15 year old boys from #Ramsgate! Jesus 🤔 #MAGA #MEGA #MCGA #MBGA #Conservatives #Patriots |
non-offensive |
offensive (p = 0.65) |
290 |
@user
@user
@user
@user
Red roses swarm. Conservatives pounce. We need our own verb. |
offensive |
non-offensive (p = 0.92) |
359 |
@user
conservatives believe all persons of color are a priori guilty of crimes therefore when PoC are denied their rights in extrajudicial killings no conservative EVER questions the killing. |
offensive |
non-offensive (p = 0.51) |
For records in the dataset where text_length(text)
>= 107.500, the Balanced Accuracy is 5.31% lower than the global Balanced Accuracy.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
text_length(text) >= 107.500 |
Balanced Accuracy = 0.736 |
-5.31% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
text_length(text) |
label |
Predicted label |
8 |
@user
@user
@user
@user
You've got nerve pointing the finger at other states with the murder rate you have. How's that gun control working for you? Own it |
154 |
non-offensive |
offensive (p = 0.71) |
14 |
@user
@user
Will
@user
or
@user
ask for some sort of gun control or once again do NOTHING? They seem to be really good at doing NOTHING! |
136 |
offensive |
non-offensive (p = 0.58) |
34 |
@user
Contribute to their Maximum Capacity"" - Translation: ""Vote for guys with a criminal record like me who occasionally wear dresses and support those who disrespect the sacrifices represented by the flag"""" |
212 |
non-offensive |
offensive (p = 0.50) |
Checkout out the Giskard Space and Giskard Documentation to learn more about how to test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.