SlideShare a Scribd company logo
1 of 42
Download to read offline
Pitfalls in AI
Joachim Ganseman
20/03/2019
1
S m a l s R e s e a r c h
w w w . s m a l s r e s e a r c h . b e
Robotic Process
Automation
Conversational
Interfaces
Productivity in AI
Web Scraping for
Analytics
Data Quality
AI for Public Sector
NewSQL
Databases
I n n o v a t i o n w i t h
n e w t e c h n o l o g i e s
C o n s u l t a n c y
& e x p e r t i s e
I n t e r n a l & e x t e r n a l
k n o w l e d g e t r a n s fe r
Advanced
Cryptography
Blockchain
S u p p o r t f o r
g o i n g l i v e
Today’s menu
• From data to decision
– Data collection issues (bias vs. fairness)
– Data processing issues (confounding variables)
– Goal (mis)formulation
• Attacks against AI systems
– Data poisoning
– Adversarial examples
• Abuse of AI systems
– Spear phishing
– (personalized) disinformation
– The role of recommender systems
• Defense against the Dark Arts
– Transparency & explainability
– Digital Skepticism
– Policy
3
Screwing up your own AI
Someone screws with your AI
Someone’s AI screws with you
About data
• AI systems are trained on data
 Garbage in, garbage out
• Training data is ideally
– Well-balanced
– Free from hidden correlations
– Independent and identically distributed (iid) over the domain
• In reality, this is rarely the case!
4
The curse of dimensionality
• Collecting data is expensive
• Any combination of every parameter … never finished
• Need lots of data quickly
 Crowdsourcing?
... when controlled!
5
The danger of unbalanced datasets
• Q: What offer to make to a new F engineer?
• AI:
– F are mostly Junior
– Average wage for Junior F is 30000
6
Employee Diploma Gender Experience Wage
1 Business M Senior 50000
2 Theatre F Junior 20000
3 Theatre F Senior 25000
4 Business F Junior 40000
5 Engineer M Senior 50000
Hidden correlations
• We’ll fix it by not taking gender into account, right?
• … well…
– Men/women have different ways of speaking
– In CVs, men/women mention different things (hobbies…)
 Gender as prominent confounding factor in Amazon’s model
7
The problem with confounding factors
• Semantic gap between
– What we want the AI to learn
– What the AI actually derives from the training data
AI does not necessarily learn what you think it’s learning!
• Mitigations:
– Better sampling of the training data
– Thorough (statistical) data analysis
9
Confounding factors
• Sometimes leads to surprising new insights!
10
Blood Meridian
(Cormac McCarthy)
Absalom, Absalom
(William Faulkner)
Data and bias
• Biased humans collect biased data
– https://en.wikipedia.org/wiki/List_of_cognitive_biases
– Often subconscious
• Biased data results in biased algorithms
11
Bias vs. Fairness
• Not all bias is unfair:
– Prostate cancer data is biased towards men
– Cervix cancer data is biased towards women
• Unfair bias can have serious consequences
– Security decisions (airport controls / inspections)
– Legal decisions (bail, parole)
– Economic decisions (insurance, mortgage)
 Know your data!
12
Bias vs. Fairness
• Recommended reading:
13
Definition of objectives
• In many ML algorithms:
– Reward “success”
– Punish “failure”
– Goal: maximize the reward
• “Success” is hard to correctly define!
– AI exploits bugs
– AI exploits unexpected data properties
– AI gets stuck in endless loops
14
Bad definition of objectives:
15
Producer vs. consumer objectives
• You: want to find good information
• Youtube: wants you to keep watching (ads)
 promotes content that “pushes buttons”
– Conspiracy theories
– Sensationalism
– Disturbing content
– Extremism
– …
16
Takeaways
AI is only as reliable as the data it’s trained on.
AI maximizes its objective, nothing more.
17
Today’s menu
• From data to decision
– Data collection issues (bias vs. fairness)
– Data processing issues (confounding variables)
– Goal (mis)formulation
• Attacks against AI systems
– Data poisoning
– Adversarial examples
• Abuse of AI systems
– Spear phishing
– (personalized) disinformation
– The role of recommender systems
• Defense against the Dark Arts
– Transparency & explainability
– Digital Skepticism
– Policy
18
Data Poisoning
• Inject false training data to compromise learning
– Intentionally mislabeled data
– Bogus data or noise
– E.g. Tay:
19
Adversarial examples
• Minimal change to input  large change in output
20
Adversarial examples
• Problem in most AI methods,
regardless of data format
• Often robust
– Stickers on objects
– 2D/3D printed objects
• Potential causes
– Curse of dimensionality
– Overfitting / Limited generalization
• Adding one strong feature from another class is enough
21
Takeaways
AI systems have (sometimes significant) error margins.
22
On today’s menu
• From data to decision
– Data collection issues (bias vs. fairness)
– Data processing issues (confounding variables)
– Goal (mis)formulation
• Attacks against AI systems
– Data poisoning
– Adversarial examples
• Abuse of AI systems
– Spear phishing
– (personalized) disinformation
– The role of recommender systems
• Defense against the Dark Arts
– Transparency & explainability
– Digital Skepticism
– Policy
23
Spear phishing
• Fraudulent attempt to obtain sensitive information,
directed at a specific individual/company
• AI can personalize the message to the target
(as in advertising)
24
Disinformation (“fake news”)
• Definition (EC action plan against disinformation, 05/12/2018):
– Verifiably false or misleading information
– Disseminated for economic gain or to intentionally deceive
– May cause public harm
• It is not:
– (Extreme) political, scientific, ethical or moral viewpoints
– Unions, lobbying, advocacy, campaigning, …
– Selective presentation of information
– Satire, parody, …
– Religion
26
Can disinformation be generated?
• Image/video/audio: yes, kind of
cf. deepfakes:
27
Generating fake text
• Result of the latest research (OpenAI GPT-2, 14/02/2019):
– Grammar ✓
– Sticking to the theme ✓
– Internal consistency ✘
– Style ✘
 Short texts might fool an inattentive human
28
Gimli was a tall and powerful man, and he had a beard and a moustache.
He was also a dwarf, and he had a strong build, and he was covered in
tattoos. He was not a man who looked like a hobbit.
Amplification through recommendation
• YouTube as the great radicalizer (Z. Tufekci)
– Videos about vegetarianism lead to veganism
– Videos about jogging lead to ultramarathons
• Similar on many other (free) content platforms
30
Amplification through recommendation
• Any content that is watched more
obtains a higher ranking in search results
31
Amplification through recommendation
• Inflammatory Any content is watched more
obtains a higher ranking in search results
32
Takeaways
It’s still “spam vs. spamfilter”, but on a higher level.
If it sounds too good to be true,
it’s probably too good to be true.
34
Today’s menu
• From data to decision
– Data collection issues (bias vs. fairness)
– Data processing issues (confounding variables)
– Goal (mis)formulation
• Attacks against AI systems
– Data poisoning
– Adversarial examples
• Abuse of AI systems
– Spear phishing
– (personalized) disinformation
– The role of recommender systems
• Defense against the Dark Arts
– Transparency & explainability
– Digital Skepticism
– Policy
35
On the tech side
• Governance through FAT(E)
– Fairness
– Accountability
– Transparency
– ( Ethics )
36
Explainable AI
• Important factor in accountability
• Especially hard with deep learning:
– What did an AI learn?
– Why this outcome / decision?
 Recent movement towards “Explainable AI”
• Still in its infancy
37
You against scams and disinformation
38
( images courtesy of https://heimdalsecurity.com/blog/fake-facebook-scams/ )
You against scams and disinformation
• Awareness
– You are being profiled
– What you see is not what someone else sees
– Anything you post can be used against you
• Rely on authoritative, transparent sources
– Peer-reviewed science
– Quality journalism
 Encourage Digital Skepticism (without being paranoid)
 Requires some Competences / Literacy
39
Legal protection against AI abuse
• GDPR (ratified in Belgium: law of 30 July 2018)
40
For policymakers
• Awareness
– Own vulnerability to pre-selected information
– Reach and impact of social media
– Information warfare
• Stimulate
– Independent and quality media
– Innovation & research on the impact of innovation
– Culture of permanent learning
 Fancy “strategy plans” are useless without actual €€€ !
41
What is happening in the EU?
• 03/2015: Stratcom Task Force (euvsdisinfo.eu)
• 03/2018: recommendations of EU HLEG
• 10/2018: EU code of practice on disinformation
– Signed by Google, Facebook, Twitter, Mozilla etc.
– (Initial) choice for industry self-regulation
• 12/2018: EU action plan on disinformation
42
Takeaways
digital skeptic > digital panic
Curiosity didn’t kill the cat, naivety did.
46
Further Reading
• The Malicious Use of Artificial Intelligence: Forecasting,
Prevention, and Mitigation (“Malicious AI report”, 02/2018)
• For a Meaningful Artificial Intelligence
(“Villani rapport”, 03/2018)
• Information Manipulation, A Challenge for
Our Democracies (CAPS & IRSEM, France, 08/2018)
• EU Action Plan against Disinformation (05/12/2018)
47
Smals, ICT for society
02 787 57 11
Fonsnylaan 20 / Avenue Fonsny 20
1060 Brussel / 1060 Bruxelles
Contact
Joachim Ganseman
joachim.ganseman@smals.be
48
Smals, ICT for society
02 787 57 11
Fonsnylaan 20 / Avenue Fonsny 20
1060 Brussel / 1060 Bruxelles
Contact
Joachim Ganseman
joachim.ganseman@smals.be
49
A133

More Related Content

Similar to Joachim Ganseman - Pitfalls in AI - Infosecurity.be 2019

Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencySimon Buckingham Shum
 
AI and the Future of Work [TUG-CO, 11/15/23]
AI and the Future of Work [TUG-CO, 11/15/23]AI and the Future of Work [TUG-CO, 11/15/23]
AI and the Future of Work [TUG-CO, 11/15/23]Matt Small
 
Teacher Education: AI - Ethical, legal and societal implications
Teacher Education: AI - Ethical, legal and societal implicationsTeacher Education: AI - Ethical, legal and societal implications
Teacher Education: AI - Ethical, legal and societal implicationsSonja Aits
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationDigital Reasoning
 
Designing A.I. - Week 1 - Intro Lecture
Designing A.I. - Week 1 - Intro LectureDesigning A.I. - Week 1 - Intro Lecture
Designing A.I. - Week 1 - Intro LectureDavid Young
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Matthew Lease
 
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptxISSIP
 
Social Networks And Phishing
Social Networks And PhishingSocial Networks And Phishing
Social Networks And Phishingecarrow
 
Wk_2_Privacy (1) (1).ppt
Wk_2_Privacy (1) (1).pptWk_2_Privacy (1) (1).ppt
Wk_2_Privacy (1) (1).pptBiproRoy3
 
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...Sherry Jones
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairnessAnthonyMelson
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipIIIT Hyderabad
 
Session 10. Risk Management lekture financial markets
Session 10. Risk Management lekture financial marketsSession 10. Risk Management lekture financial markets
Session 10. Risk Management lekture financial marketsparketkril09
 
Technology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasTechnology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasMarion Mulder
 

Similar to Joachim Ganseman - Pitfalls in AI - Infosecurity.be 2019 (20)

Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic Transparency
 
2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science
 
AI and the Future of Work [TUG-CO, 11/15/23]
AI and the Future of Work [TUG-CO, 11/15/23]AI and the Future of Work [TUG-CO, 11/15/23]
AI and the Future of Work [TUG-CO, 11/15/23]
 
Teacher Education: AI - Ethical, legal and societal implications
Teacher Education: AI - Ethical, legal and societal implicationsTeacher Education: AI - Ethical, legal and societal implications
Teacher Education: AI - Ethical, legal and societal implications
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your Organization
 
Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
 
Maura Tuohy
Maura TuohyMaura Tuohy
Maura Tuohy
 
Designing A.I. - Week 1 - Intro Lecture
Designing A.I. - Week 1 - Intro LectureDesigning A.I. - Week 1 - Intro Lecture
Designing A.I. - Week 1 - Intro Lecture
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences.
 
Teaching Critical AI Literacies - Maha Bali
Teaching Critical AI Literacies - Maha BaliTeaching Critical AI Literacies - Maha Bali
Teaching Critical AI Literacies - Maha Bali
 
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
 
Social Networks And Phishing
Social Networks And PhishingSocial Networks And Phishing
Social Networks And Phishing
 
Wk_2_Privacy (1) (1).ppt
Wk_2_Privacy (1) (1).pptWk_2_Privacy (1) (1).ppt
Wk_2_Privacy (1) (1).ppt
 
Privacy.ppt
Privacy.pptPrivacy.ppt
Privacy.ppt
 
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...
The Future of Moral Persuasion in Games, AR, AI Bots, and Self Trackers by Sh...
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
 
Session 10. Risk Management lekture financial markets
Session 10. Risk Management lekture financial marketsSession 10. Risk Management lekture financial markets
Session 10. Risk Management lekture financial markets
 
Technology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and BiasTechnology for everyone - AI ethics and Bias
Technology for everyone - AI ethics and Bias
 

More from Smals

Wat zijn chatbots en waarvoor gebruiken we ze
Wat zijn chatbots en waarvoor gebruiken we zeWat zijn chatbots en waarvoor gebruiken we ze
Wat zijn chatbots en waarvoor gebruiken we zeSmals
 
Wat is augmented reality en waarvoor gebruiken we het nl
Wat is augmented reality en waarvoor gebruiken we het nlWat is augmented reality en waarvoor gebruiken we het nl
Wat is augmented reality en waarvoor gebruiken we het nlSmals
 
Named entity recognition hoe werkt het wat kunnen we er mee doen nl
Named entity recognition hoe werkt het wat kunnen we er mee doen nlNamed entity recognition hoe werkt het wat kunnen we er mee doen nl
Named entity recognition hoe werkt het wat kunnen we er mee doen nlSmals
 
Natural language generation nederlands
Natural language generation nederlandsNatural language generation nederlands
Natural language generation nederlandsSmals
 
Wat is ai en wat kan het nl
Wat is ai en wat kan het nlWat is ai en wat kan het nl
Wat is ai en wat kan het nlSmals
 
Realite augmentee
Realite augmenteeRealite augmentee
Realite augmenteeSmals
 
Internet des objets
Internet des objetsInternet des objets
Internet des objetsSmals
 
Chatbots comment ca marche a quoi ca sert
Chatbots comment ca marche a quoi ca sertChatbots comment ca marche a quoi ca sert
Chatbots comment ca marche a quoi ca sertSmals
 
Analyse predictive comment ca marche a quoi ca sert
Analyse predictive comment ca marche a quoi ca sertAnalyse predictive comment ca marche a quoi ca sert
Analyse predictive comment ca marche a quoi ca sertSmals
 
Traduction vocale quasi instantanee introduction
Traduction vocale quasi instantanee introductionTraduction vocale quasi instantanee introduction
Traduction vocale quasi instantanee introductionSmals
 
Automatisation des processus robotises introduction
Automatisation des processus robotises introductionAutomatisation des processus robotises introduction
Automatisation des processus robotises introductionSmals
 
Interfaces conversationnelle introduction
Interfaces conversationnelle introductionInterfaces conversationnelle introduction
Interfaces conversationnelle introductionSmals
 
Reconnaissance d'entites nommees introduction
Reconnaissance d'entites nommees introductionReconnaissance d'entites nommees introduction
Reconnaissance d'entites nommees introductionSmals
 
Generation automatique de textes
Generation automatique de textesGeneration automatique de textes
Generation automatique de textesSmals
 
Intelligence artificielle etroite introduction
Intelligence artificielle etroite introductionIntelligence artificielle etroite introduction
Intelligence artificielle etroite introductionSmals
 
Named entity recognition hoe werkt het wat kunnen we er mee doen
Named entity recognition hoe werkt het wat kunnen we er mee doenNamed entity recognition hoe werkt het wat kunnen we er mee doen
Named entity recognition hoe werkt het wat kunnen we er mee doenSmals
 
Real time voice translation handig maar hoe ver staat het
Real time voice translation   handig  maar hoe ver staat hetReal time voice translation   handig  maar hoe ver staat het
Real time voice translation handig maar hoe ver staat hetSmals
 
Wat is predictive analytics en waarvoor kun je het gebruiken
Wat is predictive analytics en waarvoor kun je het gebruikenWat is predictive analytics en waarvoor kun je het gebruiken
Wat is predictive analytics en waarvoor kun je het gebruikenSmals
 
Wat is robotic process automation en wat kun je er mee doen
Wat is robotic process automation en wat kun je er mee doenWat is robotic process automation en wat kun je er mee doen
Wat is robotic process automation en wat kun je er mee doenSmals
 
Exemples europeens comme source d inspiration
Exemples europeens comme source d inspirationExemples europeens comme source d inspiration
Exemples europeens comme source d inspirationSmals
 

More from Smals (20)

Wat zijn chatbots en waarvoor gebruiken we ze
Wat zijn chatbots en waarvoor gebruiken we zeWat zijn chatbots en waarvoor gebruiken we ze
Wat zijn chatbots en waarvoor gebruiken we ze
 
Wat is augmented reality en waarvoor gebruiken we het nl
Wat is augmented reality en waarvoor gebruiken we het nlWat is augmented reality en waarvoor gebruiken we het nl
Wat is augmented reality en waarvoor gebruiken we het nl
 
Named entity recognition hoe werkt het wat kunnen we er mee doen nl
Named entity recognition hoe werkt het wat kunnen we er mee doen nlNamed entity recognition hoe werkt het wat kunnen we er mee doen nl
Named entity recognition hoe werkt het wat kunnen we er mee doen nl
 
Natural language generation nederlands
Natural language generation nederlandsNatural language generation nederlands
Natural language generation nederlands
 
Wat is ai en wat kan het nl
Wat is ai en wat kan het nlWat is ai en wat kan het nl
Wat is ai en wat kan het nl
 
Realite augmentee
Realite augmenteeRealite augmentee
Realite augmentee
 
Internet des objets
Internet des objetsInternet des objets
Internet des objets
 
Chatbots comment ca marche a quoi ca sert
Chatbots comment ca marche a quoi ca sertChatbots comment ca marche a quoi ca sert
Chatbots comment ca marche a quoi ca sert
 
Analyse predictive comment ca marche a quoi ca sert
Analyse predictive comment ca marche a quoi ca sertAnalyse predictive comment ca marche a quoi ca sert
Analyse predictive comment ca marche a quoi ca sert
 
Traduction vocale quasi instantanee introduction
Traduction vocale quasi instantanee introductionTraduction vocale quasi instantanee introduction
Traduction vocale quasi instantanee introduction
 
Automatisation des processus robotises introduction
Automatisation des processus robotises introductionAutomatisation des processus robotises introduction
Automatisation des processus robotises introduction
 
Interfaces conversationnelle introduction
Interfaces conversationnelle introductionInterfaces conversationnelle introduction
Interfaces conversationnelle introduction
 
Reconnaissance d'entites nommees introduction
Reconnaissance d'entites nommees introductionReconnaissance d'entites nommees introduction
Reconnaissance d'entites nommees introduction
 
Generation automatique de textes
Generation automatique de textesGeneration automatique de textes
Generation automatique de textes
 
Intelligence artificielle etroite introduction
Intelligence artificielle etroite introductionIntelligence artificielle etroite introduction
Intelligence artificielle etroite introduction
 
Named entity recognition hoe werkt het wat kunnen we er mee doen
Named entity recognition hoe werkt het wat kunnen we er mee doenNamed entity recognition hoe werkt het wat kunnen we er mee doen
Named entity recognition hoe werkt het wat kunnen we er mee doen
 
Real time voice translation handig maar hoe ver staat het
Real time voice translation   handig  maar hoe ver staat hetReal time voice translation   handig  maar hoe ver staat het
Real time voice translation handig maar hoe ver staat het
 
Wat is predictive analytics en waarvoor kun je het gebruiken
Wat is predictive analytics en waarvoor kun je het gebruikenWat is predictive analytics en waarvoor kun je het gebruiken
Wat is predictive analytics en waarvoor kun je het gebruiken
 
Wat is robotic process automation en wat kun je er mee doen
Wat is robotic process automation en wat kun je er mee doenWat is robotic process automation en wat kun je er mee doen
Wat is robotic process automation en wat kun je er mee doen
 
Exemples europeens comme source d inspiration
Exemples europeens comme source d inspirationExemples europeens comme source d inspiration
Exemples europeens comme source d inspiration
 

Recently uploaded

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 

Recently uploaded (20)

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 

Joachim Ganseman - Pitfalls in AI - Infosecurity.be 2019

  • 1. Pitfalls in AI Joachim Ganseman 20/03/2019 1
  • 2. S m a l s R e s e a r c h w w w . s m a l s r e s e a r c h . b e Robotic Process Automation Conversational Interfaces Productivity in AI Web Scraping for Analytics Data Quality AI for Public Sector NewSQL Databases I n n o v a t i o n w i t h n e w t e c h n o l o g i e s C o n s u l t a n c y & e x p e r t i s e I n t e r n a l & e x t e r n a l k n o w l e d g e t r a n s fe r Advanced Cryptography Blockchain S u p p o r t f o r g o i n g l i v e
  • 3. Today’s menu • From data to decision – Data collection issues (bias vs. fairness) – Data processing issues (confounding variables) – Goal (mis)formulation • Attacks against AI systems – Data poisoning – Adversarial examples • Abuse of AI systems – Spear phishing – (personalized) disinformation – The role of recommender systems • Defense against the Dark Arts – Transparency & explainability – Digital Skepticism – Policy 3 Screwing up your own AI Someone screws with your AI Someone’s AI screws with you
  • 4. About data • AI systems are trained on data  Garbage in, garbage out • Training data is ideally – Well-balanced – Free from hidden correlations – Independent and identically distributed (iid) over the domain • In reality, this is rarely the case! 4
  • 5. The curse of dimensionality • Collecting data is expensive • Any combination of every parameter … never finished • Need lots of data quickly  Crowdsourcing? ... when controlled! 5
  • 6. The danger of unbalanced datasets • Q: What offer to make to a new F engineer? • AI: – F are mostly Junior – Average wage for Junior F is 30000 6 Employee Diploma Gender Experience Wage 1 Business M Senior 50000 2 Theatre F Junior 20000 3 Theatre F Senior 25000 4 Business F Junior 40000 5 Engineer M Senior 50000
  • 7. Hidden correlations • We’ll fix it by not taking gender into account, right? • … well… – Men/women have different ways of speaking – In CVs, men/women mention different things (hobbies…)  Gender as prominent confounding factor in Amazon’s model 7
  • 8. The problem with confounding factors • Semantic gap between – What we want the AI to learn – What the AI actually derives from the training data AI does not necessarily learn what you think it’s learning! • Mitigations: – Better sampling of the training data – Thorough (statistical) data analysis 9
  • 9. Confounding factors • Sometimes leads to surprising new insights! 10 Blood Meridian (Cormac McCarthy) Absalom, Absalom (William Faulkner)
  • 10. Data and bias • Biased humans collect biased data – https://en.wikipedia.org/wiki/List_of_cognitive_biases – Often subconscious • Biased data results in biased algorithms 11
  • 11. Bias vs. Fairness • Not all bias is unfair: – Prostate cancer data is biased towards men – Cervix cancer data is biased towards women • Unfair bias can have serious consequences – Security decisions (airport controls / inspections) – Legal decisions (bail, parole) – Economic decisions (insurance, mortgage)  Know your data! 12
  • 12. Bias vs. Fairness • Recommended reading: 13
  • 13. Definition of objectives • In many ML algorithms: – Reward “success” – Punish “failure” – Goal: maximize the reward • “Success” is hard to correctly define! – AI exploits bugs – AI exploits unexpected data properties – AI gets stuck in endless loops 14
  • 14. Bad definition of objectives: 15
  • 15. Producer vs. consumer objectives • You: want to find good information • Youtube: wants you to keep watching (ads)  promotes content that “pushes buttons” – Conspiracy theories – Sensationalism – Disturbing content – Extremism – … 16
  • 16. Takeaways AI is only as reliable as the data it’s trained on. AI maximizes its objective, nothing more. 17
  • 17. Today’s menu • From data to decision – Data collection issues (bias vs. fairness) – Data processing issues (confounding variables) – Goal (mis)formulation • Attacks against AI systems – Data poisoning – Adversarial examples • Abuse of AI systems – Spear phishing – (personalized) disinformation – The role of recommender systems • Defense against the Dark Arts – Transparency & explainability – Digital Skepticism – Policy 18
  • 18. Data Poisoning • Inject false training data to compromise learning – Intentionally mislabeled data – Bogus data or noise – E.g. Tay: 19
  • 19. Adversarial examples • Minimal change to input  large change in output 20
  • 20. Adversarial examples • Problem in most AI methods, regardless of data format • Often robust – Stickers on objects – 2D/3D printed objects • Potential causes – Curse of dimensionality – Overfitting / Limited generalization • Adding one strong feature from another class is enough 21
  • 21. Takeaways AI systems have (sometimes significant) error margins. 22
  • 22. On today’s menu • From data to decision – Data collection issues (bias vs. fairness) – Data processing issues (confounding variables) – Goal (mis)formulation • Attacks against AI systems – Data poisoning – Adversarial examples • Abuse of AI systems – Spear phishing – (personalized) disinformation – The role of recommender systems • Defense against the Dark Arts – Transparency & explainability – Digital Skepticism – Policy 23
  • 23. Spear phishing • Fraudulent attempt to obtain sensitive information, directed at a specific individual/company • AI can personalize the message to the target (as in advertising) 24
  • 24. Disinformation (“fake news”) • Definition (EC action plan against disinformation, 05/12/2018): – Verifiably false or misleading information – Disseminated for economic gain or to intentionally deceive – May cause public harm • It is not: – (Extreme) political, scientific, ethical or moral viewpoints – Unions, lobbying, advocacy, campaigning, … – Selective presentation of information – Satire, parody, … – Religion 26
  • 25. Can disinformation be generated? • Image/video/audio: yes, kind of cf. deepfakes: 27
  • 26. Generating fake text • Result of the latest research (OpenAI GPT-2, 14/02/2019): – Grammar ✓ – Sticking to the theme ✓ – Internal consistency ✘ – Style ✘  Short texts might fool an inattentive human 28 Gimli was a tall and powerful man, and he had a beard and a moustache. He was also a dwarf, and he had a strong build, and he was covered in tattoos. He was not a man who looked like a hobbit.
  • 27. Amplification through recommendation • YouTube as the great radicalizer (Z. Tufekci) – Videos about vegetarianism lead to veganism – Videos about jogging lead to ultramarathons • Similar on many other (free) content platforms 30
  • 28. Amplification through recommendation • Any content that is watched more obtains a higher ranking in search results 31
  • 29. Amplification through recommendation • Inflammatory Any content is watched more obtains a higher ranking in search results 32
  • 30. Takeaways It’s still “spam vs. spamfilter”, but on a higher level. If it sounds too good to be true, it’s probably too good to be true. 34
  • 31. Today’s menu • From data to decision – Data collection issues (bias vs. fairness) – Data processing issues (confounding variables) – Goal (mis)formulation • Attacks against AI systems – Data poisoning – Adversarial examples • Abuse of AI systems – Spear phishing – (personalized) disinformation – The role of recommender systems • Defense against the Dark Arts – Transparency & explainability – Digital Skepticism – Policy 35
  • 32. On the tech side • Governance through FAT(E) – Fairness – Accountability – Transparency – ( Ethics ) 36
  • 33. Explainable AI • Important factor in accountability • Especially hard with deep learning: – What did an AI learn? – Why this outcome / decision?  Recent movement towards “Explainable AI” • Still in its infancy 37
  • 34. You against scams and disinformation 38 ( images courtesy of https://heimdalsecurity.com/blog/fake-facebook-scams/ )
  • 35. You against scams and disinformation • Awareness – You are being profiled – What you see is not what someone else sees – Anything you post can be used against you • Rely on authoritative, transparent sources – Peer-reviewed science – Quality journalism  Encourage Digital Skepticism (without being paranoid)  Requires some Competences / Literacy 39
  • 36. Legal protection against AI abuse • GDPR (ratified in Belgium: law of 30 July 2018) 40
  • 37. For policymakers • Awareness – Own vulnerability to pre-selected information – Reach and impact of social media – Information warfare • Stimulate – Independent and quality media – Innovation & research on the impact of innovation – Culture of permanent learning  Fancy “strategy plans” are useless without actual €€€ ! 41
  • 38. What is happening in the EU? • 03/2015: Stratcom Task Force (euvsdisinfo.eu) • 03/2018: recommendations of EU HLEG • 10/2018: EU code of practice on disinformation – Signed by Google, Facebook, Twitter, Mozilla etc. – (Initial) choice for industry self-regulation • 12/2018: EU action plan on disinformation 42
  • 39. Takeaways digital skeptic > digital panic Curiosity didn’t kill the cat, naivety did. 46
  • 40. Further Reading • The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation (“Malicious AI report”, 02/2018) • For a Meaningful Artificial Intelligence (“Villani rapport”, 03/2018) • Information Manipulation, A Challenge for Our Democracies (CAPS & IRSEM, France, 08/2018) • EU Action Plan against Disinformation (05/12/2018) 47
  • 41. Smals, ICT for society 02 787 57 11 Fonsnylaan 20 / Avenue Fonsny 20 1060 Brussel / 1060 Bruxelles Contact Joachim Ganseman joachim.ganseman@smals.be 48
  • 42. Smals, ICT for society 02 787 57 11 Fonsnylaan 20 / Avenue Fonsny 20 1060 Brussel / 1060 Bruxelles Contact Joachim Ganseman joachim.ganseman@smals.be 49 A133