SlideShare a Scribd company logo
1 of 28
Animesh Singh
Svetlana Levitan
IBM Center for Open Data and AI Technologies
(CODAIT)
Defending deep learning
from adversarial attacks
Animesh Singh
2
STSM
Lead for IBM Watson and
Cloud Platform
Member of IBM Academy
of Technology
MS in Software Engineering from University
of Texas, Dallas
@AnimeshSingh
Svetlana Levitan
Developer Advocate with IBM CODAIT
Software Engineer for SPSS analytic components
(2000-2018)
IBM Representative to the Data Mining Group
PhD in Applied Math and MS in CS from University
of Maryland, College Park
Originally from Moscow, Russia
@SvetaLevitan
Deep Learning
Adversarial Attacks
Research summary
Adversarial Robustness
Toolbox
An Example
Digital Business Group / © 2019 IBM Corporation 3
Very brief introduction to Deep Learning
4
Perceptron 1957 by Frank Rosenblatt
Deep Learning models are now used in
many areas
5
Can we trust them?
A scarier example (from https://arxiv.org/pdf/1707.08945.pdf)
6
Adversarial machine learning
7
Very active area of research since ~2013
Evasion attack: a very small change in input to cause misclassification
- White-box attack or black-box attack (may use a surrogate model and transferability)
Adversarial defence: model hardening and runtime detection of adversarial inputs
- Model hardening: augment training data with adversarial examples or preprocess inputs
Poisoning attacks – manipulated training data
IBM Adversarial Robustness Toolbox
8
https://github.com/IBM/adversarial-robustness-toolbox
https://ibm.biz/Bd2fd8
Includes many attack and defense methods and detection
methods of adversarial samples or poisoning
Developed by IBM Research group led by
Irina Nicolae and Mathiue Sinn (Ireland)
Types of adversarial attacks in latest version (0.4.0)
9
DeepFool (Moosavi-Dezfooli et al., 2015)
Fast Gradient Method (Goodfellow et al., 2014)
Basic Iterative Method (Kurakin et al., 2016)
Projected Gradient Descent (Madry et al., 2017)
Jacobian Saliency Map (Papernot et al., 2016)
Universal Perturbation (Moosavi-Dezfooli et al., 2016)
Virtual Adversarial Method (Miyato et al., 2015)
C&W Attack (Carlini and Wagner, 2016)
NewtonFool (Jang et al., 2017)
Types of defense methods in ART
10
Feature squeezing (Xu et al., 2017)
Spatial smoothing (Xu et al., 2017)
Label smoothing (Warde-Farley and Goodfellow, 2016)
Adversarial training (Szegedy et al., 2013)
Virtual adversarial training (Miyato et al., 2015)
Gaussian data augmentation (Zantedeschi et al., 2017)
Thermometer encoding (Buckman et al., 2018)
Total variance minimization (Guo et al., 2018)
JPEG compression (Dziugaite et al., 2016)
Poisoning detection
• Detection based on
clustering activations
• Proof of attack strategy
Evasion detection
• Detector based on
inputs
• Detector based on
activations
Robustness metrics
• CLEVER
• Empirical robustness
• Loss sensitivity
Unified model API
• Training
• Prediction
• Access to loss and
prediction gradients
Evasion defenses
• Feature squeezing
• Spatial smoothing
• Label smoothing
• Adversarial training
• Virtual adversarial
training
• Thermometer encoding
• Gaussian data
augmentation
Evasion attacks
• FGSM
• JSMA
• BIM
• PGD
• Carlini & Wagner
• DeepFool
• NewtonFool
• Universal perturbation
11
Implementation for state-of-the-art methods for attacking and defending
classifiers.
Jupyter notebook with an example
12
https://nbviewer.jupyter.org/github/IBM/adversarial-robustness-toolbox/
blob/master/notebooks/attack_defense_imagenet.ipynb
13
ResNet50 -
14
Load an ImageNet example image
15
16
17
Apply a defence method
18
Now we can play with the demo
19
https://art-demo.mybluemix.net
In case there are network problems, here is what you could see
20
Continue playing with the demo: feature squeezing is less effective
here, unless it is set to high
21
Gaussian Noise gives correct prediction, but with lower confidence
22
The code behind all this is quite simple
23
24
Watson Studio (formerly Data Science Experience)
ART is used in
Watson Studio
along with a lot of
other open source
modules
(C) 2019 IBM Corp
Conclusions
25
Adversarial attacks present a serious threat
ART is an open source library of tools for protection from such attacks
Works with TensorFlow, Keras, PyTorch, and MXNet
Developed by IBM Research
Ireland: Irina Nicolae,
Mathieu Sinn
Current version 0.4.0
Links
26
https://github.com/IBM/adversarial-robustness-toolbox : https://ibm.biz/Bd2fd8
https://developer.ibm.com/code/open/projects/adversarial-robustness-toolbox/
https://ibm.biz/Bd2fdV
https://art-demo.mybluemix.net : https://ibm.biz/Bd2fdn
Example notebooks: https://ibm.biz/Bd2fnF
Email: singhan@us.ibm.com, slevitan@us.ibm.com
Twitter: @AnimeshSingh, @SvetaLevitan
Carlini and Wagner paper 2017
27
28
Thank you.

More Related Content

What's hot

Artificial intelligence and knowledge representation
Artificial intelligence and knowledge representationArtificial intelligence and knowledge representation
Artificial intelligence and knowledge representation
Likan Patra
 

What's hot (20)

Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
 
Memebership inference attacks against machine learning models
Memebership inference attacks against machine learning modelsMemebership inference attacks against machine learning models
Memebership inference attacks against machine learning models
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Security of Machine Learning
Security of Machine LearningSecurity of Machine Learning
Security of Machine Learning
 
Security in the age of Artificial Intelligence
Security in the age of Artificial IntelligenceSecurity in the age of Artificial Intelligence
Security in the age of Artificial Intelligence
 
ML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An Introduction
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
Towards Deep Learning Models Resistant to Adversarial Attacks.
Towards Deep Learning Models Resistant to Adversarial Attacks.Towards Deep Learning Models Resistant to Adversarial Attacks.
Towards Deep Learning Models Resistant to Adversarial Attacks.
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Use of Artificial Intelligence in Cyber Security - Avantika University
Use of Artificial Intelligence in Cyber Security - Avantika UniversityUse of Artificial Intelligence in Cyber Security - Avantika University
Use of Artificial Intelligence in Cyber Security - Avantika University
 
Artificial Intelligence and Machine Learning in Research
Artificial Intelligence and Machine Learning in ResearchArtificial Intelligence and Machine Learning in Research
Artificial Intelligence and Machine Learning in Research
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Artificial intelligence and knowledge representation
Artificial intelligence and knowledge representationArtificial intelligence and knowledge representation
Artificial intelligence and knowledge representation
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 

Similar to Defending deep learning from adversarial attacks

Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
polochau
 
slides_security_and_privacy_in_machine_learning.pptx
slides_security_and_privacy_in_machine_learning.pptxslides_security_and_privacy_in_machine_learning.pptx
slides_security_and_privacy_in_machine_learning.pptx
ssuserabf73f
 
6212020 Originality Reporthttpsucumberlands.blackboar.docx
6212020 Originality Reporthttpsucumberlands.blackboar.docx6212020 Originality Reporthttpsucumberlands.blackboar.docx
6212020 Originality Reporthttpsucumberlands.blackboar.docx
BHANU281672
 

Similar to Defending deep learning from adversarial attacks (20)

Survey of Adversarial Attacks in Deep Learning Models
Survey of Adversarial Attacks in Deep Learning ModelsSurvey of Adversarial Attacks in Deep Learning Models
Survey of Adversarial Attacks in Deep Learning Models
 
Adversarial ml
Adversarial mlAdversarial ml
Adversarial ml
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
 
Security in Machine Learning
Security in Machine LearningSecurity in Machine Learning
Security in Machine Learning
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
 
Japan 20200724 v13
Japan 20200724 v13Japan 20200724 v13
Japan 20200724 v13
 
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
 
Application Threat Modeling In Risk Management
Application Threat Modeling In Risk ManagementApplication Threat Modeling In Risk Management
Application Threat Modeling In Risk Management
 
Security and Privacy of Machine Learning
Security and Privacy of Machine LearningSecurity and Privacy of Machine Learning
Security and Privacy of Machine Learning
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
 
Metrics for Security Effort Prioritization
Metrics for Security Effort PrioritizationMetrics for Security Effort Prioritization
Metrics for Security Effort Prioritization
 
Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]
 
Security and Privacy Issues in Deep Learning
Security and Privacy Issues in Deep LearningSecurity and Privacy Issues in Deep Learning
Security and Privacy Issues in Deep Learning
 
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine LearningTackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
 
VULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTURE
VULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTUREVULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTURE
VULNERABILITIES AND EXPLOITATION IN COMPUTER SYSTEM – PAST, PRESENT, AND FUTURE
 
Transforming incident Response to Intelligent Response using Graphs
Transforming incident Response to Intelligent Response using GraphsTransforming incident Response to Intelligent Response using Graphs
Transforming incident Response to Intelligent Response using Graphs
 
slides_security_and_privacy_in_machine_learning.pptx
slides_security_and_privacy_in_machine_learning.pptxslides_security_and_privacy_in_machine_learning.pptx
slides_security_and_privacy_in_machine_learning.pptx
 
2024_개보위_개인정보 미래포럼_의료 인공지능 모델과 프라이버시 이슈.pdf
2024_개보위_개인정보 미래포럼_의료 인공지능 모델과 프라이버시 이슈.pdf2024_개보위_개인정보 미래포럼_의료 인공지능 모델과 프라이버시 이슈.pdf
2024_개보위_개인정보 미래포럼_의료 인공지능 모델과 프라이버시 이슈.pdf
 
Top Software Engineering & Applications Research articles of 2019
Top Software Engineering & Applications Research articles of 2019Top Software Engineering & Applications Research articles of 2019
Top Software Engineering & Applications Research articles of 2019
 
6212020 Originality Reporthttpsucumberlands.blackboar.docx
6212020 Originality Reporthttpsucumberlands.blackboar.docx6212020 Originality Reporthttpsucumberlands.blackboar.docx
6212020 Originality Reporthttpsucumberlands.blackboar.docx
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Recently uploaded (20)

Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Defending deep learning from adversarial attacks

  • 1. Animesh Singh Svetlana Levitan IBM Center for Open Data and AI Technologies (CODAIT) Defending deep learning from adversarial attacks
  • 2. Animesh Singh 2 STSM Lead for IBM Watson and Cloud Platform Member of IBM Academy of Technology MS in Software Engineering from University of Texas, Dallas @AnimeshSingh Svetlana Levitan Developer Advocate with IBM CODAIT Software Engineer for SPSS analytic components (2000-2018) IBM Representative to the Data Mining Group PhD in Applied Math and MS in CS from University of Maryland, College Park Originally from Moscow, Russia @SvetaLevitan
  • 3. Deep Learning Adversarial Attacks Research summary Adversarial Robustness Toolbox An Example Digital Business Group / © 2019 IBM Corporation 3
  • 4. Very brief introduction to Deep Learning 4 Perceptron 1957 by Frank Rosenblatt
  • 5. Deep Learning models are now used in many areas 5 Can we trust them?
  • 6. A scarier example (from https://arxiv.org/pdf/1707.08945.pdf) 6
  • 7. Adversarial machine learning 7 Very active area of research since ~2013 Evasion attack: a very small change in input to cause misclassification - White-box attack or black-box attack (may use a surrogate model and transferability) Adversarial defence: model hardening and runtime detection of adversarial inputs - Model hardening: augment training data with adversarial examples or preprocess inputs Poisoning attacks – manipulated training data
  • 8. IBM Adversarial Robustness Toolbox 8 https://github.com/IBM/adversarial-robustness-toolbox https://ibm.biz/Bd2fd8 Includes many attack and defense methods and detection methods of adversarial samples or poisoning Developed by IBM Research group led by Irina Nicolae and Mathiue Sinn (Ireland)
  • 9. Types of adversarial attacks in latest version (0.4.0) 9 DeepFool (Moosavi-Dezfooli et al., 2015) Fast Gradient Method (Goodfellow et al., 2014) Basic Iterative Method (Kurakin et al., 2016) Projected Gradient Descent (Madry et al., 2017) Jacobian Saliency Map (Papernot et al., 2016) Universal Perturbation (Moosavi-Dezfooli et al., 2016) Virtual Adversarial Method (Miyato et al., 2015) C&W Attack (Carlini and Wagner, 2016) NewtonFool (Jang et al., 2017)
  • 10. Types of defense methods in ART 10 Feature squeezing (Xu et al., 2017) Spatial smoothing (Xu et al., 2017) Label smoothing (Warde-Farley and Goodfellow, 2016) Adversarial training (Szegedy et al., 2013) Virtual adversarial training (Miyato et al., 2015) Gaussian data augmentation (Zantedeschi et al., 2017) Thermometer encoding (Buckman et al., 2018) Total variance minimization (Guo et al., 2018) JPEG compression (Dziugaite et al., 2016)
  • 11. Poisoning detection • Detection based on clustering activations • Proof of attack strategy Evasion detection • Detector based on inputs • Detector based on activations Robustness metrics • CLEVER • Empirical robustness • Loss sensitivity Unified model API • Training • Prediction • Access to loss and prediction gradients Evasion defenses • Feature squeezing • Spatial smoothing • Label smoothing • Adversarial training • Virtual adversarial training • Thermometer encoding • Gaussian data augmentation Evasion attacks • FGSM • JSMA • BIM • PGD • Carlini & Wagner • DeepFool • NewtonFool • Universal perturbation 11 Implementation for state-of-the-art methods for attacking and defending classifiers.
  • 12. Jupyter notebook with an example 12 https://nbviewer.jupyter.org/github/IBM/adversarial-robustness-toolbox/ blob/master/notebooks/attack_defense_imagenet.ipynb
  • 13. 13
  • 15. Load an ImageNet example image 15
  • 16. 16
  • 17. 17
  • 18. Apply a defence method 18
  • 19. Now we can play with the demo 19 https://art-demo.mybluemix.net
  • 20. In case there are network problems, here is what you could see 20
  • 21. Continue playing with the demo: feature squeezing is less effective here, unless it is set to high 21
  • 22. Gaussian Noise gives correct prediction, but with lower confidence 22
  • 23. The code behind all this is quite simple 23
  • 24. 24 Watson Studio (formerly Data Science Experience) ART is used in Watson Studio along with a lot of other open source modules (C) 2019 IBM Corp
  • 25. Conclusions 25 Adversarial attacks present a serious threat ART is an open source library of tools for protection from such attacks Works with TensorFlow, Keras, PyTorch, and MXNet Developed by IBM Research Ireland: Irina Nicolae, Mathieu Sinn Current version 0.4.0
  • 26. Links 26 https://github.com/IBM/adversarial-robustness-toolbox : https://ibm.biz/Bd2fd8 https://developer.ibm.com/code/open/projects/adversarial-robustness-toolbox/ https://ibm.biz/Bd2fdV https://art-demo.mybluemix.net : https://ibm.biz/Bd2fdn Example notebooks: https://ibm.biz/Bd2fnF Email: singhan@us.ibm.com, slevitan@us.ibm.com Twitter: @AnimeshSingh, @SvetaLevitan
  • 27. Carlini and Wagner paper 2017 27

Editor's Notes

  1. Inspired by brain Perceptron can’t do XOR, only linearly separable problem (1969 Marvin Minsky and Seymour Papert) A multi-layer perceptron with nonlinear activation function can approximate any function ( Hornick at al., 1989). Backpropagation works, but could be very slow for large networks. Recently deep networks became practical, thanks to hardware and algorithms progress. Convolutional networks are based on how retina works, the picture is from Wikipedia Modern networks include convolutional, pooling, ReLu layers.
  2. Here we will consider image recognition models, but adversarial attacks happen on other models too, e.g. speech recognition
  3. Open source Python library