Defending deep learning from adversarial attacks

Animesh Singh
Svetlana Levitan
IBM Center for Open Data and AI Technologies
(CODAIT)
Defending deep learning
from adversarial attacks

Animesh Singh
2
STSM
Lead for IBM Watson and
Cloud Platform
Member of IBM Academy
of Technology
MS in Software Engineering from University
of Texas, Dallas
@AnimeshSingh
Svetlana Levitan
Developer Advocate with IBM CODAIT
Software Engineer for SPSS analytic components
(2000-2018)
IBM Representative to the Data Mining Group
PhD in Applied Math and MS in CS from University
of Maryland, College Park
Originally from Moscow, Russia
@SvetaLevitan

Deep Learning
Adversarial Attacks
Research summary
Adversarial Robustness
Toolbox
An Example
Digital Business Group / © 2019 IBM Corporation 3

Very brief introduction to Deep Learning
4
Perceptron 1957 by Frank Rosenblatt

Deep Learning models are now used in
many areas
5
Can we trust them?

A scarier example (from https://arxiv.org/pdf/1707.08945.pdf)
6

Adversarial machine learning
7
Very active area of research since ~2013
Evasion attack: a very small change in input to cause misclassification
- White-box attack or black-box attack (may use a surrogate model and transferability)
Adversarial defence: model hardening and runtime detection of adversarial inputs
- Model hardening: augment training data with adversarial examples or preprocess inputs
Poisoning attacks – manipulated training data

IBM Adversarial Robustness Toolbox
8
https://github.com/IBM/adversarial-robustness-toolbox
https://ibm.biz/Bd2fd8
Includes many attack and defense methods and detection
methods of adversarial samples or poisoning
Developed by IBM Research group led by
Irina Nicolae and Mathiue Sinn (Ireland)

Types of adversarial attacks in latest version (0.4.0)
9
DeepFool (Moosavi-Dezfooli et al., 2015)
Fast Gradient Method (Goodfellow et al., 2014)
Basic Iterative Method (Kurakin et al., 2016)
Projected Gradient Descent (Madry et al., 2017)
Jacobian Saliency Map (Papernot et al., 2016)
Universal Perturbation (Moosavi-Dezfooli et al., 2016)
Virtual Adversarial Method (Miyato et al., 2015)
C&W Attack (Carlini and Wagner, 2016)
NewtonFool (Jang et al., 2017)

Types of defense methods in ART
10
Feature squeezing (Xu et al., 2017)
Spatial smoothing (Xu et al., 2017)
Label smoothing (Warde-Farley and Goodfellow, 2016)
Adversarial training (Szegedy et al., 2013)
Virtual adversarial training (Miyato et al., 2015)
Gaussian data augmentation (Zantedeschi et al., 2017)
Thermometer encoding (Buckman et al., 2018)
Total variance minimization (Guo et al., 2018)
JPEG compression (Dziugaite et al., 2016)

Poisoning detection
• Detection based on
clustering activations
• Proof of attack strategy
Evasion detection
• Detector based on
inputs
• Detector based on
activations
Robustness metrics
• CLEVER
• Empirical robustness
• Loss sensitivity
Unified model API
• Training
• Prediction
• Access to loss and
prediction gradients
Evasion defenses
• Feature squeezing
• Spatial smoothing
• Label smoothing
• Adversarial training
• Virtual adversarial
training
• Thermometer encoding
• Gaussian data
augmentation
Evasion attacks
• FGSM
• JSMA
• BIM
• PGD
• Carlini & Wagner
• DeepFool
• NewtonFool
• Universal perturbation
11
Implementation for state-of-the-art methods for attacking and defending
classifiers.

Jupyter notebook with an example
12
https://nbviewer.jupyter.org/github/IBM/adversarial-robustness-toolbox/
blob/master/notebooks/attack_defense_imagenet.ipynb

Load an ImageNet example image
15

Now we can play with the demo
19
https://art-demo.mybluemix.net

In case there are network problems, here is what you could see
20

Continue playing with the demo: feature squeezing is less effective
here, unless it is set to high
21

Gaussian Noise gives correct prediction, but with lower confidence
22

The code behind all this is quite simple
23

24
Watson Studio (formerly Data Science Experience)
ART is used in
Watson Studio
along with a lot of
other open source
modules
(C) 2019 IBM Corp

Conclusions
25
Adversarial attacks present a serious threat
ART is an open source library of tools for protection from such attacks
Works with TensorFlow, Keras, PyTorch, and MXNet
Developed by IBM Research
Ireland: Irina Nicolae,
Mathieu Sinn
Current version 0.4.0

Links
26
https://github.com/IBM/adversarial-robustness-toolbox : https://ibm.biz/Bd2fd8
https://developer.ibm.com/code/open/projects/adversarial-robustness-toolbox/
https://ibm.biz/Bd2fdV
https://art-demo.mybluemix.net : https://ibm.biz/Bd2fdn
Example notebooks: https://ibm.biz/Bd2fnF
Email: singhan@us.ibm.com, slevitan@us.ibm.com
Twitter: @AnimeshSingh, @SvetaLevitan

Carlini and Wagner paper 2017
27

Defending deep learning from adversarial attacks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Defending deep learning from adversarial attacks

Similar to Defending deep learning from adversarial attacks (20)

Recently uploaded

Recently uploaded (20)

Defending deep learning from adversarial attacks

Editor's Notes