U N I V E R S A L
A D V E R S A R I A L
2 0 1 8 S N U V L S E M I N A R
H Y U N W O O K I M
P E R T U R B A T I O N
C O N T E N T S
1. Quick Intro to Adversarial Attacks
- Deep fool: A simple and accurate method
to fool deep neural networks
- Explaining & Harnessing Adversarial Examples
2. Universal Adversarial Perturbation
Attacks ?
Through the
human eye
Anderson & Winawer, 2005
iPodBoat Perturbation
Through the
machine’s eye
Adversarial
Example or Attack
Intriguing Properties of Neural Networks
C. Szegedy et al. ICLR, 2014.
Intriguing Properties of Neural Networks
C. Szegedy et al. ICLR, 2014.
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
Fast Gradient Sign Method [FGSM]
!" = " + %
&
Fast Gradient Sign Method [FGSM]
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
Perturbation
Gradient of the
cost function
Fast Gradient Sign Method [FGSM]
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
RAW Images ATTACKED Images
With Maxout Network
Misclassification Rate
- MNIST : 89.4%
Fast Gradient Sign Method [FGSM]
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
RAW Images ATTACKED Images
With Conv Maxout Network
Misclassification Rate
- CIFAR-10 : 87.2%
Fast Gradient Sign Method [FGSM]
Explaining and Harnessing Adversarial Examples
I.J. Goodfellow et al. ICLR, 2015.
: Intensity of the Attack
Various methods of attacks
Ian Goodfellow
https://github.com/tensorflow/cleverhans
Python library for
Adversarial attacks
“They say that the best weapon
is the one you never have to fire.
I respectfully disagree.
I prefer the weapon
you only have to fire ONCE.”
-Tony Stark-
Universal
Adversarial
Perturbations
S.M. Moosavi-Dezfooli et al.
CVPR, 2017.
How to make one
• ! ∶ #$%&'$()&$*+ *, $-./0% $+ ℝ2
• 34 ∶ . 56.%%$,$5.&$*+ ,)+5&$*+ &ℎ.& *)&8)&% .+ 9:;<=>;9? @>A9@ 34 B
,*' 0.5ℎ $-./0 C ∈ ℝ2
E0 %00F . G05&*' H %)5ℎ &ℎ.&
How to make one:
constraints for the
!" #""$ % &"'()* + #,'ℎ (ℎ%(
+
2. Has to be small enough
1. Satisfies the fooling rate
∆"#
∆"$
%
How to make one:
the algorithm
2. Has to be small enough
1. Satisfies the fooling rate
Projection on the ℓ" ball of radius #
From another paper by the author
Take use of the tangent plane
Deep fool:
S.M. Moosavi-Dezfooli et al.
CVPR, 2016.
Take use of the tangent plane
Deep fool:
S.M. Moosavi-Dezfooli et al.
CVPR, 2016.
Take use of the tangent plane
Perturbation
Deep fool:
S.M. Moosavi-Dezfooli et al.
CVPR, 2016.
Deep fool: Take use of the tangent plane
2. Has to be small enough
1. Satisfies the fooling rate
The one perturbation that
messes up with all the classes
The Universal Perturbations
for each Network
93% 94%
79%
78%
78% 84%
Doubly-Universal on ImageNet Dataset
Cross-model Universality
Cross-model Universality
Intriguing Properties of Neural Networks
C. Szegedy et al. ICLR, 2014.
Doubly-Universal on ImageNet Dataset
Cross-model Universality
Need many images for crafting ?
500 is all you need!
How about fine tuning ?
Fine tuning on
the set of perturbations
94%76%
80%80%…
The real fun part
1. The existence of
dominant labels
Been overlooked
Intriguing Properties of Neural Networks
C. Szegedy et al. ICLR, 2014.
The whole picture
African Grey
Macaw
Some examples
2. Comparison with
other perturbations
3. Captures the local geometry
Geometric properties of
Adversarial Perturbations
1. ∥ "($) ∥& measures the Euclidean distance
from x to the closest point on the decision boundary
2. The vector "($) is orthogonal to the
decision boundary of the classifier
…....
n universal perturbations
(normal vectors)
n random vectors
"# "$ ….... "%
3. Captures the local geometry
N =
Difference in
Singular Value decay
Perform SVD of the matrix N
The existence of large correlations and
redundancies in the decision boundary
A subspace of low dimension d’
that contains most normal vectors to the
decision boundary of deep networks
Sample random vectors
from this subspace S
Spanned by the first 100 singular vectors
10%
38%
vs
There exists
a low-dimensional subspace
that captures the correlations
among different regions of the
decision boundary
Generalization of
Universal perturbations
The Universal Adversarial Perturbation
A silver bullet1 single vector
Image-agnostic
Network-agnostic
I L L U S I O N S
Adversarial Attacks
Thank you
Hyunwoo Kim

Universal Adversarial Perturbation