SlideShare a Scribd company logo
1 of 151
Download to read offline
Wild Patterns: A Half-day Tutorial on
Adversarial Machine Learning
Battista Biggio
Pattern Recognition
and Applications Lab
University of
Cagliari, Italy
ICMLC Tutorial – July 7, 2019 - Hotel Portopia, Kobe, Japan
* Slides from this talk are inspired from the tutorial I prepared with Fabio Roli on such topic.
https://www.pluribus-one.it/research/sec-ml/wild-patterns/
http://pralab.diee.unica.it @biggiobattista
A Question to Start…
What is the oldest survey article on machine learning
that you have ever read?
What is the publication year?
2
http://pralab.diee.unica.it @biggiobattista
This Is Mine… Year 1966
3Credits: Dr Gavin Brown for showing me this article
http://pralab.diee.unica.it @biggiobattista
Applications in the Old Good Days…
What applications do you think that this paper dealt
with?
4
http://pralab.diee.unica.it @biggiobattista
Popular Applications in the Sixties
5
OCR for bank cheque sorting
Aerial photo recognition
Detection of particle tracks in bubble chambers
http://pralab.diee.unica.it @biggiobattista
Key Feature of these Apps
6
Specialised applications for professional users…
7
What about Today Applications?
http://pralab.diee.unica.it @biggiobattista
Computer Vision for Self-Driving Cars
8
He et al., Mask R-CNN, ICCV ’17, https://arxiv.org/abs/1703.06870
Video from: https://www.youtube.com/watch?v=OOT3UIXZztE
http://pralab.diee.unica.it @biggiobattista
Automatic Speech Recognition for Virtual Assistants
Amazon Alexa
Apple Siri Microsoft Cortana Google Assistant
9
http://pralab.diee.unica.it @biggiobattista
Today Applications of Machine Learning
10
http://pralab.diee.unica.it @biggiobattista
Key Features of Today Apps
11
Personal and consumer applications…
http://pralab.diee.unica.it @biggiobattista
We Are Living in the Best of the Worlds…
12
AI is going to transform industry and business
as electricity did about a century ago
(Andrew Ng, Jan. 2017)
13
All Right? All Good?
http://pralab.diee.unica.it @biggiobattista
iPhone 5s and 6s with Fingerprint Reader…
14
http://pralab.diee.unica.it @biggiobattista
Hacked a Few Days After Release…
15
16
But maybe this happens only for old,
shallow machine learning…
End-to-end deep learning is another story…
http://pralab.diee.unica.it @biggiobattista
Adversarial School Bus
17
Szegedy et al., Intriguing properties of neural networks, ICLR 2014
Biggio, Roli et al., Evasion attacks against machine learning at test time, ECML-PKDD 2013
http://pralab.diee.unica.it @biggiobattista
Adversarial Glasses
• M. Sharif et al. (ACM CCS 2016) attacked deep neural networks for face recognition with
carefully-fabricated eyeglass frames
• When worn by a 41-year-old white male (left image), the glasses mislead the deep network
into believing that the face belongs to the famous actress Milla Jovovich
18
19
But maybe this happens only for image
recognition…
http://pralab.diee.unica.it @biggiobattista
Audio Adversarial Examples
20
“without the dataset the article is useless”
“okay google browse to evil dot com”
Transcription by Mozilla DeepSpeechAudio
https://nicholas.carlini.com/code/audio_adversarial_examples/
http://pralab.diee.unica.it @biggiobattista
Evasion of Deep Networks for EXE Malware Detection
• MalConv: convolutional deep network trained on raw bytes to detect EXE malware
• Our attack can evade it by adding few padding bytes
21
[Kolosniaji, Biggio, Roli et al., Adversarial Malware Binaries, EUSIPCO2018]
[Demetrio, Biggio et al., Explaining Vulnerability of DL ..., ITASEC 2019]
http://pralab.diee.unica.it @biggiobattista
Take-home Message
We are living exciting time for machine learning…
…Our work feeds a lot of consumer technologies for personal
applications...
This opens up new big possibilities, but also new security risks
22
23
Where Do These Security Risks Come From?
http://pralab.diee.unica.it @biggiobattista
The Classical Statistical Model
Note these two implicit assumptions of the model:
1. The source of data is given, and it does not depend on the classifier
2. Noise affecting data is stochastic
24
Data
source
acquisition/
measurementRaw data
x1
x2
...
xd
feature
vector
learning
algorithm
classifier
stochastic
noise
ed by sets of coupled s
for formal neurons ation
of essentials feat
Example: OCR
http://pralab.diee.unica.it @biggiobattista
Can This Model Be Used Under Attack?
25
Data
source
acquisition/
measurementRaw data
x1
x2
...
xd
feature
vector
learning
algorithm
classifier
stochastic
noise
ed by sets of coupled s
for formal neurons ation
of essentials feat
Example: OCR
http://pralab.diee.unica.it @biggiobattista
An Example: Spam Filtering
26
Total score = 6.0
From: spam@example.it
Buy Viagra !
> 5.0 (threshold) Spam
Linear Classifier
ØThe famous SpamAssassin filter is really a linear classifier
§http://spamassassin.apache.org
Feature weights
buy = 1.0
viagra = 5.0
http://pralab.diee.unica.it @biggiobattista
Feature Space View
27
X2
X1
+
+
+
+
+
-
-
-
-
-
yc(x)
Feature weights
buy = 1.0
viagra = 5.0
From:
spam@example.it
Buy Viagra!
• Classifier’s weights are learned from training data
• The SpamAssassin filter uses the perceptron algorithm
28
But spam filtering is not a stationary classification
task, the data source is not neutral…
http://pralab.diee.unica.it @biggiobattista
The Data Source Can Add “Good” Words
29
Total score = 1.0
From: spam@example.it
Buy Viagra !
conference meeting
< 5.0 (threshold)
Linear Classifier
Feature weights
buy = 1.0
viagra = 5.0
conference = -2.0
meeting = -3.0
Ham
üAdding good words is a typical spammers trick [Z. Jorgensen et al., JMLR 2008]
http://pralab.diee.unica.it @biggiobattista
Adding Good Words: Feature Space View
30
X2
X1
+
+
+
+
+
-
-
-
-
-
yc(x)
Feature weights
buy = 1.0
viagra = 5.0
conference = -2.0
meeting = -3.0
From: spam@example.it
Buy Viagra!
conference meeting
-
üNote that spammers corrupt patterns with a noise that is not random..
http://pralab.diee.unica.it @biggiobattista
Is This Model Good for Spam Filtering?
Ø The source of data is given, and it does not depend on the classifier
Ø Noise affecting data is stochastic (“random”)
31
Data
source
acquisition/
measurementRaw data
x1
x2
...
xd
feature
vector
learning
algorithm
classifier
stochastic
noise
ed by sets of coupled s
for formal neurons ation
of essentials feat
Example: OCR
32
No, it is not…
http://pralab.diee.unica.it @biggiobattista
Adversarial Machine Learning
1. The source of data is not neutral, it depends on the classifier
2. Noise is not stochastic, it is adversarial, crafted to maximize the classification error
33
measurementRaw data
x1
x2
...
xn
feature
vector
learning
algorithm
classifier
adversarial
noise
Spam message:
Buy Viagra
Camouflaged message:
Buy Vi@gra
Dublin University
Non-neutral
data source
http://pralab.diee.unica.it @biggiobattista
Adversarial Noise vs. Stochastic Noise
34
Hamming’s adversarial
noise model: the channel
acts as an adversary that
arbitrarily corrupts the
code-word subject to a
bound on the total number
of errors
Shannon’s stochastic noise
model: probabilistic model
of the channel, the
probability of occurrence of
too many or too few errors
is usually low
• This distinction is not new...
http://pralab.diee.unica.it @biggiobattista
The Classical Model Cannot Work
• Standard classification algorithms assume that
– data generating process is independent from the classifier
– training /test data follow the same distribution (i.i.d. samples)
• This is not the case for adversarial tasks!
• Easy to see that classifier performance will degrade quickly if the adversarial noise is not
taken into account
– Adversarial tasks are a mission impossible for the classical model
35
36
How Should We Design Pattern Classifiers
Under Attack?
http://pralab.diee.unica.it @biggiobattista
Adversary-aware Machine Learning
37
Machine learning systems should be aware of the arms race with the adversary
[Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014]
Adversary System Designer
Analyze system
Devise and
execute attack Analyze attack
Develop countermeasure
http://pralab.diee.unica.it @biggiobattista
• In 2004 spammers invented a new trick for evading anti-spam filters…
– As filters did not analyze the content of attached images…
– Spammers embedded their messages into images…so evading filters…
Image-based Spam
Your orological prescription appointment starts September 30th
bergstrom mustsquawbush try bimini , maine see
woodwind in con or patagonia or scrapbook but.
patriarchal and tasteful must advisory not thoroughgoing
the frowzy not ellwood da jargon and.
beresford ! arpeggio must stern try disastrous ! alone ,
wear da esophagi try autonomic da clyde and taskmaster,
tideland try cream see await must mort in.
From: Conrad Stern <rjlfm@berlin.de>
To: utente@emailserver.it
Arms Race: The Case of Image Spam
38
http://pralab.diee.unica.it @biggiobattista
• PRA Lab team proposed a countermeasure against image spam…
– G. Fumera, I. Pillai, F. Roli, Spam filtering based on the analysis of text information embedded into
images, Journal of Machine Learning Research, Vol. 7, 2006
• Text embedded in images is read by Optical Character Recognition (OCR)
• OCRing image text and fusing it with other mail data allows discriminating spam/ham
mails
Arms Race: The Case of Image Spam
39
http://pralab.diee.unica.it @biggiobattista
• The OCR-based solution was deployed as a plug-in of SpamAssassin filter
(called Bayes OCR) and worked well for a while…
http://wiki.apache.org/spamassassin/CustomPlugins
Arms Race: The Case of Image Spam
40
http://pralab.diee.unica.it @biggiobattista
Spammers’ Reaction
• Spammers reacted quickly with a countermeasure against OCR-based solutions
(and against signature-based image spam detection)
• They applied content obscuring techniques to images, like done in CAPTCHAs, to make
OCR systems ineffective without compromising human readability
41
http://pralab.diee.unica.it @biggiobattista
You find the complete story here:
http://en.wikipedia.org/wiki/Image_spam
Arms Race: The Case of Image Spam
• PRA Lab did another countermove by devising features which detect the
presence of spammers obfuscation techniques in text images
ü A feature for detecting characters fragmented or mixed with small background components
ü A feature for detecting characters connected through background components
ü A feature for detecting non-uniform background, hidden text
• This solution was deployed as a new plug-in of SpamAssassin filter
(called Image Cerberus)
42
43
How Can We Design Adversary-aware Machine
Learning Systems?
http://pralab.diee.unica.it @biggiobattista
Adversary-aware Machine Learning
44
Machine learning systems should be aware of the arms race with the adversary
[Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014]
Adversary System Designer
Analyze system
Devise and
execute attack Analyze attack
Develop countermeasure
http://pralab.diee.unica.it @biggiobattista
Adversary-aware Machine Learning
45
Machine learning systems should be aware of the arms race with the adversary
[Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014]
System Designer System Designer
Simulate attack Evaluate attack’s impact
Develop countermeasureModel adversary
http://pralab.diee.unica.it @biggiobattista
The Three Golden Rules
1. Know your adversary
2. Be proactive
3. Protect your classifier
46
47
Know your adversary
If you know the enemy and know yourself, you need not
fear the result of a hundred battles
(Sun Tzu, The art of war, 500 BC)
http://pralab.diee.unica.it @biggiobattista
Adversary’s 3D Model
48
Adversary’s Knowledge Adversary’s Capability
Adversary’s Goal
http://pralab.diee.unica.it @biggiobattista
Adversary’s Goal
• To cause a security violation...
49
Misclassifications
that do not
compromise normal
system operation
Integrity
Misclassifications
that compromise
normal system
operation
(denial of service)
Availability
Querying strategies that
reveal confidential
information on the
learning model or its users
Confidentiality / Privacy
[Barreno et al., Can Machine Learning Be Secure? ASIACCS ‘06]
http://pralab.diee.unica.it @biggiobattista
Adversary’s Knowledge
• Perfect-knowledge (white-box) attacks
– upper bound on the performance degradation under attack
50
TRAINING DATA
FEATURE
REPRESENTATION
LEARNING
ALGORITHM
e.g., SVM
x1
x2
...
xd
x xx
x x
x
x
x
x
x
x
x
x xxx
x
- Learning algorithm
- Parameters (e.g., feature weights)
- Feedback on decisions
[B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
http://pralab.diee.unica.it @biggiobattista
• Limited-knowledge Attacks
– Ranging from gray-box to black-box attacks
51
TRAINING DATA
FEATURE
REPRESENTATION
LEARNING
ALGORITHM
e.g., SVM
x1
x2
...
xd
x xx
x x
x
x
x
x
x
x
x
x xxx
x
- Learning algorithm
- Parameters (e.g., feature weights)
- Feedback on decisions
Adversary’s Knowledge
[B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
http://pralab.diee.unica.it @biggiobattista
Kerckhoffs’ Principle
• Kerckhoffs’ Principle (Kerckhoffs 1883) states that the security of a system should not rely on
unrealistic expectations of secrecy
– It’s the opposite of the principle of “security by obscurity”
• Secure systems should make minimal assumptions about what can realistically be kept secret
from a potential attacker
• For machine learning systems, one could assume that the adversary is aware of the learning
algorithm and can obtain some degree of information about the data used to train the learner
• But the best strategy is to assess system security under different levels of adversary’s
knowledge
52[Joseph et al., Adversarial Machine Learning, Cambridge Univ. Press, 2017]
http://pralab.diee.unica.it @biggiobattista
Adversary’s Capability
53[B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014; M. Barreno et al., ML 2010]
• Attackers may manipulate training data and/or test data
TRAINING
TEST
Influence model at training time to cause subsequent errors at test time
poisoning attacks, backdoors
Manipulate malicious samples at test time to cause misclassications
evasion attacks, adversarial examples
http://pralab.diee.unica.it @biggiobattista
A Deliberate Poisoning Attack?
54
[http://exploringpossibilityspace.blogspot.it/2016
/03/poor-software-qa-is-root-cause-of-tay.html]
Microsoft deployed Tay,
and AI chatbot designed
to talk to youngsters on
Twitter, but after 16 hours
the chatbot was shut
down since it started to
raise racist and offensive
comments.
http://pralab.diee.unica.it @biggiobattista
Adversary’s Capability
• Luckily, the adversary is not omnipotent, she is constrained…
55[R. Lippmann, Dagstuhl Workshop, 2012]
Email messages must be understandable by human readers
Malware must execute on a computer, usually exploiting a
known vulnerability
http://pralab.diee.unica.it @biggiobattista
• Constraints on data manipulation
– maximum number of samples that can be added to the training data
• the attacker usually controls only a small fraction of the training samples
– maximum amount of modifications
• application-specific constraints in feature space
• e.g., max. number of words that are modified in spam emails
56
d(x, !x ) ≤ dmax
x2
x1
f(x)
x
Feasible domain
x '
TRAINING
TEST
Adversary’s Capability
[B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
http://pralab.diee.unica.it @biggiobattista
Conservative Design
• The design and analysis of a system should avoid unnecessary or unreasonable assumptions
on the adversary’s capability
– worst-case security evaluation
• Conversely, analysing the capabilities of an omnipotent adversary reveals little about a
learning system’s behaviour against realistically-constrained attackers
• Again, the best strategy is to assess system security under different levels of adversary’s
capability
57[Joseph et al., Adversarial Machine Learning, Cambridge Univ. Press, 2017]
58
Be Proactive
To know your enemy, you must become your enemy
(Sun Tzu, The art of war, 500 BC)
http://pralab.diee.unica.it @biggiobattista
Be Proactive
• Given a model of the adversary characterized by her:
– Goal
– Knowledge
– Capability
Try to anticipate the adversary!
• What is the optimal attack she can do?
• What is the expected performance decrease of your classifier?
59
http://pralab.diee.unica.it @biggiobattista
Evasion of Linear Classifiers
• Problem: how to evade a linear (trained) classifier?
Start 2007 with
a bang!
Make WBFS YOUR
PORTFOLIO’s
first winner of
the year
...
start
bang
portfolio
winner
year
...
university
campus
1
1
1
1
1
...
0
0
+6 > 0, SPAM
(correctly classified)
f (x) = sign(wT
x)
x
start
bang
portfolio
winner
year
...
university
campus
+2
+1
+1
+1
+1
...
-3
-4
w
x’
St4rt 2007 with
a b4ng!
Make WBFS YOUR
PORTFOLIO’s
first winner of
the year
... campus
start
bang
portfolio
winner
year
...
university
campus
0
0
1
1
1
...
0
1
+3 -4 < 0, HAM
(misclassified email)
f (x) = sign(wT
x)
60
http://pralab.diee.unica.it @biggiobattista
Evasion of Nonlinear Classifiers
• What if the classifier is nonlinear?
• Decision functions can be arbitrarily complicated, with no clear relationship between
features (x) and classifier parameters (w) −2−1.5−1−0.500.511.5
61
http://pralab.diee.unica.it @biggiobattista
Detection of Malicious PDF Files
Srndic & Laskov, Detection of malicious PDF files based on hierarchical document structure, NDSS 2013
“The most aggressive evasion strategy we could conceive was successful for only 0.025% of malicious
examples tested against a nonlinear SVM classifier with the RBF kernel [...].
Currently, we do not have a rigorous mathematical explanation for such a surprising robustness. Our
intuition suggests that [...] the space of true features is “hidden behind” a complex nonlinear
transformation which is mathematically hard to invert.
[...] the same attack staged against the linear classifier [...] had a 50% success rate; hence, the robustness
of the RBF classifier must be rooted in its nonlinear transformation”
62
http://pralab.diee.unica.it @biggiobattista
Evasion Attacks against Machine Learning at Test Time
Biggio, Corona, Maiorca, Nelson, Srndic, Laskov, Giacinto, Roli, ECML-PKDD 2013
• Goal: maximum-confidence evasion
• Knowledge: perfect (white-box attack)
• Attack strategy:
• Non-linear, constrained optimization
– Projected gradient descent: approximate
solution for smooth functions
• Gradients of g(x) can be analytically computed in
many cases
– SVMs, Neural networks
−2−1.5−1−0.500.51
x
f (x) = sign g(x)( )=
+1, malicious
−1, legitimate
"
#
$
%$
x '
min
$%
&((%
)
s. t. ( − (%
. ≤ 0123
63
http://pralab.diee.unica.it @biggiobattista
Computing Descent Directions
Support vector machines
Neural networks
x1
xd
d1
dk
dm
xf g(x)
w1
wk
wm
v11
vmd
vk1
……
……
g(x) = αi yik(x,
i
∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )
i
∑
g(x) = 1+exp − wkδk (x)
k=1
m
∑
#
$
%
&
'
(
)
*
+
,
-
.
−1
∂g(x)
∂xf
= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkf
k=1
m
∑
RBF kernel gradient: ∇k(x,xi
) = −2γ exp −γ || x − xi
||2
{ }(x − xi
)
[Biggio, Roli et al., ECML PKDD 2013] 64
http://pralab.diee.unica.it @biggiobattista
An Example on Handwritten Digits
• Nonlinear SVM (RBF kernel) to discriminate between ‘3’ and ‘7’
• Features: gray-level pixel values (28 x 28 image = 784 features)
Few modifications are
enough to evade detection!
1st adversarial examples
generated with gradient-based
attacks date back to 2013!
(one year before attacks to deep
neural networks)
Before attack (3 vs 7)
5 10 15 20 25
5
10
15
20
25
After attack, g(x)=0
5 10 15 20 25
5
10
15
20
25
After attack, last iter.
5 10 15 20 25
5
10
15
20
25
0 500
−2
−1
0
1
2
g(x)
number of iterations
After attack
(misclassified as 7)
65[Biggio, Roli et al., ECML PKDD 2013]
http://pralab.diee.unica.it @biggiobattista
Bounding the Adversary’s Knowledge
Limited-knowledge (gray/black-box) attacks
• Only feature representation and (possibly) learning algorithm are known
• Surrogate data sampled from the same distribution as the classifier’s training data
• Classifier’s feedback to label surrogate data
PD(X,Y)data
Surrogate
training data
Send queries
Get labels
f(x)
Learn
surrogate
classifier
f’(x)
This is the same underlying idea
behind substitute models and black-
box attacks (transferability)
[N. Papernot et al., IEEE Euro S&P ’16;
N. Papernot et al., ASIACCS’17]
66[Biggio, Roli et al., ECML PKDD 2013]
http://pralab.diee.unica.it @biggiobattista
Recent Results on Android Malware Detection
• Drebin: Arp et al., NDSS 2014
– Android malware detection directly on the mobile phone
– Linear SVM trained on features extracted from static
code analysis
[Demontis, Biggio et al., IEEE TDSC 2017]
x2
Classifier
0
1
...
1
0
Android app (apk)
malware
benign
x1
x
f (x)
67
http://pralab.diee.unica.it @biggiobattista
Recent Results on Android Malware Detection
• Dataset (Drebin): 5,600 malware and 121,000 benign apps (TR: 30K, TS: 60K)
• Detection rate at FP=1% vs max. number of manipulated features (averaged on 10 runs)
– Perfect knowledge (PK) white-box attack; Limited knowledge (LK) black-box attack
68[Demontis, Biggio et al., IEEE TDSC 2017]
http://pralab.diee.unica.it @biggiobattista
Take-home Messages
• Linear and non-linear supervised classifiers can be highly vulnerable to well-crafted evasion
attacks
• Performance evaluation should be always performed as a function of the adversary’s
knowledge and capability
– Security Evaluation Curves
69
min
x'
g(x')
s.t. d(x, x') ≤ dmax
x ≤ x'
http://pralab.diee.unica.it @biggiobattista
Why Is Machine Learning So Vulnerable?
• Learning algorithms tend to overemphasize some
features to discriminate among classes
• Large sensitivity to changes of such input
features: !"#(")
• Different classifiers tend to find the same set of
relevant features
– that is why attacks can transfer across models!
70[Melis et al., Explaining Black-box Android Malware Detection, EUSIPCO 2018]
&
g(&)
&’
positivenegative
71
2014: Deep Learning Meets
Adversarial Machine Learning
http://pralab.diee.unica.it @biggiobattista
The Discovery of Adversarial Examples
72
... we find that deep neural networks learn input-output mappings that
are fairly discontinuous to a significant extent. We can cause the
network to misclassify an image by applying a certain hardly
perceptible perturbation, which is found by maximizing the network’s
prediction error ...
[Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
http://pralab.diee.unica.it @biggiobattista
Adversarial Examples and Deep Learning
• C. Szegedy et al. (ICLR 2014) independently developed a gradient-based attack against deep
neural networks
– minimally-perturbed adversarial examples
73[Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
http://pralab.diee.unica.it @biggiobattista
!(#) ≠ &
School Bus (x) Ostrich
Struthio Camelus
Adversarial Noise (r)
The adversarial image x + r is visually
hard to distinguish from x
Informally speaking, the solution x + r is
the closest image to x classified as l by f
The solution is approximated using using a box-constrained limited-memory BFGS
Creation of Adversarial Examples
74[Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
Many Black Swans After 2014…
[Search https://arxiv.org with keywords “adversarial examples”]
75
• Several defenses have been proposed against adversarial
examples, and more powerful attacks have been developed
to show that they are ineffective. Remember the arms race?
• Most of these attacks are modifications to the optimization
problems reported for evasion attacks / adversarial
examples, using different gradient-based solution
algorithms, initializations and stopping conditions.
• Most popular attack algorithms: FGSM (Goodfellow et al.),
JSMA (Papernot et al.), CW (Carlini & Wagner, and follow-
up versions)
76
Why Adversarial Perturbations are Imperceptible?
http://pralab.diee.unica.it @biggiobattista
Why Adversarial Perturbations against Deep Networks are
Imperceptible?
• Large sensitivity of g(x) to input changes
– i.e., the input gradient !"#(") has a large norm (scales with input dimensions!)
– Thus, even small modifications along that direction will cause large changes in the predictions
77
[Simon-Gabriel et al., Adversarial vulnerability of NNs increases
with input dimension, arXiv 2018]
&
g(&)
&’
!"#(")
http://pralab.diee.unica.it @biggiobattista
• Regularization also impacts (reduces) the size of input gradients
– High regularization requires larger perturbations to mislead detection
– e.g., see manipulated digits 9 (classified as 8) against linear SVMs with different C values
Regularization, Input Gradients and Adversarial Robustness
78[Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019]
high regularization
large perturbation
low regularization
imperceptible perturbation
C=0.001, eps=1.7 C=1.0, eps=0.47
http://pralab.diee.unica.it @biggiobattista
Regularization, Input Gradients and Adversarial Robustness
0.0 0.2 0.4 0.6 0.8 1.0
Regularization (weight decay)
0.05
0.10
0.15
Sizeofinputgradients
0.0
0.2
0.4
0.6
0.8
1.0
Testerror
High complexity Low complexity
test error (no attack)
test error (" = 0.3)
79[Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019]
http://pralab.diee.unica.it @biggiobattista
Why Do Adversarial Attacks Transfer?
Regularization and Transferability
80
0.0 0.2 0.4 0.6 0.8
Gradient alignment (R)
0.2
0.4
0.6
0.8
Black-towhite-boxerrorratio(✏=5)
P: 0.69, p-val: < 1e-10
K: 0.48, p-val: < 1e-10
10 2
10 1
Variability of loss landscape (V)
0.3
0.4
0.5
0.6
Transferrate(✏=30)
• Excerpt of our experimental results for evasion attacks on Android malware (DREBIN)
– ‘x’ for low-complexity (strongly-regularized) models
– ‘o’ for high-complexity (weakly-regularized) models
10 1
100
Size of input gradients (S)
0.2
0.4
0.6
0.8
1.0
Evasionrate("=5)
SVM
logistic
ridge
SVM-RBF
NN
[Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019]
81
Is Deep Learning Safe for Robot Vision?
http://pralab.diee.unica.it @biggiobattista
Is Deep Learning Safe for Robot Vision?
• Evasion attacks against the iCub humanoid robot
– Deep Neural Network used for visual object recognition
82[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista [http://old.iit.it/projects/data-sets]
iCubWorld28 Data Set: Example Images
83
http://pralab.diee.unica.it @biggiobattista
From Binary to Multiclass Evasion
• In multiclass problems, classification errors occur in different classes.
• Thus, the attacker may aim:
1. to have a sample misclassified as any class different from the true class (error-generic attacks)
2. to have a sample misclassified as a specific class (error-specific attacks)
84
cup
sponge
dish
detergent
Error-generic attacks
any class
cup
sponge
dish
detergent
Error-specific attacks
[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista
Error-generic Evasion
• Error-generic evasion
– k is the true class (blue)
– l is the competing (closest) class in feature space (red)
• The attack minimizes the objective to have the sample misclassified as the closest class (could
be any!)
85
1 0 1
1
0
1
Indiscriminate evasion
[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista
• Error-specific evasion
– k is the target class (green)
– l is the competing class (initially, the blue class)
• The attack maximizes the objective to have the sample misclassified as the target class
Error-specific Evasion
86
max
1 0 1
1
0
1
Targeted evasion
[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista
Adversarial Examples against iCub – Gradient Computation
∇fi
(x) =
∂fi(z)
∂z
∂z
∂x
f1
f2
fi
fc
...
...
87
The given optimization
problems can be both solved
with gradient-based algorithms
The gradient of the objective
can be computed using the
chain rule
1. the gradient of the functions
fi(z) can be computed if the
chosen classifier is
differentiable
2. ... and then backpropagated
through the deep network with
automatic differentiation
http://pralab.diee.unica.it @biggiobattista
Example of Adversarial Images against iCub
An adversarial example from class laundry-detergent, modified by
the proposed algorithm to be misclassified as cup
88[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista
The “Sticker” Attack against iCub
Adversarial example generated
by manipulating only a
specific region, to simulate a
sticker that could be applied
to the real-world object.
This image is classified as cup.
89[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
90
Countering Evasion Attacks
What is the rule? The rule is protect yourself at all times
(from the movie “Million dollar baby”, 2004)
http://pralab.diee.unica.it @biggiobattista
Security Measures against Evasion Attacks
1. Reduce sensitivity to input changes
with robust optimization
– Adversarial Training / Regularization
2. Introduce rejection / detection
of adversarial examples
91
min
$
∑& max
||*+||,-
ℓ(0&, 2$ 3& + *& )
bounded perturbation!
1 0 1
1
0
1
SVM-RBF (higher rejection rate)
1 0 1
1
0
1
SVM-RBF (no reject)
http://pralab.diee.unica.it @biggiobattista
• Robust optimization (a.k.a. adversarial training)
• Robustness and regularization (Xu et al., JMLR 2009)
– under linearity of ℓ and "#, equivalent to robust optimization
Reducing Input Sensitivity via Robust Optimization
min
#
max
||*+||,-.
∑0 ℓ 10, "# 30 + *0
bounded perturbation!
92
min
#
∑5 ℓ 10, "# 30 + 6||73"||8
dual norm of the perturbation
||73"||8 = ||#||8
http://pralab.diee.unica.it @biggiobattista
Experiments on Android Malware
• Infinity-norm regularization is the optimal regularizer against sparse evasion attacks
– Sparse evasion attacks penalize | " |# promoting the manipulation of only few features
Results on Adversarial Android Malware
[Demontis, Biggio et al., Yes, ML Can Be More Secure!..., IEEE TDSC 2017]
Absolute weight values |$| in descending order
Why? It bounds the maximum weight absolute values!
min
w,b
w ∞
+C max 0,1− yi f (xi )( )
i
∑ , w ∞
= max
i=1,...,d
wiSec-SVM
93
http://pralab.diee.unica.it @biggiobattista
Adversarial Training and Regularization
• Adversarial training can also be seen as a form of regularization, which penalizes the (dual)
norm of the input gradients ! |#$ℓ |&
• Known as double backprop or gradient/Jacobian regularization
– see, e.g., Simon-Gabriel et al., Adversarial vulnerability of neural networks increases with input
dimension, ArXiv 2018; and Lyu et al., A unified gradient regularization family for adversarial
examples, ICDM 2015.
94
'
g(')
'’
with adversarial trainingTake-home message: the net
effect of these techniques is
to make the prediction function
of the classifier smoother
(increasing the input margin)
http://pralab.diee.unica.it @biggiobattista
Ineffective Defenses: Obfuscated Gradients
• Work by Carlini & Wagner (SP’ 17) and Athalye et al. (ICML ‘18) has shown that
– some recently-proposed defenses rely on obfuscated / masked gradients, and
– they can be circumvented
95
g(")
"’"
Obfuscated
gradients do not
allow the
correct
execution of
gradient-based
attacks...
"
g(")
"’
... but substitute
models and/or
smoothing can
correctly reveal
meaningful
input gradients!
http://pralab.diee.unica.it @biggiobattista
Detecting & Rejecting Adversarial Examples
• Adversarial examples tend to occur in blind spots
– Regions far from training data that are anyway assigned to ‘legitimate’ classes
96
blind-spot evasion
(not even required to
mimic the target class)
rejection of adversarial examples through
enclosing of legitimate classes
http://pralab.diee.unica.it @biggiobattista
Detecting & Rejecting Adversarial Examples
input perturbation (Euclidean distance)
97[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
http://pralab.diee.unica.it @biggiobattista [S. Sabour at al., ICLR 2016]
Why Rejection (in Representation Space) Is Not Enough?
98
99
Adversarial Examples and Security Evaluation
(Demo Session)
https://advx-secml.pluribus-one.it
http://pralab.diee.unica.it @biggiobattista
secml: An open source Python library for ML Security
100
adv
ml
expl
others
- ML algorithms via sklearn
- DL algorithms and optimizers via PyTorch and Tensorflow
- attacks (evasion, poisoning, ...) with custom/faster solvers
- defenses (advx rejection, adversarial training, ...)
- Explanation methods based on influential features
- Explanation methods based on influential prototypes
- Parallel computation
- Support for dense/sparse data
- Advanced plotting functions (via matplotlib)
- Modular and easy to extend
First release scheduled on August 2019!
Marco Melis
Ambra Demontis
http://pralab.diee.unica.it @biggiobattista
EU H2020 Project ALOHA
• ALOHA – software framework for runtime-Adaptive and secure deep Learning On
Heterogeneous Architectures
• Project goal: to facilitate implementation of deep learning algorithms on heterogeneous low-
energy computing platforms
• Project website: www.aloha-h2020.eu
• Pluribus One is in charge of evaluating and improving security of deep learning algorithms
under attack
101
This project has received funding from the European Union’s Horizon 2020
Research and Innovation programme under Grant Agreement No. 780788
102
Poisoning Machine Learning
http://pralab.diee.unica.it @biggiobattista
Poisoning Machine Learning
103
x xx
x x
x
x
x
x
x
x
x
x xxx
x
x1
x2
...
xd
pre-processing and
feature extraction
training data
(with labels)
classifier learning
start
bang
portfolio
winner
year
...
university
campus
Start 2007
with a bang!
Make WBFS
YOUR
PORTFOLIO’s
first winner
of the year
...
start
bang
portfolio
winner
year
...
university
campus
1
1
1
1
1
...
0
0
xSPAM start
bang
portfolio
winner
year
...
university
campus
+2
+1
+1
+1
+1
...
-3
-4
w
x
x
x
x
xx
x
x
x
x
x
x
x x
xx
x
classifier generalizes well
on test data
http://pralab.diee.unica.it @biggiobattista
Poisoning Machine Learning
104
x xx
x x
x
x
x
x
x
x
x
x xxx
x
x1
x2
...
xd
pre-processing and
feature extraction
corrupted
training data
classifier learning
is compromised...
Start 2007
with a bang!
Make WBFS
YOUR
PORTFOLIO’s
first winner
of the year
...
university
campus...
start
bang
portfolio
winner
year
...
university
campus
1
1
1
1
1
...
1
1
xSPAM
start
bang
portfolio
winner
year
...
university
campus
+2
+1
+1
+1
+1
...
+1
+1
w
x
x
x
x
xx
x
x
x
x
x
x
x x
xx
x
... to maximize error
on test data
xx x
poisoning
data
http://pralab.diee.unica.it @biggiobattista
• Goal: to maximize classification error
• Knowledge: perfect / white-box attack
• Capability: injecting poisoning samples into TR
• Strategy: find an optimal attack point xc in TR that maximizes classification error
xc
classification error = 0.039classification error = 0.022
Poisoning Attacks against Machine Learning
xc
classification error as a function of xc
[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] 105
http://pralab.diee.unica.it @biggiobattista
Poisoning is a Bilevel Optimization Problem
• Attacker’s objective
– to maximize generalization error on untainted data, w.r.t. poisoning point xc
• Poisoning problem against (linear) SVMs:
Loss estimated on validation data
(no attack points!)
Algorithm is trained on surrogate data
(including the attack point)
[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
[Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015]
[Munoz-Gonzalez, Biggio, Roli et al., Towards poisoning of deep learning..., AISec 2017]
max
$%
& '()*, ,∗
s. t. ,∗
= argmin6 ℒ '89 ∪ ;<, =< , ,
max
$%
>
?@A
B
max(0,1 − =?,∗ ;? )
s. t. ,∗
= argminH,I
A
J
HK
H + C ∑O@A
P
max(0,1 − =O, ;O ) + C max(0,1 − =<, ;< )
106
http://pralab.diee.unica.it @biggiobattista
xc
(0) xc
Gradient-based Poisoning Attacks
• Gradient is not easy to compute
– The training point affects the classification function
• Trick:
– Replace the inner learning problem with its equilibrium (KKT)
conditions
– This enables computing gradient in closed form
• Example for (kernelized) SVM
– similar derivation for Ridge, LASSO, Logistic Regression, etc.
107
xc
(0)
xc
[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
[Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015]
http://pralab.diee.unica.it @biggiobattista
Experiments on MNIST digits
Single-point attack
• Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000
– ‘0’ is the malicious (attacking) class
– ‘4’ is the legitimate (attacked) one
xc
(0)
xc
108[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
http://pralab.diee.unica.it @biggiobattista
Experiments on MNIST digits
Multiple-point attack
• Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000
– ‘0’ is the malicious (attacking) class
– ‘4’ is the legitimate (attacked) one
109[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
http://pralab.diee.unica.it @biggiobattista
How about Poisoning Deep Nets?
• ICML 2017 Best Paper by Koh et al., “Understanding black-box predictions via Influence
Functions” has derived adversarial training examples against a DNN
– they have been constructed attacking only the last layer (KKT-based attack against logistic
regression) and assuming the rest of the network to be ”frozen”
110
http://pralab.diee.unica.it @biggiobattista
Towards Poisoning Deep Neural Networks
• Solving the poisoning problem without exploiting KKT conditions (back-gradient)
– Muñoz-González, Biggio, Roli et al., AISec 2017 https://arxiv.org/abs/1708.08689
111
112
Countering Poisoning Attacks
What is the rule? The rule is protect yourself at all times
(from the movie “Million dollar baby”, 2004)
http://pralab.diee.unica.it @biggiobattista
Security Measures against Poisoning
• Rationale: poisoning injects outlying training samples
• Two main strategies for countering this threat
1. Data sanitization: remove poisoning samples from training data
• Bagging for fighting poisoning attacks
• Reject-On-Negative-Impact (RONI) defense
2. Robust Learning: learning algorithms that are robust in the presence of poisoning samples
xc
(0)
xc
xc
(0) xc
113
http://pralab.diee.unica.it @biggiobattista
Robust Regression with TRIM
• TRIM learns the model by retaining only training points with the smallest residuals
argmin
',),*
+ ,, -, . =
1
|.|
2
3∈*
5 63 − 83
9 + ;Ω(>)
@ = 1 + A B, . ⊂ 1, … , @ , . = B
[Jagielski, Biggio et al., IEEE Symp. Security and Privacy, 2018] 114
http://pralab.diee.unica.it @biggiobattista
Experiments with TRIM (Loan Dataset)
• TRIM MSE is within 1% of original model MSE
Existing methods
Our defense
No defense
Better defense
115[Jagielski, Biggio et al., IEEE Symp. Security and Privacy, 2018]
116
Other Attacks against ML
http://pralab.diee.unica.it @biggiobattista
Attacks against Machine Learning
Integrity Availability Privacy / Confidentiality
Test data Evasion (a.k.a. adversarial
examples)
- Model extraction / stealing
Model inversion (hill-climbing)
Membership inference attacks
Training data Poisoning (to allow subsequent
intrusions) – e.g., backdoors or
neural network trojans
Poisoning (to maximize
classification error)
-
[Biggio & Roli, Wild Patterns, 2018 https://arxiv.org/abs/1712.03141]
Misclassifications that do
not compromise normal
system operation
Misclassifications that
compromise normal
system operation
Attacker’s Goal
Attacker’s Capability
Querying strategies that reveal
confidential information on the
learning model or its users
Attacker’s Knowledge:
• perfect-knowledge (PK) white-box attacks
• limited-knowledge (LK) black-box attacks (transferability with surrogate/substitute learning models)
117
http://pralab.diee.unica.it @biggiobattista
Model Inversion Attacks
Privacy Attacks
• Goal: to extract users’ sensitive information
(e.g., face templates stored during user enrollment)
– Fredrikson, Jha, Ristenpart. Model inversion attacks that exploit
confidence information and basic countermeasures. ACM CCS, 2015
• Also known as hill-climbing attacks in the biometric community
– Adler. Vulnerabilities in biometric encryption systems.
5th Int’l Conf. AVBPA, 2005
– Galbally, McCool, Fierrez, Marcel, Ortega-Garcia. On the vulnerability of
face verification systems to hill-climbing attacks. Patt. Rec., 2010
• How: by repeatedly querying the target system and adjusting the
input sample to maximize its output score (e.g., a measure of the
similarity of the input sample with the user templates)
118
Reconstructed Image
Training Image
http://pralab.diee.unica.it @biggiobattista
Membership Inference Attacks
Privacy Attacks (Shokri et al., IEEE Symp. SP 2017)
• Goal: to identify whether an input sample is part of the training set used to learn a deep
neural network based on the observed prediction scores for each class
119
http://pralab.diee.unica.it @biggiobattista
Training data (poisoned)
Backdoored stop sign
(labeled as speedlimit)
Backdoor Attacks
Poisoning Integrity Attacks
120
Backdoor / poisoning integrity attacks place mislabeled training points in a region of
the feature space far from the rest of training data. The learning algorithm labels such
region as desired, allowing for subsequent intrusions / misclassifications at test time
Training data (no poisoning)
T. Gu, B. Dolan-Gavitt, and S. Garg. Badnets: Identifying vulnerabilities in
the machine learning model supply chain. In NIPS Workshop on Machine
Learning and Computer Security, 2017.
X. Chen, C. Liu, B. Li, K. Lu, and D. Song. Targeted backdoor attacks on
deep learning systems using data poisoning. ArXiv e-prints, 2017.
M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine
learning be secure? In Proc. ACM Symp. Information, Computer and
Comm. Sec., ASIACCS ’06, pages 16–25, New York, NY, USA, 2006. ACM.
M. Barreno, B. Nelson, A. Joseph, and J. Tygar. The security of machine
learning. Machine Learning, 81:121–148, 2010.
B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support
vector machines. In J. Langford and J. Pineau, editors, 29th Int’l Conf. on
Machine Learning, pages 1807–1814. Omnipress, 2012.
B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers
under attack. IEEE Transactions on Knowledge and Data Engineering,
26(4):984–996, April 2014.
H. Xiao, B. Biggio, G. Brown, G. Fumera, C. Eckert, and F. Roli. Is feature
selection secure against training data poisoning? In F. Bach and D. Blei,
editors, JMLR W&CP - Proc. 32nd Int’l Conf. Mach. Learning (ICML),
volume 37, pages 1689–1698, 2015.
L. Munoz-Gonzalez, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee,
E. C. Lupu, and F. Roli. Towards poisoning of deep learning algorithms
with back-gradient optimization. In 10th ACM Workshop on Artificial
Intelligence and Security, AISec ’17, pp. 27–38, 2017. ACM.
B. Biggio and F. Roli. Wild patterns: Ten years after the rise of adversarial
machine learning. ArXiv e-prints, 2018.
M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li.
Manipulating machine learning: Poisoning attacks and countermeasures
for regression learning. In 39th IEEE Symp. on Security and Privacy, 2018.
Attackreferred
toasbackdoor
Attackreferredtoas
‘poisoningintegrity’
121
Are Adversarial Examples a Real Security Threat?
http://pralab.diee.unica.it @biggiobattista
World Is Not Digital…
• ….Previous cases of adversarial examples have common characteristic: the
adversary is able to precisely control the digital representation of the
input to the machine learning tools…..
122
[M. Sharif et al., ACM CCS 2016]
School Bus (x) Ostrich
Struthio Camelus
Adversarial Noise (r)
123
Do Adversarial Examples Exist in the Physical
World?
http://pralab.diee.unica.it @biggiobattista
• Adversarial images fool deep networks even when they operate in the physical world, for
example, images are taken from a cell-phone camera?
– Alexey Kurakin et al. (2016, 2017) explored the possibility of creating adversarial images for
machine learning systems which operate in the physical world. They used images taken from a
cell-phone camera as an input to an Inception v3 image classification neural network
– They showed that in such a set-up, a significant fraction of adversarial images crafted using the
original network are misclassified even when fed to the classifier through the camera
[Alexey Kurakin et al., ICLR 2017]
Adversarial Images in the Physical World
124
http://pralab.diee.unica.it @biggiobattista [M. Sharif et al., ACM CCS 2016]
The adversarial perturbation is
applied only to the eyeglasses
image region
Adversarial Glasses
125
http://pralab.diee.unica.it @biggiobattista
Should We Be Worried ?
126
http://pralab.diee.unica.it @biggiobattista
No, We Should Not…
127
In this paper, we show experiments that suggest that a trained neural
network classifies most of the pictures taken from different distances and
angles of a perturbed image correctly. We believe this is because the
adversarial property of the perturbation is sensitive to the scale at which
the perturbed picture is viewed, so (for example) an autonomous car will
misclassify a stop sign only from a small range of distances.
[arXiv:1707.03501; CVPR 2017]
http://pralab.diee.unica.it @biggiobattista
Yes, We Should...
128[Athalye et al., Synthesizing robust adversarial examples. ICLR, 2018]
http://pralab.diee.unica.it @biggiobattista
Yes, We Should…
129
http://pralab.diee.unica.it @biggiobattista
Adversarial Road Signs
[Evtimov et al., CVPR 2017] 130
http://pralab.diee.unica.it @biggiobattista
• Adversarial examples can exist in the physical world, we can fabricate concrete
adversarial objects (glasses, road signs, etc.)
• But the effectiveness of attacks carried out by adversarial objects is still to be
investigated with large scale experiments in realistic security scenarios
• Gilmer et al. (2018) have recently discussed the realism of security threat caused
by adversarial examples, pointing out that it should be carefully investigated
– Are indistinguishable adversarial examples a real security threat ?
– For which real security scenarios adversarial examples are the best attack vector?
Better than attacking components outside the machine learning component
– …
131
Is This a Real Security Threat?
[Justin Gilmer et al., Motivating the Rules of the Game for
Adversarial Example Research, https://arxiv.org/abs/1807.06732]
132
Are Indistinguishable Perturbations a Real Security
Threat?
http://pralab.diee.unica.it @biggiobattista
!(#) ≠ &
The adversarial image x + r is visually
hard to distinguish from x
…There is a torrent of work that views increased robustness to restricted
perturbations as making these models more secure. While not all of this work
requires completely indistinguishable modications, many of the papers focus on
specifically small modications, and the language in many suggests or implies that
the degree of perceptibility of the perturbations is an important aspect of their
security risk…
Indistinguishable Adversarial Examples
133
[Justin Gilmer et al., Motivating the Rules of the Game
for Adversarial Example Research, arXiv 2018]
http://pralab.diee.unica.it @biggiobattista
• The attacker can benefit by minimal perturbation of a legitimate input; e.g., she
could use the attack for a longer period of time before it is detected
• But is minimal perturbation a necessary constraint for the attacker?
Indistinguishable Adversarial Examples
134
http://pralab.diee.unica.it @biggiobattista
• Is minimal perturbation a necessary constraint for the attacker?
Indistinguishable Adversarial Examples
135
http://pralab.diee.unica.it @biggiobattista
Attacks with Content Preservation
136
There are well known security applications where minimal perturbations and
indistinguishability of adversarial inputs are not required at all…
http://pralab.diee.unica.it @biggiobattista
…At the time of writing, we were unable to find a compelling example
that required indistinguishability…
To have the largest impact, we should both recast future adversarial
example research as a contribution to core machine learning and
develop new abstractions that capture realistic threat models.
137
Are Indistinguishable Perturbations a Real Security Threat?
[Justin Gilmer et al., Motivating the Rules of the Game
for Adversarial Example Research, arXiv 2018]
http://pralab.diee.unica.it @biggiobattista
To Conclude…
This is a recent research field…
138
Dagstuhl Perspectives Workshop on
“Machine Learning in Computer Security”
Schloss Dagstuhl, Germany, Sept. 9th-14th, 2012
http://pralab.diee.unica.it @biggiobattista
Timeline of Learning Security
139
Security of DNNs
Adversarial M
L
2004-2005: pioneering work
Dalvi et al., KDD
200; Lowd &
M
eek, KDD
2005
2013: Srndic &
Laskov, NDSS
claim
nonlinear classifiers are secure
2006-2010: Barreno, Nelson,
Rubinstein, Joseph, Tygar
The Security of M
achine Learning
2013-2014: Biggio et al., ECM
L, IEEE TKDE
high-confidence &
black-box evasion attacks
to show
vulnerability of nonlinear classifiers
2014: Srndic &
Laskov, IEEE S&P
shows vulnerability of nonlinear classifiers
with our ECM
L ‘13 gradient-based attack
2014: Szegedy et al., ICLR
adversarial exam
ples vs DL
2016: Papernot et al., IEEE S&P
evasion attacks / adversarial exam
ples
2017: Papernot et al., ASIACCS
black-box evasion attacks
2017: Carlini &
W
agner, IEEE S&P
high-confidence evasion attacks
2017: Grosse et al., ESORICS
Application to Android m
alware
2017: Dem
ontis et al., IEEE TDSC
Secure learning for Android m
alware
2004
2014
2006 2013 2014 2017
2016 2017 2017 2017
2014
http://pralab.diee.unica.it @biggiobattista
Timeline of Learning Security
140
Security of DNNs
Adversarial M
L
2004-2005: pioneering work
Dalvi et al., KDD
200; Lowd &
M
eek, KDD
2005
2013: Srndic &
Laskov, NDSS
claim
nonlinear classifiers are secure
2006-2010: Barreno, Nelson,
Rubinstein, Joseph, Tygar
The Security of M
achine Learning
2013-2014: Biggio et al., ECM
L, IEEE TKDE
high-confidence &
black-box evasion attacks
to show
vulnerability of nonlinear classifiers
2014: Srndic &
Laskov, IEEE S&P
shows vulnerability of nonlinear classifiers
with our ECM
L ‘13 gradient-based attack
2014: Szegedy et al., ICLR
adversarial exam
ples vs DL
2016: Papernot et al., IEEE S&P
evasion attacks / adversarial exam
ples
2017: Papernot et al., ASIACCS
black-box evasion attacks
2017: Carlini &
W
agner, IEEE S&P
high-confidence evasion attacks
2017: Grosse et al., ESORICS
Application to Android m
alware
2004
2014
2006 2013 2014 2017
2016 2017 2017 2017
2014
2017: Dem
ontis et al., IEEE TDSC
Secure learning for Android m
alware
http://pralab.diee.unica.it @biggiobattista
Timeline of Learning Security
AdversarialM
L
2004-2005: pioneering work
Dalvi et al., KDD 2004
Lowd & Meek, KDD 2005
2013: Srndic & Laskov, NDSS
2013: Biggio et al., ECML-PKDD - demonstrated vulnerability of nonlinear algorithms
to gradient-based evasion attacks, also under limited knowledge
Main contributions:
1. gradient-based adversarial perturbations (against SVMs and neural nets)
2. projected gradient descent / iterative attack (also on discrete features from malware data)
transfer attack with surrogate/substitute model
3. maximum-confidence evasion (rather than minimum-distance evasion)
Main contributions:
- minimum-distance evasion of linear classifiers
- notion of adversary-aware classifiers
2006-2010: Barreno, Nelson,
Rubinstein, Joseph, Tygar
The Security of Machine Learning
(and references therein)
Main contributions:
- first consolidated view of the adversarial ML problem
- attack taxonomy
- exemplary attacks against some learning algorithms
2014: Szegedy et al., ICLR
Independent discovery of (gradient-
based) minimum-distance adversarial
examples against deep nets; earlier
implementation of adversarial training
SecurityofDNNs
2016: Papernot et al., IEEE S&P
Framework for security evalution of
deep nets
2017: Papernot et al., ASIACCS
Black-box evasion attacks with
substitute models (breaks distillation
with transfer attacks on a smoother
surrogate classifier)
2017: Carlini & Wagner, IEEE S&P
Breaks again distillation with
maximum-confidence evasion attacks
(rather than using minimum-distance
adversarial examples)
2016: Papernot et al., Euro S&P
Distillation defense (gradient masking)
Main contributions:
- evasion of linear PDF malware detectors
- claims nonlinear classifiers can be more secure
2014: Biggio et al., IEEE TKDE Main contributions:
- framework for security evaluation of learning algorithms
- attacker’s model in terms of goal, knowledge, capability
2017: Demontis et al., IEEE TDSC
Yes, Machine Learning Can Be
More Secure! A Case Study on
Android Malware Detection
Main contributions:
- Secure SVM against adversarial examples in malware
detection
2017: Grosse et al., ESORICS
Adversarial examples for
malware detection
2018: Madry et al., ICLR
Improves the basic iterative attack from
Kurakin et al. by adding noise before
running the attack; first successful use of
adversarial training to generalize across
many attack algorithms
2014: Srndic & Laskov, IEEE S&P
used Biggio et al.’s ECML-PKDD ‘13 gradient-based evasion attack to demonstrate
vulnerability of nonlinear PDF malware detectors
2006: Globerson & Roweis, ICML
2009: Kolcz et al., CEAS
2010: Biggio et al., IJMLC
Main contributions:
- evasion attacks against linear classifiers in spam filtering
Work on security evaluation of learning algorithms
Work on evasion attacks (a.k.a. adversarial examples)
Pioneering work on adversarial machine learning
... in malware detection (PDF / Android)
Legend
1
2
3
4
1
2
3
4
2015: Goodfellow et al., ICLR
Maximin formulation of adversarial
training, with adversarial examples
generated iteratively in the inner loop
2016: Kurakin et al.
Basic iterative attack with projected
gradient to generate adversarial examples
2 iterative attacks
Biggio and Roli, Wild Patterns: Ten Years
After The Rise of Adversarial Machine
Learning, Pattern Recognition, 2018
141
http://pralab.diee.unica.it @biggiobattista
Black Swans to the Fore
142
[Szegedy et al., Intriguing properties of neural networks, 2014]
After this “black swan”, the issue of
security of DNNs came to the fore…
Not only on scientific specialistic
journals…
http://pralab.diee.unica.it @biggiobattista
The Safety Issue to the Fore…
143
The black box of AI
D. Castelvecchi, Nature, Vol. 538, 20, Oct
2016
Machine learning is becoming ubiquitous in
basic research as well as in industry. But for
scientists to trust it, they first need to
understand what the machines are doing.
Ellie Dobson, director of data science at the
big-data firm Arundo Analytics in Oslo: If
something were to go wrong as a result of
setting the UK interest rates, she says, “the
Bank of England can’t say, the black box
made me do it’”.
http://pralab.diee.unica.it @biggiobattista
Why So Much Interest?
144
Before the deep net “revolution”, people were not surprised
when machine learning was wrong, they were more amazed
when it worked well…
Now that it seems to work for real applications, people are
disappointed, and worried, for errors that humans do not do…
http://pralab.diee.unica.it @biggiobattista
Errors of Humans and Machines…
145
Machine learning decisions are affected by several
sources of bias that causes “strange” errors
But we should keep in mind that also humans are
biased…
http://pralab.diee.unica.it @biggiobattista
The Bat and the Ball Problem
146
A bat and a ball together cost $ 1.10
The bat costs $ 1.0 more than the ball
How much does the ball cost ?
Please, give me the first answer coming to
your mind !
http://pralab.diee.unica.it @biggiobattista
The Bat and the Ball Problem
147
Exact solution is 0.05 dollar (5 cents)
The wrong solution ($ 0.10) is due to the attribute substitution, a
psychological process thought to underlie a number of cognitive biases
It occurs when an individual has to make a judgment (of a target
attribute) that is computationally complex, and instead substitutes a
more easily calculated heuristic attribute
bat+ball=$1.10
bat=ball+$1.0
⎧
⎨
⎪
⎩⎪
http://pralab.diee.unica.it @biggiobattista
Trust in Humans or Machines?
Algorithms are biased, but
also humans are as well…
When should you trust
humans and when
algorithms?
148
http://pralab.diee.unica.it @biggiobattista
Learning Comes at a Price!
149
The introduction of novel
learning functionalities increases the
attack surface of computer systems
and produces new vulnerabilities
Safety of machine learning
will be more and more important in
future computer systems, as well as
accountability, transparency, and the
protection of fundamental human
values and rights
Thanks for listening.
Any questions?
If you know the enemy and know yourself, you need not fear
the result of a hundred battles
Sun Tzu, The art of war, 500 BC
Battista Biggio
battista.biggio@unica.it
@biggiobattista
http://pralab.diee.unica.it @biggiobattista
References
• B. Biggio and F. Roli. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 2018.
• B. Biggio, G. Fumera and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data
Engineering, 26(4):984–996, April 2014.
• B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto and F. Roli. Evasion attacks against machine learning at test
time. ECML PKDD, 2013.
• F. Crecchi, D. Bacciu, and B. Biggio. Detecting adversarial examples through nonlinear dimensionality reduction. ESANN, 2019.
• B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert and F. Roli. Adversarial Malware Binaries: Evading Deep
Learning for Malware Detection in Executables. EUSIPCO 2018.
• A. Demontis, M. Melis, M. Pintor, M. Jagielski, B. Biggio, A. Oprea, C. Nita-Rotaru, and F. Roli. Why do adversarial attacks transfer?
Explaining transferability of evasion and poisoning attacks. In 28th USENIX Security Symposium (USENIX Security 19).
• A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck, I. Corona, G. Giacinto, and F. Roli. Yes, machine learning can be more
secure! a case study on android malware detection. IEEE Trans. Dep. and Secure Computing, In press.
• M. Melis, A. Demontis, B. Biggio, G. Brown, G. Fumera, and F. Roli. Is deep learning safe for robot vision? Adversarial examples against
the iCub humanoid. In ICCV Workshop on Vision in Practice on Autonomous Robots (ViPAR), 2017.
• B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support vector machines. In J. Langford and J. Pineau, editors, 29th Int’l
Conf. on Machine Learning, pages 1807–1814. Omnipress, 2012.
• H. Xiao, B. Biggio, G. Brown, G. Fumera, C. Eckert, and F. Roli. Is feature selection secure against training data poisoning? In F. Bach and
D. Blei, editors, JMLR W&CP - Proc. 32nd Int’l Conf. Mach. Learning (ICML), volume 37, pages 1689–1698, 2015.
• L. Munoz-Gonzalez, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli. Towards poisoning of deep learning
algorithms with back-gradient optimization. AISec ’17, pages 27–38, New York, NY, USA, 2017. ACM.
• M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li. Manipulating machine learning: Poisoning attacks and
countermeasures for regression learning. In 39th IEEE Symposium on Security and Privacy, 2018.
• A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok. Synthesizing robust adversarial examples. In ICLR, 2018.
• C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR,
2014.
151

More Related Content

Similar to Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 2019 - Kobe, Japan

Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Pluribus One
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
(Very) Recent AI advances for Chemical Engineering research and education
(Very) Recent AI advances for Chemical Engineering research and education(Very) Recent AI advances for Chemical Engineering research and education
(Very) Recent AI advances for Chemical Engineering research and educationRichard West
 
Digital Fabrication Studio 0.3 Fabbing and FabLabs
Digital Fabrication Studio 0.3 Fabbing and FabLabsDigital Fabrication Studio 0.3 Fabbing and FabLabs
Digital Fabrication Studio 0.3 Fabbing and FabLabsMassimo Menichinelli
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesClément DUFFAU
 
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystem
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystemDigital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystem
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystemMassimo Menichinelli
 
Practical Machine Ethics @ SXSW2019
Practical Machine Ethics @ SXSW2019Practical Machine Ethics @ SXSW2019
Practical Machine Ethics @ SXSW2019Jesus Ramos
 
Deep Learning - a Path from Big Data Indexing to Robotic Applications
Deep Learning - a Path from Big Data Indexing to Robotic ApplicationsDeep Learning - a Path from Big Data Indexing to Robotic Applications
Deep Learning - a Path from Big Data Indexing to Robotic ApplicationsDarius Burschka
 
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Grigory Sapunov
 
Dl applicationlandscape-mar2018-180405144127
Dl applicationlandscape-mar2018-180405144127Dl applicationlandscape-mar2018-180405144127
Dl applicationlandscape-mar2018-180405144127Aravindharamanan S
 
How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideMarcus Botacin
 
Deep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningDeep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningJustin Beirold
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxGreg Makowski
 
Explore, Explain, and Debug aka Interpretable Machine Learning
Explore, Explain, and Debug aka Interpretable Machine LearningExplore, Explain, and Debug aka Interpretable Machine Learning
Explore, Explain, and Debug aka Interpretable Machine LearningPrzemek Biecek
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
Predictive analytics semi-supervised learning with GANs
Predictive analytics   semi-supervised learning with GANsPredictive analytics   semi-supervised learning with GANs
Predictive analytics semi-supervised learning with GANsterek47
 

Similar to Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 2019 - Kobe, Japan (20)

Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
(Very) Recent AI advances for Chemical Engineering research and education
(Very) Recent AI advances for Chemical Engineering research and education(Very) Recent AI advances for Chemical Engineering research and education
(Very) Recent AI advances for Chemical Engineering research and education
 
Design, AI, and "-isms"
Design, AI, and "-isms"Design, AI, and "-isms"
Design, AI, and "-isms"
 
Digital Fabrication Studio 0.3 Fabbing and FabLabs
Digital Fabrication Studio 0.3 Fabbing and FabLabsDigital Fabrication Studio 0.3 Fabbing and FabLabs
Digital Fabrication Studio 0.3 Fabbing and FabLabs
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biases
 
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystem
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystemDigital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystem
Digital Fabrication Studio v.0.2: Digital Fabrication and FabLab ecosystem
 
Practical Machine Ethics @ SXSW2019
Practical Machine Ethics @ SXSW2019Practical Machine Ethics @ SXSW2019
Practical Machine Ethics @ SXSW2019
 
Deep Learning - a Path from Big Data Indexing to Robotic Applications
Deep Learning - a Path from Big Data Indexing to Robotic ApplicationsDeep Learning - a Path from Big Data Indexing to Robotic Applications
Deep Learning - a Path from Big Data Indexing to Robotic Applications
 
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018
 
Dl applicationlandscape-mar2018-180405144127
Dl applicationlandscape-mar2018-180405144127Dl applicationlandscape-mar2018-180405144127
Dl applicationlandscape-mar2018-180405144127
 
How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guide
 
Deep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningDeep Neural Networks for Machine Learning
Deep Neural Networks for Machine Learning
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptx
 
Explore, Explain, and Debug aka Interpretable Machine Learning
Explore, Explain, and Debug aka Interpretable Machine LearningExplore, Explain, and Debug aka Interpretable Machine Learning
Explore, Explain, and Debug aka Interpretable Machine Learning
 
20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
Predictive analytics semi-supervised learning with GANs
Predictive analytics   semi-supervised learning with GANsPredictive analytics   semi-supervised learning with GANs
Predictive analytics semi-supervised learning with GANs
 

More from Pluribus One

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Pluribus One
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsPluribus One
 
Secure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksSecure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksPluribus One
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Pluribus One
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringPluribus One
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Pluribus One
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Pluribus One
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Pluribus One
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Pluribus One
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesPluribus One
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Pluribus One
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Pluribus One
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Pluribus One
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsPluribus One
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 PosterPluribus One
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterPluribus One
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisPluribus One
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Pluribus One
 

More from Pluribus One (20)

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial Settings
 
Secure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksSecure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion Attacks
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environments
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern Analysis
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011
 

Recently uploaded

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 2019 - Kobe, Japan

  • 1. Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning Battista Biggio Pattern Recognition and Applications Lab University of Cagliari, Italy ICMLC Tutorial – July 7, 2019 - Hotel Portopia, Kobe, Japan * Slides from this talk are inspired from the tutorial I prepared with Fabio Roli on such topic. https://www.pluribus-one.it/research/sec-ml/wild-patterns/
  • 2. http://pralab.diee.unica.it @biggiobattista A Question to Start… What is the oldest survey article on machine learning that you have ever read? What is the publication year? 2
  • 3. http://pralab.diee.unica.it @biggiobattista This Is Mine… Year 1966 3Credits: Dr Gavin Brown for showing me this article
  • 4. http://pralab.diee.unica.it @biggiobattista Applications in the Old Good Days… What applications do you think that this paper dealt with? 4
  • 5. http://pralab.diee.unica.it @biggiobattista Popular Applications in the Sixties 5 OCR for bank cheque sorting Aerial photo recognition Detection of particle tracks in bubble chambers
  • 6. http://pralab.diee.unica.it @biggiobattista Key Feature of these Apps 6 Specialised applications for professional users…
  • 7. 7 What about Today Applications?
  • 8. http://pralab.diee.unica.it @biggiobattista Computer Vision for Self-Driving Cars 8 He et al., Mask R-CNN, ICCV ’17, https://arxiv.org/abs/1703.06870 Video from: https://www.youtube.com/watch?v=OOT3UIXZztE
  • 9. http://pralab.diee.unica.it @biggiobattista Automatic Speech Recognition for Virtual Assistants Amazon Alexa Apple Siri Microsoft Cortana Google Assistant 9
  • 11. http://pralab.diee.unica.it @biggiobattista Key Features of Today Apps 11 Personal and consumer applications…
  • 12. http://pralab.diee.unica.it @biggiobattista We Are Living in the Best of the Worlds… 12 AI is going to transform industry and business as electricity did about a century ago (Andrew Ng, Jan. 2017)
  • 14. http://pralab.diee.unica.it @biggiobattista iPhone 5s and 6s with Fingerprint Reader… 14
  • 16. 16 But maybe this happens only for old, shallow machine learning… End-to-end deep learning is another story…
  • 17. http://pralab.diee.unica.it @biggiobattista Adversarial School Bus 17 Szegedy et al., Intriguing properties of neural networks, ICLR 2014 Biggio, Roli et al., Evasion attacks against machine learning at test time, ECML-PKDD 2013
  • 18. http://pralab.diee.unica.it @biggiobattista Adversarial Glasses • M. Sharif et al. (ACM CCS 2016) attacked deep neural networks for face recognition with carefully-fabricated eyeglass frames • When worn by a 41-year-old white male (left image), the glasses mislead the deep network into believing that the face belongs to the famous actress Milla Jovovich 18
  • 19. 19 But maybe this happens only for image recognition…
  • 20. http://pralab.diee.unica.it @biggiobattista Audio Adversarial Examples 20 “without the dataset the article is useless” “okay google browse to evil dot com” Transcription by Mozilla DeepSpeechAudio https://nicholas.carlini.com/code/audio_adversarial_examples/
  • 21. http://pralab.diee.unica.it @biggiobattista Evasion of Deep Networks for EXE Malware Detection • MalConv: convolutional deep network trained on raw bytes to detect EXE malware • Our attack can evade it by adding few padding bytes 21 [Kolosniaji, Biggio, Roli et al., Adversarial Malware Binaries, EUSIPCO2018] [Demetrio, Biggio et al., Explaining Vulnerability of DL ..., ITASEC 2019]
  • 22. http://pralab.diee.unica.it @biggiobattista Take-home Message We are living exciting time for machine learning… …Our work feeds a lot of consumer technologies for personal applications... This opens up new big possibilities, but also new security risks 22
  • 23. 23 Where Do These Security Risks Come From?
  • 24. http://pralab.diee.unica.it @biggiobattista The Classical Statistical Model Note these two implicit assumptions of the model: 1. The source of data is given, and it does not depend on the classifier 2. Noise affecting data is stochastic 24 Data source acquisition/ measurementRaw data x1 x2 ... xd feature vector learning algorithm classifier stochastic noise ed by sets of coupled s for formal neurons ation of essentials feat Example: OCR
  • 25. http://pralab.diee.unica.it @biggiobattista Can This Model Be Used Under Attack? 25 Data source acquisition/ measurementRaw data x1 x2 ... xd feature vector learning algorithm classifier stochastic noise ed by sets of coupled s for formal neurons ation of essentials feat Example: OCR
  • 26. http://pralab.diee.unica.it @biggiobattista An Example: Spam Filtering 26 Total score = 6.0 From: spam@example.it Buy Viagra ! > 5.0 (threshold) Spam Linear Classifier ØThe famous SpamAssassin filter is really a linear classifier §http://spamassassin.apache.org Feature weights buy = 1.0 viagra = 5.0
  • 27. http://pralab.diee.unica.it @biggiobattista Feature Space View 27 X2 X1 + + + + + - - - - - yc(x) Feature weights buy = 1.0 viagra = 5.0 From: spam@example.it Buy Viagra! • Classifier’s weights are learned from training data • The SpamAssassin filter uses the perceptron algorithm
  • 28. 28 But spam filtering is not a stationary classification task, the data source is not neutral…
  • 29. http://pralab.diee.unica.it @biggiobattista The Data Source Can Add “Good” Words 29 Total score = 1.0 From: spam@example.it Buy Viagra ! conference meeting < 5.0 (threshold) Linear Classifier Feature weights buy = 1.0 viagra = 5.0 conference = -2.0 meeting = -3.0 Ham üAdding good words is a typical spammers trick [Z. Jorgensen et al., JMLR 2008]
  • 30. http://pralab.diee.unica.it @biggiobattista Adding Good Words: Feature Space View 30 X2 X1 + + + + + - - - - - yc(x) Feature weights buy = 1.0 viagra = 5.0 conference = -2.0 meeting = -3.0 From: spam@example.it Buy Viagra! conference meeting - üNote that spammers corrupt patterns with a noise that is not random..
  • 31. http://pralab.diee.unica.it @biggiobattista Is This Model Good for Spam Filtering? Ø The source of data is given, and it does not depend on the classifier Ø Noise affecting data is stochastic (“random”) 31 Data source acquisition/ measurementRaw data x1 x2 ... xd feature vector learning algorithm classifier stochastic noise ed by sets of coupled s for formal neurons ation of essentials feat Example: OCR
  • 32. 32 No, it is not…
  • 33. http://pralab.diee.unica.it @biggiobattista Adversarial Machine Learning 1. The source of data is not neutral, it depends on the classifier 2. Noise is not stochastic, it is adversarial, crafted to maximize the classification error 33 measurementRaw data x1 x2 ... xn feature vector learning algorithm classifier adversarial noise Spam message: Buy Viagra Camouflaged message: Buy Vi@gra Dublin University Non-neutral data source
  • 34. http://pralab.diee.unica.it @biggiobattista Adversarial Noise vs. Stochastic Noise 34 Hamming’s adversarial noise model: the channel acts as an adversary that arbitrarily corrupts the code-word subject to a bound on the total number of errors Shannon’s stochastic noise model: probabilistic model of the channel, the probability of occurrence of too many or too few errors is usually low • This distinction is not new...
  • 35. http://pralab.diee.unica.it @biggiobattista The Classical Model Cannot Work • Standard classification algorithms assume that – data generating process is independent from the classifier – training /test data follow the same distribution (i.i.d. samples) • This is not the case for adversarial tasks! • Easy to see that classifier performance will degrade quickly if the adversarial noise is not taken into account – Adversarial tasks are a mission impossible for the classical model 35
  • 36. 36 How Should We Design Pattern Classifiers Under Attack?
  • 37. http://pralab.diee.unica.it @biggiobattista Adversary-aware Machine Learning 37 Machine learning systems should be aware of the arms race with the adversary [Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014] Adversary System Designer Analyze system Devise and execute attack Analyze attack Develop countermeasure
  • 38. http://pralab.diee.unica.it @biggiobattista • In 2004 spammers invented a new trick for evading anti-spam filters… – As filters did not analyze the content of attached images… – Spammers embedded their messages into images…so evading filters… Image-based Spam Your orological prescription appointment starts September 30th bergstrom mustsquawbush try bimini , maine see woodwind in con or patagonia or scrapbook but. patriarchal and tasteful must advisory not thoroughgoing the frowzy not ellwood da jargon and. beresford ! arpeggio must stern try disastrous ! alone , wear da esophagi try autonomic da clyde and taskmaster, tideland try cream see await must mort in. From: Conrad Stern <rjlfm@berlin.de> To: utente@emailserver.it Arms Race: The Case of Image Spam 38
  • 39. http://pralab.diee.unica.it @biggiobattista • PRA Lab team proposed a countermeasure against image spam… – G. Fumera, I. Pillai, F. Roli, Spam filtering based on the analysis of text information embedded into images, Journal of Machine Learning Research, Vol. 7, 2006 • Text embedded in images is read by Optical Character Recognition (OCR) • OCRing image text and fusing it with other mail data allows discriminating spam/ham mails Arms Race: The Case of Image Spam 39
  • 40. http://pralab.diee.unica.it @biggiobattista • The OCR-based solution was deployed as a plug-in of SpamAssassin filter (called Bayes OCR) and worked well for a while… http://wiki.apache.org/spamassassin/CustomPlugins Arms Race: The Case of Image Spam 40
  • 41. http://pralab.diee.unica.it @biggiobattista Spammers’ Reaction • Spammers reacted quickly with a countermeasure against OCR-based solutions (and against signature-based image spam detection) • They applied content obscuring techniques to images, like done in CAPTCHAs, to make OCR systems ineffective without compromising human readability 41
  • 42. http://pralab.diee.unica.it @biggiobattista You find the complete story here: http://en.wikipedia.org/wiki/Image_spam Arms Race: The Case of Image Spam • PRA Lab did another countermove by devising features which detect the presence of spammers obfuscation techniques in text images ü A feature for detecting characters fragmented or mixed with small background components ü A feature for detecting characters connected through background components ü A feature for detecting non-uniform background, hidden text • This solution was deployed as a new plug-in of SpamAssassin filter (called Image Cerberus) 42
  • 43. 43 How Can We Design Adversary-aware Machine Learning Systems?
  • 44. http://pralab.diee.unica.it @biggiobattista Adversary-aware Machine Learning 44 Machine learning systems should be aware of the arms race with the adversary [Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014] Adversary System Designer Analyze system Devise and execute attack Analyze attack Develop countermeasure
  • 45. http://pralab.diee.unica.it @biggiobattista Adversary-aware Machine Learning 45 Machine learning systems should be aware of the arms race with the adversary [Biggio, Fumera, Roli. Security evaluation of pattern classifiers under attack, IEEE TKDE, 2014] System Designer System Designer Simulate attack Evaluate attack’s impact Develop countermeasureModel adversary
  • 46. http://pralab.diee.unica.it @biggiobattista The Three Golden Rules 1. Know your adversary 2. Be proactive 3. Protect your classifier 46
  • 47. 47 Know your adversary If you know the enemy and know yourself, you need not fear the result of a hundred battles (Sun Tzu, The art of war, 500 BC)
  • 48. http://pralab.diee.unica.it @biggiobattista Adversary’s 3D Model 48 Adversary’s Knowledge Adversary’s Capability Adversary’s Goal
  • 49. http://pralab.diee.unica.it @biggiobattista Adversary’s Goal • To cause a security violation... 49 Misclassifications that do not compromise normal system operation Integrity Misclassifications that compromise normal system operation (denial of service) Availability Querying strategies that reveal confidential information on the learning model or its users Confidentiality / Privacy [Barreno et al., Can Machine Learning Be Secure? ASIACCS ‘06]
  • 50. http://pralab.diee.unica.it @biggiobattista Adversary’s Knowledge • Perfect-knowledge (white-box) attacks – upper bound on the performance degradation under attack 50 TRAINING DATA FEATURE REPRESENTATION LEARNING ALGORITHM e.g., SVM x1 x2 ... xd x xx x x x x x x x x x x xxx x - Learning algorithm - Parameters (e.g., feature weights) - Feedback on decisions [B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
  • 51. http://pralab.diee.unica.it @biggiobattista • Limited-knowledge Attacks – Ranging from gray-box to black-box attacks 51 TRAINING DATA FEATURE REPRESENTATION LEARNING ALGORITHM e.g., SVM x1 x2 ... xd x xx x x x x x x x x x x xxx x - Learning algorithm - Parameters (e.g., feature weights) - Feedback on decisions Adversary’s Knowledge [B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
  • 52. http://pralab.diee.unica.it @biggiobattista Kerckhoffs’ Principle • Kerckhoffs’ Principle (Kerckhoffs 1883) states that the security of a system should not rely on unrealistic expectations of secrecy – It’s the opposite of the principle of “security by obscurity” • Secure systems should make minimal assumptions about what can realistically be kept secret from a potential attacker • For machine learning systems, one could assume that the adversary is aware of the learning algorithm and can obtain some degree of information about the data used to train the learner • But the best strategy is to assess system security under different levels of adversary’s knowledge 52[Joseph et al., Adversarial Machine Learning, Cambridge Univ. Press, 2017]
  • 53. http://pralab.diee.unica.it @biggiobattista Adversary’s Capability 53[B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014; M. Barreno et al., ML 2010] • Attackers may manipulate training data and/or test data TRAINING TEST Influence model at training time to cause subsequent errors at test time poisoning attacks, backdoors Manipulate malicious samples at test time to cause misclassications evasion attacks, adversarial examples
  • 54. http://pralab.diee.unica.it @biggiobattista A Deliberate Poisoning Attack? 54 [http://exploringpossibilityspace.blogspot.it/2016 /03/poor-software-qa-is-root-cause-of-tay.html] Microsoft deployed Tay, and AI chatbot designed to talk to youngsters on Twitter, but after 16 hours the chatbot was shut down since it started to raise racist and offensive comments.
  • 55. http://pralab.diee.unica.it @biggiobattista Adversary’s Capability • Luckily, the adversary is not omnipotent, she is constrained… 55[R. Lippmann, Dagstuhl Workshop, 2012] Email messages must be understandable by human readers Malware must execute on a computer, usually exploiting a known vulnerability
  • 56. http://pralab.diee.unica.it @biggiobattista • Constraints on data manipulation – maximum number of samples that can be added to the training data • the attacker usually controls only a small fraction of the training samples – maximum amount of modifications • application-specific constraints in feature space • e.g., max. number of words that are modified in spam emails 56 d(x, !x ) ≤ dmax x2 x1 f(x) x Feasible domain x ' TRAINING TEST Adversary’s Capability [B. Biggio, G. Fumera, F. Roli, IEEE TKDE 2014]
  • 57. http://pralab.diee.unica.it @biggiobattista Conservative Design • The design and analysis of a system should avoid unnecessary or unreasonable assumptions on the adversary’s capability – worst-case security evaluation • Conversely, analysing the capabilities of an omnipotent adversary reveals little about a learning system’s behaviour against realistically-constrained attackers • Again, the best strategy is to assess system security under different levels of adversary’s capability 57[Joseph et al., Adversarial Machine Learning, Cambridge Univ. Press, 2017]
  • 58. 58 Be Proactive To know your enemy, you must become your enemy (Sun Tzu, The art of war, 500 BC)
  • 59. http://pralab.diee.unica.it @biggiobattista Be Proactive • Given a model of the adversary characterized by her: – Goal – Knowledge – Capability Try to anticipate the adversary! • What is the optimal attack she can do? • What is the expected performance decrease of your classifier? 59
  • 60. http://pralab.diee.unica.it @biggiobattista Evasion of Linear Classifiers • Problem: how to evade a linear (trained) classifier? Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ... start bang portfolio winner year ... university campus 1 1 1 1 1 ... 0 0 +6 > 0, SPAM (correctly classified) f (x) = sign(wT x) x start bang portfolio winner year ... university campus +2 +1 +1 +1 +1 ... -3 -4 w x’ St4rt 2007 with a b4ng! Make WBFS YOUR PORTFOLIO’s first winner of the year ... campus start bang portfolio winner year ... university campus 0 0 1 1 1 ... 0 1 +3 -4 < 0, HAM (misclassified email) f (x) = sign(wT x) 60
  • 61. http://pralab.diee.unica.it @biggiobattista Evasion of Nonlinear Classifiers • What if the classifier is nonlinear? • Decision functions can be arbitrarily complicated, with no clear relationship between features (x) and classifier parameters (w) −2−1.5−1−0.500.511.5 61
  • 62. http://pralab.diee.unica.it @biggiobattista Detection of Malicious PDF Files Srndic & Laskov, Detection of malicious PDF files based on hierarchical document structure, NDSS 2013 “The most aggressive evasion strategy we could conceive was successful for only 0.025% of malicious examples tested against a nonlinear SVM classifier with the RBF kernel [...]. Currently, we do not have a rigorous mathematical explanation for such a surprising robustness. Our intuition suggests that [...] the space of true features is “hidden behind” a complex nonlinear transformation which is mathematically hard to invert. [...] the same attack staged against the linear classifier [...] had a 50% success rate; hence, the robustness of the RBF classifier must be rooted in its nonlinear transformation” 62
  • 63. http://pralab.diee.unica.it @biggiobattista Evasion Attacks against Machine Learning at Test Time Biggio, Corona, Maiorca, Nelson, Srndic, Laskov, Giacinto, Roli, ECML-PKDD 2013 • Goal: maximum-confidence evasion • Knowledge: perfect (white-box attack) • Attack strategy: • Non-linear, constrained optimization – Projected gradient descent: approximate solution for smooth functions • Gradients of g(x) can be analytically computed in many cases – SVMs, Neural networks −2−1.5−1−0.500.51 x f (x) = sign g(x)( )= +1, malicious −1, legitimate " # $ %$ x ' min $% &((% ) s. t. ( − (% . ≤ 0123 63
  • 64. http://pralab.diee.unica.it @biggiobattista Computing Descent Directions Support vector machines Neural networks x1 xd d1 dk dm xf g(x) w1 wk wm v11 vmd vk1 …… …… g(x) = αi yik(x, i ∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi ) i ∑ g(x) = 1+exp − wkδk (x) k=1 m ∑ # $ % & ' ( ) * + , - . −1 ∂g(x) ∂xf = g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkf k=1 m ∑ RBF kernel gradient: ∇k(x,xi ) = −2γ exp −γ || x − xi ||2 { }(x − xi ) [Biggio, Roli et al., ECML PKDD 2013] 64
  • 65. http://pralab.diee.unica.it @biggiobattista An Example on Handwritten Digits • Nonlinear SVM (RBF kernel) to discriminate between ‘3’ and ‘7’ • Features: gray-level pixel values (28 x 28 image = 784 features) Few modifications are enough to evade detection! 1st adversarial examples generated with gradient-based attacks date back to 2013! (one year before attacks to deep neural networks) Before attack (3 vs 7) 5 10 15 20 25 5 10 15 20 25 After attack, g(x)=0 5 10 15 20 25 5 10 15 20 25 After attack, last iter. 5 10 15 20 25 5 10 15 20 25 0 500 −2 −1 0 1 2 g(x) number of iterations After attack (misclassified as 7) 65[Biggio, Roli et al., ECML PKDD 2013]
  • 66. http://pralab.diee.unica.it @biggiobattista Bounding the Adversary’s Knowledge Limited-knowledge (gray/black-box) attacks • Only feature representation and (possibly) learning algorithm are known • Surrogate data sampled from the same distribution as the classifier’s training data • Classifier’s feedback to label surrogate data PD(X,Y)data Surrogate training data Send queries Get labels f(x) Learn surrogate classifier f’(x) This is the same underlying idea behind substitute models and black- box attacks (transferability) [N. Papernot et al., IEEE Euro S&P ’16; N. Papernot et al., ASIACCS’17] 66[Biggio, Roli et al., ECML PKDD 2013]
  • 67. http://pralab.diee.unica.it @biggiobattista Recent Results on Android Malware Detection • Drebin: Arp et al., NDSS 2014 – Android malware detection directly on the mobile phone – Linear SVM trained on features extracted from static code analysis [Demontis, Biggio et al., IEEE TDSC 2017] x2 Classifier 0 1 ... 1 0 Android app (apk) malware benign x1 x f (x) 67
  • 68. http://pralab.diee.unica.it @biggiobattista Recent Results on Android Malware Detection • Dataset (Drebin): 5,600 malware and 121,000 benign apps (TR: 30K, TS: 60K) • Detection rate at FP=1% vs max. number of manipulated features (averaged on 10 runs) – Perfect knowledge (PK) white-box attack; Limited knowledge (LK) black-box attack 68[Demontis, Biggio et al., IEEE TDSC 2017]
  • 69. http://pralab.diee.unica.it @biggiobattista Take-home Messages • Linear and non-linear supervised classifiers can be highly vulnerable to well-crafted evasion attacks • Performance evaluation should be always performed as a function of the adversary’s knowledge and capability – Security Evaluation Curves 69 min x' g(x') s.t. d(x, x') ≤ dmax x ≤ x'
  • 70. http://pralab.diee.unica.it @biggiobattista Why Is Machine Learning So Vulnerable? • Learning algorithms tend to overemphasize some features to discriminate among classes • Large sensitivity to changes of such input features: !"#(") • Different classifiers tend to find the same set of relevant features – that is why attacks can transfer across models! 70[Melis et al., Explaining Black-box Android Malware Detection, EUSIPCO 2018] & g(&) &’ positivenegative
  • 71. 71 2014: Deep Learning Meets Adversarial Machine Learning
  • 72. http://pralab.diee.unica.it @biggiobattista The Discovery of Adversarial Examples 72 ... we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error ... [Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
  • 73. http://pralab.diee.unica.it @biggiobattista Adversarial Examples and Deep Learning • C. Szegedy et al. (ICLR 2014) independently developed a gradient-based attack against deep neural networks – minimally-perturbed adversarial examples 73[Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
  • 74. http://pralab.diee.unica.it @biggiobattista !(#) ≠ & School Bus (x) Ostrich Struthio Camelus Adversarial Noise (r) The adversarial image x + r is visually hard to distinguish from x Informally speaking, the solution x + r is the closest image to x classified as l by f The solution is approximated using using a box-constrained limited-memory BFGS Creation of Adversarial Examples 74[Szegedy, Goodfellow et al., Intriguing Properties of NNs, ICLR 2014, ArXiv 2013]
  • 75. Many Black Swans After 2014… [Search https://arxiv.org with keywords “adversarial examples”] 75 • Several defenses have been proposed against adversarial examples, and more powerful attacks have been developed to show that they are ineffective. Remember the arms race? • Most of these attacks are modifications to the optimization problems reported for evasion attacks / adversarial examples, using different gradient-based solution algorithms, initializations and stopping conditions. • Most popular attack algorithms: FGSM (Goodfellow et al.), JSMA (Papernot et al.), CW (Carlini & Wagner, and follow- up versions)
  • 76. 76 Why Adversarial Perturbations are Imperceptible?
  • 77. http://pralab.diee.unica.it @biggiobattista Why Adversarial Perturbations against Deep Networks are Imperceptible? • Large sensitivity of g(x) to input changes – i.e., the input gradient !"#(") has a large norm (scales with input dimensions!) – Thus, even small modifications along that direction will cause large changes in the predictions 77 [Simon-Gabriel et al., Adversarial vulnerability of NNs increases with input dimension, arXiv 2018] & g(&) &’ !"#(")
  • 78. http://pralab.diee.unica.it @biggiobattista • Regularization also impacts (reduces) the size of input gradients – High regularization requires larger perturbations to mislead detection – e.g., see manipulated digits 9 (classified as 8) against linear SVMs with different C values Regularization, Input Gradients and Adversarial Robustness 78[Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019] high regularization large perturbation low regularization imperceptible perturbation C=0.001, eps=1.7 C=1.0, eps=0.47
  • 79. http://pralab.diee.unica.it @biggiobattista Regularization, Input Gradients and Adversarial Robustness 0.0 0.2 0.4 0.6 0.8 1.0 Regularization (weight decay) 0.05 0.10 0.15 Sizeofinputgradients 0.0 0.2 0.4 0.6 0.8 1.0 Testerror High complexity Low complexity test error (no attack) test error (" = 0.3) 79[Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019]
  • 80. http://pralab.diee.unica.it @biggiobattista Why Do Adversarial Attacks Transfer? Regularization and Transferability 80 0.0 0.2 0.4 0.6 0.8 Gradient alignment (R) 0.2 0.4 0.6 0.8 Black-towhite-boxerrorratio(✏=5) P: 0.69, p-val: < 1e-10 K: 0.48, p-val: < 1e-10 10 2 10 1 Variability of loss landscape (V) 0.3 0.4 0.5 0.6 Transferrate(✏=30) • Excerpt of our experimental results for evasion attacks on Android malware (DREBIN) – ‘x’ for low-complexity (strongly-regularized) models – ‘o’ for high-complexity (weakly-regularized) models 10 1 100 Size of input gradients (S) 0.2 0.4 0.6 0.8 1.0 Evasionrate("=5) SVM logistic ridge SVM-RBF NN [Demontis, Biggio, et al., Why Do Adversarial Attacks Transfer? ... USENIX 2019]
  • 81. 81 Is Deep Learning Safe for Robot Vision?
  • 82. http://pralab.diee.unica.it @biggiobattista Is Deep Learning Safe for Robot Vision? • Evasion attacks against the iCub humanoid robot – Deep Neural Network used for visual object recognition 82[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 84. http://pralab.diee.unica.it @biggiobattista From Binary to Multiclass Evasion • In multiclass problems, classification errors occur in different classes. • Thus, the attacker may aim: 1. to have a sample misclassified as any class different from the true class (error-generic attacks) 2. to have a sample misclassified as a specific class (error-specific attacks) 84 cup sponge dish detergent Error-generic attacks any class cup sponge dish detergent Error-specific attacks [Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 85. http://pralab.diee.unica.it @biggiobattista Error-generic Evasion • Error-generic evasion – k is the true class (blue) – l is the competing (closest) class in feature space (red) • The attack minimizes the objective to have the sample misclassified as the closest class (could be any!) 85 1 0 1 1 0 1 Indiscriminate evasion [Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 86. http://pralab.diee.unica.it @biggiobattista • Error-specific evasion – k is the target class (green) – l is the competing class (initially, the blue class) • The attack maximizes the objective to have the sample misclassified as the target class Error-specific Evasion 86 max 1 0 1 1 0 1 Targeted evasion [Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 87. http://pralab.diee.unica.it @biggiobattista Adversarial Examples against iCub – Gradient Computation ∇fi (x) = ∂fi(z) ∂z ∂z ∂x f1 f2 fi fc ... ... 87 The given optimization problems can be both solved with gradient-based algorithms The gradient of the objective can be computed using the chain rule 1. the gradient of the functions fi(z) can be computed if the chosen classifier is differentiable 2. ... and then backpropagated through the deep network with automatic differentiation
  • 88. http://pralab.diee.unica.it @biggiobattista Example of Adversarial Images against iCub An adversarial example from class laundry-detergent, modified by the proposed algorithm to be misclassified as cup 88[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 89. http://pralab.diee.unica.it @biggiobattista The “Sticker” Attack against iCub Adversarial example generated by manipulating only a specific region, to simulate a sticker that could be applied to the real-world object. This image is classified as cup. 89[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 90. 90 Countering Evasion Attacks What is the rule? The rule is protect yourself at all times (from the movie “Million dollar baby”, 2004)
  • 91. http://pralab.diee.unica.it @biggiobattista Security Measures against Evasion Attacks 1. Reduce sensitivity to input changes with robust optimization – Adversarial Training / Regularization 2. Introduce rejection / detection of adversarial examples 91 min $ ∑& max ||*+||,- ℓ(0&, 2$ 3& + *& ) bounded perturbation! 1 0 1 1 0 1 SVM-RBF (higher rejection rate) 1 0 1 1 0 1 SVM-RBF (no reject)
  • 92. http://pralab.diee.unica.it @biggiobattista • Robust optimization (a.k.a. adversarial training) • Robustness and regularization (Xu et al., JMLR 2009) – under linearity of ℓ and "#, equivalent to robust optimization Reducing Input Sensitivity via Robust Optimization min # max ||*+||,-. ∑0 ℓ 10, "# 30 + *0 bounded perturbation! 92 min # ∑5 ℓ 10, "# 30 + 6||73"||8 dual norm of the perturbation ||73"||8 = ||#||8
  • 93. http://pralab.diee.unica.it @biggiobattista Experiments on Android Malware • Infinity-norm regularization is the optimal regularizer against sparse evasion attacks – Sparse evasion attacks penalize | " |# promoting the manipulation of only few features Results on Adversarial Android Malware [Demontis, Biggio et al., Yes, ML Can Be More Secure!..., IEEE TDSC 2017] Absolute weight values |$| in descending order Why? It bounds the maximum weight absolute values! min w,b w ∞ +C max 0,1− yi f (xi )( ) i ∑ , w ∞ = max i=1,...,d wiSec-SVM 93
  • 94. http://pralab.diee.unica.it @biggiobattista Adversarial Training and Regularization • Adversarial training can also be seen as a form of regularization, which penalizes the (dual) norm of the input gradients ! |#$ℓ |& • Known as double backprop or gradient/Jacobian regularization – see, e.g., Simon-Gabriel et al., Adversarial vulnerability of neural networks increases with input dimension, ArXiv 2018; and Lyu et al., A unified gradient regularization family for adversarial examples, ICDM 2015. 94 ' g(') '’ with adversarial trainingTake-home message: the net effect of these techniques is to make the prediction function of the classifier smoother (increasing the input margin)
  • 95. http://pralab.diee.unica.it @biggiobattista Ineffective Defenses: Obfuscated Gradients • Work by Carlini & Wagner (SP’ 17) and Athalye et al. (ICML ‘18) has shown that – some recently-proposed defenses rely on obfuscated / masked gradients, and – they can be circumvented 95 g(") "’" Obfuscated gradients do not allow the correct execution of gradient-based attacks... " g(") "’ ... but substitute models and/or smoothing can correctly reveal meaningful input gradients!
  • 96. http://pralab.diee.unica.it @biggiobattista Detecting & Rejecting Adversarial Examples • Adversarial examples tend to occur in blind spots – Regions far from training data that are anyway assigned to ‘legitimate’ classes 96 blind-spot evasion (not even required to mimic the target class) rejection of adversarial examples through enclosing of legitimate classes
  • 97. http://pralab.diee.unica.it @biggiobattista Detecting & Rejecting Adversarial Examples input perturbation (Euclidean distance) 97[Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]
  • 98. http://pralab.diee.unica.it @biggiobattista [S. Sabour at al., ICLR 2016] Why Rejection (in Representation Space) Is Not Enough? 98
  • 99. 99 Adversarial Examples and Security Evaluation (Demo Session) https://advx-secml.pluribus-one.it
  • 100. http://pralab.diee.unica.it @biggiobattista secml: An open source Python library for ML Security 100 adv ml expl others - ML algorithms via sklearn - DL algorithms and optimizers via PyTorch and Tensorflow - attacks (evasion, poisoning, ...) with custom/faster solvers - defenses (advx rejection, adversarial training, ...) - Explanation methods based on influential features - Explanation methods based on influential prototypes - Parallel computation - Support for dense/sparse data - Advanced plotting functions (via matplotlib) - Modular and easy to extend First release scheduled on August 2019! Marco Melis Ambra Demontis
  • 101. http://pralab.diee.unica.it @biggiobattista EU H2020 Project ALOHA • ALOHA – software framework for runtime-Adaptive and secure deep Learning On Heterogeneous Architectures • Project goal: to facilitate implementation of deep learning algorithms on heterogeneous low- energy computing platforms • Project website: www.aloha-h2020.eu • Pluribus One is in charge of evaluating and improving security of deep learning algorithms under attack 101 This project has received funding from the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement No. 780788
  • 103. http://pralab.diee.unica.it @biggiobattista Poisoning Machine Learning 103 x xx x x x x x x x x x x xxx x x1 x2 ... xd pre-processing and feature extraction training data (with labels) classifier learning start bang portfolio winner year ... university campus Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ... start bang portfolio winner year ... university campus 1 1 1 1 1 ... 0 0 xSPAM start bang portfolio winner year ... university campus +2 +1 +1 +1 +1 ... -3 -4 w x x x x xx x x x x x x x x xx x classifier generalizes well on test data
  • 104. http://pralab.diee.unica.it @biggiobattista Poisoning Machine Learning 104 x xx x x x x x x x x x x xxx x x1 x2 ... xd pre-processing and feature extraction corrupted training data classifier learning is compromised... Start 2007 with a bang! Make WBFS YOUR PORTFOLIO’s first winner of the year ... university campus... start bang portfolio winner year ... university campus 1 1 1 1 1 ... 1 1 xSPAM start bang portfolio winner year ... university campus +2 +1 +1 +1 +1 ... +1 +1 w x x x x xx x x x x x x x x xx x ... to maximize error on test data xx x poisoning data
  • 105. http://pralab.diee.unica.it @biggiobattista • Goal: to maximize classification error • Knowledge: perfect / white-box attack • Capability: injecting poisoning samples into TR • Strategy: find an optimal attack point xc in TR that maximizes classification error xc classification error = 0.039classification error = 0.022 Poisoning Attacks against Machine Learning xc classification error as a function of xc [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] 105
  • 106. http://pralab.diee.unica.it @biggiobattista Poisoning is a Bilevel Optimization Problem • Attacker’s objective – to maximize generalization error on untainted data, w.r.t. poisoning point xc • Poisoning problem against (linear) SVMs: Loss estimated on validation data (no attack points!) Algorithm is trained on surrogate data (including the attack point) [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] [Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015] [Munoz-Gonzalez, Biggio, Roli et al., Towards poisoning of deep learning..., AISec 2017] max $% & '()*, ,∗ s. t. ,∗ = argmin6 ℒ '89 ∪ ;<, =< , , max $% > ?@A B max(0,1 − =?,∗ ;? ) s. t. ,∗ = argminH,I A J HK H + C ∑O@A P max(0,1 − =O, ;O ) + C max(0,1 − =<, ;< ) 106
  • 107. http://pralab.diee.unica.it @biggiobattista xc (0) xc Gradient-based Poisoning Attacks • Gradient is not easy to compute – The training point affects the classification function • Trick: – Replace the inner learning problem with its equilibrium (KKT) conditions – This enables computing gradient in closed form • Example for (kernelized) SVM – similar derivation for Ridge, LASSO, Logistic Regression, etc. 107 xc (0) xc [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] [Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015]
  • 108. http://pralab.diee.unica.it @biggiobattista Experiments on MNIST digits Single-point attack • Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 – ‘0’ is the malicious (attacking) class – ‘4’ is the legitimate (attacked) one xc (0) xc 108[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
  • 109. http://pralab.diee.unica.it @biggiobattista Experiments on MNIST digits Multiple-point attack • Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 – ‘0’ is the malicious (attacking) class – ‘4’ is the legitimate (attacked) one 109[Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]
  • 110. http://pralab.diee.unica.it @biggiobattista How about Poisoning Deep Nets? • ICML 2017 Best Paper by Koh et al., “Understanding black-box predictions via Influence Functions” has derived adversarial training examples against a DNN – they have been constructed attacking only the last layer (KKT-based attack against logistic regression) and assuming the rest of the network to be ”frozen” 110
  • 111. http://pralab.diee.unica.it @biggiobattista Towards Poisoning Deep Neural Networks • Solving the poisoning problem without exploiting KKT conditions (back-gradient) – Muñoz-González, Biggio, Roli et al., AISec 2017 https://arxiv.org/abs/1708.08689 111
  • 112. 112 Countering Poisoning Attacks What is the rule? The rule is protect yourself at all times (from the movie “Million dollar baby”, 2004)
  • 113. http://pralab.diee.unica.it @biggiobattista Security Measures against Poisoning • Rationale: poisoning injects outlying training samples • Two main strategies for countering this threat 1. Data sanitization: remove poisoning samples from training data • Bagging for fighting poisoning attacks • Reject-On-Negative-Impact (RONI) defense 2. Robust Learning: learning algorithms that are robust in the presence of poisoning samples xc (0) xc xc (0) xc 113
  • 114. http://pralab.diee.unica.it @biggiobattista Robust Regression with TRIM • TRIM learns the model by retaining only training points with the smallest residuals argmin ',),* + ,, -, . = 1 |.| 2 3∈* 5 63 − 83 9 + ;Ω(>) @ = 1 + A B, . ⊂ 1, … , @ , . = B [Jagielski, Biggio et al., IEEE Symp. Security and Privacy, 2018] 114
  • 115. http://pralab.diee.unica.it @biggiobattista Experiments with TRIM (Loan Dataset) • TRIM MSE is within 1% of original model MSE Existing methods Our defense No defense Better defense 115[Jagielski, Biggio et al., IEEE Symp. Security and Privacy, 2018]
  • 117. http://pralab.diee.unica.it @biggiobattista Attacks against Machine Learning Integrity Availability Privacy / Confidentiality Test data Evasion (a.k.a. adversarial examples) - Model extraction / stealing Model inversion (hill-climbing) Membership inference attacks Training data Poisoning (to allow subsequent intrusions) – e.g., backdoors or neural network trojans Poisoning (to maximize classification error) - [Biggio & Roli, Wild Patterns, 2018 https://arxiv.org/abs/1712.03141] Misclassifications that do not compromise normal system operation Misclassifications that compromise normal system operation Attacker’s Goal Attacker’s Capability Querying strategies that reveal confidential information on the learning model or its users Attacker’s Knowledge: • perfect-knowledge (PK) white-box attacks • limited-knowledge (LK) black-box attacks (transferability with surrogate/substitute learning models) 117
  • 118. http://pralab.diee.unica.it @biggiobattista Model Inversion Attacks Privacy Attacks • Goal: to extract users’ sensitive information (e.g., face templates stored during user enrollment) – Fredrikson, Jha, Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. ACM CCS, 2015 • Also known as hill-climbing attacks in the biometric community – Adler. Vulnerabilities in biometric encryption systems. 5th Int’l Conf. AVBPA, 2005 – Galbally, McCool, Fierrez, Marcel, Ortega-Garcia. On the vulnerability of face verification systems to hill-climbing attacks. Patt. Rec., 2010 • How: by repeatedly querying the target system and adjusting the input sample to maximize its output score (e.g., a measure of the similarity of the input sample with the user templates) 118 Reconstructed Image Training Image
  • 119. http://pralab.diee.unica.it @biggiobattista Membership Inference Attacks Privacy Attacks (Shokri et al., IEEE Symp. SP 2017) • Goal: to identify whether an input sample is part of the training set used to learn a deep neural network based on the observed prediction scores for each class 119
  • 120. http://pralab.diee.unica.it @biggiobattista Training data (poisoned) Backdoored stop sign (labeled as speedlimit) Backdoor Attacks Poisoning Integrity Attacks 120 Backdoor / poisoning integrity attacks place mislabeled training points in a region of the feature space far from the rest of training data. The learning algorithm labels such region as desired, allowing for subsequent intrusions / misclassifications at test time Training data (no poisoning) T. Gu, B. Dolan-Gavitt, and S. Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain. In NIPS Workshop on Machine Learning and Computer Security, 2017. X. Chen, C. Liu, B. Li, K. Lu, and D. Song. Targeted backdoor attacks on deep learning systems using data poisoning. ArXiv e-prints, 2017. M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In Proc. ACM Symp. Information, Computer and Comm. Sec., ASIACCS ’06, pages 16–25, New York, NY, USA, 2006. ACM. M. Barreno, B. Nelson, A. Joseph, and J. Tygar. The security of machine learning. Machine Learning, 81:121–148, 2010. B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support vector machines. In J. Langford and J. Pineau, editors, 29th Int’l Conf. on Machine Learning, pages 1807–1814. Omnipress, 2012. B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data Engineering, 26(4):984–996, April 2014. H. Xiao, B. Biggio, G. Brown, G. Fumera, C. Eckert, and F. Roli. Is feature selection secure against training data poisoning? In F. Bach and D. Blei, editors, JMLR W&CP - Proc. 32nd Int’l Conf. Mach. Learning (ICML), volume 37, pages 1689–1698, 2015. L. Munoz-Gonzalez, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli. Towards poisoning of deep learning algorithms with back-gradient optimization. In 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, pp. 27–38, 2017. ACM. B. Biggio and F. Roli. Wild patterns: Ten years after the rise of adversarial machine learning. ArXiv e-prints, 2018. M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 39th IEEE Symp. on Security and Privacy, 2018. Attackreferred toasbackdoor Attackreferredtoas ‘poisoningintegrity’
  • 121. 121 Are Adversarial Examples a Real Security Threat?
  • 122. http://pralab.diee.unica.it @biggiobattista World Is Not Digital… • ….Previous cases of adversarial examples have common characteristic: the adversary is able to precisely control the digital representation of the input to the machine learning tools….. 122 [M. Sharif et al., ACM CCS 2016] School Bus (x) Ostrich Struthio Camelus Adversarial Noise (r)
  • 123. 123 Do Adversarial Examples Exist in the Physical World?
  • 124. http://pralab.diee.unica.it @biggiobattista • Adversarial images fool deep networks even when they operate in the physical world, for example, images are taken from a cell-phone camera? – Alexey Kurakin et al. (2016, 2017) explored the possibility of creating adversarial images for machine learning systems which operate in the physical world. They used images taken from a cell-phone camera as an input to an Inception v3 image classification neural network – They showed that in such a set-up, a significant fraction of adversarial images crafted using the original network are misclassified even when fed to the classifier through the camera [Alexey Kurakin et al., ICLR 2017] Adversarial Images in the Physical World 124
  • 125. http://pralab.diee.unica.it @biggiobattista [M. Sharif et al., ACM CCS 2016] The adversarial perturbation is applied only to the eyeglasses image region Adversarial Glasses 125
  • 127. http://pralab.diee.unica.it @biggiobattista No, We Should Not… 127 In this paper, we show experiments that suggest that a trained neural network classifies most of the pictures taken from different distances and angles of a perturbed image correctly. We believe this is because the adversarial property of the perturbation is sensitive to the scale at which the perturbed picture is viewed, so (for example) an autonomous car will misclassify a stop sign only from a small range of distances. [arXiv:1707.03501; CVPR 2017]
  • 128. http://pralab.diee.unica.it @biggiobattista Yes, We Should... 128[Athalye et al., Synthesizing robust adversarial examples. ICLR, 2018]
  • 130. http://pralab.diee.unica.it @biggiobattista Adversarial Road Signs [Evtimov et al., CVPR 2017] 130
  • 131. http://pralab.diee.unica.it @biggiobattista • Adversarial examples can exist in the physical world, we can fabricate concrete adversarial objects (glasses, road signs, etc.) • But the effectiveness of attacks carried out by adversarial objects is still to be investigated with large scale experiments in realistic security scenarios • Gilmer et al. (2018) have recently discussed the realism of security threat caused by adversarial examples, pointing out that it should be carefully investigated – Are indistinguishable adversarial examples a real security threat ? – For which real security scenarios adversarial examples are the best attack vector? Better than attacking components outside the machine learning component – … 131 Is This a Real Security Threat? [Justin Gilmer et al., Motivating the Rules of the Game for Adversarial Example Research, https://arxiv.org/abs/1807.06732]
  • 132. 132 Are Indistinguishable Perturbations a Real Security Threat?
  • 133. http://pralab.diee.unica.it @biggiobattista !(#) ≠ & The adversarial image x + r is visually hard to distinguish from x …There is a torrent of work that views increased robustness to restricted perturbations as making these models more secure. While not all of this work requires completely indistinguishable modications, many of the papers focus on specifically small modications, and the language in many suggests or implies that the degree of perceptibility of the perturbations is an important aspect of their security risk… Indistinguishable Adversarial Examples 133 [Justin Gilmer et al., Motivating the Rules of the Game for Adversarial Example Research, arXiv 2018]
  • 134. http://pralab.diee.unica.it @biggiobattista • The attacker can benefit by minimal perturbation of a legitimate input; e.g., she could use the attack for a longer period of time before it is detected • But is minimal perturbation a necessary constraint for the attacker? Indistinguishable Adversarial Examples 134
  • 135. http://pralab.diee.unica.it @biggiobattista • Is minimal perturbation a necessary constraint for the attacker? Indistinguishable Adversarial Examples 135
  • 136. http://pralab.diee.unica.it @biggiobattista Attacks with Content Preservation 136 There are well known security applications where minimal perturbations and indistinguishability of adversarial inputs are not required at all…
  • 137. http://pralab.diee.unica.it @biggiobattista …At the time of writing, we were unable to find a compelling example that required indistinguishability… To have the largest impact, we should both recast future adversarial example research as a contribution to core machine learning and develop new abstractions that capture realistic threat models. 137 Are Indistinguishable Perturbations a Real Security Threat? [Justin Gilmer et al., Motivating the Rules of the Game for Adversarial Example Research, arXiv 2018]
  • 138. http://pralab.diee.unica.it @biggiobattista To Conclude… This is a recent research field… 138 Dagstuhl Perspectives Workshop on “Machine Learning in Computer Security” Schloss Dagstuhl, Germany, Sept. 9th-14th, 2012
  • 139. http://pralab.diee.unica.it @biggiobattista Timeline of Learning Security 139 Security of DNNs Adversarial M L 2004-2005: pioneering work Dalvi et al., KDD 200; Lowd & M eek, KDD 2005 2013: Srndic & Laskov, NDSS claim nonlinear classifiers are secure 2006-2010: Barreno, Nelson, Rubinstein, Joseph, Tygar The Security of M achine Learning 2013-2014: Biggio et al., ECM L, IEEE TKDE high-confidence & black-box evasion attacks to show vulnerability of nonlinear classifiers 2014: Srndic & Laskov, IEEE S&P shows vulnerability of nonlinear classifiers with our ECM L ‘13 gradient-based attack 2014: Szegedy et al., ICLR adversarial exam ples vs DL 2016: Papernot et al., IEEE S&P evasion attacks / adversarial exam ples 2017: Papernot et al., ASIACCS black-box evasion attacks 2017: Carlini & W agner, IEEE S&P high-confidence evasion attacks 2017: Grosse et al., ESORICS Application to Android m alware 2017: Dem ontis et al., IEEE TDSC Secure learning for Android m alware 2004 2014 2006 2013 2014 2017 2016 2017 2017 2017 2014
  • 140. http://pralab.diee.unica.it @biggiobattista Timeline of Learning Security 140 Security of DNNs Adversarial M L 2004-2005: pioneering work Dalvi et al., KDD 200; Lowd & M eek, KDD 2005 2013: Srndic & Laskov, NDSS claim nonlinear classifiers are secure 2006-2010: Barreno, Nelson, Rubinstein, Joseph, Tygar The Security of M achine Learning 2013-2014: Biggio et al., ECM L, IEEE TKDE high-confidence & black-box evasion attacks to show vulnerability of nonlinear classifiers 2014: Srndic & Laskov, IEEE S&P shows vulnerability of nonlinear classifiers with our ECM L ‘13 gradient-based attack 2014: Szegedy et al., ICLR adversarial exam ples vs DL 2016: Papernot et al., IEEE S&P evasion attacks / adversarial exam ples 2017: Papernot et al., ASIACCS black-box evasion attacks 2017: Carlini & W agner, IEEE S&P high-confidence evasion attacks 2017: Grosse et al., ESORICS Application to Android m alware 2004 2014 2006 2013 2014 2017 2016 2017 2017 2017 2014 2017: Dem ontis et al., IEEE TDSC Secure learning for Android m alware
  • 141. http://pralab.diee.unica.it @biggiobattista Timeline of Learning Security AdversarialM L 2004-2005: pioneering work Dalvi et al., KDD 2004 Lowd & Meek, KDD 2005 2013: Srndic & Laskov, NDSS 2013: Biggio et al., ECML-PKDD - demonstrated vulnerability of nonlinear algorithms to gradient-based evasion attacks, also under limited knowledge Main contributions: 1. gradient-based adversarial perturbations (against SVMs and neural nets) 2. projected gradient descent / iterative attack (also on discrete features from malware data) transfer attack with surrogate/substitute model 3. maximum-confidence evasion (rather than minimum-distance evasion) Main contributions: - minimum-distance evasion of linear classifiers - notion of adversary-aware classifiers 2006-2010: Barreno, Nelson, Rubinstein, Joseph, Tygar The Security of Machine Learning (and references therein) Main contributions: - first consolidated view of the adversarial ML problem - attack taxonomy - exemplary attacks against some learning algorithms 2014: Szegedy et al., ICLR Independent discovery of (gradient- based) minimum-distance adversarial examples against deep nets; earlier implementation of adversarial training SecurityofDNNs 2016: Papernot et al., IEEE S&P Framework for security evalution of deep nets 2017: Papernot et al., ASIACCS Black-box evasion attacks with substitute models (breaks distillation with transfer attacks on a smoother surrogate classifier) 2017: Carlini & Wagner, IEEE S&P Breaks again distillation with maximum-confidence evasion attacks (rather than using minimum-distance adversarial examples) 2016: Papernot et al., Euro S&P Distillation defense (gradient masking) Main contributions: - evasion of linear PDF malware detectors - claims nonlinear classifiers can be more secure 2014: Biggio et al., IEEE TKDE Main contributions: - framework for security evaluation of learning algorithms - attacker’s model in terms of goal, knowledge, capability 2017: Demontis et al., IEEE TDSC Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware Detection Main contributions: - Secure SVM against adversarial examples in malware detection 2017: Grosse et al., ESORICS Adversarial examples for malware detection 2018: Madry et al., ICLR Improves the basic iterative attack from Kurakin et al. by adding noise before running the attack; first successful use of adversarial training to generalize across many attack algorithms 2014: Srndic & Laskov, IEEE S&P used Biggio et al.’s ECML-PKDD ‘13 gradient-based evasion attack to demonstrate vulnerability of nonlinear PDF malware detectors 2006: Globerson & Roweis, ICML 2009: Kolcz et al., CEAS 2010: Biggio et al., IJMLC Main contributions: - evasion attacks against linear classifiers in spam filtering Work on security evaluation of learning algorithms Work on evasion attacks (a.k.a. adversarial examples) Pioneering work on adversarial machine learning ... in malware detection (PDF / Android) Legend 1 2 3 4 1 2 3 4 2015: Goodfellow et al., ICLR Maximin formulation of adversarial training, with adversarial examples generated iteratively in the inner loop 2016: Kurakin et al. Basic iterative attack with projected gradient to generate adversarial examples 2 iterative attacks Biggio and Roli, Wild Patterns: Ten Years After The Rise of Adversarial Machine Learning, Pattern Recognition, 2018 141
  • 142. http://pralab.diee.unica.it @biggiobattista Black Swans to the Fore 142 [Szegedy et al., Intriguing properties of neural networks, 2014] After this “black swan”, the issue of security of DNNs came to the fore… Not only on scientific specialistic journals…
  • 143. http://pralab.diee.unica.it @biggiobattista The Safety Issue to the Fore… 143 The black box of AI D. Castelvecchi, Nature, Vol. 538, 20, Oct 2016 Machine learning is becoming ubiquitous in basic research as well as in industry. But for scientists to trust it, they first need to understand what the machines are doing. Ellie Dobson, director of data science at the big-data firm Arundo Analytics in Oslo: If something were to go wrong as a result of setting the UK interest rates, she says, “the Bank of England can’t say, the black box made me do it’”.
  • 144. http://pralab.diee.unica.it @biggiobattista Why So Much Interest? 144 Before the deep net “revolution”, people were not surprised when machine learning was wrong, they were more amazed when it worked well… Now that it seems to work for real applications, people are disappointed, and worried, for errors that humans do not do…
  • 145. http://pralab.diee.unica.it @biggiobattista Errors of Humans and Machines… 145 Machine learning decisions are affected by several sources of bias that causes “strange” errors But we should keep in mind that also humans are biased…
  • 146. http://pralab.diee.unica.it @biggiobattista The Bat and the Ball Problem 146 A bat and a ball together cost $ 1.10 The bat costs $ 1.0 more than the ball How much does the ball cost ? Please, give me the first answer coming to your mind !
  • 147. http://pralab.diee.unica.it @biggiobattista The Bat and the Ball Problem 147 Exact solution is 0.05 dollar (5 cents) The wrong solution ($ 0.10) is due to the attribute substitution, a psychological process thought to underlie a number of cognitive biases It occurs when an individual has to make a judgment (of a target attribute) that is computationally complex, and instead substitutes a more easily calculated heuristic attribute bat+ball=$1.10 bat=ball+$1.0 ⎧ ⎨ ⎪ ⎩⎪
  • 148. http://pralab.diee.unica.it @biggiobattista Trust in Humans or Machines? Algorithms are biased, but also humans are as well… When should you trust humans and when algorithms? 148
  • 149. http://pralab.diee.unica.it @biggiobattista Learning Comes at a Price! 149 The introduction of novel learning functionalities increases the attack surface of computer systems and produces new vulnerabilities Safety of machine learning will be more and more important in future computer systems, as well as accountability, transparency, and the protection of fundamental human values and rights
  • 150. Thanks for listening. Any questions? If you know the enemy and know yourself, you need not fear the result of a hundred battles Sun Tzu, The art of war, 500 BC Battista Biggio battista.biggio@unica.it @biggiobattista
  • 151. http://pralab.diee.unica.it @biggiobattista References • B. Biggio and F. Roli. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 2018. • B. Biggio, G. Fumera and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data Engineering, 26(4):984–996, April 2014. • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto and F. Roli. Evasion attacks against machine learning at test time. ECML PKDD, 2013. • F. Crecchi, D. Bacciu, and B. Biggio. Detecting adversarial examples through nonlinear dimensionality reduction. ESANN, 2019. • B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert and F. Roli. Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables. EUSIPCO 2018. • A. Demontis, M. Melis, M. Pintor, M. Jagielski, B. Biggio, A. Oprea, C. Nita-Rotaru, and F. Roli. Why do adversarial attacks transfer? Explaining transferability of evasion and poisoning attacks. In 28th USENIX Security Symposium (USENIX Security 19). • A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck, I. Corona, G. Giacinto, and F. Roli. Yes, machine learning can be more secure! a case study on android malware detection. IEEE Trans. Dep. and Secure Computing, In press. • M. Melis, A. Demontis, B. Biggio, G. Brown, G. Fumera, and F. Roli. Is deep learning safe for robot vision? Adversarial examples against the iCub humanoid. In ICCV Workshop on Vision in Practice on Autonomous Robots (ViPAR), 2017. • B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support vector machines. In J. Langford and J. Pineau, editors, 29th Int’l Conf. on Machine Learning, pages 1807–1814. Omnipress, 2012. • H. Xiao, B. Biggio, G. Brown, G. Fumera, C. Eckert, and F. Roli. Is feature selection secure against training data poisoning? In F. Bach and D. Blei, editors, JMLR W&CP - Proc. 32nd Int’l Conf. Mach. Learning (ICML), volume 37, pages 1689–1698, 2015. • L. Munoz-Gonzalez, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli. Towards poisoning of deep learning algorithms with back-gradient optimization. AISec ’17, pages 27–38, New York, NY, USA, 2017. ACM. • M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 39th IEEE Symposium on Security and Privacy, 2018. • A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok. Synthesizing robust adversarial examples. In ICLR, 2018. • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014. 151