This document summarizes research on developing more secure machine learning classifiers. It discusses how gradient-based and surrogate model approaches can be used to evade existing classifiers. The researchers then propose several techniques for building more robust classifiers, including using infinity-norm regularization, cost-sensitive learning, and modifying kernel parameters. Experiments on handwritten digit and spam filtering datasets show the proposed approaches improve security against evasion attacks compared to standard support vector machines.
5. http://pralab.diee.unica.it
New Challenges for Machine Learning
• The use of machine learning opens up new big possibilities
but also new security risks
• Proliferation and sophistication
of attacks and cyberthreats
– Skilled / economically-motivated
attackers (e.g., ransomware)
• Several security systems use machine learning to detect attacks
– but … is machine learning secure enough?
5
6. http://pralab.diee.unica.it
Is Machine Learning Secure Enough?
• Problem: how to evade a linear (trained) classifier?
Start 2007
with a bang!
Make WBFS
YOUR
PORTFOLIO’s
first winner
of the year
...
start
bang
portfolio
winner
year
...
university
campus
1
1
1
1
1
...
0
0
+6 > 0, SPAM
(correctly classified)
f (x) = sign(wT
x)
x
start
bang
portfolio
winner
year
...
university
campus
+2
+1
+1
+1
+1
...
-3
-4
w
x’
St4rt 2007
with a b4ng!
Make WBFS
YOUR
PORTFOLIO’s
first winner
of the year
... campus
start
bang
portfolio
winner
year
...
university
campus
0
0
1
1
1
...
0
1
+3 -4 < 0, HAM
(misclassified email)
f (x) = sign(wT
x)
6
But… what if the classifier is non-linear?
7. http://pralab.diee.unica.it
Gradient-based Evasion
• Goal: maximum-confidence evasion
• Attack strategy:
• Non-linear, constrained optimization
– Gradient descent: approximate
solution for smooth functions
• Gradients of g(x) can be analytically
computed in many cases
– SVMs, Neural networks
−2−1.5−1−0.500.51
x
f (x) = sign g(x)( )=
+1, malicious
−1, legitimate
"
#
$
%$
min
x'
g(x')
s.t. d(x, x') ≤ dmax
x '
7
d(x, !x ) ≤ dmax
Feasible domain
[Biggio et al., ECML 2013]
8. http://pralab.diee.unica.it
Computing Descent Directions
Support vector machines
Neural networks
x1
xd
d1
dk
dm
xf g(x)
w1
wk
wm
v11
vmd
vk1
……
……
g(x) = αi yik(x,
i
∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )
i
∑
g(x) = 1+exp − wkδk (x)
k=1
m
∑
#
$
%
&
'
(
)
*
+
,
-
.
−1
∂g(x)
∂xf
= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkf
k=1
m
∑
RBF kernel gradient: ∇k(x,xi
) = −2γ exp −γ || x − xi
||2
{ }(x − xi
)
8
But… what if the classifier is non-differentiable?
[Biggio et al., ECML 2013]
10. http://pralab.diee.unica.it
Dense and Sparse Evasion Attacks
• L2-norm noise corresponds to
dense evasion attacks
– All features are modified by
a small amount
• L1-norm noise corresponds to
sparse evasion attacks
– Few features are significantly
modified
10
min$% 𝑔 𝑥%
𝑠. 𝑡. |𝑥 − 𝑥%
|-
-
≤ 𝑑01$
min$% 𝑔 𝑥%
𝑠. 𝑡. |𝑥 − 𝑥%
|2 ≤ 𝑑01$
11. http://pralab.diee.unica.it
Goal of This Work
• Secure learning against evasion attacks exploits game-
theoretical models, robust optimization, multiple classifiers,
adversarial training, etc.
• Practical adoption of current secure learning algorithms is
hindered by several factors:
– strong theoretical requirements
– complexity of implementation
– scalability issues (computational time and space for training)
11
Our goal: to develop secure kernel machines
that are not computationally more demanding
than their non-secure counterparts
13. http://pralab.diee.unica.it
Secure Linear Classifiers
• Intuition in previous work on spam filtering
[Kolcz and Teo, CEAS 2007; Biggio et al., IJMLC 2010]
– the attacker aims to modify few features
– features assigned to highest absolute weights are modified first
– heuristic methods to design secure linear classifiers with more evenly-
distributed weights
• We know now that the aforementioned attack is sparse
– l1-norm constrained
13
Then, what does more evenly-distributed weights mean
from a more theoretical perspective?
14. http://pralab.diee.unica.it
Robustness and Regularization
[Xu et al., JMLR 2009]
• SVM learning is equivalent to a robust optimization problem
– regularization depends on the noise on training data!
14
min
w,b
1
2
wT
w+C max 0,1− yi f (xi )( )
i
∑
min
w,b
max
ui∈U
max 0,1− yi f (xi +ui )( )
i
∑
l2-norm regularization is optimal
against l2-norm noise!
infinity-norm regularization is optimal
against l1-norm noise!
16. http://pralab.diee.unica.it
Cost-sensitive Learning
• Unbalancing cost of classification errors to account for different
levels of noise over the training classes
[Katsumada and Takeda, AISTATS ‘15]
• Evasion attacks: higher amount of noise on malicious data
16
17. http://pralab.diee.unica.it
Experiments on MNIST Handwritten Digits
17
• 8 vs 9, 28x28 images (784 features – grey-level pixel values)
• 500 training samples, 500 test samples, 5 repetitions
• Parameter tuning (max. detection rate at 1% FP)
0 200 400 600
0
0.2
0.4
0.6
0.8
1
Handwritten digits (dense attack)
TPatFP=1%
d max
SVM
cSVM
I−SVM
cI−SVM
0 2000 4000 6000
0
0.2
0.4
0.6
0.8
1
Handwritten digits (sparse attack)
TPatFP=1%
d max
SVM
cSVM
I−SVM
cI−SVM
vs
26. http://pralab.diee.unica.it
Secure Kernel Machines
• Key Idea: to better enclose benign data (eliminate blind spots)
– Adversarial Training / Game-theoretical models
• We can achieve a similar effect by properly modifying the SVM
parameters (classification costs and kernel parameters)
23
Standard SVM Cost-sensitive Learning Kernel Modification
27. http://pralab.diee.unica.it
• Lux0R [Corona et al., AISec ‘14]
• Adversary’s capability
– adding up to dmax API calls
– removing API calls may
compromise the embedded
malware code
classifier
benign
malicious
API reference
extraction
API reference
selection
learning-based model
runtime analysis
known
label
JavaScript
API references
Suspicious
references
Experiments on PDF Malware Detection
min
x'
g(x')
s.t. d(x, x') ≤ dmax
x ≤ x'
24
eval
isNaN
this.getURL
...
28. http://pralab.diee.unica.it
Experiments on PDF Malware Detection
• Lux0R Data: Benign / Malicious PDF files with Javascript
– 5,000 training samples, 5,000 test samples, 5 repetitions
– 100 API calls selected from training data
• Detection rate (TP) at FP=1% vs max. number of added API calls
25
number of added API calls
29. http://pralab.diee.unica.it
Conclusions and Future Work
• Classifier security can be significantly improved by properly
tuning classifier parameters
– regularization terms
– cost-sensitive learning, kernel parameters
Future Work
• Security / complexity comparison against current adversarial
approaches
– Adversarial Training / Game-theoretical models
• More theoretical insights on classifier / feature vulnerability
26