Adversarial Pattern Classification
Upcoming SlideShare
Loading in...5
×
 

Adversarial Pattern Classification

on

  • 1,438 views

Presentation of PhD Thesis by Battista Biggio

Presentation of PhD Thesis by Battista Biggio

Statistics

Views

Total Views
1,438
Views on SlideShare
1,324
Embed Views
114

Actions

Likes
0
Downloads
34
Comments
0

7 Embeds 114

http://prag.diee.unica.it 78
http://prag 11
https://prag.diee.unica.it 10
http://pralab.diee.unica.it 7
http://www.slideshare.net 6
http://192.167.131.140 1
http://www.linkedin.com 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Adversarial Pattern Classification Adversarial Pattern Classification Presentation Transcript

  • PhD in Electronic and Computer Engineering Adversarial Pattern Classification Battista Biggio XXII cycle Advisor: prof. Fabio Roli Department of Electrical and Electronic Engineering University of Cagliari, Italy
  • Outline • Problem definition • Open issues • Contributions of this thesis – Experiments • Conclusions and future works 05-03-2010 Adv ersarial Classification - B. Biggio 2
  • What is adversarial classification? • Pattern recognition in security applications – spam filtering, intrusion detection, biometrics • Malicious adversaries aim to mislead the system x2 legitimate f(x) malicious Buy viagra! Buy vi4gr@! x1 05-03-2010 Adv ersarial Classification - B. Biggio 3
  • Open issues 1. Vulnerability identification • potential vulnerabilities may be exploited by an adversary to mislead the system 2. Performance evaluation under attack • standard performance evaluation does not provide information about the robustness of a classifier under attack 3. Defence strategies for robust classifier design • classification algorithms were not originally thought to be robust against adversarial attacks 05-03-2010 Adv ersarial Classification - B. Biggio 4
  • Main contributions of this thesis 1. State of the art in adversarial classification – to highlight the need for a unifying view of the problem 2. Robustness evaluation – to provide an estimate of the performance of a classifier under attack – to select a more appropriate classification model 3. Defence strategies for robust classifier design – to improve the robustness of classifiers under attack 05-03-2010 Adv ersarial Classification - B. Biggio 5
  • 1. State of the art
  • State of the art • Vulnerability identification – Good word attacks in spam filtering [W ittel, Lowd, Graham-Cumming] – Polymorphic and poisoning attacks in IDSs [Fogla, Lee, Kloft, Laskov ] – Possible attacks to a biometric verification system [Ratha, Jain] • Defence strategies against specific attacks – Good word attacks in spam filtering [Jorgensen, Nelson] – Polymorphic and poisoning attacks in IDSs [Perdisci, Cretu] – Spoof attacks in biometrics [Rodrigues] • No general methodology exists to evaluate the performance of classifiers under attack 05-03-2010 Adv ersarial Classification - B. Biggio 7
  • State of the art A clear and unifying view of the problem as well as practical guidelines for the design of classifiers in adversarial environments do not exist yet! 05-03-2010 Adv ersarial Classification - B. Biggio 8
  • 2. Robustness evaluation
  • Standard performance evaluation accuracy C2 TRAINING SET C1 COLLECTED CLASSIFIER DATA TESTING SET Techniques Performance measures Validation Classification accuracy Cross validation ROC curve Bootstrap Area Under the ROC curve (AUC) … … 05-03-2010 Adv ersarial Classification - B. Biggio 10
  • Problems • Standard performance evaluation is likely to provide an optimistic estimate of the performance [Kolcz] 1. collected data may not include attacks at all Biometric systems are not typically tested against spoof attacks 05-03-2010 Adv ersarial Classification - B. Biggio 11
  • Problems • Standard performance evaluation is likely to provide an optimistic estimate of the performance [Kolcz] 2. collected data may contain attacks which however were not targeted against the system being designed Attacks collected in spam filtering or IDSs might have targeted systems based on different features 05-03-2010 Adv ersarial Classification - B. Biggio 12
  • Problems 3. Collected data does not contain attacks of different attack strength • e.g., number of words modified in spam e-mails Buy viagra! Buy vi4gr4! Buy vi4gr4! Did you ever play that game when you were a kid? It is of interest to evaluate robustness of classifiers under different attack strength 05-03-2010 Adv ersarial Classification - B. Biggio 13
  • Robustness evaluation • Result of our robustness evaluation – performance vs attack strength Example Standard performance performance degradation of evaluation text classifiers in spam filtering under different number of modified words C2 C1 accuracy 0 05-03-2010 Adv ersarial Classification - B. Biggio 14
  • Robustness evaluation • Robustness evaluation is required to have a more complete understanding of the classifier’s performance – We need to figure out how an adversary may attack the classifier (security by design) • Designing attacks may be a very difficult task – in-depth knowledge on the specific application is required – costly and time-consuming • e.g., fake fingerprints • We thus propose to simulate the effect of attacks by modifying the feature values of malicious samples 05-03-2010 Adv ersarial Classification - B. Biggio 15
  • Attack simulation • Biometric multi-modal verification system • Potential attacks – spoof attempts s2 Fingerprint Claimed identity spoof Genuine + + Fingerprint score Face Fingerprint matcher matcher s1 s2 + + Impostor Face Fusion module spoof s1 Genuine / Impostor Face score f (x) 05-03-2010 Adv ersarial Classification - B. Biggio 16
  • Attack simulation • Text classifiers in spam filtering – binary features (presence / absence of word) • Potential attacks – bad word obfuscation (BWO) / good word insertion (GWI) Buy viagra! Buy vi4gr4! Did you ever play that game when you were a kid where the little plastic hippo tries to gobble up all your marbles? x = [0 0 1 0 0 0 0 0 …] x’ = [0 0 0 0 1 0 0 1 …] x ' = A(x) 05-03-2010 Adv ersarial Classification - B. Biggio 17
  • Attack strength • Distance in the feature space – chosen depending on the application and features Example • Text classifiers in spam filtering – binary features (presence / absence of word) Buy viagra ! … Buy vi@gr4 ! … x = [0 0 1 0 1 …] x’ = [0 0 0 0 1 …] Hamming distance number of words modified d(x, x ') = 1 in the spam message 05-03-2010 Adv ersarial Classification - B. Biggio 18
  • Attack strategy A(x) Buy viagra! A1 (x) + + B-u-y viagra! A2 (x) + 0 D Buy vi4gr@! d(x, x ') ! D D =1 A(x) depends on the adversary’s knowledge about the classifier! 05-03-2010 Adv ersarial Classification - B. Biggio 19
  • Worst case attack • To simulate attacks which exploits knowledge on the decision function of the classifier # +1, malicious f (x) = sign g(x) ! $ %"1, legitimate e.g., g(x) = & wi xi + w0 Buy viagra! i B-u-y viagra! + + D =1 A(x) = arg min g(x ') x' f (x) + s.t. d(x, x ') ! D Buy vi4gr@! 05-03-2010 Adv ersarial Classification - B. Biggio 20
  • Worst case attack • Linear classifiers / binary features viagra Buy viagra! buy D weights Buy vi4gr@! kid B-u-y vi4gr@! game B-u-y vi4gr@! game • Features which have been assigned the highest absolute weights are modified first 05-03-2010 Adv ersarial Classification - B. Biggio 21
  • Experiments on spam filtering Text classifiers (worst case) • TREC 2007 public data set – Training set: 10K emails – Testing set: 10K emails • Features: words (tokens) • Classifiers (using different number of features) – Logistic Regression (LR) Attack strength – Linear SVM TP • AUC10% 0 0.1 FP Attack strength 05-03-2010 Adv ersarial Classification - B. Biggio 22
  • Mimicry attack • To simulate attacks where no information on the classification function is exploited • Malicious samples are camouflaged to mimic legitimate samples – e.g., spoof attempts, polymorphic attacks Buy viagra! ! B-u-y vi4gr@! D=2 A(x) = arg min d(x ', x ) + x' + s.t. d(x, x ') " D Buy viagra! funny game + + Yesterday I played a funny game… 05-03-2010 Adv ersarial Classification - B. Biggio 23
  • Experiments on spam filtering Text classifiers (mimicry) • TREC 2007 public data set – Training set: 10K emails – Testing set: 10K emails • Features: words (tokens) Attack strength • Classifiers (using different number of features) – Logistic Regression (LR) – Linear SVM – Bayesian text classifier (SpamAssassin) – SVM with RBF kernel Attack strength 05-03-2010 Adv ersarial Classification - B. Biggio 24
  • Experiments on intrusion detection (mimicry) • Data set of real network traffic (Georgia Tech, 2006) – Training set: 20K legitimate packets – Testing set: 20K legitimate packets + 66 distinct HTTP attacks (205 packets) • Packets are classified separately – Features: relative byte frequencies (PAYL) [Wang] 0 1 2 … 255 • One-class classifiers – Mahalanobis Distance classifier (MD) – SVM with RBF kernel • Attack strength – Percentage of bytes modified in a packet Attack strength 05-03-2010 Adv ersarial Classification - B. Biggio 25
  • To sum up 1. The proposed methodology for robustness evaluation extends standard performance evaluation to adversarial applications 2. Experiments showed how this methodology may give useful insights for the design of PR systems in adversarial tasks • e.g., LR outperforms BayesSA, etc. 05-03-2010 Adv ersarial Classification - B. Biggio 26
  • 3. Robust classifiers
  • Defence strategies for robust classifier design • Rationale – Discriminant capability of features may change at operating phase due to attacks – Avoiding to under- or over-emphasise features may increase robustness against attacks which exploit some knowledge on the decision function viagra buy buy viagra weights weights Buy viagra! … kid kid game game • Feature weighting for improved classifier robustness [Kolcz ] – Algorithms for improving robustness of linear classifiers – Underlying idea: to obtain more uniform set of weights 05-03-2010 Adv ersarial Classification - B. Biggio 28
  • Robust classifiers by MCSs f1 (x) = ! wi1 xi + w1 0 bagging, 1 K DATA RSM … ! fk (x) K k =1 fK (x) = ! wiK xi + w0 K • We investigated if bagging and RSM can be exploited to design more robust linear classifiers • The underlying idea is still to obtain more uniform set of weights 05-03-2010 Adv ersarial Classification - B. Biggio 29
  • Robust training • Adding simulated attacks to the training set f '(x) s2 Fingerprint Claimed identity spoof Genuine + + Fingerprint score Face Fingerprint matcher matcher s1 s2 + + Impostor Face Fusion module spoof s1 Genuine / Impostor Face score f (x) 05-03-2010 Adv ersarial Classification - B. Biggio 30
  • Experiments on spam filtering SpamAssassin w1 Header analysis s ! th w2 spam URL filter ! s w3 th Keyword filter … wn legitimate Text classifier s < th • SpamAssassin: open source spam filter – Linear classifier / binary features (tests) • default weights are manually tuned by designers to improve robustness • TREC 2007 public data set – First 10,000 e-mails to train the text classifier – Second 10,000 e-mails to train the linear decision function – Third 10,000 e-mails as testing set 05-03-2010 Adv ersarial Classification - B. Biggio 31
  • Experiments on spam filtering SpamAssassin (worst case) • Attack strength – number of evaded tests • Robust training – to defend against worst case attacks Attack strength • Defence strategies are not effective against the mimicry attack • Strategies proposed by Kolcz exhibited similar results to RSM and bagging Attack strength 05-03-2010 Adv ersarial Classification - B. Biggio 32
  • Conclusions and future works • Adversarial pattern classification and open issues • Contributions of this thesis – State of the art of works in adversarial classification – Methodology for robustness evaluation – Defence strategies for robust classifier design • Experimental results provide useful insights for the design of PR systems in adversarial environments • Future works – Theoretical investigation of adversarial classification – Robustness evaluation of biometric verification systems 05-03-2010 Adv ersarial Classification - B. Biggio 33