Your SlideShare is downloading. ×

Evade Hard Multiple Classifier Systems

1,442

Published on

Multiple classifier systems are widely used in security applications like biometric personal authentication, spam filtering, and intrusion detection in computer networks. Several works experimentally …

Multiple classifier systems are widely used in security applications like biometric personal authentication, spam filtering, and intrusion detection in computer networks. Several works experimentally showed their effectiveness in these tasks. However, their use in such applications is motivated only by intuitive and qualitative arguments. In this work we give a first possible formal explanation of why multiple classifier systems are harder to evade, and therefore more secure, than a system based on a single classifier. To this end, we exploit a theoretical framework recently proposed to model adversarial classification problems. A case study in spam filtering illustrates our theoretical findings.

Published in: Education, Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,442
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Precisare bene cos’e’ W(x,x’), ovvero che è il costo di aggiungere parole, etc E che è una sorta di misura di similarità tra i pattern, per cui vale 0 se e solo se x=x’
  • Intro biometrics, poi parallelo con spam e IDSs In many security systems, hardness of evasion can be improved combining several experts trained on redundant and heterogeneus features MCSs provide a very natural architecture to achieve this task. Our goal is to provide a more formal explaination to this phenomenon, using the framework previously described.
  • Specificare come abbiamo simulato il gioco Adversary optimal strategy Classifier adds modules
  • Transcript

    • 1. P R A G Pattern Recognition and Applications Group University of Cagliari, Italy Department of Electrical and Electronic Engineering Evade Hard Multiple Classifier Systems Battista Biggio, Giorgio Fumera, Fabio Roli ECAI / SUEMA 2008, Patras, Greece, July 21st - 25th SUEMA 2008
    • 2. About me • Pattern Recognition and Applications Group http://prag.diee.unica.it – DIEE, University of Cagliari, Italy. • Contact – Battista Biggio, Ph.D. student battista.biggio@diee.unica.it 21-07-2008 Evade Hard MCSs SUEMA 2008 2
    • 3. Pattern Recognition and P R A G Applications Group • Research interests – Methodological issues • Multiple classifier systems • Classification reliability – Main applications • Intrusion detection in computer networks • Multimedia document categorization, Spam filtering • Biometric authentication (fingerprint, face) • Content-based image retrieval 21-07-2008 Evade Hard MCSs SUEMA 2008 3
    • 4. Why are we working on this topic? • MCSs are widely used in security applications, but… – Lack of theoretical motivations • Only few theoretical works on machine learning for adversarial classification • Goal of this (ongoing) work – To give some theoretical background to the use of MCSs in security applications 21-07-2008 Evade Hard MCSs SUEMA 2008 4
    • 5. Outline • Introducing the problem – Adversarial Classification • A study on MCSs for adversarial classification – MCS hardening strategy: adding classifiers trained on different features – A case study in spam filtering: SpamAssassin 21-07-2008 Evade Hard MCSs SUEMA 2008 5
    • 6. Adversarial Classification Dalvi et al., Adversarial Classification, 10th ACM SIGKDD Int. Conf. 2004 • Adversarial classification – An intelligent adaptive adversary modifies patterns to defeat the classifier. • e.g., spam filtering, intrusion detection systems (IDSs). • Goals – How to design adversary- aware classifiers? – How to improve classifier hardness of evasion? 21-07-2008 Evade Hard MCSs SUEMA 2008 6
    • 7. Definitions Dalvi et al., 2004 • Two class problem: – Positive/malicious patterns (+) – Negative/innocent patterns (-) Adversarial Instance space Classifier cost function - X2 x X2 X2 x + X1 X1 X1 X = {X 1 , ... , X N } C : X ! {+,"} W:X ! X "! Each Xi is a feature Instances, x ∈ X c ∈ C, concept class (e.g., more legible (e.g., emails) (e.g., linear classifier) spam is better) 21-07-2008 Evade Hard MCSs SUEMA 2008 7
    • 8. Adversarial cost function • Cost is related to – Adversary efforts • e.g., to use a different server for sending spam – Attack effectiveness • more legible spam is better! Example • Original spam message: BUY VIAGRA! – Easy to be detected by classifier • Slightly modified spam message: BU-Y V1@GR4! – It can evade classifier and be effective • No more legible spam (uneffective message): B--Y V…! – It can evade several systems, but who will still buy viagra? 21-07-2008 Evade Hard MCSs SUEMA 2008 8
    • 9. A framework for adversarial classification Dalvi et al., 2004 • Problem formulation – Two player game: Classifier vs Adversary • Utility and cost functions for each player • Classifier chooses a decision function C(x) at each ply • Adversary chooses a modification function A(x) to evade classifier • Assumptions in Dalvi et al., 2004 – Perfect Information • Adversary knows the classifier’s discriminant function C(x) • Classifier knows adversary strategy A(x) for modifying patterns – Actions • Adversary can only modify malicious patterns at operation phase (training process is untainted) 21-07-2008 Evade Hard MCSs SUEMA 2008 9
    • 10. In a nutshell Lowd & Meek, Adversarial Learning, 11th ACM SIGKDD Int. Conf. 2005 - - + + Adversary’s Task: Classifier’s Task: Choose minimum cost Choose a new decision modifications to function to minimise the evade classifier expected risk 21-07-2008 Evade Hard MCSs SUEMA 2008 10
    • 11. Adversary’s strategy x2 BUY VIAGRA! + x Too high cost camouflage(s) B--Y V…! Mimimum cost +' camouflage(s) + + x '' x BUY VI@GRA! x ''' C(x) = ! C(x) = + x1 21-07-2008 Evade Hard MCSs SUEMA 2008 11
    • 12. Classifier’s strategy • The Classifier knows A(x) [perfect information] – Adversary-aware classifier Dalvi et al. showed that adversary-aware classifier can perform significantly better x2 ? + detected! x ? still evades… +' x x x1 C(x) = ! C(x) = + x' 21-07-2008 Evade Hard MCSs SUEMA 2008 12
    • 13. Goals of this work • Analysis of a widely used strategy for hardening MCSs – Using different sets of heterogeneus and redundant features [Giacinto et al. (2003), Perdisci et al. (2006)] • Only heuristic and qualitative motivations have been given • Using the described framework, we give more formal explainations about the effectiveness of this strategy 21-07-2008 Evade Hard MCSs SUEMA 2008 13
    • 14. An example of the considered strategy • Biometric verification system Fingerprint Face Decision genuine rule impostor … Voice Claimed Identity 21-07-2008 Evade Hard MCSs SUEMA 2008 14
    • 15. Another example of the considered strategy • Spam filtering Header Analysis Σ Black/White List URL Filter legitimate spam Signature Filter … Assigned class Content Analysis http://spamassassin.apache.org 21-07-2008 Evade Hard MCSs SUEMA 2008 15
    • 16. Applying the framework to the spam filtering case • Cost for Adversary legitimate Header Analysis s1 = 0.2 s2 = 0 true Σ Black/White List s = 5.7 2.7 Signature Filter s3 = 0 s<th s<5 BUY Text Classifier s4 = 2.5 VI@GR4! VIAGRA! … false Keyword Filters sN = 0 3 spam Working assumption: changing “VIAGRA” to “VI@GR4” costs 3! 21-07-2008 Evade Hard MCSs SUEMA 2008 16
    • 17. Applying the framework to the spam filtering case AFM Continues to Climb. Big News On Horizon | UP 50 % This Week Text is embedded Aerofoam Metals Inc. into an image! Symbol : AFML Price : $ 0.10 UP AGAIN Status : Strong Buy legitimate Header Analysis s1 = 3.2 s2 = 0 true Σ Black/White List s = 5.7 3.2 6.2 Signature Filter s3 = 0 s<5 Text Classifier sN = 2.5 0 Evasion costs 2.5 … false Image Analysis sN+1 = 3 Evasion costs 3.0 spam Now both text and image classifiers must be evaded to evade the filter! 21-07-2008 Evade Hard MCSs SUEMA 2008 17
    • 18. Forcing the adversary to surrender • Hardening the system by adding modules can make the evasion too costly for the adversary – In the end, the optimal adversary strategy becomes not fighting! “The ultimate warrior is one who wins the war by forcing the enemy to surrender without fighting any battles” The Art of War, Sun Tzu, 500 BC 21-07-2008 Evade Hard MCSs SUEMA 2008 18
    • 19. Experimental Setup • SpamAssassin – 619 tests – includes a text classifier (naive bayes) • Data set: TREC 2007 spam track – 75,419 e-mails (25,220 ham - 50,199 spam). – We used the first 10K e-mails (taken in chronological order) for training the SpamAssassin naive Bayes classifier. 21-07-2008 Evade Hard MCSs SUEMA 2008 19
    • 20. Experimental Setup • Adversary – Cost simulated at score level • Manhattan distance between test scores – Maximum cost fixed • Rationale: higher cost modifications will make the spam message no more effective/legible • Classifier – We did not take into account the computational cost for adding tests • Performance measure – Expected utility 21-07-2008 Evade Hard MCSs SUEMA 2008 20
    • 21. Experimental Results maximum cost = 1 21-07-2008 Evade Hard MCSs SUEMA 2008 21
    • 22. Experimental Results maximum cost = 5 21-07-2008 Evade Hard MCSs SUEMA 2008 22
    • 23. Will spammers give up? • Spammer economics – Goal: beat enough of the filters temporarily to get a bit of mails through and generate a quick profit – As filters accuracy increases, spammers simply send larger quantities of spam in order to get the same bit of mails still pass through • the cost of sending spam is negligible with respect to the achievable profit! • Is it feasible to push the accuracy of spam filters up to the point where only ineffective spam messages can pass through the filters? – Otherwise spammers won’t give up! 21-07-2008 Evade Hard MCSs SUEMA 2008 23
    • 24. Future work • Theory of Adversarial Classification – Extend the model to more realistic situations • Investigating other defence strategies – We are expanding the framework to model information hiding strategies [Barreno et al. (2006)] • Possible implementation: randomising the placement of the decision boundary “Keep the adversary guessing. If your strategy is a mystery, it cannot be counteracted. This gives you a significant advantage” The Art of War, Sun Tzu, 500 BC 21-07-2008 Evade Hard MCSs SUEMA 2008 24
    • 25. Thank you! • Contacts – roli@diee.unica.it – fumera@diee.unica.it – battista.biggio@diee.unica.it P R A G 21-07-2008 Evade Hard MCSs SUEMA 2008 25

    ×