Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
P R A
               Pattern Recognition and Applications Group
                         University of Cagliari, Italy
   ...
Standard pattern classification model
                                                                      learning
     ...
Adversarial pattern classification
                                                                        learning
      ...
An example of adversarial classification
                            Spam Filtering
                       From: spam@exam...
A game in the feature space…
   N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004


          ...
An example of adversarial classification
    Spammer attacks by adding “good” words…
                From: spam@example.it...
A game in the feature space…
   N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

        Spa...
Modelling the spammer’s attack strategy
      N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 200...
An example of adversarial classification

            Classifier reaction by retraining…
                From: spam@exampl...
Modelling classifier reaction
       N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

      ...
Adversary-aware classifier
  N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004

Results repor...
Defence strategies in adversarial classification
                  Beyond classifier retraining…

     Real anti-spam filt...
A defence strategy: hiding information by randomization

                                          X2             y2(x)
  ...
A defence strategy: hiding information by randomization

                                                X2              y...
A defence strategy: hiding information by randomization
                                   X2                y2(x)
       ...
Evade hard MCS with randomization

                      Header Analysis
                     Black/White List
           ...
Experiments with multiple classifiers and randomization

E-mail data set: TREC 2007             Assume that the adversary...
Upcoming SlideShare
Loading in …5
×

Adversarial Pattern Classification Using Multiple Classifiers and Randomisation

1,668 views

Published on

Published in: Education, Technology
  • Be the first to comment

Adversarial Pattern Classification Using Multiple Classifiers and Randomisation

  1. 1. P R A Pattern Recognition and Applications Group University of Cagliari, Italy Department of Electrical and Electronic Engineering Adversarial pattern classification using multiple classifiers and randomization Battista Biggio, Giorgio Fumera, Fabio Roli S+SSPR 2008,Orlando, Florida, December 4th, 2008
  2. 2. Standard pattern classification model learning algorithm x1 physical pattern acquisition/ (image, text x2 process measurement ... document, ...)‫‏‬ xn classifier feature random vector noise ed by sets of coupled Example: OCR s for formal neurons ation of essentials feat But many security applications, such as spam filtering, do not fit well with the above model: noise is not random, but adversarial. Malicious errors. false negatives are not random, they are created to evade the classifier training data can be “tainted” by the attacker an important classifier’s feature is its “hardness of evasion”, that is, the effort that the attacker has to do for evading the classifier
  3. 3. Adversarial pattern classification learning pattern x1 algorithm (e-mail, measurement x2 network packet, ... fingerprint, ...)‫‏‬ classifier xn adversarial feature noise vector Example: Spam message: spam e-mails CNBC Features MPRG on Power Lunch Today, Price Climbs 74%! The Motion Picture Group Symbol: MPRG Price: $0.33 UP 74% It’s a game with two players: the classifier and the adversary The adversary camouflages illegitimate patterns in adversarial way to evade the classifier The classifier should be adversary-aware to handle the adversarial noise and to implement defence strategies
  4. 4. An example of adversarial classification Spam Filtering From: spam@example.it Buy Viagra ! 1st round Feature weights Linear Classifier buy = 1.0 viagra = 5.0 Total score = 6.0 > 5.0 (threshold) Spam Note that the popular SpamAssassin filter is really a linear classifier  See http://spamassassin.apache.org
  5. 5. A game in the feature space… N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004 X2 From: spam@example.it yc(x) Buy Viagra! ++ + + + -- Feature weights 1st round - - - buy = 1.0 viagra = 5.0 X1 Classifier’s weights are learnt using an initial “untainted” training set See, for example, the case of the SpamAssassin filter http://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron
  6. 6. An example of adversarial classification Spammer attacks by adding “good” words… From: spam@example.it Buy Viagra! Florida UniversityNanjing 2nd round Feature weights buy = 1.0 Linear Classifier viagra = 5.0 University = -2.0 Florida = -3.0 Total score = 1.0 < 5.0 (threshold) Ham
  7. 7. A game in the feature space… N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004 Spammer attacks by adding “good” words… X2 From: spam@example.it yc(x) Buy Viagra! ++ Florida UniversityNanjing + + + Feature weights 2nd round - -- - - buy = 1.0 - viagra = 5.0 X1 University = -2.0 Florida = -3.0 Adding good words is a typical trick used by spammers for evading a filter The spammer’s goal is modifying the mail so that the filter is evaded but the message is still understandable by humans
  8. 8. Modelling the spammer’s attack strategy N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004 X2 The adversary uses a cost yc(x) + function A(x) to select malicious x - x’ A(x) patterns that can be camouflaged as innocent with minimum cost W(x, x’) X1 A(x)=argmax[U A(yc (x'),+)"W(x,x')] x'! X Adversary utility is higher when malicious patterns are misclassified: U A (!,+)>U A (+,+) For spammers, the cost W(x, x’) is related to adding words, replacing words, etc. The adversary transforms a malicious pattern x into an innocent pattern x’ if the camouflage cost W(x, x’) is lower than the utility gain In spam filtering, the adversary selects spam mails which can be camouflaged as ham mails with a minimum number of modifications of mail content
  9. 9. An example of adversarial classification Classifier reaction by retraining… From: spam@example.it Buy Viagra! Florida UniversityNanjing 3rd round Feature weights buy = 1.0 Linear Classifier viagra = 5.0 University = -0.3 Florida = -0.3 Total score = 5.4 > 5.0 (threshold) Spam
  10. 10. Modelling classifier reaction N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004 Classifier retraining… X2 From: spam@example.it yc(x) Buy Viagra! ++ Florida UniversityNanjing + + + Feature weights - - - - buy = 1.0 3rd round - - viagra = 5.0 X1 University = -2.0 Florida = -3.0 The classifier is adversary-aware, it takes into account the previous moves of the adversary In real cases, this means that the filter’s user provides the correct labels for mislabelled mails The classifier constructs a new decision boundary yc(x) if this move gives an utility higher than the cost for extracting features and re-training
  11. 11. Adversary-aware classifier N. Dalvi et al., Adversarial classification, 10th ACM SIGKDD Int. Conf., 2004 Results reported in this paper showed that classifier performance significantly degrades if the adversarial nature of the task is not taken into account, while an adversary-aware classifier can perform significantly better By anticipating the adversary strategy, we can defeat it. Real anti-spam filters should be adversary-aware, which means that they should adapt to and anticipate adversary’s moves: exploiting the feedback of the user, changing their operation, etc. “If you know the enemy and know yourself, you need not fear the result of a hundred battles” (Sun Tzu, 500 BC)
  12. 12. Defence strategies in adversarial classification Beyond classifier retraining… Real anti-spam filters can be re-trained by the feedback of the users x2 which can provide correct labels for the mislabelled mails. In the model of Dalvi et al., this corresponds to the assumption of perfect knowledge of the adversary’s strategy function A(x) Beyond retraining, are there other defence strategies that we can implement? Mimimum cost +' x camouflage(s) BUY VI@GRA! C(x) = ! C(x) = + x1
  13. 13. A defence strategy: hiding information by randomization X2 y2(x) Am I evading it? Two random realizations of the y1(x) - + x + boundary yc(x) - x’ X1 An intuitive strategy for making a classifier harder to evade is to hide information about it to the adversary A possible implementation of this strategy is to introduce some randomness in the placement of the classification boundary “Keep the adversary guessing. If your strategy is a mystery, it cannot be counteracted. This gives you a significant advantage” (Sun Tzu, 500 BC)
  14. 14. A defence strategy: hiding information by randomization X2 y2(x) Am I evading it? Two random realizations of yc(x) y1(x) - + x + A(x)=x’ does not evade y1(x) ! - x’ A(x)=x’ X1 Consider a randomized classifier yc(x, T), where the random variable is the training set T Example: assume that UA(-,+)=5, UA(+,+)=0, W(x’, x)=3 Case 1: the adversary knows the actual boundary y2(x) The adversary’s gain if the pattern x is changed into x’ is UA(x’, x) - W(x’, x)= 5 - 3 = 2, then the adversary does the transformation ad evades the classifier. Case 2: two random boundaries with P(y1(x))=P(y1(x))=0.5 The expected gain is: [UA(x’, x) * P(y1(x)) + UA(x’, x) * P(y2(x))] - W(x’, x)= [0 * 0.5 - 5 * 0.5] - 3 = 2.5 - 3 < 0, then the adversary does not move, even if such move would allow evading the classifier.
  15. 15. A defence strategy: hiding information by randomization X2 y2(x) Am I evading it? - + y1(x) x + - x’ A(x)=x’ X1 Why is a randomized classifier harder to evade? Key points: yc(x) becomes a random variable: Yc The adversary has to compute the C Y x'!X [ E{A(x)} = argmax E{U A(y c (x'),+)} "W(x,x') YC ] E{A(x)} # A(x/y c (x'))= A (x) expected value of A(x) by averaging opt over possible realizations of yc(x) YC In the Proceedings paper we show that adversary’s strategy A(x) becomes suboptimal. Adversary does not camouflage malicious patterns that would allow evading the classifier, or camouflage malicious patterns which are misclassified by the classifier.
  16. 16. Evade hard MCS with randomization Header Analysis Black/White List URL Filter Signature Filter … Σ legitimate spam Assigned class Content Analysis http://spamassassin.apache.org The defence strategy based on “randomization” can be implemented in several ways We implemented it using the multiple classifiers approach, by the randomisation of the combination function For our experiments, we used the SpamAssassin filter that is basically a linearly weighted combination of classifiers, and randomized the weights by training set bootstrapping
  17. 17. Experiments with multiple classifiers and randomization E-mail data set: TREC 2007 Assume that the adversary can make any 75,419 real e-mail messages modification which reduces the score of a rule received between Apr.-July 2007 25,220 ham, 50,199 spam Key point: the adversary does not know the actual set of weights deployed for combining Experimental set up multiple classifiers (filtering rules). So it can We used SpamAssassin filter with a devise only a suboptimal strategy A(x). weighted sum as combination function (a SVM with linear kernel) rnd FN (%) FN (%) det rnd det rnd det UA UA UC UC Randomization of the combination function by bootstrap. The adversary 0.98 0.56 1.30 1.46 19.55 11.21 “sees” 100 different sets of weights with identical probability. SpamAssassin architecture The average false negative rate decreases from 19.55% to 11.25% when the classifier uses randomization This is confirmed by the decrease of adversary’s utility and the increase of classifier’s utility

×