Adaptive Intrusion Detection Using Learning Classifiers


Published on

This is an introduction to adaptive intrusion detection systems using rules-based learning classifiers. After listing the limitation of the current clustering and supervised learning techniques, the presentation describes a new class of learning algorithms used for detecting and preventing intrusion in computer networks and data center. Security policies are constantly upgraded or downgrades to adjust to ever changing IT environment, organization and regulations, by combining Genetic Algorithm and Reinforcement learning.

Published in: Technology, Education
1 Comment
1 Like
  • Excellent presentation. I have not read much about learning classifiers (LCS, or XCS) lately. Good use case. A suggestion, I find the description of the matching of actions (I believe it used a predictor for XCS from earlier paper) a bit confusing.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Adaptive Intrusion Detection Using Learning Classifiers

  1. 1. Adaptive Intrusion Detection Using Learning Classifiers Patrick Nicolas June 21, 2013
  2. 2. Introduction 2 The objective of this presentation is to review the different method to implement an adaptive intrusion detection (IDS) solution. The second part of the presentation dives into learning classifiers class of algorithms to detect, evaluate and act upon a security breach or cyber attack. Patrick Nicolas © 2013
  3. 3. Data Mining Techniques Learning Classifiers Systems
  4. 4. Context 4 The effectiveness of an intrusion detection system depends on its adaptability to ● Ever changing IT environment ● Evolving internal policies & regulations ● Agile organization & mobile workforce Patrick Nicolas © 2013
  5. 5. Data Mining: Overview Data mining is becoming a popular method to extract knowledge from historical data. However, traditional data mining techniques fail to capture the evolutionary nature of an organization, its process, rules and IT infrastructure. Patrick Nicolas © 2013 5
  6. 6. Data Mining: Clustering Unsupervised learning methods such as clustering or spectral analysis have drawbacks: ● ● ● ● Poor classification of mix variable types No descriptive representation Limited leverage of the domain expertise High computational cost to update models Patrick Nicolas © 2013 6
  7. 7. Data Mining: Supervised Learning Supervised learning methods can be effective ona large set of historical data but have the following limitations: ● Need for large training set to alleviate data over-fitting ● No descriptive representation ● Limited role for domain expert Patrick Nicolas © 2013 7
  8. 8. Data Mining Techniques Learning Classifiers Systems
  9. 9. An evolutionary approach 9 1. An intrusion detection solution should learn from its suggestions through a process borrowed from human behavior: rewardbased learning 1. It should evolve with the monitors: Darwinian process Patrick Nicolas © 2013 system it
  10. 10. Rule-based Learners 10 A class of algorithms known as learning classifiers (LCS) or extended learning classifiers (XCS) combines genetic algorithm and reinforcement learning to discover, evolve security policies and rules from real-time data. Patrick Nicolas © 2013
  11. 11. LCS/XCS Benefits 11 ● Rule-based representation allows security experts to monitor evolving knowledge ● Learn from each security event, making very well suited for streamed data ● Support various seeds schema such as initial rules set, training set and clustering. Patrick Nicolas © 2013
  12. 12. Security rules 12 Security rules are used to represent the knowledge of a security expert. IFnum. outbounds ftp sessions >5 THENcost+2(source: KDD Cup Dataset 1999) Those rules are chained to support reasoning about a sequence of events in a data center. Patrick Nicolas © 2013
  13. 13. Rules Set Evolution 13 The rules set needs to adapt constantly to the ever changing environment & objectives. Patrick Nicolas © 2013
  14. 14. Rule Encoding 14 In order to evolve, rules are represented as genes in Genetic Algorithm. A gene is implemented at a binary vector structure for which the state or condition of the rule is expressed as op(x, value) (i.e. x > value) IF op(x, value) THEN f(cost) is translated 010 1000101 0101101110 01101110100101010 op x values cost or action Patrick Nicolas © 2013
  15. 15. Rules Chains & Chromosomes As with any rules-based inference engine, encoded rules can be chained by aggregating binary representations: IF op1(x1, v1) AND op2(x2, v2)THEN f(cost) 001 010 1000101 01011110 010 100101 0101101110 01101110100101010 && op1 x1 v1 op2 x2 v2 cost or action In terms of evolutionary algorithm, the firing of multiple rules is represented as a sequence of genes or chromosomes Patrick Nicolas © 2013 15
  16. 16. Rules Evolutionary Process The rules set evolves through the genetic recombination of rules using cross-over, mutation and transposition operations. Parent rules Offspring rules 0101101011101110101010111010100111 0101101011101110101010111010100111 1101010101110101001101010110101110 1101010101110110100111010110101110 1 Cross-over operation 0101101011101110101010111010100111 0101101011101110101010101010100011 Mutation operation 0101101011101110101010111010100111 0101101011101110101010101010100011 Transposition operation Patrick Nicolas © 2013 16
  17. 17. Rules Fitness 17 Rules are selected according to their fitness before being ‘mated’ and mutated. The fitness of a rule represents its contribution to a detection or prevention of an intrusion. The rules which are repeatedly invoked, have the highest fitness values and thrive overtime. Other rules become slowly irrelevant. Patrick Nicolas © 2013
  18. 18. Overview Genetic Algorithm The rules set is constantly updated by the Genetic Algorithm to guarantee that it identifies intrusion correctly. Initial rules set Encoding Initial chromosomes Fitness Selection Cross-over Mutation New rules set Decoding New chromosomes Patrick Nicolas © 2013 18
  19. 19. Rule Fitness & Reward The fitness criteria of one or multiple rules has to be updated according to the state of the Infrastructure, organization & policies. The fitness function is updated to provide the best possible reward (or credit) to the rules that contribute to the detection of an intrusion. Patrick Nicolas © 2013 19
  20. 20. Reinforcement Learning Reinforcement learning techniques are widely used in robotics. In the context of IDS, it rewards (or punishes) rules for their contribution (or lack of) in identifying threats taking into account changes in the organization, external accesses and IT infrastructure. Patrick Nicolas © 2013 20
  21. 21. Evolutionary Security Rules Genetic 7 Evolution Algorithm 6 3 Reward Update Fitness New rule 5 State 21 Rules Matching Real-time data Threats monitor IDS 2 Threat predictor 4 1 Threat level Data Center Cloud 1. Process new data/eventfrom the system 2. Find the security related rule(s) which condition matches the event 3. Create a new rule if none match (Covering) 4. Fire the fittest rules with the highest predicted outcome. Patrick Nicolas © 2013
  22. 22. Evolutionary Security Rules Genetic 7 Evolution Algorithm 6 3 Reward Update Fitness New rule 5 State 22 Rules Matching Real-time data Threats monitor IDS 2 Threat predictor 4 1 Threat level Data Center Cloud 5. Process new state on system 6. Reward contributing/matching rules by updating the rule fitness 7. Genetic algorithm update the existing population of security rules through reproduction and mutation of rules. Patrick Nicolas © 2013
  23. 23. Conclusion 23 By combining evolutionary algorithms with reinforcement learning, rule-based learners such as learning classifiers systems allow security policies and constraintsto adapt to any change in environment or data center andthereforestay a step ahead of ever changing threats. Patrick Nicolas © 2003
  24. 24. References 24 ● Genetic Programming: On the Programming of Computers by Means of Natural Selection - j. Koza ● Reinforcement Learning: An Introduction to Adaptive Computation and Machine Learning - R. Sutton, A. Barto ● Learning Classifiers Systems in L. Bull, E. Bernado-Mansilla, J. Holms Data Mining ● Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers G. Ateniese, G. Felici, L. Mancini, D. Vitali, A. Spognardi ● Evaluation of anomaly-based IDS for mobile devices using machine learning classifiers D. Damopoulos, S. Menesidou, G. Kambourakis, M Papadaki, N. Clarke ● Patrick Nicolas © 2003