Adaptive Intrusion Detection Using Learning Classifiers

Adaptive Intrusion
Detection Using
Learning Classifiers
Patrick Nicolas
June 21, 2013

patricknicolas.blogspot.com
www.slideshare.net/pnicolas
github.com/prnicolas

Introduction

2

The objective of this presentation is to
review the different method to implement
an adaptive intrusion detection (IDS)
solution.
The second part of the presentation dives
into learning classifiers class of algorithms
to detect, evaluate and act upon a security
breach or cyber attack.

Patrick Nicolas © 2013 http://patricknicolas.blogspot.com

https://github.com/prnicolas

Data Mining Techniques
Learning Classifiers Systems

Context

4

The effectiveness of an intrusion detection
system depends on its adaptability to
● Ever changing IT environment
● Evolving internal policies & regulations
● Agile organization & mobile workforce



Data Mining: Overview

Data mining is becoming a popular
method to extract knowledge from
historical data.
However,
traditional
data
mining
techniques
fail
to
capture
the
evolutionary nature of an organization,
its process, rules and IT infrastructure.



5

Data Mining: Clustering
Unsupervised learning methods such as
clustering or spectral analysis have drawbacks:
●
●
●
●

Poor classification of mix variable types
No descriptive representation
Limited leverage of the domain expertise
High computational cost to update models



6

Data Mining: Supervised Learning

Supervised learning methods can be effective
ona large set of historical data but have the
following limitations:
● Need for large training set to alleviate
data over-fitting
● No descriptive representation
● Limited role for domain expert



7

An evolutionary approach

9

1. An intrusion detection solution should learn
from its suggestions through a process
borrowed from human behavior: rewardbased learning
1. It should evolve with the
monitors: Darwinian process


system


it

Rule-based Learners

10

A class of algorithms known as learning
classifiers (LCS) or extended learning
classifiers
(XCS)
combines
genetic
algorithm and reinforcement learning to
discover, evolve security policies and
rules from real-time data.



LCS/XCS Benefits

11

● Rule-based representation allows security
experts to monitor evolving knowledge

● Learn from each security event, making
very well suited for streamed data
● Support various seeds schema such as
initial rules set, training set and
clustering.



Security rules

12

Security rules are used to represent the
knowledge of a security expert.
IFnum.
outbounds
ftp
sessions
>5
THENcost+2(source: KDD Cup Dataset 1999)
Those rules are chained to support reasoning
about a sequence of events in a data center.



Rules Set Evolution

13

The rules set needs to adapt constantly to the
ever changing environment & objectives.



Rule Encoding

14

In order to evolve, rules are represented as
genes in Genetic Algorithm. A gene is
implemented at a binary vector structure for
which the state or condition of the rule is
expressed as op(x, value) (i.e. x > value)

IF op(x, value) THEN f(cost)

is translated

010 1000101 0101101110 01101110100101010
op

x

values

cost or action



Rules Chains & Chromosomes
As with any rules-based inference engine,
encoded rules can be chained by aggregating
binary representations:
IF op1(x1, v1) AND op2(x2, v2)THEN f(cost)
001 010 1000101 01011110 010 100101 0101101110 01101110100101010
&& op1

x1

v1

op2

x2

v2

cost or action

In terms of evolutionary algorithm, the firing of
multiple rules is represented as a sequence of
genes or chromosomes



15

Rules Evolutionary Process
The rules set evolves through the genetic
recombination of rules using cross-over,
mutation and transposition operations.
Parent rules

Offspring rules

0101101011101110101010111010100111

0101101011101110101010111010100111

1101010101110101001101010110101110

1101010101110110100111010110101110
1

Cross-over operation

0101101011101110101010111010100111

0101101011101110101010101010100011

Mutation operation
0101101011101110101010111010100111

0101101011101110101010101010100011

Transposition operation



16

Rules Fitness

17

Rules are selected according to their fitness
before being ‘mated’ and mutated. The
fitness of a rule represents its contribution
to a detection or prevention of an intrusion.
The rules which are repeatedly invoked,
have the highest fitness values and thrive
overtime. Other rules become slowly
irrelevant.



Overview Genetic Algorithm
The rules set is constantly updated by the
Genetic Algorithm to guarantee that it
identifies intrusion correctly.
Initial rules set

Encoding

Initial chromosomes

Fitness

Selection
Cross-over
Mutation

New rules set

Decoding

New chromosomes



18

Rule Fitness & Reward

The fitness criteria of one or multiple rules
has to be updated according to the state of
the Infrastructure, organization & policies.
The fitness function is updated to provide
the best possible reward (or credit) to the
rules that contribute to the detection of an
intrusion.



19

Reinforcement Learning

Reinforcement learning techniques are
widely used in robotics. In the context of
IDS, it rewards (or punishes) rules for
their contribution (or lack of) in
identifying threats taking into account
changes in the organization, external
accesses and IT infrastructure.



20

Evolutionary Security Rules
Genetic 7
Evolution
Algorithm

6

3

Reward

Update
Fitness

New rule

5

State

21

Rules
Matching

Real-time
data

Threats
monitor
IDS

2
Threat
predictor 4

1

Threat
level

Data
Center
Cloud

1. Process new data/eventfrom the system
2. Find the security related rule(s) which condition
matches the event
3. Create a new rule if none match (Covering)
4. Fire the fittest rules with the highest predicted
outcome.



Evolutionary Security Rules
Genetic 7
Evolution
Algorithm

6

3

Reward

Update
Fitness

New rule

5

State

22

Rules
Matching

Real-time
data

Threats
monitor
IDS

2
Threat
predictor 4

1

Threat
level

Data
Center
Cloud

5. Process new state on system
6. Reward contributing/matching rules by updating
the rule fitness
7. Genetic algorithm update the existing population
of security rules through reproduction and
mutation of rules.



Conclusion

23

By combining evolutionary algorithms with
reinforcement learning, rule-based learners
such as learning classifiers systems allow
security policies and constraintsto adapt to
any change in environment or data center
andthereforestay a step ahead of ever
changing threats.



References

24

● Genetic Programming: On the Programming of Computers
by Means of Natural Selection - j. Koza
● Reinforcement Learning: An Introduction to Adaptive
Computation and Machine Learning - R. Sutton, A. Barto
● Learning
Classifiers
Systems
in
L. Bull, E. Bernado-Mansilla, J. Holms

Data

Mining

● Hacking Smart Machines with Smarter Ones: How to
Extract Meaningful Data from Machine Learning
Classifiers
G. Ateniese, G. Felici, L. Mancini, D.
Vitali, A. Spognardi
● Evaluation of anomaly-based IDS for mobile devices using
machine learning classifiers
D. Damopoulos,
S.
Menesidou, G. Kambourakis, M Papadaki, N. Clarke
● http://patricknicolas.blogspot.com



Adaptive Intrusion Detection Using Learning Classifiers

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Viewers also liked

Viewers also liked (20)

Similar to Adaptive Intrusion Detection Using Learning Classifiers

Similar to Adaptive Intrusion Detection Using Learning Classifiers (20)

More from Patrick Nicolas

More from Patrick Nicolas (7)

Recently uploaded

Recently uploaded (20)

Adaptive Intrusion Detection Using Learning Classifiers