2. What is intrusion detection?
»Intrusion detection systems (IDSs) are software
or hardware systems that automate the process of
monitoring the events occurring in a computer
system or network, analyzing them for signs of
security problems.
3. What is intrusion detection?
»Intrusion detection is the process of monitoring
the events occurring in a computer system or
network and analyzing them for signs of intrusions,
defined as attempts to compromise the
confidentiality, integrity, availability, or to bypass
the security mechanisms of a computer or network
4. Why need intrusion detection?
»Intrusions are caused by attackers accessing the
systems from the Internet, authorized users of the
systems who attempt to gain additional privileges
for which they are not authorized, and authorized
users who misuse the privileges given them.
5. Classification of intrusion
detection system
Generally speaking, there are two kinds of
classification methods for intrusion detection system:
» According to different data sources, intrusion
detection system includes host-based IDS and
network-based IDS.
» According to different analysis methods, intrusion
detection system includes Misuse Detection and
Anomaly Detection.
6. host-based and network-based
IDS
» Host-based systems base their decisions on
information obtained from a single host (usually
audit trails), while network-based intrusion
detection systems obtain data by monitoring the
traffic in the network to which the hosts are
connected
7. Misuse Detection and Anomaly
Detection
» A signature detection system identifies patterns of
traffic or application data presumed to be malicious
while anomaly detection systems compare activities
against a ‘‘normal ’’ baseline
» Anomaly detection assumes that an intrusion will
always reflect some deviations from normal
patterns.
» Misuse detection is based on the knowledge of
system vulnerabilities and known attack patterns
10. Misuse detection Advantages and
disadvantages
» The primary advantage of signature detection is
that known attacks can be detected fairly reliably
with a low false positive rate.
» The drawback of the signature detection
approach is that such systems typically require a
signature to be defined for all of the possible
attacks that an attacker may launch against a
network
11. Misuse detection Advantages and
disadvantages
» The main disadvantage of misuse detection
approaches is that they will detect only the attacks
for which they are trained to detect.
» Novel attacks or unknown attacks or even variants
of common attacks often go undetected. At a time
when new security vulnerabilities in software are
discovered and exploited every day, the reactive
approach embodied by misuse detection methods is
not feasible for defeating malicious attacks
12. Anomaly detection Advantages and
disadvantages
» Anomaly detection systems have two major advantages
over signature based intrusion detection systems. The first
advantage that differentiates anomaly detection systems from
signature detection systems is their ability to detect unknown
attacks as well as ‘‘zero day’’ attacks
» profiles of normal activity are customized for every system,
application and/or network, and therefore making it very
difficult for an attacker to know with certainty what activities
it can carry out without getting detected.
13. Anomaly detection Advantages and
disadvantages
» Disadvantage of the anomaly detection
approach is that well-known attacks may not be
detected, particularly if they fit the established
profile of the user
» if the attacker knows that his profile is stored
he can change his profile slightly and train the
system in such a way that the system will
consider the attack as a normal behavior.
14. Process model for Intrusion Detection
» Three fundamental functional components of an IDS:
Information Sources – the different sources of event
information used to determine whether an intrusion has
taken place. These sources can be drawn from different
levels of the system, with network, host, and application
monitoring most common.
» Analysis – the part of intrusion detection systems that
actually organizes and makes sense of the events derived
from the information sources, deciding when those events
indicate that intrusions are occurring or have already taken
place
» Response – Send alarm to the administrator
16. KDD Cup 99 dataset- A benchmark
» There are approximately 4,940,000 kinds of data in
training dataset
» There are 23 types of attacks contained in training
information and 37 types of attacks contained in test
information,14 types of attacks more than training
information
» each record ( row) has 41 features plus one that is class
variable
» test information can be used to assess the detection
capacity for unknown attacks.
17. KDD Cup 99 dataset attacks
» Four types of attacks in the KDD cup 99 :
Probe: Strictly speaking, it should not be regarded as
true attacks but preparation step of attackers before
launching attacks.
» Dos (Denial of service): Such attack may cause the
stop of server operation, and the server cannot
provide services. The attack usually occupies all
system source of server, or occupies the band width
and disables system resource and makes operation
stop.
18. KDD Cup 99 dataset attacks
(cont…
» U2R (User gain root): In the attack, users
take advantage of system leak to get access to
legal purview or administrator’s purview
» A remote to user (R2L) attack is a class of
attack where an attacker sends packets
to a machine over a network, then exploits the
machine’s vulnerability to illegally gain local
access as a user.
20. Classification tree
» Classification tree which is also called decision tree is
one of the main techniques used in data mining.
» Its main goal is to learn from class-labeled training tuples
for predicting classes of new or previously unseen data.
» Two methods for building tree are top-down tree and
bottom-up Pruning
» ID3 and C4.5, two common algorithms of decision tree, are
constructed in top-down manner.
21. Steps of Classification tree
1) Computing the information gain for each attribute.
2) The attribute with the highest information gain, is
selected as a splitting attribute.
3) If the selected attribute is discrete (categorical), the node
is branched with all possible values. If the attribute is
continuous, a cut point with the highest information gain is
selected.
4) After splitting, consider whether or not these new nodes
are leaves (their data belong to the same type); otherwise,
new nodes are the root of the sub-trees.
5) Repeating all the above steps, until all new nodes are
leaves.
22. SVM – Support Vector Machine
small distance between data and hyperplane and right: big distance
between data and hyperplane.
24. Preprocess of data
» The research will sample training dataset (10%
kddcup.data_10_percent.gz) and test Dataset
» Based on the normal proportion, select each
10,000 group of data where normal proportion is
10%, 20%, 30%, . . ., 90% in training dataset and
test dataset
28. Accuracy comparison between C4.5 and
SVM
» when the proportion of normal information is
large (>70%), their accuracy is approximately equal,
but SVM is much better
» According to the average, C4.5 is slightly better
than SVM
31. Comparison of Detection
Rate(cont…)
» In detection rate, C4.5 declines as the percentage
of normal data rises, but SVM is not fixed.
» Integrally speaking, Curve of C4.5 is above that of
SVM
» obviously, its detection rate is better than that of
SVM
34. False alarm rate comparison between C4.5 and SVM
(cont..)
» In comparison of false alarm rate, SVM is inferior
to C4.5 only when the proportion of normal
information is 30%, 50% and 60%, but it is better
than C4.5 otherwise
» According to the average value, SVM is better C4.5
in false alarm rate.
35. Comparison
» For comparison results of C4.5 and SVM, we
finds that C4.5 is superior to SVM in accuracy
and detection; but in false alarm rate, SVM is
better
36. Feature Selection
» In complex classification domains, features
may contain false correlations, which hinder
the process of detecting intrusions.
» Further, some features may be redundant
since the information they add is contained in
other features
» Extra features can increase computation time,
and can have an impact on the accuracy of the
IDS.
37. Feature Selection(cont..)
» Empirical results indicate that significant input feature
selection is important to design an IDS that is lightweight,
efficient and effective for real world detection systems
» IDSs try to perform their task in real time.Some data may
not be useful to the IDS and thus can be eliminated before
processing
» Feature selection can help to reduce the time need to
construct a model
41. Classification and Regression
Trees (CART)
» The Classification and Regression Trees (CART)
methodology is based on binary recursive partitioning
» The process is binary because parent nodes are always
split into exactly two child nodes and recursive because
the process is repeated by treating each child node as a
parent
» For splitting, the Gini rule is used which essentially is a
measure of how well the splitting rule separates the
classes contained in the parent node
42. Classification and Regression Trees
(CART)(cont…)
» Unlike other methods, CART does not stop in the
middle of the tree growing process, because there
might still be important information to be
discovered by drilling down several more levels.
» Once the maximal tree is grown and a set of
sub-trees is derived from it, CART determines the
best tree by testing for error rates or costs
43. Classification and Regression Trees
(CART)(cont…)
» The best sub-tree is the one with the lowest or
near-lowest cost, which may be a relatively small
tree
» The best variable selected at each node of the tree
is called (first) primary variable
» Surrogate variables are defined as the variables
that most accurately predict the action of the
primary variable
44. Result of CART
» KDD cup 99 Data set has 41 features , which is
high-dimensional
» IDS is a real-time task , thus feature reduction
can help reduce the time of constructing a model
» This resulted in a reduced 12-variable data set
with C, E, F, L, W, X, Y, AB, AE, AF, AG and AI as
variables
48. Conclusion and future work
» Decision trees can help in IDSs with constructing an
accurate model But not do well in R2l and U2R attacks
» From empirical results of U2R and R2L classes which
have small training data and for which decision tree gives
better performance than SVM, we can say that decision
tree works well with small training data
» We found that reducing the number of features will
not necessarily reduce the test time. This quite depends
on the existing relationship between dataset features,
not on the number of features.
49. Refrences
[1] M. Ektefa, S. Memar, F. Sidi, and L. S. Affendey,
"Intrusion Detection Using Data Mining Techniques," 2010
International Conference on Information Retrieval & Knowledge
Management, (CAMP)
2010.
[2] B. M. Bidgoli, M. Analoui, M. H. Rezvani, and H. S.
Shahhoseini, "Performance Evaluation of Decision Tree for
Intrusion Detection Using Reduced Feature Spaces," Trends in
Intelligent Systems and Computer Engineering, 2008.
[3] S. Chebrolua, A. Abrahama, and J. P. Thomasa, "Feature
deduction and ensemble design of intrusion detection
systems," Computers & Security, 2005.