SlideShare a Scribd company logo
1 of 49
IDS
[Intrusion Detection
System]
Analysis of Decision Trees and SVM
S. V. Farrahi
H. Manzari
N. Kharazmi
Shiraz University of Technology
What is intrusion detection?
»Intrusion detection systems (IDSs) are software
or hardware systems that automate the process of
monitoring the events occurring in a computer
system or network, analyzing them for signs of
security problems.
What is intrusion detection?
»Intrusion detection is the process of monitoring
the events occurring in a computer system or
network and analyzing them for signs of intrusions,
defined as attempts to compromise the
confidentiality, integrity, availability, or to bypass
the security mechanisms of a computer or network
Why need intrusion detection?
»Intrusions are caused by attackers accessing the
systems from the Internet, authorized users of the
systems who attempt to gain additional privileges
for which they are not authorized, and authorized
users who misuse the privileges given them.
Classification of intrusion
detection system
Generally speaking, there are two kinds of
classification methods for intrusion detection system:
» According to different data sources, intrusion
detection system includes host-based IDS and
network-based IDS.
» According to different analysis methods, intrusion
detection system includes Misuse Detection and
Anomaly Detection.
host-based and network-based
IDS
» Host-based systems base their decisions on
information obtained from a single host (usually
audit trails), while network-based intrusion
detection systems obtain data by monitoring the
traffic in the network to which the hosts are
connected
Misuse Detection and Anomaly
Detection
» A signature detection system identifies patterns of
traffic or application data presumed to be malicious
while anomaly detection systems compare activities
against a ‘‘normal ’’ baseline
» Anomaly detection assumes that an intrusion will
always reflect some deviations from normal
patterns.
» Misuse detection is based on the knowledge of
system vulnerabilities and known attack patterns
Signatures based
Intrusion
Patterns
activities
pattern
matching
intrusion
Example: if (src_ip == dst_ip) then “land attack”
Anomaly based
activity
measures
probable
intrusion
Misuse detection Advantages and
disadvantages
» The primary advantage of signature detection is
that known attacks can be detected fairly reliably
with a low false positive rate.
» The drawback of the signature detection
approach is that such systems typically require a
signature to be defined for all of the possible
attacks that an attacker may launch against a
network
Misuse detection Advantages and
disadvantages
» The main disadvantage of misuse detection
approaches is that they will detect only the attacks
for which they are trained to detect.
» Novel attacks or unknown attacks or even variants
of common attacks often go undetected. At a time
when new security vulnerabilities in software are
discovered and exploited every day, the reactive
approach embodied by misuse detection methods is
not feasible for defeating malicious attacks
Anomaly detection Advantages and
disadvantages
» Anomaly detection systems have two major advantages
over signature based intrusion detection systems. The first
advantage that differentiates anomaly detection systems from
signature detection systems is their ability to detect unknown
attacks as well as ‘‘zero day’’ attacks
» profiles of normal activity are customized for every system,
application and/or network, and therefore making it very
difficult for an attacker to know with certainty what activities
it can carry out without getting detected.
Anomaly detection Advantages and
disadvantages
» Disadvantage of the anomaly detection
approach is that well-known attacks may not be
detected, particularly if they fit the established
profile of the user
» if the attacker knows that his profile is stored
he can change his profile slightly and train the
system in such a way that the system will
consider the attack as a normal behavior.
Process model for Intrusion Detection
» Three fundamental functional components of an IDS:
Information Sources – the different sources of event
information used to determine whether an intrusion has
taken place. These sources can be drawn from different
levels of the system, with network, host, and application
monitoring most common.
» Analysis – the part of intrusion detection systems that
actually organizes and makes sense of the events derived
from the information sources, deciding when those events
indicate that intrusions are occurring or have already taken
place
» Response – Send alarm to the administrator
Architecture
Architecture of an intrusion detection system
KDD Cup 99 dataset- A benchmark
» There are approximately 4,940,000 kinds of data in
training dataset
» There are 23 types of attacks contained in training
information and 37 types of attacks contained in test
information,14 types of attacks more than training
information
» each record ( row) has 41 features plus one that is class
variable
» test information can be used to assess the detection
capacity for unknown attacks.
KDD Cup 99 dataset attacks
» Four types of attacks in the KDD cup 99 :
Probe: Strictly speaking, it should not be regarded as
true attacks but preparation step of attackers before
launching attacks.
» Dos (Denial of service): Such attack may cause the
stop of server operation, and the server cannot
provide services. The attack usually occupies all
system source of server, or occupies the band width
and disables system resource and makes operation
stop.
KDD Cup 99 dataset attacks
(cont…
» U2R (User gain root): In the attack, users
take advantage of system leak to get access to
legal purview or administrator’s purview
» A remote to user (R2L) attack is a class of
attack where an attacker sends packets
to a machine over a network, then exploits the
machine’s vulnerability to illegally gain local
access as a user.
Evaluation steps
Classification tree
» Classification tree which is also called decision tree is
one of the main techniques used in data mining.
» Its main goal is to learn from class-labeled training tuples
for predicting classes of new or previously unseen data.
» Two methods for building tree are top-down tree and
bottom-up Pruning
» ID3 and C4.5, two common algorithms of decision tree, are
constructed in top-down manner.
Steps of Classification tree
1) Computing the information gain for each attribute.
2) The attribute with the highest information gain, is
selected as a splitting attribute.
3) If the selected attribute is discrete (categorical), the node
is branched with all possible values. If the attribute is
continuous, a cut point with the highest information gain is
selected.
4) After splitting, consider whether or not these new nodes
are leaves (their data belong to the same type); otherwise,
new nodes are the root of the sub-trees.
5) Repeating all the above steps, until all new nodes are
leaves.
SVM – Support Vector Machine
small distance between data and hyperplane and right: big distance
between data and hyperplane.
Percentage of various data
10% kddcup.data_10_percent.gz.
Preprocess of data
» The research will sample training dataset (10%
kddcup.data_10_percent.gz) and test Dataset
» Based on the normal proportion, select each
10,000 group of data where normal proportion is
10%, 20%, 30%, . . ., 90% in training dataset and
test dataset
Camparison
Accuracy = TP +TN/(TP + TN + FP + FN) * 100%
False alarm rate = FP/(FP +TN)* 100%
Detection rate = TP /(TP + FN) * 100%
precision = TP/(TP + FP) * 100%
recall = TP/(TP + FN) * 100%
Accuracy comparison between C4.5 and
SVM
Accuracy comparison between C4.5 and
SVM
Accuracy comparison between C4.5 and
SVM
» when the proportion of normal information is
large (>70%), their accuracy is approximately equal,
but SVM is much better
» According to the average, C4.5 is slightly better
than SVM
Detection rate comparison between C4.5 and
SVM
Comparison of Detection Rate(cont..)
Comparison of Detection
Rate(cont…)
» In detection rate, C4.5 declines as the percentage
of normal data rises, but SVM is not fixed.
» Integrally speaking, Curve of C4.5 is above that of
SVM
» obviously, its detection rate is better than that of
SVM
False alarm rate comparison between C4.5 and
SVM
False alarm rate comparison between C4.5 and
SVM
False alarm rate comparison between C4.5 and SVM
(cont..)
» In comparison of false alarm rate, SVM is inferior
to C4.5 only when the proportion of normal
information is 30%, 50% and 60%, but it is better
than C4.5 otherwise
» According to the average value, SVM is better C4.5
in false alarm rate.
Comparison
» For comparison results of C4.5 and SVM, we
finds that C4.5 is superior to SVM in accuracy
and detection; but in false alarm rate, SVM is
better
Feature Selection
» In complex classification domains, features
may contain false correlations, which hinder
the process of detecting intrusions.
» Further, some features may be redundant
since the information they add is contained in
other features
» Extra features can increase computation time,
and can have an impact on the accuracy of the
IDS.
Feature Selection(cont..)
» Empirical results indicate that significant input feature
selection is important to design an IDS that is lightweight,
efficient and effective for real world detection systems
» IDSs try to perform their task in real time.Some data may
not be useful to the IDS and thus can be eliminated before
processing
» Feature selection can help to reduce the time need to
construct a model
Correlation
coefficient(preprocessing)
Correlation coefficient of A and B is defined as follows :
Correlation
coefficient(preprocessing)
Detection rate comparison between
C4.5 and SVM
Classification and Regression
Trees (CART)
» The Classification and Regression Trees (CART)
methodology is based on binary recursive partitioning
» The process is binary because parent nodes are always
split into exactly two child nodes and recursive because
the process is repeated by treating each child node as a
parent
» For splitting, the Gini rule is used which essentially is a
measure of how well the splitting rule separates the
classes contained in the parent node
Classification and Regression Trees
(CART)(cont…)
» Unlike other methods, CART does not stop in the
middle of the tree growing process, because there
might still be important information to be
discovered by drilling down several more levels.
» Once the maximal tree is grown and a set of
sub-trees is derived from it, CART determines the
best tree by testing for error rates or costs
Classification and Regression Trees
(CART)(cont…)
» The best sub-tree is the one with the lowest or
near-lowest cost, which may be a relatively small
tree
» The best variable selected at each node of the tree
is called (first) primary variable
» Surrogate variables are defined as the variables
that most accurately predict the action of the
primary variable
Result of CART
» KDD cup 99 Data set has 41 features , which is
high-dimensional
» IDS is a real-time task , thus feature reduction
can help reduce the time of constructing a model
» This resulted in a reduced 12-variable data set
with C, E, F, L, W, X, Y, AB, AE, AF, AG and AI as
variables
Performance of CART
Experimental Result
Experimental Result
Conclusion and future work
» Decision trees can help in IDSs with constructing an
accurate model But not do well in R2l and U2R attacks
» From empirical results of U2R and R2L classes which
have small training data and for which decision tree gives
better performance than SVM, we can say that decision
tree works well with small training data
» We found that reducing the number of features will
not necessarily reduce the test time. This quite depends
on the existing relationship between dataset features,
not on the number of features.
Refrences
[1] M. Ektefa, S. Memar, F. Sidi, and L. S. Affendey,
"Intrusion Detection Using Data Mining Techniques," 2010
International Conference on Information Retrieval & Knowledge
Management, (CAMP)
2010.
[2] B. M. Bidgoli, M. Analoui, M. H. Rezvani, and H. S.
Shahhoseini, "Performance Evaluation of Decision Tree for
Intrusion Detection Using Reduced Feature Spaces," Trends in
Intelligent Systems and Computer Engineering, 2008.
[3] S. Chebrolua, A. Abrahama, and J. P. Thomasa, "Feature
deduction and ensemble design of intrusion detection
systems," Computers & Security, 2005.

More Related Content

What's hot

3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysisKrish_ver2
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clusteringMegha Sharma
 
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Edureka!
 
Fault tolerance and computing
Fault tolerance  and computingFault tolerance  and computing
Fault tolerance and computingPalani murugan
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 
QUEUEING NETWORKS
QUEUEING NETWORKSQUEUEING NETWORKS
QUEUEING NETWORKSRohitK71
 
Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiersbutest
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionMartinHogg9
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning pyingkodi maran
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine LearningSamra Shahzadi
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”Dr.(Mrs).Gethsiyal Augasta
 
Data Science: Applying Random Forest
Data Science: Applying Random ForestData Science: Applying Random Forest
Data Science: Applying Random ForestEdureka!
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 

What's hot (20)

3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
 
Fault tolerance and computing
Fault tolerance  and computingFault tolerance  and computing
Fault tolerance and computing
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
QUEUEING NETWORKS
QUEUEING NETWORKSQUEUEING NETWORKS
QUEUEING NETWORKS
 
Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiers
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
Spam Detection Using Natural Language processing
Spam Detection Using Natural Language processingSpam Detection Using Natural Language processing
Spam Detection Using Natural Language processing
 
Data Science: Applying Random Forest
Data Science: Applying Random ForestData Science: Applying Random Forest
Data Science: Applying Random Forest
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 

Similar to IDS - Analysis of SVM and decision trees

Ids 00 introduction_ intrusion detection & prevention systems
Ids 00 introduction_ intrusion detection & prevention systemsIds 00 introduction_ intrusion detection & prevention systems
Ids 00 introduction_ intrusion detection & prevention systemsjyoti_lakhani
 
Role of data mining in cyber security
Role of data mining in cyber securityRole of data mining in cyber security
Role of data mining in cyber securityKhaled Al-Khalili
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...ijcseit
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINijcseit
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainijcseit
 
Understanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptxUnderstanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptxRineri1
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
 
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...IOSR Journals
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectioncsandit
 
DETECTING NETWORK ANOMALIES USING CUSUM and FCM
DETECTING NETWORK ANOMALIES USING CUSUM and FCMDETECTING NETWORK ANOMALIES USING CUSUM and FCM
DETECTING NETWORK ANOMALIES USING CUSUM and FCMEditor IJMTER
 
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Jowin John Chemban
 
a system for denial-of-service attack detection based on multivariate correla...
a system for denial-of-service attack detection based on multivariate correla...a system for denial-of-service attack detection based on multivariate correla...
a system for denial-of-service attack detection based on multivariate correla...swathi78
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques IJMER
 
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSION
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSIONMULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSION
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSIONIJNSA Journal
 
Critical analysis of genetic algorithm based IDS and an approach for detecti...
Critical analysis of genetic algorithm based IDS and an approach  for detecti...Critical analysis of genetic algorithm based IDS and an approach  for detecti...
Critical analysis of genetic algorithm based IDS and an approach for detecti...IOSR Journals
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
 

Similar to IDS - Analysis of SVM and decision trees (20)

Ids 00 introduction_ intrusion detection & prevention systems
Ids 00 introduction_ intrusion detection & prevention systemsIds 00 introduction_ intrusion detection & prevention systems
Ids 00 introduction_ intrusion detection & prevention systems
 
Role of data mining in cyber security
Role of data mining in cyber securityRole of data mining in cyber security
Role of data mining in cyber security
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chain
 
Understanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptxUnderstanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptx
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docx
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
 
46 102-112
46 102-11246 102-112
46 102-112
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detection
 
DETECTING NETWORK ANOMALIES USING CUSUM and FCM
DETECTING NETWORK ANOMALIES USING CUSUM and FCMDETECTING NETWORK ANOMALIES USING CUSUM and FCM
DETECTING NETWORK ANOMALIES USING CUSUM and FCM
 
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
Seminar Presentation | Network Intrusion Detection using Supervised Machine L...
 
a system for denial-of-service attack detection based on multivariate correla...
a system for denial-of-service attack detection based on multivariate correla...a system for denial-of-service attack detection based on multivariate correla...
a system for denial-of-service attack detection based on multivariate correla...
 
Layered approach
Layered approachLayered approach
Layered approach
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques
 
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSION
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSIONMULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSION
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSION
 
Critical analysis of genetic algorithm based IDS and an approach for detecti...
Critical analysis of genetic algorithm based IDS and an approach  for detecti...Critical analysis of genetic algorithm based IDS and an approach  for detecti...
Critical analysis of genetic algorithm based IDS and an approach for detecti...
 
Kx3419591964
Kx3419591964Kx3419591964
Kx3419591964
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
 

Recently uploaded

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Recently uploaded (20)

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

IDS - Analysis of SVM and decision trees

  • 1. IDS [Intrusion Detection System] Analysis of Decision Trees and SVM S. V. Farrahi H. Manzari N. Kharazmi Shiraz University of Technology
  • 2. What is intrusion detection? »Intrusion detection systems (IDSs) are software or hardware systems that automate the process of monitoring the events occurring in a computer system or network, analyzing them for signs of security problems.
  • 3. What is intrusion detection? »Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusions, defined as attempts to compromise the confidentiality, integrity, availability, or to bypass the security mechanisms of a computer or network
  • 4. Why need intrusion detection? »Intrusions are caused by attackers accessing the systems from the Internet, authorized users of the systems who attempt to gain additional privileges for which they are not authorized, and authorized users who misuse the privileges given them.
  • 5. Classification of intrusion detection system Generally speaking, there are two kinds of classification methods for intrusion detection system: » According to different data sources, intrusion detection system includes host-based IDS and network-based IDS. » According to different analysis methods, intrusion detection system includes Misuse Detection and Anomaly Detection.
  • 6. host-based and network-based IDS » Host-based systems base their decisions on information obtained from a single host (usually audit trails), while network-based intrusion detection systems obtain data by monitoring the traffic in the network to which the hosts are connected
  • 7. Misuse Detection and Anomaly Detection » A signature detection system identifies patterns of traffic or application data presumed to be malicious while anomaly detection systems compare activities against a ‘‘normal ’’ baseline » Anomaly detection assumes that an intrusion will always reflect some deviations from normal patterns. » Misuse detection is based on the knowledge of system vulnerabilities and known attack patterns
  • 10. Misuse detection Advantages and disadvantages » The primary advantage of signature detection is that known attacks can be detected fairly reliably with a low false positive rate. » The drawback of the signature detection approach is that such systems typically require a signature to be defined for all of the possible attacks that an attacker may launch against a network
  • 11. Misuse detection Advantages and disadvantages » The main disadvantage of misuse detection approaches is that they will detect only the attacks for which they are trained to detect. » Novel attacks or unknown attacks or even variants of common attacks often go undetected. At a time when new security vulnerabilities in software are discovered and exploited every day, the reactive approach embodied by misuse detection methods is not feasible for defeating malicious attacks
  • 12. Anomaly detection Advantages and disadvantages » Anomaly detection systems have two major advantages over signature based intrusion detection systems. The first advantage that differentiates anomaly detection systems from signature detection systems is their ability to detect unknown attacks as well as ‘‘zero day’’ attacks » profiles of normal activity are customized for every system, application and/or network, and therefore making it very difficult for an attacker to know with certainty what activities it can carry out without getting detected.
  • 13. Anomaly detection Advantages and disadvantages » Disadvantage of the anomaly detection approach is that well-known attacks may not be detected, particularly if they fit the established profile of the user » if the attacker knows that his profile is stored he can change his profile slightly and train the system in such a way that the system will consider the attack as a normal behavior.
  • 14. Process model for Intrusion Detection » Three fundamental functional components of an IDS: Information Sources – the different sources of event information used to determine whether an intrusion has taken place. These sources can be drawn from different levels of the system, with network, host, and application monitoring most common. » Analysis – the part of intrusion detection systems that actually organizes and makes sense of the events derived from the information sources, deciding when those events indicate that intrusions are occurring or have already taken place » Response – Send alarm to the administrator
  • 15. Architecture Architecture of an intrusion detection system
  • 16. KDD Cup 99 dataset- A benchmark » There are approximately 4,940,000 kinds of data in training dataset » There are 23 types of attacks contained in training information and 37 types of attacks contained in test information,14 types of attacks more than training information » each record ( row) has 41 features plus one that is class variable » test information can be used to assess the detection capacity for unknown attacks.
  • 17. KDD Cup 99 dataset attacks » Four types of attacks in the KDD cup 99 : Probe: Strictly speaking, it should not be regarded as true attacks but preparation step of attackers before launching attacks. » Dos (Denial of service): Such attack may cause the stop of server operation, and the server cannot provide services. The attack usually occupies all system source of server, or occupies the band width and disables system resource and makes operation stop.
  • 18. KDD Cup 99 dataset attacks (cont… » U2R (User gain root): In the attack, users take advantage of system leak to get access to legal purview or administrator’s purview » A remote to user (R2L) attack is a class of attack where an attacker sends packets to a machine over a network, then exploits the machine’s vulnerability to illegally gain local access as a user.
  • 20. Classification tree » Classification tree which is also called decision tree is one of the main techniques used in data mining. » Its main goal is to learn from class-labeled training tuples for predicting classes of new or previously unseen data. » Two methods for building tree are top-down tree and bottom-up Pruning » ID3 and C4.5, two common algorithms of decision tree, are constructed in top-down manner.
  • 21. Steps of Classification tree 1) Computing the information gain for each attribute. 2) The attribute with the highest information gain, is selected as a splitting attribute. 3) If the selected attribute is discrete (categorical), the node is branched with all possible values. If the attribute is continuous, a cut point with the highest information gain is selected. 4) After splitting, consider whether or not these new nodes are leaves (their data belong to the same type); otherwise, new nodes are the root of the sub-trees. 5) Repeating all the above steps, until all new nodes are leaves.
  • 22. SVM – Support Vector Machine small distance between data and hyperplane and right: big distance between data and hyperplane.
  • 23. Percentage of various data 10% kddcup.data_10_percent.gz.
  • 24. Preprocess of data » The research will sample training dataset (10% kddcup.data_10_percent.gz) and test Dataset » Based on the normal proportion, select each 10,000 group of data where normal proportion is 10%, 20%, 30%, . . ., 90% in training dataset and test dataset
  • 25. Camparison Accuracy = TP +TN/(TP + TN + FP + FN) * 100% False alarm rate = FP/(FP +TN)* 100% Detection rate = TP /(TP + FN) * 100% precision = TP/(TP + FP) * 100% recall = TP/(TP + FN) * 100%
  • 28. Accuracy comparison between C4.5 and SVM » when the proportion of normal information is large (>70%), their accuracy is approximately equal, but SVM is much better » According to the average, C4.5 is slightly better than SVM
  • 29. Detection rate comparison between C4.5 and SVM
  • 30. Comparison of Detection Rate(cont..)
  • 31. Comparison of Detection Rate(cont…) » In detection rate, C4.5 declines as the percentage of normal data rises, but SVM is not fixed. » Integrally speaking, Curve of C4.5 is above that of SVM » obviously, its detection rate is better than that of SVM
  • 32. False alarm rate comparison between C4.5 and SVM
  • 33. False alarm rate comparison between C4.5 and SVM
  • 34. False alarm rate comparison between C4.5 and SVM (cont..) » In comparison of false alarm rate, SVM is inferior to C4.5 only when the proportion of normal information is 30%, 50% and 60%, but it is better than C4.5 otherwise » According to the average value, SVM is better C4.5 in false alarm rate.
  • 35. Comparison » For comparison results of C4.5 and SVM, we finds that C4.5 is superior to SVM in accuracy and detection; but in false alarm rate, SVM is better
  • 36. Feature Selection » In complex classification domains, features may contain false correlations, which hinder the process of detecting intrusions. » Further, some features may be redundant since the information they add is contained in other features » Extra features can increase computation time, and can have an impact on the accuracy of the IDS.
  • 37. Feature Selection(cont..) » Empirical results indicate that significant input feature selection is important to design an IDS that is lightweight, efficient and effective for real world detection systems » IDSs try to perform their task in real time.Some data may not be useful to the IDS and thus can be eliminated before processing » Feature selection can help to reduce the time need to construct a model
  • 40. Detection rate comparison between C4.5 and SVM
  • 41. Classification and Regression Trees (CART) » The Classification and Regression Trees (CART) methodology is based on binary recursive partitioning » The process is binary because parent nodes are always split into exactly two child nodes and recursive because the process is repeated by treating each child node as a parent » For splitting, the Gini rule is used which essentially is a measure of how well the splitting rule separates the classes contained in the parent node
  • 42. Classification and Regression Trees (CART)(cont…) » Unlike other methods, CART does not stop in the middle of the tree growing process, because there might still be important information to be discovered by drilling down several more levels. » Once the maximal tree is grown and a set of sub-trees is derived from it, CART determines the best tree by testing for error rates or costs
  • 43. Classification and Regression Trees (CART)(cont…) » The best sub-tree is the one with the lowest or near-lowest cost, which may be a relatively small tree » The best variable selected at each node of the tree is called (first) primary variable » Surrogate variables are defined as the variables that most accurately predict the action of the primary variable
  • 44. Result of CART » KDD cup 99 Data set has 41 features , which is high-dimensional » IDS is a real-time task , thus feature reduction can help reduce the time of constructing a model » This resulted in a reduced 12-variable data set with C, E, F, L, W, X, Y, AB, AE, AF, AG and AI as variables
  • 48. Conclusion and future work » Decision trees can help in IDSs with constructing an accurate model But not do well in R2l and U2R attacks » From empirical results of U2R and R2L classes which have small training data and for which decision tree gives better performance than SVM, we can say that decision tree works well with small training data » We found that reducing the number of features will not necessarily reduce the test time. This quite depends on the existing relationship between dataset features, not on the number of features.
  • 49. Refrences [1] M. Ektefa, S. Memar, F. Sidi, and L. S. Affendey, "Intrusion Detection Using Data Mining Techniques," 2010 International Conference on Information Retrieval & Knowledge Management, (CAMP) 2010. [2] B. M. Bidgoli, M. Analoui, M. H. Rezvani, and H. S. Shahhoseini, "Performance Evaluation of Decision Tree for Intrusion Detection Using Reduced Feature Spaces," Trends in Intelligent Systems and Computer Engineering, 2008. [3] S. Chebrolua, A. Abrahama, and J. P. Thomasa, "Feature deduction and ensemble design of intrusion detection systems," Computers & Security, 2005.