ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
My
1. 1
Comparison of Genetic Algorithm Optimization on
Artificial Neural Network and Support Vector Machine
Case Study : Intrusion Detection System
Presented by : Amin Dastanpour
PhD Candidate of Network Security
Advanced Informatics School, University Technology Malaysia, Kuala lumpur
2. 2
Table of Content
Introduction Slide 3
Problem of IDS Slide 4
Solution Slide 5
Related Work Slide 6
Artificial Neural Network Slide 7
Support Vector Machine Slide 8
Genetic Algorithm Slide 9
Methodology Slide 10
Data Set Slide 11
Result Slide 12
Conclusion Slide 15
4. Problem of IDS
It is only capable of detecting the known attacks
and there should be a frequent update for the
attacks.
Network traffic that needs to be dealt with is very
large and the data distribution is highly
imbalanced.
4
5. Solution
Machine learning is to discover and learn and then
adapt to the situation that might change over
time .
In IDS, algorithms are deployed on the input
attacks that have been previously unseen in order
to perform the actual process of detection.
Recognizing the new attacks.
Numbers of key features and the process of
detection will be optimized.
5
6. Related work
Author Method objective
Bin Luo et
al.
four-angle-star based visualized feature
generation approach, (FASVFG)
evaluate the distance between
samples in a 5-class
classification problem
Abraham et
al.
fuzzy rule based
classifiers
framework for Distributed
Intrusion Detection Systems
(DIDS)
Amiri et al. Forward feature selection algorithm(FFSA)
Liner correlation feature selection (LCFS)
Modified mutual information feature selection
(MMIFS)
Propose a feature selection
phase, which can be generally
implemented on any intrusion
detection
Li et al. Ant colony algorithm and support vector
machine (SVM)
This paper proposes a desirable
IDS model with high efficiency
and accuracy
Dastanpour
et al.
Propose a feature selection based on the
Genetic Algorithm (GA) and Support Vector
Machine (SVM)
Improve detection rate with
the less number of features
Dastanpour
et al.
Applying Genetic Algorithms (GA) with
Artificial Neural Networks (ANN) classifier to
detect the attacks in network
Increase of accuracy with the
optimal number of features
6
7. Artificial Neural Network (ANN)
Artificial Neural Network (ANN) and it has been
used to solve the regression and classification
problems and ability of recognition of the
patterns.
Recognize the new attacks or data from the
previous ones.
Problem Of ANN
The purpose of classification and reorganization, a
large data set is required by the ANN. For
optimizing this data type and making or
generating a feature or pattern.7
8. Support Vector Machine (SVM)
Support vector machine (SVM) used for solving
classification .
non-linear classification.
Problem of SVM
SVM needs a large set of data.
8
9. Genetic Algorithm (GA)
Genetic algorithm is an exploratory and adaptive
algorithm for work and search which has been
base on the natural genetics evolutionary ideas.
GA is capable of proposing a solution in a single
solution with an optimal value.
In this Research use GA to Support ANN and SVM.
9
11. DataSet
Knowledge Discovery and Data Mining (KDD CUP
1999) has been applied.
494,020 single connection vectors each of which
contains 41 features and is labeled with exact one
specific attack type : normal or an attack.
Probing
U2R
R2L
DOS
11
14. COMPARING WITH OTHER ALGORITHM
COMPARATIVE OF GA-ANN AND GA-SVM WITH OTHER
ALGORITHM MENTION ON THE RELATED WORK.
14
Name of algorithm Detection rate Number of Feature
LCFS 100 % 21
FFSA 100 % 31
MMIFS 100 % 24
fuzzy rule based 100 % 41
FASVFG 94 % 20
SVM With GA 100 % 24
ANN with GA 100 % 18
15. Conclusion
In this study GA has been proposed for producing
the detection features. Then the SVM and ANN are
used for the detection system classifier and
comparing with each other to show the
effectiveness of the GA on these methods.
Comparison with the other methods, the highest
detection rate is.
The GA with SVM requires 24 features and GA
with ANN needs 18 for achieving 100% of
detection.
15
16. References
1) F. Amiri, M. Rezaei Yousefi, C. Lucas, A. Shakery, and N. Yazdani, "Mutual information-
based feature selection for intrusion detection systems," Journal of Network and
Computer Applications, vol. 34, pp. 1184-1199, 2011.
2) A. Abraham, R. Jain, J. Thomas, and S. Y. Han, "D-SCIDS: Distributed soft computing
intrusion detection system," Journal of Network and Computer Applications, vol. 30,
pp. 81-98, 2007.
3) A. Dastanpour and R. A. R. Mahmood, "Feature Selection Based on Genetic Algorithm
and SupportVector Machine for Intrusion Detection System," in The Second
International Conference on Informatics Engineering & Information Science
(ICIEIS2013), 2013, pp. 169-181.
4) A. Dastanpour, S. Ibrahim, and R. Mashinchi, "Using Genetic Algorithm to Supporting
Artificial Neural Network for Intrusion Detection System," in The International
Conference on Computer Security and Digital Investigation (ComSec2014), 2014, pp.
1-13.
5) …
16
17. 17
Presented by : Amin Dastanpour
PhD Candidate of Network Security
Advanced Informatics School, University Technology Malaysia, Kuala lumpur