SlideShare a Scribd company logo
1 of 25
Download to read offline
A STUDY OF FEATURE SELECTION METHODS IN INTRUSION
DETECTION SYSTEM: A SURVEY
AMRITA & P AHMED
Department of CSE, Sharda University, Greater Noida, India
ABSTRACT
Nowadays, detection of security threats, commonly referred to as intrusion, has become a very
important and critical issue in network, data and information security. Therefore, an intrusion detection
system (IDS) has become a very essential component in computer or network security. Prevention of
such intrusions entirely depends on detection capability of Intrusion Detection System (IDS). As network
speed becomes faster, there is an emerge need for IDS to be lightweight with high detection rates.
Therefore, many feature selection approaches/methods are proposed in the literature. There are three
broad categories of approaches for selecting good feature subset as filter, wrapper and hybrid approach.
The aim of this paper is to present a survey of various feature selection methods for IDS on KDD
CUP’99 bench mark dataset based on these three categories and different evaluation criteria.
KEYWORDS : Feature selection, intrusion detection systems, filter method, wrapper method, hybrid
method.
INTRODUCTION
In the last three decades computer networks have grown in size and complexity drastically. This
tremendous growth has posed challenging issues in network and information security, and detection of
security threats, commonly referred to as intrusion, has become a very important and critical issue in
network, data and information security. The security attacks can cause severe disruption to data and
networks. Therefore, Intrusion Detection System (IDS) becomes an important part of every computer or
network system. An IDS can monitor computer or network traffic and identify malicious activities that
compromise the integrity, confidentiality, and availability of information resources and alerts the system
or network administrator against malicious attacks. Since, an IDS needs to examine very large data with
high dimension even for small network. Due to this, IDS has to meet the challenges of low detection rate
and large computation. Therefore, Feature selection is a very important issue and plays a key role in
intrusion detection in order to achieve maximal performance. It is one of the important and frequently
used techniques in data preprocessing for selecting a subset of relevant features to build robust IDS.
Feature selection is the selection of that minimal cardinality feature subset of original feature set that
retains the high detection accuracy as the original feature set [1]. The efficient feature subset can improve
the training and testing time that helps to build lightweight IDS guaranteeing high detection rates and
makes IDS suitable for real time and on-line detection of attacks.
International Journal of Computer Science Engineering
and Information Technology Research (IJCSEITR)
ISSN 2249-6831
Vol.2, Issue 3, Sep 2012 1-25
© TJPRC Pvt. Ltd.,
2 Amrita & P Ahmed
This survey paper categorizes the feature selection algorithms that have been developed for IDS
building, critically evaluates their usefulness, and recommends ways of enhancing the quality of feature
selection algorithms.
The paper is organized into the following sections. Intrusion Detection Systems is reviewed in
Section 2. Section 3 gives the details of the Datasets and Performance Evaluation used in this survey. In
Section 4, different methodologies of feature selection in IDSs are discussed. Related research in the
literature for feature selection methods together with their performance is addressed in Section 5. Section
6 summaries the different results reported in the literature in tabular form. Section 7 concludes and
discusses future research.
IINTRUSION DETECTION SYSTEM
An intrusion is defined as an attempt to compromise the confidentiality, integrity, availability,
unauthorized use of resources, or to bypass the security mechanisms of a computer system or network
and James P. Anderson introduced Intrusion Detection (ID) early in 1980s [2]. Dorothy Denning
proposed several models for IDS in 1987 [3]. Ideally, Intrusions Detection (ID) should be an intelligent
monitoring process of events occurring in system and analyzing them for security violations policies. An
IDS is required to have a high attack Detection Rate (DR) with a low False Alarm Rate (FAR). Refer [4]
for the organization of a generalized IDS.
Approaches of IDS based on detection are anomaly based and misuse based intrusion detection
approach. In anomaly based intrusion detection approach [5], the system first learns the normal behavior
or activity of the system or network to detect the intrusion. In misuse or signature based intrusion
detection approach [6], the system first define the attack and the characteristics of the attack that
distinguish this attack from normal data or traffic to detect the intrusion. Approaches of IDS based on
location of monitoring are Network based intrusion detection system (NIDS) [7] and Host-based
intrusion detection system (HIDS)[8]. NIDS detects intrusion by monitoring network traffic in terms of
IP packet. HIDS are installed locally on host machines and detects intrusions by examining system calls,
application logs, file system modification and other host activities made by each user on a particular
machine.
DATASETS AND PERFORMANCE EVALUATION
This section summarizes the popular benchmark datasets and performance evaluation measures
in the intrusion detection domain to evaluate different feature selection methods in intrusion detection
system
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 3
DATASETS
The KDD CUP 1999 [9] benchmark datasets are used to evaluate different feature selection
method for IDS. It consists 4,940,000 connection records for training data set and 311,029 connection
records for test data set. The training set contains 24 attacks and the test set contains 38 attacks. Since the
training and test set are prohibitively large, another 10% of the KDD Cup’99 dataset is frequently used
[9]. Each connection had a label of either normal or the attack type, with exactly one specific attack type
falls into one of the four attacks categories [10] as: Denial of Service Attack (DoS), User to Root Attack
(U2R), Remote to Local Attack (R2L) and Probing Attack. Each connection record consisted of 41
features and are labeled in order as 1,2,3,4,5,6,7,8,9,.....,41 and falls into the four categories are shown in
Table 1:
Category 1 (1-9) : Basic features of individual TCP connections
Category 2 (10-22) : Content features within a connection suggested by domain knowledge
Category 3 (23-31) : Traffic features computed using a two-second time window
Category 4 (32-41) : Traffic features computed using a two-second time window from destination to
host
Table 1: Lists of features in the KDD cup 99
Feature # Feature Name Feature # Feature Name Feature # Feature Name
1 Duration 15 Su-attempted 29 Same-srv-rate
2 Protocol-type 16 Num-root 30 Diff-srv-rate
3 Service 17 Num-file-creations 31 Srv-diff-host-rate
4 Flag 18 Num-shells 32 Dst-host-count
5 Src-bytes 19 Num-access-files 33 Dst-host-srv-count
6 Dst-bytes 20 Num-outbound-cmds 34 Dst-host-same-srv-
rate
7 Land 21 Is-hot-login 35 Dst-host-diff-srv-
rate
8 Wrong-fragment 22 Is-guest-login 36 Dst-host-same-src-
port-rate
9 Urgent 23 Count 37 Dst-host-srv-diff-
host-rate
10 Hot 24 Srv-count 38 Dst-host-serror-rate
11 Num-failed-logins 25 Serror-rate 39 Dst-host-srv-serror-
rate
12 Logged-in 26 Srv-serror-rate 40 Dst-host-rerror-rate
13 Num-compromised 27 Rerror-rate 41 Dst-host-srv-rerror-
rate
14 Root-shell 28 Srv-rerror-rate
Performance Evaluation
The effectiveness of an IDS is evaluated by its ability to make correct predictions. According to
the real nature of a given event compared to the prediction from the IDS, four possible outcomes are
shown in Table 2, known as the confusion matrix [4]. True Positive Rate(TPR) or Detection Rate(DR),
True Negative Rate(TNR), False Positive Rate (FPR) or False Alarm Rate (FAR) and False Negative
4 Amrita & P Ahmed
Rate(FNR) are measures that can be applied to quantify the performance of IDSs [4] based on the above
confusion matrix.
Table 2. Confusion Matrix
Predicted Negative Class
(Normal)
Positive Class (Attack)
Actual
Negative Class (Normal) True Negative (TN) False Positive (FP)
Positive Class (Attack) False Negative (FN) True positive (TP)
FEATURE SELECTION
Real time intrusion detection is merely impossible due to the huge amount of data flowing on
the Internet. Feature selection can reduce the computation and model complexity. Research on feature
selection started in early 60s [11]. Feature selection is a technique of selecting a subset of relevant
features by removing most irrelevant and redundant features [12] from the data for building robust
learning models [13].
Process of Feature Selection
Feature selection processes involve four basic steps in a typical feature selection method [13]
shown in Figure 2. They are generation procedure to generate the next candidate subset; an evaluation
function to evaluate the subset under examination; a stopping criterion to decide when to stop; and a
validation procedure to check whether the subset is valid. Figure 2 demonstrates the feature selection
process to determine and validate a best feature subset.
Figure 1 : Feature selection process with validation [13].
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 5
METHODS FOR FEATURE SELECTION
Blum and Langley [14] divide the feature selection methods into three categories named filter,
wrapper and hybrid (embedded) method. These methods are currently used in intrusion detection. The
filter method [15][16] selects features subsets based on the general characteristics of the data. Filter
method is independent of classification algorithms. Filter algorithm [18] uses external learning algorithm
to evaluate the performance of selected features. The wrapper method [19] “Wrap around” the learning
algorithm. It uses one predetermined classifier to evaluate features or feature subsets. Wrapper algorithm
[18] uses a search algorithm to search through the space of possible features and evaluate each subset by
running a model on the subset. Many feature subsets are evaluated based on classification performance
and best one is selected This method is more computationally expensive than the filter method [17][19].
The hybrid method [17][20] combines wrapper and filter approach to achieve best possible performance
with a particular learning algorithm. More efficient search strategies and evaluation criteria are needed
for feature selection with large dimensionality in hybrid algorithm [18] to achieve similar time
complexity of filter algorithms. These methods are discussed in detail in Section 5 and summarized in
section 6.
RELATED WORKS
In this section, we thoroughly discusses the different feature selection methods used in intrusion
detection based on filter, wrapper and hybrid method, number of feature selected, feature number
(according to Table 1), its performance on KDD Cup’99 dataset, strength, limitation and future work
reported in the literature.
Filter Method
A feature selection algorithm, FSMDB based on DB index criterion is proposed in [21] (Zhang
et al., 2004). Criterion function is constructed according to the characters of DB index criterion. 24
features {features no. : 6, 5, 1, 34, 33, 36, 32, 8, 27, 29, 28, 30, 26, 38, 39, 35, 13, 24, 23, 11, 3, 10, 12
and 4} are selected and tested using two classifiers BP network and SVM. Classification accuracy of
FSMDB algorithm by classifiers BP network and SVM are 0.1017 and 0.056 respectively. This method
can be used for supervised or unsupervised classification problems but has computational complexity in
unsupervised learning mode. Future Work: To find a better approach to reduce high computational
complexity in unsupervised learning mode.
Two neural network methods: (1) neural network principal component analysis (NNPCA) and
(2) nonlinear component analysis (NLCA) are presented in [22] (Kuchimanchi et al., 2004). The number
of significant features extracted from methods PCA, NNPCA and NLCA are 19, 19 and 12. The first 19
selected features based on the results of Scree test and critical eignvalues test are {feature no. : 5, 6, 1,
22, 21, 31, 30, 3, 4, 2, 16, 10, 13, 34, 32, 27, 24, 37, 23 and 36}. The performance of the Non-linear
classifier (NC) and the CART decision tree classifier (DC) are tested on four datasets (Table 3). DC has
6 Amrita & P Ahmed
relatively high detection accuracies and low false positive rates. Future Work: This work can be extended
on quantitative measures to find optimal combinations of classifiers and feature extractors for IDS.
Table 3 : False Positive Rates (FPR) And Detection Accuracies (DA) for NC and DC on the four Datasets
DATASET #Features FPR DA
NC DC NC DC
ORIGDATA 41 8.2821 0.2268 99.0198 99.9428
PCADATA 19 29.4105 0.2609 99.1161 99.9167
NNPCADATA 19 50.5463 0.4922 98.8206 99.7516
NLDATA 12 51.2756 0.8227 97.2306 99.6359
RICGA (ReliefF Immune Clonal Genetic Algorithm), a combined feature subset selection
method based on the ReliefF algorithm, Immune Clonal selection algorithm and GA is proposed in [23]
(Zhu et al., 2005). BP networks is used as classifier.. RICGA has higher classification accuracy (86.47%)
for small size feature subsets (8) than ReliefF-GA. Features are not mentioned in the paper.
This paper [24] (Zainal et al., 2006) investigated the effectiveness of Rough Set (RS) theory in
identifying important features and used as a classifier. The 6 significant features obtained are {feature
no.: 41, 32, 24, 4, 5 and 3}. Classification results obtained by Rough Set are compared with Multivariate
Adaptive Regression Splines (MARS), Support Vector Decision Function (SVDF) and Linear Genetic
Programming (LGP). Classification accuracy of RS is ranked second for normal category and performed
almost same to MARS and SVDF for attack category. Future Work: This work can be extended in terms
of accuracy by focusing on fusion of classifiers after a set of optimum feature subset is obtained.
Wong and Lai (2006) [25] combined Discriminant Analysis (DA) and Support Vector Machine
(SVM) to detect intrusion for anomaly-based network IDS. Nine features (feature no. : 12, 23, 32, 2, 24,
36, 31, 29 and 39) are extracted by Discriminant Analysis and evaluated by SVM. The TN (%), FP(%),
FN(%) and TP(%) of the proposed method are 99.58%, 0.42%, 9.93% and 90.07% respectively. Future
Work: Multiple Discriminant Analysis (MDA) can be applied to find the optimal feature set for each
type of attack.
Li et al. (2006) [26] proposed a lightweight intrusion detection model. Information Gain and
Chi-Square approach are used to extract important features and Classic Maximum Entropy (ME) model
is used to learn and detect intrusions. The top 12 important features selected by both methods are
{feature no.: 3, 5, 6, 10, 13, 23, 24, 27, 28, 37, 40 and 41}. Experimental results are shown in Table 4.
Future Work: This model can be applied in realistic environment to verify its real-time performance and
effectiveness.
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 7
Table 4. Detection Results
All 41 features Selected features
Class Testing Time(s) Acc.(%) Testing Time(s) Acc.(%)
Normal 1.28 99.75 0.78 99.73
Probe 2.09 99.8 1.25 99.76
DoS 1.93 100 1.03 100
U2R 1.05 99.89 0.7 99.87
R2L 1.02 99.78 0.68 99.75
Tamilarasan et al. (2006) [27] performed different feature selection and ranking methods on the
KDD Cup’99 dataset. Chi-Square analysis, logistic regression, normal distribution and beta distribution
experiments are performed for feature selection. The 25 most significant features ranked by Chi-square
test are {feature no.: 35, 27, 41, 28, 40, 30, 34, 3, 33, 12, 37, 24, 29, 2, 13, 8, 36, 10, 26, 39, 22, 25, 5, 1,
38}. Experiments are performed for normal, probe, DoS, U2R, and R2L using resilient back propagation
neural network. The overall accuracy of the classification is 97.04% with a FPR of 2.76% and FNR of
0.20%.
Fadaeieslam et al. (2007) [28] proposed a feature selection method based on Decision
Dependent Correlation (DDC). Mutual information of each feature and decision is calculated and top 20
important features {feature no.: 3, 5, 40, 24, 2, 10, 41, 36, 8, 13, 27, 28, 22, 11, 14, 17, 18, 7, 9 and 15}
are selected and evaluated by SVM classifier. The classified result is 93.46% and it outperforms
Principal Component Analysis PCA.
Shina Sheen and R Rajesh (2008) [29] considered different methods: Chi square, Information
Gain and ReliefF for feature selection. Top 20 features {feature no.: 2, 3, 4, 5, 12, 22, 23, 24, 27, 28, 30,
31, 32, 33, 34, 35, 37, 38, 40 and 41} are selected and evaluated using decision tree (C4.5). The
Classification accuracy of Chi Square, Info Gain and ReliefF are 95.8506%, 95.8506% and 95.6432%
respectively.
In [30] (Kiziloren and German, 2009), Principal Component Analysis (PCA) is used for feature
selection to increase quality of extracted feature vectors and Self Organizing Network (SOM) as a
classifier to detect network anomalies. The highest success rate 98.83% of the system is obtained when
number of feature vector size equals to 10. Features are not mentioned in the paper. The average success
rate of the system without using PCA is 97.76%. PCA provides faster classification operation which is
important for a real-time system.
Suebsing and Hiransakolwong (2009) [31] proposed a combination of Euclidean Distance and
Cosine Similarity to select robust features subsets with smaller size. Euclidean Distance is used to select
the features to detect the known attacks and Cosine Similarity is used to select the features to detect the
unknown attacks to build a model. The known detection method extracts 30 important features {feature
no. : 1, 2, 12, 25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40, 41, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19
8 Amrita & P Ahmed
and 22}. The unknown detection method extracts 24 important features {feature no. : 1, 2, 12, 25, 26, 27,
28, 30, 31, 35, 37, 38, 39, 40, 41, 3, 4, 23, 24, 29, 32, 33, 34 and 36}. 15 features {feature no.: 1, 2, 12,
25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40 and 41} are selected by both methods. The C5.0 method is used
as a classifier. The experimental results are shown in Table 5.
Table 5: Results for known and unknown attack
Parameter Known attack Unknown attack
Full Set (41) Known detection
method(30)
Full Set (41) Unknown
detection
method(24)
Overall TP % 97.95 98.12 53.31 68.28
Overall FP % 2.04 1.87 46.69 31.72
Time to Build Model(s) 75 51 75 45
A new approach named Quantitative Intrusion Intensity Assessment (QIIA) is proposed in the
paper [32] (Lee et al., 2009). QIIA evaluates the proximity of each instance of audit data using proximity
metrics based on Random Forests (RF). QIIA uses Random Forests (RF) to select important features by
using the numerical feature importance of RF. Two approaches QIIQ1 and QIIA2 are proposed to
determine the threshold parameters value. The top 5 important features selected are {feature no.: 23, 32,
10, 6 and 3}. Only DoS attacks are used since other attack types have very small number of instances.
The experimental results show that the detection rates (DR) of QIIA1 and QIIA2 are 97.94 and 99.37
respectively.
An entropy-based traffic profiling scheme for detecting security attacks is presented in [33] (Lee
and He, 2009). Only denial-of-service (DoS) attack is focused in this paper. The top six features ranked
by the accuracy are {feature no.: 5, 6, 31, 32, 36 and 37}. The true positive rate (TPR) of this scheme is
91%.
[34] (Xiao et al., 2009) presented a two-step feature selection algorithm. It eliminates two kinds of
features: irrelevant features in first step and redundant features in second step. 21 features {feature no.: 1,
3, 4, 5, 6, 8, 11, 12, 13, 23, 25, 26, 27, 28, 29, 30, 32, 33, 34, 36 and 39} are selected and evaluated using
C4.5 algorithm and Support Vector Machine (SVM). The Detection Rate (%), False Alarm Rate (%) and
Processing Time of selected features (All features) are 86.3 (87.0), 1.89 (1.85) and 15.163 sec (21.891
sec) respectively.
A novel approach for selecting features and comparing the performance of various BN
classifiers is proposed in [35] (Khor et al., 2009). Two feature selection algorithms Correlation-based
Feature Selection Subset Evaluator (CFSE) and Consistency Subset Evaluator (CSE) and domain experts
are utilised to form the proposed feature set.. This feature set contains 7 features as {feature no.: 3, 6, 12,
23, 32*, 14* and 40*}. Bayesian Network (BN) is employed as a classifier. The classification accuracy
(%) of the BN for Normal, DoS, Probe, R2L and U2R types are (99.8, 99.9, 89.4, 91.5 and 69.2%). *:
Features that were selected based on domain knowledge.
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 9
Bahrololum et al. (2009) [36] used three machine learning methods : Decision Tree(DT),
Flexible Neural Tree (FNT) and Particle Swarm Optimization (PSO) for feature selection. The five
important features {feature no.: 10, 17, 14, 13 and 11} are selected depending on the contribution of the
variables for the construction of the decision tree. The experimental results are shown in (Table 6).
Table 6 : Detection Performance using DT, FNT and PSO Methods
Attack Class DT FNT PSO
Normal 9.96% 99.19% 95.69%
DoS 100% 98.75% 90.41%
R2L 99.02% 99.09% 98.10%
U2R 88.33% 99.70% 100%
Probe 99.66% 98.39% 95.53%
An automatic feature selection method based on filter method is proposed by Nguyen et al.
(2010) [37]. The globally optimal subset of relevant features is found by the Correlation Feature
Selection (CFS) and evaluated by C4.5 and BayesNet. The selected features for Normal&DoS are 3 {5, 6
and 12}; for Normal&Probe are 6 {5, 6, 12, 29, 37 and 41}; for Normal&U2R is 1 {14}; for
Normal&R2L are 2 {10 and 22}. Average classification accuracies of C4.5 and BayesNet are 99.41%
and 98.82% respectively.
Chen et al. [38] (2010) proposed a novel inconsistency-based feature selection method. Data
consistency is applied to find the optimal features and evaluated by decision tree method (C4.5). The
proposed method is compared with CFS (Table 7).
Table 7 : Performance Comparision (CC: Classification Correctness)
Attac
k
Type
All features Proposed Method CFS Method
CC(%
)
Time(s
)
Features CC(%
)
Time(s
)
Features CC(%
)
Ti
me
(s)
Probe 99.85 0.66 4(3,5,35,36) 99.77 0.16 4(5,6,25,37) 94.35 0.2
7
DoS 99.94 1.08 4(3,4,10,23) 99.81 0.22 4(2,5,16,22) 99.32 0.3
3
U2R 100 0.11 2(3,41) 100 0.09 9(3,10,24,29,31,32,33,34,40) 100 0.0
8
R2U 98.99 0.22 5(3,5,12,32,35) 99.13 9.13 5(3,5,10,24,33) 98.05 0.1
1
All 99.5 3.72 8(1,3,5,25,32,34,36,40
)
99.45 0.48 11(2,3,4,5,6,10,23,24,25,36,3
7)
99.67 6.2
8
A novel unsupervised statistical varGDLF, a variational framework for the GD mixture model
with localized feature selection (GDLF) approach is proposed in [39] (Fan et al., 2011) for detecting
network based attacks. Eleven features {feature no.: 1, 5, 12, 15, 18, 21, 22, 29, 33, 38 and 41} are
selected. The performance of varGDLF approach is compared with other four variational mixture models
10 Amrita & P Ahmed
and it outperforms with the highest accuracy rate (85.2%), the lowest FP rate (7.3%) and the most
accurately detected number of components (4.95). Accuracy rate for Normal, DOS, R2L, U2R and
Probing is 99.5, 96.5, 75.4, 69.6 and 85.1%, respectively. FP rate is 11.5, 0.8, 1.4, 11.5 and 11.3%,
respectively.
An improved information gain (IIG) algorithm is proposed in [40] (Xian et al., 2011) based on
feature redundancy.. Twenty two features are selected after applying Information Gain (IG) algorithm
and then 12 {feature no.:2, 3, 5, 6, 8, 10, 12, 23, 25, 36, 37 and 38} features are selected after applying
IIG. Naive Bayes (NB) is used to carry out the experiment on the three feature set as the original feature
set (41 features), feature subset 1 (22 features) and feature subset 2(12 features). The Processing times (s)
of the three feature subsets are 8.34, 4.16 and 2.08; the Detection Rates (DR) (%) are 96.187, 96.407 and
96.801; the False Positive Rates (FPR) (%) are 5.22, 2.58 and 1.02 respectively.
WRAPPER METHOD
In paper [41] (Middlemiss and Dick, 2003), a simple Genetic Algorithm (GA) is used to evolve
weights for the features and k-nearest neighbour (KNN)classifier is used as fitness function of the GA
and also as classifier. Top five ranked features for each class are selected {DoS-23,29,1,11,24; R2U-
24,3,12,23,36; U2R-24,6,31,41,17; Probe-2,37,30,3,6}. The result shown indicates an increase in
intrusion detection accuracy.
Mukkamala and Sung (2003) [42] presented two methods to rank the important features:
(1)Performance-Based Ranking Method (PBRM) and (2) Support Vector Decision Function Ranking
Method (SVDFRM). Thirty one features are selected by union of important features for each of the 5
classes ranked by PBRM. In SVDFRM, the union of important features for each of the 5 classes are 23.
The 8 important features identified by both ranking methods are {feature no.: 1, 3, 5, 6, 23, 24, 32 and
33}. Experiments are performed by both methods with classifier SVM (Table 8). Future Work: Ongoing
experiments include making 23-class (22 attack classes plus normal) feature identification using SVMs.
Table 8 : Performance of SVMs
Ranked by PBRM (31) Ranked by SVDFM (23)
Class Training
Time (s)
Testing Time(s) Acc.(%) Training
Time (s)
Testing
Time(s)
Acc.(%)
Normal 7.67 1.02 99.51 4.85 0.82 99.55
Probe 44.38 2.07 99.67 36.23 1.4 99.71
DOS 18.64 1.41 99.22 7.77 1.32 99.2
U2R 3.23 0.98 99.87 1.72 0.75 99.87
R2L 9.81 1.01 99.78 5.91 0.88 99.78
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 11
The Ant Colony Optimization (ACO) based intrusion feature selection algorithm is proposed in
[43] (Gao et al., 2005). The fisher discrimination rate is adopted as the heuristic information for ants’
traversal. The Least Square based SVM classifier is adopted as the base classifier to evaluate the
generated feature subset. The number of features selected by applying ACO-SVM methods is 11 for
Probe, 9 for DoS, and 14 for U2R & R2L. Features name is not mentioned in this paper. Table 9 shows
the experimental results.
Table 9: Performance of ACO-SVM
Type #Feature Correct
Classification
Rates
False Positive
Rates
Average
Detection Time
Probe 11 99.40% 0.35% 0.074
DoS 9 95.20% 3.24% 0.031
U2R&R2L 14 98.70% 1.60% 0.078
This paper [44] (Banković et al., 2007) investigated the possibility to increase the detection rate
(DR) of U2R attacks in misuse detection. Extracted features obtained by using Principal Component
Analysis(PCA) and Multi Expression Programming(MEP) are {U2R-14, 33; DoS- 1, 5, 39; Normal- 3,
10, 12}. Genetic algorithm is employed to implement rules for detecting various types of attacks.
Additional two more rule sets are deployed to re-check the decision of the rule set for detecting U2R
attacks. The experiments show (Table 10) that this system outperforms the best-performed model
reported in literature.
Table 10. Performance of the System
#Rules DR FPR
Total
System
U2R Rule
System
Total
System
U2R Rule
System
50 50 46.3 0.0055 0.007
75 77.8 77.8 7.2 10.2
100 100 100 16.54 27.4
Chen et al. (2007) [45] presented a wrapper based feature selection method. A random search
method named modified random mutation hill climbing (MRMHC) is introduced as search strategy to
select features subsets and Support Vector Machines (SVMs) as classifier. The experiments are shown in
Table 11. Future Work: This method can be improved on search strategy and evaluation criterion.
12 Amrita & P Ahmed
Table 11: Selected feature subsets, time for selecting process for different feature
selection algorithm, average time of building and testing process for ALL
Attacks, DOS, PROBE, R2L and U2R
Attack Type ALL DOS PROBE R2L U2R
#Features 5 4 5 3 5
Selected features 3,5,23,33,34 5,12,23,34 1,3,5,23,37 1, 5,6 1,3,6,14,33
Time of
Selecting
Process(h)
GA-
SVMs
1.3 0.5 4 1.5 1.5
MRMHC-
SVMs
0.4 0.2 2.2 0.8 0.6
Avg. Time to
Build Process(s)
All 78 136 245 317 193
Selected 30 31 96 24 78
Avg. Time to
Test Process(s)
All 18 22 49 55 50
selected 6 5 17 7 15
A multi-objective genetic fuzzy intrusion detection system (MOGFIDS) is proposed by Tsang et
al. (2007) [46]. The MOGFIDS is used as a genetic wrapper to search for a near-optimal feature subset.
The 27 features selected by MOGFIDS are {feature no.: 2 (tcp, udp, icmp), 5, 6, 7, 8, 9, 11, 12, 13, 14,
17, 18, 22, 23, 25, 30, 32, 33, 34, 35, 36, 37, 38, 39 and 40}. The MOGFIDS has second highest ACC
(99.24%) and lowest FPR (1.1%) among the wrappers in the paper. Future Work: This can be applied to
other complex problem domains such as face recognition and DNA computing.
This paper [47] (Wang and Gombault, 2008) proposed a system that extracts important features
from raw network traffic only for DDoS attacks in real computer networks. The first 9 important features
{feature no.: 23, 32, 37, 33, 5, 24, 31, 39 and 3} based on rank are selected by Information Gain and Chi-
square method and evaluated by Bayesian Networks and decision trees (C4.5) shown in Table 12. Future
Work: A practical real-time system for fast detection of DDoS attacks can be developed.
Table 12: Detection rate, False Positive Rate and Construction Time Results
Evaluatio
n Criteria
Dr FPR Features
Construction Time
Training Time (s) Testing time (s)
Methods C4.5 BN C4.5 BN - C4.5 BN C4.5 BN
#Feature
s
9 41 9 41 9 41 9 41 9 41 9 41 9 41 9 41
99.
8
99.
8
99.
6
99.
0
0.3 0.3 1.6 1.5 237(s) 2043(s) 1.
7
15.
3
0.
7
4.4 0.
2
0.9 0.2 0.9
Li et al. (2009) [48] proposed a wrapper-based feature selection method to build lightweight
intrusion detection system. Modified Random Mutation Hill Climbing (RMHC) method are applied as
search strategy to find a candidate feature subset and modified linear Support Vector Machines (SVMs)
to evaluate the candidate feature subset. A classification algorithm based on a decision tree whose nodes
consist of linear SVMs is used to build the IDS from selected features subsets. The experiments show
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 13
that the systems have higher ROC (Receiver Operating Characteristic) scores than all 41 features in
terms of detecting known attacks, new attacks and computational cost (Table 13).
Table 13 – Selected feature subsets, Average time of building and
testing processes with all and selected features for ALL attacks,
DOS, PROBE, R2L and U2R
Attack
Type
Features Building time(s) Testing time(s)
All
features
Selected
features
All
features
Selected features
ALL 4(3,5,23,32) 78 36 18 8
DOS 4(2,5,23,34) 136 41 22 9
PROBE 6(1,3,5,6,23,35) 245 123 49 29
R2L 3(1,3,5) 317 35 55 8
U2R 5(1,3,5,14,32) 193 85 50 18
This paper [49] (Ali et al., 2010) improve the accuracy of Signature Detection Classification
(SDC) Model by applying the features extraction based customized features. Features are extracted by
using GA (Genetic Algorithm), two-second-time and Hidden Markov from customized features. Eleven
features {feature no.: 5, 6, 13, 23, 24, 25, 26, 33, 36, 37 and 38} are extracted and the best signature
detection classification model is developed using JRip, Ridor, PART and Decision tree. The extracted
features have increased the detection rates between 0.4% to 9% and reduced false alarm rates between
0.17% to 0.5%.
Gong et al. (2011) [50] proposed a novel approach for feature selection based on Genetic
Quantum Particale Swarm Optimization (GQPSO) for network intrusion detection. Support Vector
Machine (SVM) is used for classification algorithm. Selected features and experimental results are
shown in Table 14.
Table 14 : Selected Feature and performance of SVM with GQPSO Algorithm
Attack Type Features Training Detecting DR Error
Report
Time(ms) Time(ms) Rate(%)
DoS 10 (2, 6, 3, 12,
21, 22,31, 26, 28,
30)
0.0627 0.0581 99.98 0
Probe 5 (5, 12, 26, 32,
34)
0.0431 0.0478 91.77 0.001
R2L 7 (10, 23, 25, 29,
26, 33, 35)
0.053 0.014 98.26 0
U2R 5 (2, 3, 17, 32,
36)
0.0006 0.0016 100 0.0003
14 Amrita & P Ahmed
Li et al. (2012) [51] proposed an effective wrapper-based feature reduction method, called
gradually feature removal (GFR) method. The GFR method extracted 19 critical features {feature no.: 2,
4, 8, 10, 14, 15, 19, 25, 27, 29, 31, 32, 33, 34, 35, 36, 37, 38 and 40}. The accuracy of SVM classifier is
achieved 98.6249% and MCC (Matthews correlation coefficient) is 0.861161. The training and testing
time of SVM classifier is greatly reduced.
An advanced intelligent systems using ensemble soft computing techniques is proposed by
Sindhu et al. (2012) [52] for a lightweight IDS to detect anomalies in networks. GA (Genetic Algorithm)
is used to extract the feature subset and a neurotree paradigm is proposed as a classifier. Features
extracted by this method are 16 {feature no.: 2, 3, 4, 5, 6, 8, 10, 12, 24, 25, 29, 35, 36, 37, 38 and 40}.
The detection rate is 98.4% which is superior to other methods.
HYBRID METHOD
In this paper [53] (NG et al., 2003), a feature importance ranking methodology based on the
stochastic radial basis function neural network output sensitivity measure (RBFNN-SM) is presented.
RBFNN-SM is used to evaluate the features for only the normal and six classes of denial of service
(DOS) attack. The experiments show that 8 {feature no.: 2, 24, 23, 29, 32, 34, 33 and 36} most
significant sensitive features are enough to classify normal and DOS attacks. The computation
complexity reduced to 9 seconds from 23 seconds. The classification accuracy for normal and DOS
attacks are 99.77% and 99.06%; the FAR for 8 (41) features are 0.18% (0.01%) and 0.27% (0.03%); the
FPR are 0.93% (0.70%); and training and testing are 0.94% and (0.71%) respectively.
Shazzad and Park (2005) [54] proposed a fast hybrid feature selection method to determine an
optimal feature set. This method is a fusion of Correlation-based Feature Selection (CFS), Support
Vector Machine (SVM) and Genetic Algorithm (GA). Subsets of features are generated by Genetic
Algorithm and evaluated by CFS and SVM. The 12 selected features are {feature no.: 1, 6, 12, 14, 23,
24, 25, 31, 32, 37, 40 and 41}. Optimal subset set has 99.56% as DR and 37.5% as FPR in average.
Chebrolu, Abraham and Thomas(2005) [7] investigated the performance of two feature
selection techniques, Bayesian Networks (BN) and Classification and Regression Trees (CART) and
developed the ensemble classifier of both techniques for building an IDS and best in classifying R2L and
DoS. Seventeen important features are {feature no.: 1, 2, 3, 5, 7, 8, 11, 12, 14, 17, 22, 23, 24, 25, 26, 30
and 32} are selected by Markov blanket model and a classifier is constructed using BN and tested.
Twelve features {feature no.: 3, 5, 6, 12, 23, 24, 25, 28, 31, 32, 33 and 35} are selected by decision tree
and a classifier using CART is constructed and tested. Normal class is classified 100% correctly and the
accuracies of classes U2R and R2L have increased by using the 12-variable reduced data set. It is
observed that CART classifies accurately on smaller data sets. In ensemble approach, the BN classifier
and the CART models are constructed first individually. Then the ensemble approach is used for the 12,
17 and 41-variable data sets. By using the ensemble model, Normal, Probe and DOS could be detected
with 100% accuracy and U2R and R2L with 84% and 99.47% accuracies, respectively.
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 15
In this paper [55] (Chen et al., 2007), a new hybrid approach named as C4.5-PCA-C4.5 is
proposed. It uses PCA (Principal Component Analysis) and decision tree classifier C4.5 as feature
selection method and C4.5 as classifiers. The important features extracted are {feature no.: 33, 34, 4, 1,
3, 10 and 22}. The performance of C4.5-PCA-C4.5 is compared with other four systems C4.5-ALL,
C4.5-PCA, SVM-CFS and SVM-CFS-SVM. The experiment results show that C4.5-PCA-C4.5 has
lower testing time, fast training and testing process, highest TPR, lowest FPR. Average building process
time for C4.5-PCA-C4.5 is 6 sec.
Lee et al. (2007) [56] uses two machine learning algorithms Random Forests (RF) for feature
selection and Minimax Probability Machine (MPM) for intrusion detection. The top 5 {feature no.: 23, 6,
29, 3 and 5} important features are selected. Only Denial of Service (DoS) attacks are used. The
detection rate is 99.84% and average simulation time is 0.1039 sec.
Wei Wang et al. (2008) [57] used filter and wrapper scheme for feature selection. Information
gain (IG) based filter model and Bayesian networks (BN) and decision trees (C4.5) based wrapper model
are employed to select features for network intrusion detection and Bayesian networks (BN) and decision
trees (C4.5) as classifier. Experiments results and selected 10 features for each class are shown in Table
15.
Table 15. Results comparison using 41 features and 10 features
Attacks Features
Selected
Methods Using 41 Features Using 10 Features
DR FPR Training
Time(s)
Test
Time(s)
DR FPR Training
Time(s)
Test
Time(s)
DoS 3, 4, 5,
6, 8, 10,
13, 23,
24, 37
BN 98.73 0.08 4.7 2.1 100 0 0.8 0.6
C4.5 99.96 0.15 16.3 1.2 100 0.14 4.6 0.5
DDoS 3, 4, 5,
6, 8, 10,
13, 23,
24, 37
BN 99.03 1.53 - - 99 1.92 - -
C4.5 99.8 0.26 - - 100 0.34 - -
Probe 3, 4, 5,
6, 29,
30, 32,
35, 39,
40
BN 92.89 6.08 3.1 2.8 83 3.06 0.5 0.4
C4.5 82.59 0.04 14.5 1.1 83 0.05 1.2 0.3
R2L 1, 3, 5,
6, 12,
22, 23,
31, 32,
33
BN 92.22 0.33 2.6 1.8 89 0.32 0.5 0.4
C4.5 80.29 0.02 10.5 0.8 87 0.01 0.5 0.2
U2R 1, 2, 3,
5, 10,
13, 14,
32, 33,
36
BN 75.86 0.29 2.6 1.8 66 0.12 0.4 0.4
C4.5 24.14 0 9.9 0.7 24 0 0.6 0.2
Hong and Haibo (2009) [58] proposed a new hybrid selection algorithm to build lightweight
network IDS. Chi-Square and enhanced C4.5 algorithm are used for feature selection in the
preprocessing phase. The top fifteen most important features extracted from Chi-Square algorithms are
16 Amrita & P Ahmed
{feature no.: 5, 3, 23, 35, 4, 8, 30, 34, 36, 6, 33, 38, 24, 25 and 2}. The top five features extracted by
C4.5 and C4.5-Chi2 methods are {feature no.:25, 4, 2, 5 and 29} and {feature no.: 5, 3, 4, 8 and 25}
respectively. The experimental results are shown in Table 16.
Table 16: Detection & False Positive Rate Results based on C4.5- CHI2
Attack
Type
Evaluation Criteria
DR FPR Training Time Testing Time
Normal 99.9 1.6
0.02 Sec 0.03 Sec.
DOS 99.3 1.48
Probe 93.87 1.82
U2R 50.01 28.32
R2L 61.55 12.17
In this paper [59] (Xiang et al., 2009), a hybrid method named Robust Artificial Intelligence
Selection Algorithm (RAIS) is presented. Mutual information and artificial intelligence method are used
for feature subsets selection and SVMs as classifier. Selected features are not mentioned in this paper.
The experimental results show that the RAIS algorithm has the lowest false alarm rate, 3.49%, the
highest rate of accuracy, 99.01%, and detection rate, 99.27%.
Zaman and Karray (2009) [60] proposed a novel and simple method named Enhanced Support
Vector Decision Function (ESVDF) for features selection. This method utilizes the Support Vector
Machines (SVMs) approach based on Forward Selection Ranking (FSR) and Backward Elimination
Ranking (BER) algorithms. The ESVDF (SVDF/FSR or SVDF/BER) method applies SVDF in the FSR
and BER approaches to select the most effective features set. Two classifiers: Neural Networks (NNs)
and SVMs are used to evaluate features. The experimental results are shown in Table 17. Feature’s name
is not mentioned.
Table 17 : Comparison of ESVDF/FSR, ESVDF/BER, and All 41 Features using NNs and
SVMs classifiers.
Classifier Algorithm #Features Accuarcy FPR Training
Time
Testing Time
NN ESVDF/FSR 6 99.55% 0.0032 217.57 0.047
ESVDF/BER 9 99.57% 0.003 255.047 0.053
Non 41 99.65% 0.0036 911.68 0.075
SVM ESVDF/FSR 6 99.46% 0.0033 2.039 0.052
ESVDF/BER 9 99.58% 0.0031 2.1 0.046
Non 41 99.71% 0.0032 5.182 0.17
Ming-Yang Su (2011) [61] proposed a method for feature selection to detect DoS/DDoS attacks
in real time for designing an anomaly-based NIDS. Genetic algorithm (GA) combined with KNN (k-
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 17
nearest-neighbor) are used for feature selection and weighting. The result of KNN classification is used
as the fitness function in a genetic algorithm to evolve the weight vectors of features. Initial 35 features
in the training phase are weighted. The top 19 features are considered for known attacks and the top 28
features for unknown attacks. Extracted features are not mentioned in the paper. An overall accuracy rate
of 97.42% is obtained for known attacks and 78% for unknown attacks.
A SYSTEMATIC REVIEW OF RELATED WORK
The afore-mentioned work of feature selection is summarized in a systematic way according to
approach as filter in Table 18, wrapper in Table 19 and hybrid in Table 20. These tables consist of
literature reference, proposed method name, number of features selected by paper, feature number
according to Table 1, classifier used to evaluate the proposed method, evaluation criteria and results of
proposed method.
Table 18: Summary of Filter Method
Lit.
Ref.
Method Name No of
Feature
Feature No Classifier
Used
Evaluation
Criteria
Result
FILTERMETHOD
[21]
2004
FSMDB 24 6,5,1,34,33,36,32,8,27,29,2
8,30,26,38,39,35,13,24,23,1
1,3,10,12,4
BP
Network,
SVM
Classification
Accuracy
BP-0.1017
SVM-
0.056
[22]
2004
NNPCA &
NLCA
19
12
5, 6, 1, 22, 21, 31, 30, 3, 4,
2, 16, 10, 13, 34, 32, 27, 24,
37, 23
NC & DC FPR
Detection
Accuracies
Table 3
[23]
2005
RICGA 12 Not Mentioned BP
Network
Classification
Accuracy
88.15%.
[24]
2006
Rough Set 6 41, 32, 24, 4, 5, 3 Rough Set Classification
Accuracy
99.743
[25]
2006
Combined DA and
SVM
9 12, 23, 32, 2, 24, 36, 31, 29,
39
SVM TN (%)
FP ( %)
FN (%)
TP (%)
99.58%
00.42%
09.93%
90.07%
[26]
2006
Information Gain
and Chi-Square
approach
12 3,5,6,10,13,23,24,27,28,37,
40,41
ME Accuracy
Testing Time
Table 4
[27]
2006
Artificial Neural
Networks and
Statistical Methods
25 35,27,41,28,40,30,34,3,33,1
2,37, 24,29, 2, 13,8,36,10,
26,39,22, 25,5,1,38
RBP
Neural
Network
Accuracy
FPR
FNR
97.04%
2.76%
0.20%
[28]
2007
Decision Dependent
Correlation(DDC)
20 3,5,40,24,2,10,41,36,8,13,2
7,28,22,11,14,17,18,7,9,15
SVM Classification
Accuracy
93.46%
[29]
2008
Chi Square,
Info Gain and
ReliefF
20 2,3,4,5,12,22,23,24,27,28,
30,31,32,33,34, 35,37,38,
40,41
Decision
Tree(C4.5)
Classification
Accuracy
95.8506%
95.8506%
95.6432%
[30]
2009
PCA-SOM 10 Not mentioned SOM Avg. Success
Rate
98.83%
[31]
2009
Euclidean Distance
& Cosine Similarity
15 1, 2, 12, 25, 26, 27, 28, 30,
31, 35, 37, 38, 39, 40 41
C5.0 Table 5 Table 5
[32]
2009
(1) QIIA1(Max value)
(2)QIIA2(Center Data)
5 23, 32, 10, 6 , 3 (1)
(2)
DR (1) 97.94
(2) 99.37
[33]
2009
Entropy-Based Scheme
with Chi-Square
5 5, 6, 31, 32, 36, 37 Chi-Square
Test
TPR 91%
[34]
2009
Mutual Information
based Algorithm
21 1, 3, 4, 5, 6, 8, 11, 12, 13,
23, 25, 26, 27, 28, 29, 30,
32, 33, 34, 36, 39
C4.5 &
SVM
DR
FAR
Process. Time
86.3
1.89
15.163s
[35]
2009
Proposed feature set
using CFSE and
CSE
7 3, 6, 12, 23, 32*, 14*, 40* BN Classification
Accuracy (%)
Normal-
99.8
DoS-99.9
Probe-89.4
R2L-91.5
U2R-69.2
[36]
2009
Based on DT, FNT
and PSO
5 10,17,14,13, 11 DT, FNT
and PSO
Detection
Accuracy
Table 6
max
^
xP
TxP
^
18 Amrita & P Ahmed
[37]
2010
M01LPfrom CFS 3
6
1
2
Normal&Dos-5,6,12;
Normal&Probe-
5,6,12,29,37,41;
Normal&U2R-14;
Normal&R2L-10,22;
C4.5
BayesNet
Classification
Accuracy
99.41%
98.82%
[38]
2010
Inconsistency-based
feature selection
method
Table 7 Table 7 C4.5 Classification
Correctness
Time(s)
Table 7
[39]
2011
varGDLF 11 1, 5, 12, 15, 18, 21, 22, 29,
33, 38, 41
varGDLF Accuracy Rate
FPR
No of Comp.
85.2%
7.3%
4.95
[40]
2011
IIG(Improved
Information Gain)
12 2, 3, 5, 6, 8, 10, 12, 23, 25,
36, 37, 38
NB DR
FPR
Processing Time
96.801
1.02
2.08 s
*: Features that were selected based on domain knowledge.
Table 19: Summary of Wrapper Method
Lit.
Ref.
Method Name No of
Feature
Feature No Classifie
r Used
Evaluation
Criteria
Result
WRAPPERMETHOD
[41]
200
3
GA combination
with a k-nearest
neighbour
classifier
5 for
each
class
DoS-23,29,1,11,24;
R2U-24,3,12,23,36;
U2R-24,6,31,41,17;
Probe-2,37,30,3,6
KNN Detection
Accuracy
Increase
in ID
Accurac
y
[42]
200
3
PBRM and
SVDFRM
8 1,3,5,6,23,24,32,33 SVM Table 8 Table
8
[43]
200
5
ACO-SVM Table 9 Not Mentioned SVM Table 9 Table
9
[44]
200
7
PCA & MEP 8 14, 33,1, 5, 39, 3, 10, 12 GA DR
FPR
Table
10
[45]
200
7
MRMHC-SVMs Table
11
Table 11 SVM Table 11 Table
11
[46]
200
7
MOGFIDS 27 2(tcp,udp,icmp),5,6,7,8,9
,11,
12,13,14,17,18,22,23,25,
30,32,
33,34,35,36,37,38,39, 40
MOGFID
S
Accuracy
FPR
99.24
%
1.1%
[47]
200
8
Information Gain
and Chi-square
9 23, 32, 37, 33, 5, 24, 31,
39, 3
C4.5 &
BN
Table 12 Table
12
[48]
200
9
Modified RMHC
and modified
linear SVM
Table
13
Table 13 Decision
Tree
Table 13 Table 13
[49]
201
0
Features Selection
based on
Customized
Features
11 5, 6, 13, 23, 24, 25, 26,
33, 36, 37, 38
JRip, Ridor,
PART &
Decision
tree
DR
FAR
Increase
d
Decrease
d
[50]
201
1
GQPSO Table
14
Table 14 SVM Table 14 Table
14
[51]
201
2
GFR (Gradually
Feature Removal)
19 2,4,8,10,14,
15,19,25,27,
29,31,32,33,
34,35,36,37, 38,40
SVM Training time
(s)
Testing time
(s)
Accuracy (%)
MCCavg
0.118356
4.63227
98.6249
0.861161
[52]
201
2
A combined GA
and neurotree
method
16 2,3,4,5,6,8, 10,12,24,
25,29,35,36,37,38,40
Neurotre
e
DR 98.38
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 19
Table 20: Summary of Hybrid Method
Lit.
Ref.
Method Name No of
Feature
Feature No Classifie
r Used
Evaluation
Criteria
Result
HYBRIDMETHOD [53]
200
3
RBFNN-SM 8 2, 24, 23, 29, 32, 34, 33,
36
RBFNN Class. Acc.
FAR
FPR
99.415%
0.065%
0.935%
[54]
200
5
A fusion of CFS,
SVM & GA
12 1, 6, 12, 14, 23, 24, 25,
31, 32, 37, 40, 41
SVM DR
FPR
99.56%
37.5%
[7]
200
5
Markov blanket
model and
Decision Tree for
feature selection
17-BN
12-CART
{1,2,3,5,7,8,
11,12,14,17, 22,23,24,
25, 26,30,32};
{3,5,6,12,23,
24,25,28,31, 32,33,35}
Ensemble
of BN
and
CART
Accuracy
(%)
100% -
Normal,
DoS,Probe
84% -
U2R
99.47-R2L
[55]
200
7
C4.5-PCA-C4.5 5 33, 34, 4, 1, 3, 10, 22 C4.5 Testing
Time, TPR,
FPR
6 sec
-, -
[56]
200
7
RF 5 23, 6, 29, 3, 5 MPM DR
Avg Sim. Time
99.84%
0.1039
s
[57]
200
8
Information gain &
BN and C4.5
10 Table 15 BN &
C4.5
DR
FPR
Table
15
[58]
200
9
C4.5-Chi2 5 5, 3, 4, 8, 25 Enhanced
C4.5
Table 16 Table
16
[59]
200
9
RAIS - Not mentioned SVM DR
FAR
Accuracy
99.17%
3.49%
98.60%
[60]
200
9
ESVDF/FSR
ESVDF/BER
6
9
Not mentioned NN
SVM
Table 17 Table
17
[61]
201
1
GA/KNN Hybrid 19
28
Not Mentioned GA/KNN Accuracy
Rate
97.42%
78.00%
CONCLUSIONS & FUTURE RESEARCH DIRECTIONS
Intrusion Detection Systems (IDS) have become vital and a necessary component of almost
every computer and network security. As network speed becomes faster, there is an emerge need for IDS
to be lightweight, efficient and accurate with high detection rates (DR) and low false positive rates
(FAR). Other difficulties faced by intrusion detection systems are curse of feature dimensionality and
emerging data complexities. Therefore, feature selection has become very important part in intrusion
detection systems due to curse of feature dimensionality and emerging data complexities. Feature
selection selects a subset of relevant features, removes irrelevant and redundant features from the dataset
to build robust, efficient, accurate and lightweight intrusion detection system to ensure timeliness for real
time.
A plenty of feature selection methods have been proposed by researchers in intrusion detection
system to deal with these problems. This paper has presented to survey this fast developing field and
addresses the main contribution of feature selection research proposed for intrusion detection. We
showed that why feature selection method is vital in IDS. We surveyed the existing feature selection
methods for IDS categorised as filter, wrapper and hybrid. We also presented the performance of these
methods based on different metric on KDD Cup’99 dataset, mentioned extracted feature set and classifier
20 Amrita & P Ahmed
to evaluate these extracted feature set, strength, limitation and future work of these proposed method in
section 5 and 6. The following are useful future research issues:
FUTURE RESEARCH
Single classifier for evaluation of the extracted feature set may be no longer good solution for
building the robust IDS. Therefore, designing more sophisticated classifiers by combining multiple
classifiers or combining ensemble [7] and hybrid classifiers may enhance the robustness and
performance of IDS.
After comparing the existing feature selection methods in intrusion detection, we discovered that
finding an optimal and best feature set still needs to be researched.
Feature selection algorithms always need improvement on search strategy and evaluation criterion for
building efficient and lightweight intrusion detection system.
Robustness of the extracted feature can be enhanced by using ensemble of feature selection methods,
combined with appropriate evaluation criteria.
After surveying these many feature selection methods, we cannot say that which method perform the
best under which classifier for intrusion detection (to the best of our knowledge).
Most of the proposed method works on two-class classification (normal and attack type) (to the best
of our knowledge). Very little work has been done on multiple class classification (five-class four
classes of attack and one class of normal) [62][63]. Therefore, the research in many papers can be further
extended in the future on multiple class classification.
Classes in KDD Cup’99 are unbalanced in both training and test sets as it can be seen in Table 1.
Normal and DoS classes have enough instances, whereas Probe and R2L have small instances,
particularly U2R. These classes (Probe, R2L, U2R) have not good classification rate due to small number
of instances in training set [56][31][39]. So, this is future research to develop the method combined with
appropriate evaluation criteria to alleviate the small instance of dataset.
We can conclude that there are features that really significant in classifying the normal and attacks
type as reported in literature. Also, there is no specific generic classifier that can best classify all the
attack types as seen in this survey. Different researchers use different classifier to evaluate the feature set.
This paper systematically summarized the contributions of each researcher and also projected the number
of significant research problem in this field. We hope that this survey will provide useful insights, broad
overview and new research directions about this field to the readers.
REFERENCES
[1] Mitra, P. et al. (2002). Unsupervised Feature Selection Using Feature Similarity. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 24, 301–312
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 21
[2] Anderson, J. P. (1980). Computer security threat monitoring and surveillance. Technical Report
98-17, James P. Anderson Co., Fort Washington, Pennsylvania, USA
[3] Denning, D. E. (1987). An intrusion detection model. IEEE Transaction on Software
Engineering, Software Engineering 13(2), 222-232
[4] Wu, S.X. & Banzhaf, W. (2010). The use of computational intelligence in intrusion detection
systems: A review. Applied Soft Computing Journal, 10, 1–35
[5] Lazarevic, A., Ertoz, L., Kumar V., Ozgur A. & Srivastava J. (2003). A comparative study of
anomaly detection schemes in network intrusion detection. In Proc. of the SIAM Conference on Data
Mining
[6] Kumar, S. & Spafford, E. H. (1994). A pattern matching model for misuse intrusion detection. In
Proceedings of the 17th National Computer Security Conference, 11-21
[7] Chebrolu, S. et al. (2005). Feature deduction and ensemble design of intrusion detection systems.
Computer Security, 24( 4), 295–307
[8] Yeung, D.Y. & Ding, Y. (2003). Host-based intrusion detection using dynamic and static
behavioral models. Pattern Recognition, 36, 229-243
[9] sKDD Cup 1999 Intrusion detection dataset:
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
[10] Mukkamala, S. et al. (2005). Intrusion detection using an ensemble of intelligent paradigms.
Journal of Network and Computer Applications, 28(2), 167–82
[11] Lewis, P. M. (1962). The characteristic selection problem in recognition system. IRE
Transaction on Information Theory, 8, 171-178
[12] John, G.H. et al. (1994). Irrelevant Features and the Subset Selection Problem. Proc. of the 11th
Int. Conf. on Machine Learning, Morgan Kaufmann Publishers, 121-129
[13] Dash, M. & Liu, H. (1997). Feature Selection for Classification. Intelligent Data Analysis, 1(3),
131–56
[14] Blum, Avrim L. & Pat Langley (1997). Selection of relevant features and examples in machine
learning. Artificial Intelligence, 97(1-2), 245–271
[15] Dash, M. et al. (2002). Feature Selection for Clustering-a Filter Solution. Proc. 2nd Int’l Conf.
Data Mining, 115-122
[16] Włodzisław, W. Tomasz et al. (2003). Feature Selection and Ranking Filters.
[17] Das, S. (2001). Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection. Proc. 18th
Int’l Conf. Machine Learning, 74-81
22 Amrita & P Ahmed
[18] Liu, H. & Yu, L. (2005). Towards integrating feature selection algorithms for classification and
clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491-502
[19] R. Kohavi and G.H. John (1997). Wrappers for Feature Subset Selection. Artificial Intelligence.
97 (1-2), 273-324
[20] Xing, E. et al. (2001). Feature Selection for High-Dimensional Genomic Microarray Data. Proc.
15th Int’l Conf.Machine Learning, 601-608
[21] Zhang, L. et al. (2004). Feature Selection for Pattern Classification Problems. Proceedings of
the Fourth International Conference on Computer and Information Technology (CIT’04)
[22] Kuchimanchi, Gopi K. et al. (2004). Dimension Reduction Using Feature Extraction Methods
for Real-time Misuse Detection Systems. Proceedings of the 2004 IEEE Workshop on Information
Assurance and Security United States Military Academy, West Point, NY, 195-202
[23] Zhu, Y. et al. (2005). Modified Genetic Algorithm based Feature Subset Selection in Intrusion
Detection System. Proceedings of ISCIT 2005, 9-12
[24] Zainal, A. et al. (2006). Feature selection using rough set in intrusion detection. In Proc. IEEE
TENCON, 1-4
[25] Wong, Wai-Tak & Lai, Cheng-Yang (2006). Identifying Important Features for Intrusion
Detection Using Discriminant Analysis and Support Vector Machine. Proceedings of the Fifth
International Conference on Machine Learning and Cybernetics, Dalian, 3563-3567
[26] Yang, L. et al. (2006). A Lightweight Intrusion Detection Model Based on Feature Selection and
Maximum Entropy Model. International Conference on Communication Technology (ICCT '06), 1-4
[27] Tamilarasan, A. et al. (2006). Feature Ranking and Selection for Intrusion Detection Using
Artificial Neural Networks and Statistical Methods. Int’l Joint Conf. on Neural Networks (IJCNN’06),
4754-4761
[28] Fadaeieslam, M. J.et al. (2007). Comparison of two feature selection methods in Intrusion
Detection Systems. Seventh International Conference on Computer and Information Technology, 83-86
[29] Sheen, Shina & Rajesh, R. (2008). Network Intrusion Detection using Feature Selection and
Decision tree classifier. IEEE Region 10 Conference, TENCON 2008, 1-4.
[30] Kiziloren, T. & Germen, E. (2009).Anomaly Detection with Self-Organizing Maps and Effects
of Principal Component Analysis on Feature Vectors. Fifth Int’l Conf. on Natural Computation, 509-513
[31] Suebsing, A. & Hiransakolwong, N. (2009). Feature Selection Using Euclidean Distance and
Cosine Similarity for Intrusion Detection Model. Asian Conf. on Intelligent Info. and Database Systems,
86-91
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 23
[32] Lee, S. M.et al. (2009). Quantitative Intrusion Intensity Assessment using Important Feature
Selection and Proximity Metrics. 15th IEEE Pacific Rim Int’l Symposium on Dependable Computing,
127-134
[33] Lee, Tsern-Huei & He, Jyun-De (2009). Entropy-Based Profiling of Network Traffic for
Detection of Security Attack. TENCON, 1-5
[34] Xiao, L. et al. (2009). A Two-step Feature Selection Algorithm Adapting to Intrusion Detection.
International Joint Conference on Artificial Intelligence, 618-622
[35] Kok-Chin Khor et al. (2009). From Feature Selection to Building of Bayesian Classifiers: A
Network Intrusion Detection Perspective. American Journal of Applied Sciences, 6 (11), 1948-1959
[36] Bahrololum, M. et al. (2009). Machine Learning Techniques for Feature Reduction in Intrusion
Detection Systems: A Comparison. Fourth International Conference on Computer Sciences and
Convergence Information Technology (ICCIT), 2009, Pp. 1091-1095.
[37] Nguyen, H. et al. (2010). Improving Effectiveness of Intrusion Detection by Correlation Feature
Selection. 2010 International Conference on Availability, Reliability and Security, 17-24
[38] Chen, T. et al. (2010). A Naive Feature Selection Method and Its Application in Network
Intrusion Detection. 2010 International Conference on Computational Intelligence and Security (CIS),
416-420.
[39] Fan, W. et al. (2011). Unsupervised Anomaly Intrusion Detection via Localized Bayesian
Feature Selection. 2011 11th IEEE International Conference on Data Mining, 1032-1937
[40] Xian, J. et al. (2011). An Algorithm Application in Intrusion Forensics Based on Improved
Information Gain. Web Society (SWS), 3rd Symposium on Date of Conference, 100-104
[41] Middlemiss, Melanie J. & Dick, G. (2003). Weighted Feature Extraction using a Genetic
Algorithm for Intrusion Detection, IEEE, 1669- 1675
[42] Mukkamala, S. & Sung, A. H. (2003). Feature Selection for Intrusion Detection Using Neural
Networks and Support Vector Machines. Journal of the Transportation Research Board of the National
Academics, Transportation Research Record No 1822, 33-39
[43] Gao, Hai-Hua et al. (2005). Ant Colony Optimization based network intrusion feature selection
and detection. Proc. of the Fourth Int’l Conf. on Machine Learning and Cybernetics, Guangzhou, 3871-
75
[44] Banković, Z. et al. (2007). Increasing Detection Rate of User-to-Root Attacks Using Genetic
Algorithms. Int’l Conf. on Emerging Security Information, Systems and Technologies, 48-53
[45] Chen,Y. Et al. (2007). Toward Building Lightweight Intrusion Detection System Through
Modified RMHC and SVM. ICON, 83-88
24 Amrita & P Ahmed
[46] CHi-Ho Tsang et al. (2007). Genetic-fuzzy rule mining approach and evaluation of feature
selection techniques for anomaly intrusion detection. Pattern Recognition, 40, 2373-2391.
[47] Wang, W. & Gombault, S. (2008). Efficient Detection of DDoS Attacks with Important
Attributes. Third International Conference on Risks and Security of Internet and Systems: CRiSIS’2008,
61-67
[48] Li, Y. et al. (2009). Building lightweight intrusion detection system using wrapper-based feature
selection mechanisms. Computers and security, 28(6), 466–75
[49] Zulaiha, A.O. et al.(2010).Improving Signature Detection Classification Model Using Features
Selection based on Customized Features.10th Int’l Conf. on Intelligent Systems Design and
Applications,1026-31
[50] Gong, S. (2011). Feature Selection Method for Network Intrusion Based on GQPSO Attribute
Reduction. International Conference on Multimedia Technology (ICMT), 6365 - 6368
[51] Li, Y. et al. (2012). An efficient intrusion detection system based on support vector machines
and gradually feature removal method. Expert Systems with Applications, 39, 424–430
[52] Sindhu, Siva S. et al. (2012). Decision tree based light weight intrusion detection using a
wrapper approach. Expert Systems with Applications, 39, 129–141
[53] Wing, W.Y. NG et al.(2003).Dimensionality Reduction for Denial of Service Detection
Problems using RBFNN Output Sensitivity.Proc.of 2nd Int’l Conf. on Machine Learning and
Cybernetics, Wan, 1293-98
[54] Shazzad, K. M. & Park, J. S. (2005). Optimization of Intrusion Detection through Fast Hybrid
Feature Selection. Proc.of 6th Int’l Conf. on Parallel and Distributed Computing, Applications and
Technologies
[55] Chen, Y. et al. (2007). Building Lightweight Intrusion Detection System Based on Principal
Component Analysis and C4.5 Algorithm. ICACT2007, 2109-2112
[56] Lee, S. M. et al. (2007). A Hybrid Approach for Real-Time Network Intrusion Detection
Systems. International Conference on Computational Intelligence and Security, 712-715
[57] Wang, W.et al. (2008). Towards fast detecting intrusions: using key attributes of network traffic.
The Third International Conference on Internet Monitoring and Protection, 86-91
[58] Hong, D. & Haibo, L. (2009). A Lightweight Network Intrusion Detection Model Based on
Feature Selection. 15th IEEE Pacific Rim International Symposium on Dependable Computing, 165-168
[59] Xiang,C. et al. (2009). Robust Observation Selection for Intrusion detection. Sixth
International Conference on Fuzzy Systems and Knowledge Discovery, 269-272
A Study of Feature Selection Methods in Intrusion Detection System: A Survey 25
[60] Zaman, S. & Karray, F. (2009). Features Selection for Intrusion Detection Systems Based on
Support Vector Machines. 6th IEEE Consumer Communications and Networking Conference (CCNC),
1- 8
[61] Ming-Yang Su (2011). Real-time anomaly detection systems for Denial-of-Service attacks by
weighted k-nearest-neighbor classifiers. Expert Systems with Applications, 38, 3492–3498
[62] Bruzzone, L. & Serpico, S. B. (2000). A technique for feature selection in multiclass problems.
International Journal of Remote Sensing, 21(3), 549–563
[63] Chiblovskii, B., & Lecerf, L. (2008). Scalable feature selection for multiclass problems. In Proc. of
the European conf. on machine learning and knowledge discovery in databases (ECML PKDD’08), 227

More Related Content

What's hot

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection systemNikhil Singh
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networksantoniomorancardenas
 
Intrusion Detection Systems
Intrusion Detection SystemsIntrusion Detection Systems
Intrusion Detection Systemsvamsi_xmen
 
IDS (intrusion detection system)
IDS (intrusion detection system)IDS (intrusion detection system)
IDS (intrusion detection system)Netwax Lab
 
Intrusion Detection
Intrusion DetectionIntrusion Detection
Intrusion Detectionbutest
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesVahid Farrahi
 
Analysis and Design for Intrusion Detection System Based on Data Mining
Analysis and Design for Intrusion Detection System Based on Data MiningAnalysis and Design for Intrusion Detection System Based on Data Mining
Analysis and Design for Intrusion Detection System Based on Data MiningPritesh Ranjan
 
Intruders
IntrudersIntruders
Intruderstechn
 
Deep Learning based Threat / Intrusion detection system
Deep Learning based Threat / Intrusion detection systemDeep Learning based Threat / Intrusion detection system
Deep Learning based Threat / Intrusion detection systemAffine Analytics
 
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...IRJET Journal
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSieijjournal
 
Lecture 10 intruders
Lecture 10 intrudersLecture 10 intruders
Lecture 10 intrudersrajakhurram
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMDr.Florence Dayana
 
Intruders detection
Intruders detectionIntruders detection
Intruders detectionEhtisham Ali
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectioncsandit
 

What's hot (18)

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networks
 
Intrusion Detection Systems
Intrusion Detection SystemsIntrusion Detection Systems
Intrusion Detection Systems
 
IDS (intrusion detection system)
IDS (intrusion detection system)IDS (intrusion detection system)
IDS (intrusion detection system)
 
Intrusion Detection
Intrusion DetectionIntrusion Detection
Intrusion Detection
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision trees
 
Analysis and Design for Intrusion Detection System Based on Data Mining
Analysis and Design for Intrusion Detection System Based on Data MiningAnalysis and Design for Intrusion Detection System Based on Data Mining
Analysis and Design for Intrusion Detection System Based on Data Mining
 
1776 1779
1776 17791776 1779
1776 1779
 
Intruders
IntrudersIntruders
Intruders
 
Deep Learning based Threat / Intrusion detection system
Deep Learning based Threat / Intrusion detection systemDeep Learning based Threat / Intrusion detection system
Deep Learning based Threat / Intrusion detection system
 
Kx3419591964
Kx3419591964Kx3419591964
Kx3419591964
 
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...
IRJET- Review on Intrusion Detection System using Recurrent Neural Network wi...
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
Lecture 10 intruders
Lecture 10 intrudersLecture 10 intruders
Lecture 10 intruders
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
 
Bt33430435
Bt33430435Bt33430435
Bt33430435
 
Intruders detection
Intruders detectionIntruders detection
Intruders detection
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detection
 

Similar to Feature Selection Methods in Intrusion Detection Systems Surveyed

IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET Journal
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
 
Survey of Clustering Based Detection using IDS Technique
Survey of Clustering Based Detection using   IDS Technique Survey of Clustering Based Detection using   IDS Technique
Survey of Clustering Based Detection using IDS Technique IRJET Journal
 
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...IJAAS Team
 
Intrusion Detection System Classification Using Different Machine Learning Al...
Intrusion Detection System Classification Using Different Machine Learning Al...Intrusion Detection System Classification Using Different Machine Learning Al...
Intrusion Detection System Classification Using Different Machine Learning Al...AIRCC Publishing Corporation
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...ijcseit
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINijcseit
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainijcseit
 
Intrusion detection system via fuzzy
Intrusion detection system via fuzzyIntrusion detection system via fuzzy
Intrusion detection system via fuzzyIJDKP
 
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
 
Intrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIntrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIRJET Journal
 
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...IJCSIS Research Publications
 
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTION
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONCOMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTION
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
 

Similar to Feature Selection Methods in Intrusion Detection Systems Surveyed (20)

1725 1731
1725 17311725 1731
1725 1731
 
1725 1731
1725 17311725 1731
1725 1731
 
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
 
1762 1765
1762 17651762 1765
1762 1765
 
1762 1765
1762 17651762 1765
1762 1765
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docx
 
Survey of Clustering Based Detection using IDS Technique
Survey of Clustering Based Detection using   IDS Technique Survey of Clustering Based Detection using   IDS Technique
Survey of Clustering Based Detection using IDS Technique
 
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...
 
Intrusion Detection System Classification Using Different Machine Learning Al...
Intrusion Detection System Classification Using Different Machine Learning Al...Intrusion Detection System Classification Using Different Machine Learning Al...
Intrusion Detection System Classification Using Different Machine Learning Al...
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chain
 
Intrusion detection system via fuzzy
Intrusion detection system via fuzzyIntrusion detection system via fuzzy
Intrusion detection system via fuzzy
 
A45010107
A45010107A45010107
A45010107
 
A45010107
A45010107A45010107
A45010107
 
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest Classifier
 
Intrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIntrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An Overview
 
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
 
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTION
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONCOMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTION
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTION
 

Recently uploaded

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Recently uploaded (20)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

Feature Selection Methods in Intrusion Detection Systems Surveyed

  • 1. A STUDY OF FEATURE SELECTION METHODS IN INTRUSION DETECTION SYSTEM: A SURVEY AMRITA & P AHMED Department of CSE, Sharda University, Greater Noida, India ABSTRACT Nowadays, detection of security threats, commonly referred to as intrusion, has become a very important and critical issue in network, data and information security. Therefore, an intrusion detection system (IDS) has become a very essential component in computer or network security. Prevention of such intrusions entirely depends on detection capability of Intrusion Detection System (IDS). As network speed becomes faster, there is an emerge need for IDS to be lightweight with high detection rates. Therefore, many feature selection approaches/methods are proposed in the literature. There are three broad categories of approaches for selecting good feature subset as filter, wrapper and hybrid approach. The aim of this paper is to present a survey of various feature selection methods for IDS on KDD CUP’99 bench mark dataset based on these three categories and different evaluation criteria. KEYWORDS : Feature selection, intrusion detection systems, filter method, wrapper method, hybrid method. INTRODUCTION In the last three decades computer networks have grown in size and complexity drastically. This tremendous growth has posed challenging issues in network and information security, and detection of security threats, commonly referred to as intrusion, has become a very important and critical issue in network, data and information security. The security attacks can cause severe disruption to data and networks. Therefore, Intrusion Detection System (IDS) becomes an important part of every computer or network system. An IDS can monitor computer or network traffic and identify malicious activities that compromise the integrity, confidentiality, and availability of information resources and alerts the system or network administrator against malicious attacks. Since, an IDS needs to examine very large data with high dimension even for small network. Due to this, IDS has to meet the challenges of low detection rate and large computation. Therefore, Feature selection is a very important issue and plays a key role in intrusion detection in order to achieve maximal performance. It is one of the important and frequently used techniques in data preprocessing for selecting a subset of relevant features to build robust IDS. Feature selection is the selection of that minimal cardinality feature subset of original feature set that retains the high detection accuracy as the original feature set [1]. The efficient feature subset can improve the training and testing time that helps to build lightweight IDS guaranteeing high detection rates and makes IDS suitable for real time and on-line detection of attacks. International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.2, Issue 3, Sep 2012 1-25 © TJPRC Pvt. Ltd.,
  • 2. 2 Amrita & P Ahmed This survey paper categorizes the feature selection algorithms that have been developed for IDS building, critically evaluates their usefulness, and recommends ways of enhancing the quality of feature selection algorithms. The paper is organized into the following sections. Intrusion Detection Systems is reviewed in Section 2. Section 3 gives the details of the Datasets and Performance Evaluation used in this survey. In Section 4, different methodologies of feature selection in IDSs are discussed. Related research in the literature for feature selection methods together with their performance is addressed in Section 5. Section 6 summaries the different results reported in the literature in tabular form. Section 7 concludes and discusses future research. IINTRUSION DETECTION SYSTEM An intrusion is defined as an attempt to compromise the confidentiality, integrity, availability, unauthorized use of resources, or to bypass the security mechanisms of a computer system or network and James P. Anderson introduced Intrusion Detection (ID) early in 1980s [2]. Dorothy Denning proposed several models for IDS in 1987 [3]. Ideally, Intrusions Detection (ID) should be an intelligent monitoring process of events occurring in system and analyzing them for security violations policies. An IDS is required to have a high attack Detection Rate (DR) with a low False Alarm Rate (FAR). Refer [4] for the organization of a generalized IDS. Approaches of IDS based on detection are anomaly based and misuse based intrusion detection approach. In anomaly based intrusion detection approach [5], the system first learns the normal behavior or activity of the system or network to detect the intrusion. In misuse or signature based intrusion detection approach [6], the system first define the attack and the characteristics of the attack that distinguish this attack from normal data or traffic to detect the intrusion. Approaches of IDS based on location of monitoring are Network based intrusion detection system (NIDS) [7] and Host-based intrusion detection system (HIDS)[8]. NIDS detects intrusion by monitoring network traffic in terms of IP packet. HIDS are installed locally on host machines and detects intrusions by examining system calls, application logs, file system modification and other host activities made by each user on a particular machine. DATASETS AND PERFORMANCE EVALUATION This section summarizes the popular benchmark datasets and performance evaluation measures in the intrusion detection domain to evaluate different feature selection methods in intrusion detection system
  • 3. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 3 DATASETS The KDD CUP 1999 [9] benchmark datasets are used to evaluate different feature selection method for IDS. It consists 4,940,000 connection records for training data set and 311,029 connection records for test data set. The training set contains 24 attacks and the test set contains 38 attacks. Since the training and test set are prohibitively large, another 10% of the KDD Cup’99 dataset is frequently used [9]. Each connection had a label of either normal or the attack type, with exactly one specific attack type falls into one of the four attacks categories [10] as: Denial of Service Attack (DoS), User to Root Attack (U2R), Remote to Local Attack (R2L) and Probing Attack. Each connection record consisted of 41 features and are labeled in order as 1,2,3,4,5,6,7,8,9,.....,41 and falls into the four categories are shown in Table 1: Category 1 (1-9) : Basic features of individual TCP connections Category 2 (10-22) : Content features within a connection suggested by domain knowledge Category 3 (23-31) : Traffic features computed using a two-second time window Category 4 (32-41) : Traffic features computed using a two-second time window from destination to host Table 1: Lists of features in the KDD cup 99 Feature # Feature Name Feature # Feature Name Feature # Feature Name 1 Duration 15 Su-attempted 29 Same-srv-rate 2 Protocol-type 16 Num-root 30 Diff-srv-rate 3 Service 17 Num-file-creations 31 Srv-diff-host-rate 4 Flag 18 Num-shells 32 Dst-host-count 5 Src-bytes 19 Num-access-files 33 Dst-host-srv-count 6 Dst-bytes 20 Num-outbound-cmds 34 Dst-host-same-srv- rate 7 Land 21 Is-hot-login 35 Dst-host-diff-srv- rate 8 Wrong-fragment 22 Is-guest-login 36 Dst-host-same-src- port-rate 9 Urgent 23 Count 37 Dst-host-srv-diff- host-rate 10 Hot 24 Srv-count 38 Dst-host-serror-rate 11 Num-failed-logins 25 Serror-rate 39 Dst-host-srv-serror- rate 12 Logged-in 26 Srv-serror-rate 40 Dst-host-rerror-rate 13 Num-compromised 27 Rerror-rate 41 Dst-host-srv-rerror- rate 14 Root-shell 28 Srv-rerror-rate Performance Evaluation The effectiveness of an IDS is evaluated by its ability to make correct predictions. According to the real nature of a given event compared to the prediction from the IDS, four possible outcomes are shown in Table 2, known as the confusion matrix [4]. True Positive Rate(TPR) or Detection Rate(DR), True Negative Rate(TNR), False Positive Rate (FPR) or False Alarm Rate (FAR) and False Negative
  • 4. 4 Amrita & P Ahmed Rate(FNR) are measures that can be applied to quantify the performance of IDSs [4] based on the above confusion matrix. Table 2. Confusion Matrix Predicted Negative Class (Normal) Positive Class (Attack) Actual Negative Class (Normal) True Negative (TN) False Positive (FP) Positive Class (Attack) False Negative (FN) True positive (TP) FEATURE SELECTION Real time intrusion detection is merely impossible due to the huge amount of data flowing on the Internet. Feature selection can reduce the computation and model complexity. Research on feature selection started in early 60s [11]. Feature selection is a technique of selecting a subset of relevant features by removing most irrelevant and redundant features [12] from the data for building robust learning models [13]. Process of Feature Selection Feature selection processes involve four basic steps in a typical feature selection method [13] shown in Figure 2. They are generation procedure to generate the next candidate subset; an evaluation function to evaluate the subset under examination; a stopping criterion to decide when to stop; and a validation procedure to check whether the subset is valid. Figure 2 demonstrates the feature selection process to determine and validate a best feature subset. Figure 1 : Feature selection process with validation [13].
  • 5. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 5 METHODS FOR FEATURE SELECTION Blum and Langley [14] divide the feature selection methods into three categories named filter, wrapper and hybrid (embedded) method. These methods are currently used in intrusion detection. The filter method [15][16] selects features subsets based on the general characteristics of the data. Filter method is independent of classification algorithms. Filter algorithm [18] uses external learning algorithm to evaluate the performance of selected features. The wrapper method [19] “Wrap around” the learning algorithm. It uses one predetermined classifier to evaluate features or feature subsets. Wrapper algorithm [18] uses a search algorithm to search through the space of possible features and evaluate each subset by running a model on the subset. Many feature subsets are evaluated based on classification performance and best one is selected This method is more computationally expensive than the filter method [17][19]. The hybrid method [17][20] combines wrapper and filter approach to achieve best possible performance with a particular learning algorithm. More efficient search strategies and evaluation criteria are needed for feature selection with large dimensionality in hybrid algorithm [18] to achieve similar time complexity of filter algorithms. These methods are discussed in detail in Section 5 and summarized in section 6. RELATED WORKS In this section, we thoroughly discusses the different feature selection methods used in intrusion detection based on filter, wrapper and hybrid method, number of feature selected, feature number (according to Table 1), its performance on KDD Cup’99 dataset, strength, limitation and future work reported in the literature. Filter Method A feature selection algorithm, FSMDB based on DB index criterion is proposed in [21] (Zhang et al., 2004). Criterion function is constructed according to the characters of DB index criterion. 24 features {features no. : 6, 5, 1, 34, 33, 36, 32, 8, 27, 29, 28, 30, 26, 38, 39, 35, 13, 24, 23, 11, 3, 10, 12 and 4} are selected and tested using two classifiers BP network and SVM. Classification accuracy of FSMDB algorithm by classifiers BP network and SVM are 0.1017 and 0.056 respectively. This method can be used for supervised or unsupervised classification problems but has computational complexity in unsupervised learning mode. Future Work: To find a better approach to reduce high computational complexity in unsupervised learning mode. Two neural network methods: (1) neural network principal component analysis (NNPCA) and (2) nonlinear component analysis (NLCA) are presented in [22] (Kuchimanchi et al., 2004). The number of significant features extracted from methods PCA, NNPCA and NLCA are 19, 19 and 12. The first 19 selected features based on the results of Scree test and critical eignvalues test are {feature no. : 5, 6, 1, 22, 21, 31, 30, 3, 4, 2, 16, 10, 13, 34, 32, 27, 24, 37, 23 and 36}. The performance of the Non-linear classifier (NC) and the CART decision tree classifier (DC) are tested on four datasets (Table 3). DC has
  • 6. 6 Amrita & P Ahmed relatively high detection accuracies and low false positive rates. Future Work: This work can be extended on quantitative measures to find optimal combinations of classifiers and feature extractors for IDS. Table 3 : False Positive Rates (FPR) And Detection Accuracies (DA) for NC and DC on the four Datasets DATASET #Features FPR DA NC DC NC DC ORIGDATA 41 8.2821 0.2268 99.0198 99.9428 PCADATA 19 29.4105 0.2609 99.1161 99.9167 NNPCADATA 19 50.5463 0.4922 98.8206 99.7516 NLDATA 12 51.2756 0.8227 97.2306 99.6359 RICGA (ReliefF Immune Clonal Genetic Algorithm), a combined feature subset selection method based on the ReliefF algorithm, Immune Clonal selection algorithm and GA is proposed in [23] (Zhu et al., 2005). BP networks is used as classifier.. RICGA has higher classification accuracy (86.47%) for small size feature subsets (8) than ReliefF-GA. Features are not mentioned in the paper. This paper [24] (Zainal et al., 2006) investigated the effectiveness of Rough Set (RS) theory in identifying important features and used as a classifier. The 6 significant features obtained are {feature no.: 41, 32, 24, 4, 5 and 3}. Classification results obtained by Rough Set are compared with Multivariate Adaptive Regression Splines (MARS), Support Vector Decision Function (SVDF) and Linear Genetic Programming (LGP). Classification accuracy of RS is ranked second for normal category and performed almost same to MARS and SVDF for attack category. Future Work: This work can be extended in terms of accuracy by focusing on fusion of classifiers after a set of optimum feature subset is obtained. Wong and Lai (2006) [25] combined Discriminant Analysis (DA) and Support Vector Machine (SVM) to detect intrusion for anomaly-based network IDS. Nine features (feature no. : 12, 23, 32, 2, 24, 36, 31, 29 and 39) are extracted by Discriminant Analysis and evaluated by SVM. The TN (%), FP(%), FN(%) and TP(%) of the proposed method are 99.58%, 0.42%, 9.93% and 90.07% respectively. Future Work: Multiple Discriminant Analysis (MDA) can be applied to find the optimal feature set for each type of attack. Li et al. (2006) [26] proposed a lightweight intrusion detection model. Information Gain and Chi-Square approach are used to extract important features and Classic Maximum Entropy (ME) model is used to learn and detect intrusions. The top 12 important features selected by both methods are {feature no.: 3, 5, 6, 10, 13, 23, 24, 27, 28, 37, 40 and 41}. Experimental results are shown in Table 4. Future Work: This model can be applied in realistic environment to verify its real-time performance and effectiveness.
  • 7. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 7 Table 4. Detection Results All 41 features Selected features Class Testing Time(s) Acc.(%) Testing Time(s) Acc.(%) Normal 1.28 99.75 0.78 99.73 Probe 2.09 99.8 1.25 99.76 DoS 1.93 100 1.03 100 U2R 1.05 99.89 0.7 99.87 R2L 1.02 99.78 0.68 99.75 Tamilarasan et al. (2006) [27] performed different feature selection and ranking methods on the KDD Cup’99 dataset. Chi-Square analysis, logistic regression, normal distribution and beta distribution experiments are performed for feature selection. The 25 most significant features ranked by Chi-square test are {feature no.: 35, 27, 41, 28, 40, 30, 34, 3, 33, 12, 37, 24, 29, 2, 13, 8, 36, 10, 26, 39, 22, 25, 5, 1, 38}. Experiments are performed for normal, probe, DoS, U2R, and R2L using resilient back propagation neural network. The overall accuracy of the classification is 97.04% with a FPR of 2.76% and FNR of 0.20%. Fadaeieslam et al. (2007) [28] proposed a feature selection method based on Decision Dependent Correlation (DDC). Mutual information of each feature and decision is calculated and top 20 important features {feature no.: 3, 5, 40, 24, 2, 10, 41, 36, 8, 13, 27, 28, 22, 11, 14, 17, 18, 7, 9 and 15} are selected and evaluated by SVM classifier. The classified result is 93.46% and it outperforms Principal Component Analysis PCA. Shina Sheen and R Rajesh (2008) [29] considered different methods: Chi square, Information Gain and ReliefF for feature selection. Top 20 features {feature no.: 2, 3, 4, 5, 12, 22, 23, 24, 27, 28, 30, 31, 32, 33, 34, 35, 37, 38, 40 and 41} are selected and evaluated using decision tree (C4.5). The Classification accuracy of Chi Square, Info Gain and ReliefF are 95.8506%, 95.8506% and 95.6432% respectively. In [30] (Kiziloren and German, 2009), Principal Component Analysis (PCA) is used for feature selection to increase quality of extracted feature vectors and Self Organizing Network (SOM) as a classifier to detect network anomalies. The highest success rate 98.83% of the system is obtained when number of feature vector size equals to 10. Features are not mentioned in the paper. The average success rate of the system without using PCA is 97.76%. PCA provides faster classification operation which is important for a real-time system. Suebsing and Hiransakolwong (2009) [31] proposed a combination of Euclidean Distance and Cosine Similarity to select robust features subsets with smaller size. Euclidean Distance is used to select the features to detect the known attacks and Cosine Similarity is used to select the features to detect the unknown attacks to build a model. The known detection method extracts 30 important features {feature no. : 1, 2, 12, 25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40, 41, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19
  • 8. 8 Amrita & P Ahmed and 22}. The unknown detection method extracts 24 important features {feature no. : 1, 2, 12, 25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40, 41, 3, 4, 23, 24, 29, 32, 33, 34 and 36}. 15 features {feature no.: 1, 2, 12, 25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40 and 41} are selected by both methods. The C5.0 method is used as a classifier. The experimental results are shown in Table 5. Table 5: Results for known and unknown attack Parameter Known attack Unknown attack Full Set (41) Known detection method(30) Full Set (41) Unknown detection method(24) Overall TP % 97.95 98.12 53.31 68.28 Overall FP % 2.04 1.87 46.69 31.72 Time to Build Model(s) 75 51 75 45 A new approach named Quantitative Intrusion Intensity Assessment (QIIA) is proposed in the paper [32] (Lee et al., 2009). QIIA evaluates the proximity of each instance of audit data using proximity metrics based on Random Forests (RF). QIIA uses Random Forests (RF) to select important features by using the numerical feature importance of RF. Two approaches QIIQ1 and QIIA2 are proposed to determine the threshold parameters value. The top 5 important features selected are {feature no.: 23, 32, 10, 6 and 3}. Only DoS attacks are used since other attack types have very small number of instances. The experimental results show that the detection rates (DR) of QIIA1 and QIIA2 are 97.94 and 99.37 respectively. An entropy-based traffic profiling scheme for detecting security attacks is presented in [33] (Lee and He, 2009). Only denial-of-service (DoS) attack is focused in this paper. The top six features ranked by the accuracy are {feature no.: 5, 6, 31, 32, 36 and 37}. The true positive rate (TPR) of this scheme is 91%. [34] (Xiao et al., 2009) presented a two-step feature selection algorithm. It eliminates two kinds of features: irrelevant features in first step and redundant features in second step. 21 features {feature no.: 1, 3, 4, 5, 6, 8, 11, 12, 13, 23, 25, 26, 27, 28, 29, 30, 32, 33, 34, 36 and 39} are selected and evaluated using C4.5 algorithm and Support Vector Machine (SVM). The Detection Rate (%), False Alarm Rate (%) and Processing Time of selected features (All features) are 86.3 (87.0), 1.89 (1.85) and 15.163 sec (21.891 sec) respectively. A novel approach for selecting features and comparing the performance of various BN classifiers is proposed in [35] (Khor et al., 2009). Two feature selection algorithms Correlation-based Feature Selection Subset Evaluator (CFSE) and Consistency Subset Evaluator (CSE) and domain experts are utilised to form the proposed feature set.. This feature set contains 7 features as {feature no.: 3, 6, 12, 23, 32*, 14* and 40*}. Bayesian Network (BN) is employed as a classifier. The classification accuracy (%) of the BN for Normal, DoS, Probe, R2L and U2R types are (99.8, 99.9, 89.4, 91.5 and 69.2%). *: Features that were selected based on domain knowledge.
  • 9. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 9 Bahrololum et al. (2009) [36] used three machine learning methods : Decision Tree(DT), Flexible Neural Tree (FNT) and Particle Swarm Optimization (PSO) for feature selection. The five important features {feature no.: 10, 17, 14, 13 and 11} are selected depending on the contribution of the variables for the construction of the decision tree. The experimental results are shown in (Table 6). Table 6 : Detection Performance using DT, FNT and PSO Methods Attack Class DT FNT PSO Normal 9.96% 99.19% 95.69% DoS 100% 98.75% 90.41% R2L 99.02% 99.09% 98.10% U2R 88.33% 99.70% 100% Probe 99.66% 98.39% 95.53% An automatic feature selection method based on filter method is proposed by Nguyen et al. (2010) [37]. The globally optimal subset of relevant features is found by the Correlation Feature Selection (CFS) and evaluated by C4.5 and BayesNet. The selected features for Normal&DoS are 3 {5, 6 and 12}; for Normal&Probe are 6 {5, 6, 12, 29, 37 and 41}; for Normal&U2R is 1 {14}; for Normal&R2L are 2 {10 and 22}. Average classification accuracies of C4.5 and BayesNet are 99.41% and 98.82% respectively. Chen et al. [38] (2010) proposed a novel inconsistency-based feature selection method. Data consistency is applied to find the optimal features and evaluated by decision tree method (C4.5). The proposed method is compared with CFS (Table 7). Table 7 : Performance Comparision (CC: Classification Correctness) Attac k Type All features Proposed Method CFS Method CC(% ) Time(s ) Features CC(% ) Time(s ) Features CC(% ) Ti me (s) Probe 99.85 0.66 4(3,5,35,36) 99.77 0.16 4(5,6,25,37) 94.35 0.2 7 DoS 99.94 1.08 4(3,4,10,23) 99.81 0.22 4(2,5,16,22) 99.32 0.3 3 U2R 100 0.11 2(3,41) 100 0.09 9(3,10,24,29,31,32,33,34,40) 100 0.0 8 R2U 98.99 0.22 5(3,5,12,32,35) 99.13 9.13 5(3,5,10,24,33) 98.05 0.1 1 All 99.5 3.72 8(1,3,5,25,32,34,36,40 ) 99.45 0.48 11(2,3,4,5,6,10,23,24,25,36,3 7) 99.67 6.2 8 A novel unsupervised statistical varGDLF, a variational framework for the GD mixture model with localized feature selection (GDLF) approach is proposed in [39] (Fan et al., 2011) for detecting network based attacks. Eleven features {feature no.: 1, 5, 12, 15, 18, 21, 22, 29, 33, 38 and 41} are selected. The performance of varGDLF approach is compared with other four variational mixture models
  • 10. 10 Amrita & P Ahmed and it outperforms with the highest accuracy rate (85.2%), the lowest FP rate (7.3%) and the most accurately detected number of components (4.95). Accuracy rate for Normal, DOS, R2L, U2R and Probing is 99.5, 96.5, 75.4, 69.6 and 85.1%, respectively. FP rate is 11.5, 0.8, 1.4, 11.5 and 11.3%, respectively. An improved information gain (IIG) algorithm is proposed in [40] (Xian et al., 2011) based on feature redundancy.. Twenty two features are selected after applying Information Gain (IG) algorithm and then 12 {feature no.:2, 3, 5, 6, 8, 10, 12, 23, 25, 36, 37 and 38} features are selected after applying IIG. Naive Bayes (NB) is used to carry out the experiment on the three feature set as the original feature set (41 features), feature subset 1 (22 features) and feature subset 2(12 features). The Processing times (s) of the three feature subsets are 8.34, 4.16 and 2.08; the Detection Rates (DR) (%) are 96.187, 96.407 and 96.801; the False Positive Rates (FPR) (%) are 5.22, 2.58 and 1.02 respectively. WRAPPER METHOD In paper [41] (Middlemiss and Dick, 2003), a simple Genetic Algorithm (GA) is used to evolve weights for the features and k-nearest neighbour (KNN)classifier is used as fitness function of the GA and also as classifier. Top five ranked features for each class are selected {DoS-23,29,1,11,24; R2U- 24,3,12,23,36; U2R-24,6,31,41,17; Probe-2,37,30,3,6}. The result shown indicates an increase in intrusion detection accuracy. Mukkamala and Sung (2003) [42] presented two methods to rank the important features: (1)Performance-Based Ranking Method (PBRM) and (2) Support Vector Decision Function Ranking Method (SVDFRM). Thirty one features are selected by union of important features for each of the 5 classes ranked by PBRM. In SVDFRM, the union of important features for each of the 5 classes are 23. The 8 important features identified by both ranking methods are {feature no.: 1, 3, 5, 6, 23, 24, 32 and 33}. Experiments are performed by both methods with classifier SVM (Table 8). Future Work: Ongoing experiments include making 23-class (22 attack classes plus normal) feature identification using SVMs. Table 8 : Performance of SVMs Ranked by PBRM (31) Ranked by SVDFM (23) Class Training Time (s) Testing Time(s) Acc.(%) Training Time (s) Testing Time(s) Acc.(%) Normal 7.67 1.02 99.51 4.85 0.82 99.55 Probe 44.38 2.07 99.67 36.23 1.4 99.71 DOS 18.64 1.41 99.22 7.77 1.32 99.2 U2R 3.23 0.98 99.87 1.72 0.75 99.87 R2L 9.81 1.01 99.78 5.91 0.88 99.78
  • 11. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 11 The Ant Colony Optimization (ACO) based intrusion feature selection algorithm is proposed in [43] (Gao et al., 2005). The fisher discrimination rate is adopted as the heuristic information for ants’ traversal. The Least Square based SVM classifier is adopted as the base classifier to evaluate the generated feature subset. The number of features selected by applying ACO-SVM methods is 11 for Probe, 9 for DoS, and 14 for U2R & R2L. Features name is not mentioned in this paper. Table 9 shows the experimental results. Table 9: Performance of ACO-SVM Type #Feature Correct Classification Rates False Positive Rates Average Detection Time Probe 11 99.40% 0.35% 0.074 DoS 9 95.20% 3.24% 0.031 U2R&R2L 14 98.70% 1.60% 0.078 This paper [44] (Banković et al., 2007) investigated the possibility to increase the detection rate (DR) of U2R attacks in misuse detection. Extracted features obtained by using Principal Component Analysis(PCA) and Multi Expression Programming(MEP) are {U2R-14, 33; DoS- 1, 5, 39; Normal- 3, 10, 12}. Genetic algorithm is employed to implement rules for detecting various types of attacks. Additional two more rule sets are deployed to re-check the decision of the rule set for detecting U2R attacks. The experiments show (Table 10) that this system outperforms the best-performed model reported in literature. Table 10. Performance of the System #Rules DR FPR Total System U2R Rule System Total System U2R Rule System 50 50 46.3 0.0055 0.007 75 77.8 77.8 7.2 10.2 100 100 100 16.54 27.4 Chen et al. (2007) [45] presented a wrapper based feature selection method. A random search method named modified random mutation hill climbing (MRMHC) is introduced as search strategy to select features subsets and Support Vector Machines (SVMs) as classifier. The experiments are shown in Table 11. Future Work: This method can be improved on search strategy and evaluation criterion.
  • 12. 12 Amrita & P Ahmed Table 11: Selected feature subsets, time for selecting process for different feature selection algorithm, average time of building and testing process for ALL Attacks, DOS, PROBE, R2L and U2R Attack Type ALL DOS PROBE R2L U2R #Features 5 4 5 3 5 Selected features 3,5,23,33,34 5,12,23,34 1,3,5,23,37 1, 5,6 1,3,6,14,33 Time of Selecting Process(h) GA- SVMs 1.3 0.5 4 1.5 1.5 MRMHC- SVMs 0.4 0.2 2.2 0.8 0.6 Avg. Time to Build Process(s) All 78 136 245 317 193 Selected 30 31 96 24 78 Avg. Time to Test Process(s) All 18 22 49 55 50 selected 6 5 17 7 15 A multi-objective genetic fuzzy intrusion detection system (MOGFIDS) is proposed by Tsang et al. (2007) [46]. The MOGFIDS is used as a genetic wrapper to search for a near-optimal feature subset. The 27 features selected by MOGFIDS are {feature no.: 2 (tcp, udp, icmp), 5, 6, 7, 8, 9, 11, 12, 13, 14, 17, 18, 22, 23, 25, 30, 32, 33, 34, 35, 36, 37, 38, 39 and 40}. The MOGFIDS has second highest ACC (99.24%) and lowest FPR (1.1%) among the wrappers in the paper. Future Work: This can be applied to other complex problem domains such as face recognition and DNA computing. This paper [47] (Wang and Gombault, 2008) proposed a system that extracts important features from raw network traffic only for DDoS attacks in real computer networks. The first 9 important features {feature no.: 23, 32, 37, 33, 5, 24, 31, 39 and 3} based on rank are selected by Information Gain and Chi- square method and evaluated by Bayesian Networks and decision trees (C4.5) shown in Table 12. Future Work: A practical real-time system for fast detection of DDoS attacks can be developed. Table 12: Detection rate, False Positive Rate and Construction Time Results Evaluatio n Criteria Dr FPR Features Construction Time Training Time (s) Testing time (s) Methods C4.5 BN C4.5 BN - C4.5 BN C4.5 BN #Feature s 9 41 9 41 9 41 9 41 9 41 9 41 9 41 9 41 99. 8 99. 8 99. 6 99. 0 0.3 0.3 1.6 1.5 237(s) 2043(s) 1. 7 15. 3 0. 7 4.4 0. 2 0.9 0.2 0.9 Li et al. (2009) [48] proposed a wrapper-based feature selection method to build lightweight intrusion detection system. Modified Random Mutation Hill Climbing (RMHC) method are applied as search strategy to find a candidate feature subset and modified linear Support Vector Machines (SVMs) to evaluate the candidate feature subset. A classification algorithm based on a decision tree whose nodes consist of linear SVMs is used to build the IDS from selected features subsets. The experiments show
  • 13. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 13 that the systems have higher ROC (Receiver Operating Characteristic) scores than all 41 features in terms of detecting known attacks, new attacks and computational cost (Table 13). Table 13 – Selected feature subsets, Average time of building and testing processes with all and selected features for ALL attacks, DOS, PROBE, R2L and U2R Attack Type Features Building time(s) Testing time(s) All features Selected features All features Selected features ALL 4(3,5,23,32) 78 36 18 8 DOS 4(2,5,23,34) 136 41 22 9 PROBE 6(1,3,5,6,23,35) 245 123 49 29 R2L 3(1,3,5) 317 35 55 8 U2R 5(1,3,5,14,32) 193 85 50 18 This paper [49] (Ali et al., 2010) improve the accuracy of Signature Detection Classification (SDC) Model by applying the features extraction based customized features. Features are extracted by using GA (Genetic Algorithm), two-second-time and Hidden Markov from customized features. Eleven features {feature no.: 5, 6, 13, 23, 24, 25, 26, 33, 36, 37 and 38} are extracted and the best signature detection classification model is developed using JRip, Ridor, PART and Decision tree. The extracted features have increased the detection rates between 0.4% to 9% and reduced false alarm rates between 0.17% to 0.5%. Gong et al. (2011) [50] proposed a novel approach for feature selection based on Genetic Quantum Particale Swarm Optimization (GQPSO) for network intrusion detection. Support Vector Machine (SVM) is used for classification algorithm. Selected features and experimental results are shown in Table 14. Table 14 : Selected Feature and performance of SVM with GQPSO Algorithm Attack Type Features Training Detecting DR Error Report Time(ms) Time(ms) Rate(%) DoS 10 (2, 6, 3, 12, 21, 22,31, 26, 28, 30) 0.0627 0.0581 99.98 0 Probe 5 (5, 12, 26, 32, 34) 0.0431 0.0478 91.77 0.001 R2L 7 (10, 23, 25, 29, 26, 33, 35) 0.053 0.014 98.26 0 U2R 5 (2, 3, 17, 32, 36) 0.0006 0.0016 100 0.0003
  • 14. 14 Amrita & P Ahmed Li et al. (2012) [51] proposed an effective wrapper-based feature reduction method, called gradually feature removal (GFR) method. The GFR method extracted 19 critical features {feature no.: 2, 4, 8, 10, 14, 15, 19, 25, 27, 29, 31, 32, 33, 34, 35, 36, 37, 38 and 40}. The accuracy of SVM classifier is achieved 98.6249% and MCC (Matthews correlation coefficient) is 0.861161. The training and testing time of SVM classifier is greatly reduced. An advanced intelligent systems using ensemble soft computing techniques is proposed by Sindhu et al. (2012) [52] for a lightweight IDS to detect anomalies in networks. GA (Genetic Algorithm) is used to extract the feature subset and a neurotree paradigm is proposed as a classifier. Features extracted by this method are 16 {feature no.: 2, 3, 4, 5, 6, 8, 10, 12, 24, 25, 29, 35, 36, 37, 38 and 40}. The detection rate is 98.4% which is superior to other methods. HYBRID METHOD In this paper [53] (NG et al., 2003), a feature importance ranking methodology based on the stochastic radial basis function neural network output sensitivity measure (RBFNN-SM) is presented. RBFNN-SM is used to evaluate the features for only the normal and six classes of denial of service (DOS) attack. The experiments show that 8 {feature no.: 2, 24, 23, 29, 32, 34, 33 and 36} most significant sensitive features are enough to classify normal and DOS attacks. The computation complexity reduced to 9 seconds from 23 seconds. The classification accuracy for normal and DOS attacks are 99.77% and 99.06%; the FAR for 8 (41) features are 0.18% (0.01%) and 0.27% (0.03%); the FPR are 0.93% (0.70%); and training and testing are 0.94% and (0.71%) respectively. Shazzad and Park (2005) [54] proposed a fast hybrid feature selection method to determine an optimal feature set. This method is a fusion of Correlation-based Feature Selection (CFS), Support Vector Machine (SVM) and Genetic Algorithm (GA). Subsets of features are generated by Genetic Algorithm and evaluated by CFS and SVM. The 12 selected features are {feature no.: 1, 6, 12, 14, 23, 24, 25, 31, 32, 37, 40 and 41}. Optimal subset set has 99.56% as DR and 37.5% as FPR in average. Chebrolu, Abraham and Thomas(2005) [7] investigated the performance of two feature selection techniques, Bayesian Networks (BN) and Classification and Regression Trees (CART) and developed the ensemble classifier of both techniques for building an IDS and best in classifying R2L and DoS. Seventeen important features are {feature no.: 1, 2, 3, 5, 7, 8, 11, 12, 14, 17, 22, 23, 24, 25, 26, 30 and 32} are selected by Markov blanket model and a classifier is constructed using BN and tested. Twelve features {feature no.: 3, 5, 6, 12, 23, 24, 25, 28, 31, 32, 33 and 35} are selected by decision tree and a classifier using CART is constructed and tested. Normal class is classified 100% correctly and the accuracies of classes U2R and R2L have increased by using the 12-variable reduced data set. It is observed that CART classifies accurately on smaller data sets. In ensemble approach, the BN classifier and the CART models are constructed first individually. Then the ensemble approach is used for the 12, 17 and 41-variable data sets. By using the ensemble model, Normal, Probe and DOS could be detected with 100% accuracy and U2R and R2L with 84% and 99.47% accuracies, respectively.
  • 15. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 15 In this paper [55] (Chen et al., 2007), a new hybrid approach named as C4.5-PCA-C4.5 is proposed. It uses PCA (Principal Component Analysis) and decision tree classifier C4.5 as feature selection method and C4.5 as classifiers. The important features extracted are {feature no.: 33, 34, 4, 1, 3, 10 and 22}. The performance of C4.5-PCA-C4.5 is compared with other four systems C4.5-ALL, C4.5-PCA, SVM-CFS and SVM-CFS-SVM. The experiment results show that C4.5-PCA-C4.5 has lower testing time, fast training and testing process, highest TPR, lowest FPR. Average building process time for C4.5-PCA-C4.5 is 6 sec. Lee et al. (2007) [56] uses two machine learning algorithms Random Forests (RF) for feature selection and Minimax Probability Machine (MPM) for intrusion detection. The top 5 {feature no.: 23, 6, 29, 3 and 5} important features are selected. Only Denial of Service (DoS) attacks are used. The detection rate is 99.84% and average simulation time is 0.1039 sec. Wei Wang et al. (2008) [57] used filter and wrapper scheme for feature selection. Information gain (IG) based filter model and Bayesian networks (BN) and decision trees (C4.5) based wrapper model are employed to select features for network intrusion detection and Bayesian networks (BN) and decision trees (C4.5) as classifier. Experiments results and selected 10 features for each class are shown in Table 15. Table 15. Results comparison using 41 features and 10 features Attacks Features Selected Methods Using 41 Features Using 10 Features DR FPR Training Time(s) Test Time(s) DR FPR Training Time(s) Test Time(s) DoS 3, 4, 5, 6, 8, 10, 13, 23, 24, 37 BN 98.73 0.08 4.7 2.1 100 0 0.8 0.6 C4.5 99.96 0.15 16.3 1.2 100 0.14 4.6 0.5 DDoS 3, 4, 5, 6, 8, 10, 13, 23, 24, 37 BN 99.03 1.53 - - 99 1.92 - - C4.5 99.8 0.26 - - 100 0.34 - - Probe 3, 4, 5, 6, 29, 30, 32, 35, 39, 40 BN 92.89 6.08 3.1 2.8 83 3.06 0.5 0.4 C4.5 82.59 0.04 14.5 1.1 83 0.05 1.2 0.3 R2L 1, 3, 5, 6, 12, 22, 23, 31, 32, 33 BN 92.22 0.33 2.6 1.8 89 0.32 0.5 0.4 C4.5 80.29 0.02 10.5 0.8 87 0.01 0.5 0.2 U2R 1, 2, 3, 5, 10, 13, 14, 32, 33, 36 BN 75.86 0.29 2.6 1.8 66 0.12 0.4 0.4 C4.5 24.14 0 9.9 0.7 24 0 0.6 0.2 Hong and Haibo (2009) [58] proposed a new hybrid selection algorithm to build lightweight network IDS. Chi-Square and enhanced C4.5 algorithm are used for feature selection in the preprocessing phase. The top fifteen most important features extracted from Chi-Square algorithms are
  • 16. 16 Amrita & P Ahmed {feature no.: 5, 3, 23, 35, 4, 8, 30, 34, 36, 6, 33, 38, 24, 25 and 2}. The top five features extracted by C4.5 and C4.5-Chi2 methods are {feature no.:25, 4, 2, 5 and 29} and {feature no.: 5, 3, 4, 8 and 25} respectively. The experimental results are shown in Table 16. Table 16: Detection & False Positive Rate Results based on C4.5- CHI2 Attack Type Evaluation Criteria DR FPR Training Time Testing Time Normal 99.9 1.6 0.02 Sec 0.03 Sec. DOS 99.3 1.48 Probe 93.87 1.82 U2R 50.01 28.32 R2L 61.55 12.17 In this paper [59] (Xiang et al., 2009), a hybrid method named Robust Artificial Intelligence Selection Algorithm (RAIS) is presented. Mutual information and artificial intelligence method are used for feature subsets selection and SVMs as classifier. Selected features are not mentioned in this paper. The experimental results show that the RAIS algorithm has the lowest false alarm rate, 3.49%, the highest rate of accuracy, 99.01%, and detection rate, 99.27%. Zaman and Karray (2009) [60] proposed a novel and simple method named Enhanced Support Vector Decision Function (ESVDF) for features selection. This method utilizes the Support Vector Machines (SVMs) approach based on Forward Selection Ranking (FSR) and Backward Elimination Ranking (BER) algorithms. The ESVDF (SVDF/FSR or SVDF/BER) method applies SVDF in the FSR and BER approaches to select the most effective features set. Two classifiers: Neural Networks (NNs) and SVMs are used to evaluate features. The experimental results are shown in Table 17. Feature’s name is not mentioned. Table 17 : Comparison of ESVDF/FSR, ESVDF/BER, and All 41 Features using NNs and SVMs classifiers. Classifier Algorithm #Features Accuarcy FPR Training Time Testing Time NN ESVDF/FSR 6 99.55% 0.0032 217.57 0.047 ESVDF/BER 9 99.57% 0.003 255.047 0.053 Non 41 99.65% 0.0036 911.68 0.075 SVM ESVDF/FSR 6 99.46% 0.0033 2.039 0.052 ESVDF/BER 9 99.58% 0.0031 2.1 0.046 Non 41 99.71% 0.0032 5.182 0.17 Ming-Yang Su (2011) [61] proposed a method for feature selection to detect DoS/DDoS attacks in real time for designing an anomaly-based NIDS. Genetic algorithm (GA) combined with KNN (k-
  • 17. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 17 nearest-neighbor) are used for feature selection and weighting. The result of KNN classification is used as the fitness function in a genetic algorithm to evolve the weight vectors of features. Initial 35 features in the training phase are weighted. The top 19 features are considered for known attacks and the top 28 features for unknown attacks. Extracted features are not mentioned in the paper. An overall accuracy rate of 97.42% is obtained for known attacks and 78% for unknown attacks. A SYSTEMATIC REVIEW OF RELATED WORK The afore-mentioned work of feature selection is summarized in a systematic way according to approach as filter in Table 18, wrapper in Table 19 and hybrid in Table 20. These tables consist of literature reference, proposed method name, number of features selected by paper, feature number according to Table 1, classifier used to evaluate the proposed method, evaluation criteria and results of proposed method. Table 18: Summary of Filter Method Lit. Ref. Method Name No of Feature Feature No Classifier Used Evaluation Criteria Result FILTERMETHOD [21] 2004 FSMDB 24 6,5,1,34,33,36,32,8,27,29,2 8,30,26,38,39,35,13,24,23,1 1,3,10,12,4 BP Network, SVM Classification Accuracy BP-0.1017 SVM- 0.056 [22] 2004 NNPCA & NLCA 19 12 5, 6, 1, 22, 21, 31, 30, 3, 4, 2, 16, 10, 13, 34, 32, 27, 24, 37, 23 NC & DC FPR Detection Accuracies Table 3 [23] 2005 RICGA 12 Not Mentioned BP Network Classification Accuracy 88.15%. [24] 2006 Rough Set 6 41, 32, 24, 4, 5, 3 Rough Set Classification Accuracy 99.743 [25] 2006 Combined DA and SVM 9 12, 23, 32, 2, 24, 36, 31, 29, 39 SVM TN (%) FP ( %) FN (%) TP (%) 99.58% 00.42% 09.93% 90.07% [26] 2006 Information Gain and Chi-Square approach 12 3,5,6,10,13,23,24,27,28,37, 40,41 ME Accuracy Testing Time Table 4 [27] 2006 Artificial Neural Networks and Statistical Methods 25 35,27,41,28,40,30,34,3,33,1 2,37, 24,29, 2, 13,8,36,10, 26,39,22, 25,5,1,38 RBP Neural Network Accuracy FPR FNR 97.04% 2.76% 0.20% [28] 2007 Decision Dependent Correlation(DDC) 20 3,5,40,24,2,10,41,36,8,13,2 7,28,22,11,14,17,18,7,9,15 SVM Classification Accuracy 93.46% [29] 2008 Chi Square, Info Gain and ReliefF 20 2,3,4,5,12,22,23,24,27,28, 30,31,32,33,34, 35,37,38, 40,41 Decision Tree(C4.5) Classification Accuracy 95.8506% 95.8506% 95.6432% [30] 2009 PCA-SOM 10 Not mentioned SOM Avg. Success Rate 98.83% [31] 2009 Euclidean Distance & Cosine Similarity 15 1, 2, 12, 25, 26, 27, 28, 30, 31, 35, 37, 38, 39, 40 41 C5.0 Table 5 Table 5 [32] 2009 (1) QIIA1(Max value) (2)QIIA2(Center Data) 5 23, 32, 10, 6 , 3 (1) (2) DR (1) 97.94 (2) 99.37 [33] 2009 Entropy-Based Scheme with Chi-Square 5 5, 6, 31, 32, 36, 37 Chi-Square Test TPR 91% [34] 2009 Mutual Information based Algorithm 21 1, 3, 4, 5, 6, 8, 11, 12, 13, 23, 25, 26, 27, 28, 29, 30, 32, 33, 34, 36, 39 C4.5 & SVM DR FAR Process. Time 86.3 1.89 15.163s [35] 2009 Proposed feature set using CFSE and CSE 7 3, 6, 12, 23, 32*, 14*, 40* BN Classification Accuracy (%) Normal- 99.8 DoS-99.9 Probe-89.4 R2L-91.5 U2R-69.2 [36] 2009 Based on DT, FNT and PSO 5 10,17,14,13, 11 DT, FNT and PSO Detection Accuracy Table 6 max ^ xP TxP ^
  • 18. 18 Amrita & P Ahmed [37] 2010 M01LPfrom CFS 3 6 1 2 Normal&Dos-5,6,12; Normal&Probe- 5,6,12,29,37,41; Normal&U2R-14; Normal&R2L-10,22; C4.5 BayesNet Classification Accuracy 99.41% 98.82% [38] 2010 Inconsistency-based feature selection method Table 7 Table 7 C4.5 Classification Correctness Time(s) Table 7 [39] 2011 varGDLF 11 1, 5, 12, 15, 18, 21, 22, 29, 33, 38, 41 varGDLF Accuracy Rate FPR No of Comp. 85.2% 7.3% 4.95 [40] 2011 IIG(Improved Information Gain) 12 2, 3, 5, 6, 8, 10, 12, 23, 25, 36, 37, 38 NB DR FPR Processing Time 96.801 1.02 2.08 s *: Features that were selected based on domain knowledge. Table 19: Summary of Wrapper Method Lit. Ref. Method Name No of Feature Feature No Classifie r Used Evaluation Criteria Result WRAPPERMETHOD [41] 200 3 GA combination with a k-nearest neighbour classifier 5 for each class DoS-23,29,1,11,24; R2U-24,3,12,23,36; U2R-24,6,31,41,17; Probe-2,37,30,3,6 KNN Detection Accuracy Increase in ID Accurac y [42] 200 3 PBRM and SVDFRM 8 1,3,5,6,23,24,32,33 SVM Table 8 Table 8 [43] 200 5 ACO-SVM Table 9 Not Mentioned SVM Table 9 Table 9 [44] 200 7 PCA & MEP 8 14, 33,1, 5, 39, 3, 10, 12 GA DR FPR Table 10 [45] 200 7 MRMHC-SVMs Table 11 Table 11 SVM Table 11 Table 11 [46] 200 7 MOGFIDS 27 2(tcp,udp,icmp),5,6,7,8,9 ,11, 12,13,14,17,18,22,23,25, 30,32, 33,34,35,36,37,38,39, 40 MOGFID S Accuracy FPR 99.24 % 1.1% [47] 200 8 Information Gain and Chi-square 9 23, 32, 37, 33, 5, 24, 31, 39, 3 C4.5 & BN Table 12 Table 12 [48] 200 9 Modified RMHC and modified linear SVM Table 13 Table 13 Decision Tree Table 13 Table 13 [49] 201 0 Features Selection based on Customized Features 11 5, 6, 13, 23, 24, 25, 26, 33, 36, 37, 38 JRip, Ridor, PART & Decision tree DR FAR Increase d Decrease d [50] 201 1 GQPSO Table 14 Table 14 SVM Table 14 Table 14 [51] 201 2 GFR (Gradually Feature Removal) 19 2,4,8,10,14, 15,19,25,27, 29,31,32,33, 34,35,36,37, 38,40 SVM Training time (s) Testing time (s) Accuracy (%) MCCavg 0.118356 4.63227 98.6249 0.861161 [52] 201 2 A combined GA and neurotree method 16 2,3,4,5,6,8, 10,12,24, 25,29,35,36,37,38,40 Neurotre e DR 98.38
  • 19. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 19 Table 20: Summary of Hybrid Method Lit. Ref. Method Name No of Feature Feature No Classifie r Used Evaluation Criteria Result HYBRIDMETHOD [53] 200 3 RBFNN-SM 8 2, 24, 23, 29, 32, 34, 33, 36 RBFNN Class. Acc. FAR FPR 99.415% 0.065% 0.935% [54] 200 5 A fusion of CFS, SVM & GA 12 1, 6, 12, 14, 23, 24, 25, 31, 32, 37, 40, 41 SVM DR FPR 99.56% 37.5% [7] 200 5 Markov blanket model and Decision Tree for feature selection 17-BN 12-CART {1,2,3,5,7,8, 11,12,14,17, 22,23,24, 25, 26,30,32}; {3,5,6,12,23, 24,25,28,31, 32,33,35} Ensemble of BN and CART Accuracy (%) 100% - Normal, DoS,Probe 84% - U2R 99.47-R2L [55] 200 7 C4.5-PCA-C4.5 5 33, 34, 4, 1, 3, 10, 22 C4.5 Testing Time, TPR, FPR 6 sec -, - [56] 200 7 RF 5 23, 6, 29, 3, 5 MPM DR Avg Sim. Time 99.84% 0.1039 s [57] 200 8 Information gain & BN and C4.5 10 Table 15 BN & C4.5 DR FPR Table 15 [58] 200 9 C4.5-Chi2 5 5, 3, 4, 8, 25 Enhanced C4.5 Table 16 Table 16 [59] 200 9 RAIS - Not mentioned SVM DR FAR Accuracy 99.17% 3.49% 98.60% [60] 200 9 ESVDF/FSR ESVDF/BER 6 9 Not mentioned NN SVM Table 17 Table 17 [61] 201 1 GA/KNN Hybrid 19 28 Not Mentioned GA/KNN Accuracy Rate 97.42% 78.00% CONCLUSIONS & FUTURE RESEARCH DIRECTIONS Intrusion Detection Systems (IDS) have become vital and a necessary component of almost every computer and network security. As network speed becomes faster, there is an emerge need for IDS to be lightweight, efficient and accurate with high detection rates (DR) and low false positive rates (FAR). Other difficulties faced by intrusion detection systems are curse of feature dimensionality and emerging data complexities. Therefore, feature selection has become very important part in intrusion detection systems due to curse of feature dimensionality and emerging data complexities. Feature selection selects a subset of relevant features, removes irrelevant and redundant features from the dataset to build robust, efficient, accurate and lightweight intrusion detection system to ensure timeliness for real time. A plenty of feature selection methods have been proposed by researchers in intrusion detection system to deal with these problems. This paper has presented to survey this fast developing field and addresses the main contribution of feature selection research proposed for intrusion detection. We showed that why feature selection method is vital in IDS. We surveyed the existing feature selection methods for IDS categorised as filter, wrapper and hybrid. We also presented the performance of these methods based on different metric on KDD Cup’99 dataset, mentioned extracted feature set and classifier
  • 20. 20 Amrita & P Ahmed to evaluate these extracted feature set, strength, limitation and future work of these proposed method in section 5 and 6. The following are useful future research issues: FUTURE RESEARCH Single classifier for evaluation of the extracted feature set may be no longer good solution for building the robust IDS. Therefore, designing more sophisticated classifiers by combining multiple classifiers or combining ensemble [7] and hybrid classifiers may enhance the robustness and performance of IDS. After comparing the existing feature selection methods in intrusion detection, we discovered that finding an optimal and best feature set still needs to be researched. Feature selection algorithms always need improvement on search strategy and evaluation criterion for building efficient and lightweight intrusion detection system. Robustness of the extracted feature can be enhanced by using ensemble of feature selection methods, combined with appropriate evaluation criteria. After surveying these many feature selection methods, we cannot say that which method perform the best under which classifier for intrusion detection (to the best of our knowledge). Most of the proposed method works on two-class classification (normal and attack type) (to the best of our knowledge). Very little work has been done on multiple class classification (five-class four classes of attack and one class of normal) [62][63]. Therefore, the research in many papers can be further extended in the future on multiple class classification. Classes in KDD Cup’99 are unbalanced in both training and test sets as it can be seen in Table 1. Normal and DoS classes have enough instances, whereas Probe and R2L have small instances, particularly U2R. These classes (Probe, R2L, U2R) have not good classification rate due to small number of instances in training set [56][31][39]. So, this is future research to develop the method combined with appropriate evaluation criteria to alleviate the small instance of dataset. We can conclude that there are features that really significant in classifying the normal and attacks type as reported in literature. Also, there is no specific generic classifier that can best classify all the attack types as seen in this survey. Different researchers use different classifier to evaluate the feature set. This paper systematically summarized the contributions of each researcher and also projected the number of significant research problem in this field. We hope that this survey will provide useful insights, broad overview and new research directions about this field to the readers. REFERENCES [1] Mitra, P. et al. (2002). Unsupervised Feature Selection Using Feature Similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 301–312
  • 21. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 21 [2] Anderson, J. P. (1980). Computer security threat monitoring and surveillance. Technical Report 98-17, James P. Anderson Co., Fort Washington, Pennsylvania, USA [3] Denning, D. E. (1987). An intrusion detection model. IEEE Transaction on Software Engineering, Software Engineering 13(2), 222-232 [4] Wu, S.X. & Banzhaf, W. (2010). The use of computational intelligence in intrusion detection systems: A review. Applied Soft Computing Journal, 10, 1–35 [5] Lazarevic, A., Ertoz, L., Kumar V., Ozgur A. & Srivastava J. (2003). A comparative study of anomaly detection schemes in network intrusion detection. In Proc. of the SIAM Conference on Data Mining [6] Kumar, S. & Spafford, E. H. (1994). A pattern matching model for misuse intrusion detection. In Proceedings of the 17th National Computer Security Conference, 11-21 [7] Chebrolu, S. et al. (2005). Feature deduction and ensemble design of intrusion detection systems. Computer Security, 24( 4), 295–307 [8] Yeung, D.Y. & Ding, Y. (2003). Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognition, 36, 229-243 [9] sKDD Cup 1999 Intrusion detection dataset: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html [10] Mukkamala, S. et al. (2005). Intrusion detection using an ensemble of intelligent paradigms. Journal of Network and Computer Applications, 28(2), 167–82 [11] Lewis, P. M. (1962). The characteristic selection problem in recognition system. IRE Transaction on Information Theory, 8, 171-178 [12] John, G.H. et al. (1994). Irrelevant Features and the Subset Selection Problem. Proc. of the 11th Int. Conf. on Machine Learning, Morgan Kaufmann Publishers, 121-129 [13] Dash, M. & Liu, H. (1997). Feature Selection for Classification. Intelligent Data Analysis, 1(3), 131–56 [14] Blum, Avrim L. & Pat Langley (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2), 245–271 [15] Dash, M. et al. (2002). Feature Selection for Clustering-a Filter Solution. Proc. 2nd Int’l Conf. Data Mining, 115-122 [16] Włodzisław, W. Tomasz et al. (2003). Feature Selection and Ranking Filters. [17] Das, S. (2001). Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection. Proc. 18th Int’l Conf. Machine Learning, 74-81
  • 22. 22 Amrita & P Ahmed [18] Liu, H. & Yu, L. (2005). Towards integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491-502 [19] R. Kohavi and G.H. John (1997). Wrappers for Feature Subset Selection. Artificial Intelligence. 97 (1-2), 273-324 [20] Xing, E. et al. (2001). Feature Selection for High-Dimensional Genomic Microarray Data. Proc. 15th Int’l Conf.Machine Learning, 601-608 [21] Zhang, L. et al. (2004). Feature Selection for Pattern Classification Problems. Proceedings of the Fourth International Conference on Computer and Information Technology (CIT’04) [22] Kuchimanchi, Gopi K. et al. (2004). Dimension Reduction Using Feature Extraction Methods for Real-time Misuse Detection Systems. Proceedings of the 2004 IEEE Workshop on Information Assurance and Security United States Military Academy, West Point, NY, 195-202 [23] Zhu, Y. et al. (2005). Modified Genetic Algorithm based Feature Subset Selection in Intrusion Detection System. Proceedings of ISCIT 2005, 9-12 [24] Zainal, A. et al. (2006). Feature selection using rough set in intrusion detection. In Proc. IEEE TENCON, 1-4 [25] Wong, Wai-Tak & Lai, Cheng-Yang (2006). Identifying Important Features for Intrusion Detection Using Discriminant Analysis and Support Vector Machine. Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 3563-3567 [26] Yang, L. et al. (2006). A Lightweight Intrusion Detection Model Based on Feature Selection and Maximum Entropy Model. International Conference on Communication Technology (ICCT '06), 1-4 [27] Tamilarasan, A. et al. (2006). Feature Ranking and Selection for Intrusion Detection Using Artificial Neural Networks and Statistical Methods. Int’l Joint Conf. on Neural Networks (IJCNN’06), 4754-4761 [28] Fadaeieslam, M. J.et al. (2007). Comparison of two feature selection methods in Intrusion Detection Systems. Seventh International Conference on Computer and Information Technology, 83-86 [29] Sheen, Shina & Rajesh, R. (2008). Network Intrusion Detection using Feature Selection and Decision tree classifier. IEEE Region 10 Conference, TENCON 2008, 1-4. [30] Kiziloren, T. & Germen, E. (2009).Anomaly Detection with Self-Organizing Maps and Effects of Principal Component Analysis on Feature Vectors. Fifth Int’l Conf. on Natural Computation, 509-513 [31] Suebsing, A. & Hiransakolwong, N. (2009). Feature Selection Using Euclidean Distance and Cosine Similarity for Intrusion Detection Model. Asian Conf. on Intelligent Info. and Database Systems, 86-91
  • 23. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 23 [32] Lee, S. M.et al. (2009). Quantitative Intrusion Intensity Assessment using Important Feature Selection and Proximity Metrics. 15th IEEE Pacific Rim Int’l Symposium on Dependable Computing, 127-134 [33] Lee, Tsern-Huei & He, Jyun-De (2009). Entropy-Based Profiling of Network Traffic for Detection of Security Attack. TENCON, 1-5 [34] Xiao, L. et al. (2009). A Two-step Feature Selection Algorithm Adapting to Intrusion Detection. International Joint Conference on Artificial Intelligence, 618-622 [35] Kok-Chin Khor et al. (2009). From Feature Selection to Building of Bayesian Classifiers: A Network Intrusion Detection Perspective. American Journal of Applied Sciences, 6 (11), 1948-1959 [36] Bahrololum, M. et al. (2009). Machine Learning Techniques for Feature Reduction in Intrusion Detection Systems: A Comparison. Fourth International Conference on Computer Sciences and Convergence Information Technology (ICCIT), 2009, Pp. 1091-1095. [37] Nguyen, H. et al. (2010). Improving Effectiveness of Intrusion Detection by Correlation Feature Selection. 2010 International Conference on Availability, Reliability and Security, 17-24 [38] Chen, T. et al. (2010). A Naive Feature Selection Method and Its Application in Network Intrusion Detection. 2010 International Conference on Computational Intelligence and Security (CIS), 416-420. [39] Fan, W. et al. (2011). Unsupervised Anomaly Intrusion Detection via Localized Bayesian Feature Selection. 2011 11th IEEE International Conference on Data Mining, 1032-1937 [40] Xian, J. et al. (2011). An Algorithm Application in Intrusion Forensics Based on Improved Information Gain. Web Society (SWS), 3rd Symposium on Date of Conference, 100-104 [41] Middlemiss, Melanie J. & Dick, G. (2003). Weighted Feature Extraction using a Genetic Algorithm for Intrusion Detection, IEEE, 1669- 1675 [42] Mukkamala, S. & Sung, A. H. (2003). Feature Selection for Intrusion Detection Using Neural Networks and Support Vector Machines. Journal of the Transportation Research Board of the National Academics, Transportation Research Record No 1822, 33-39 [43] Gao, Hai-Hua et al. (2005). Ant Colony Optimization based network intrusion feature selection and detection. Proc. of the Fourth Int’l Conf. on Machine Learning and Cybernetics, Guangzhou, 3871- 75 [44] Banković, Z. et al. (2007). Increasing Detection Rate of User-to-Root Attacks Using Genetic Algorithms. Int’l Conf. on Emerging Security Information, Systems and Technologies, 48-53 [45] Chen,Y. Et al. (2007). Toward Building Lightweight Intrusion Detection System Through Modified RMHC and SVM. ICON, 83-88
  • 24. 24 Amrita & P Ahmed [46] CHi-Ho Tsang et al. (2007). Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recognition, 40, 2373-2391. [47] Wang, W. & Gombault, S. (2008). Efficient Detection of DDoS Attacks with Important Attributes. Third International Conference on Risks and Security of Internet and Systems: CRiSIS’2008, 61-67 [48] Li, Y. et al. (2009). Building lightweight intrusion detection system using wrapper-based feature selection mechanisms. Computers and security, 28(6), 466–75 [49] Zulaiha, A.O. et al.(2010).Improving Signature Detection Classification Model Using Features Selection based on Customized Features.10th Int’l Conf. on Intelligent Systems Design and Applications,1026-31 [50] Gong, S. (2011). Feature Selection Method for Network Intrusion Based on GQPSO Attribute Reduction. International Conference on Multimedia Technology (ICMT), 6365 - 6368 [51] Li, Y. et al. (2012). An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Systems with Applications, 39, 424–430 [52] Sindhu, Siva S. et al. (2012). Decision tree based light weight intrusion detection using a wrapper approach. Expert Systems with Applications, 39, 129–141 [53] Wing, W.Y. NG et al.(2003).Dimensionality Reduction for Denial of Service Detection Problems using RBFNN Output Sensitivity.Proc.of 2nd Int’l Conf. on Machine Learning and Cybernetics, Wan, 1293-98 [54] Shazzad, K. M. & Park, J. S. (2005). Optimization of Intrusion Detection through Fast Hybrid Feature Selection. Proc.of 6th Int’l Conf. on Parallel and Distributed Computing, Applications and Technologies [55] Chen, Y. et al. (2007). Building Lightweight Intrusion Detection System Based on Principal Component Analysis and C4.5 Algorithm. ICACT2007, 2109-2112 [56] Lee, S. M. et al. (2007). A Hybrid Approach for Real-Time Network Intrusion Detection Systems. International Conference on Computational Intelligence and Security, 712-715 [57] Wang, W.et al. (2008). Towards fast detecting intrusions: using key attributes of network traffic. The Third International Conference on Internet Monitoring and Protection, 86-91 [58] Hong, D. & Haibo, L. (2009). A Lightweight Network Intrusion Detection Model Based on Feature Selection. 15th IEEE Pacific Rim International Symposium on Dependable Computing, 165-168 [59] Xiang,C. et al. (2009). Robust Observation Selection for Intrusion detection. Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 269-272
  • 25. A Study of Feature Selection Methods in Intrusion Detection System: A Survey 25 [60] Zaman, S. & Karray, F. (2009). Features Selection for Intrusion Detection Systems Based on Support Vector Machines. 6th IEEE Consumer Communications and Networking Conference (CCNC), 1- 8 [61] Ming-Yang Su (2011). Real-time anomaly detection systems for Denial-of-Service attacks by weighted k-nearest-neighbor classifiers. Expert Systems with Applications, 38, 3492–3498 [62] Bruzzone, L. & Serpico, S. B. (2000). A technique for feature selection in multiclass problems. International Journal of Remote Sensing, 21(3), 549–563 [63] Chiblovskii, B., & Lecerf, L. (2008). Scalable feature selection for multiclass problems. In Proc. of the European conf. on machine learning and knowledge discovery in databases (ECML PKDD’08), 227