nternational Journal of Network Security & Its Applications (IJNSA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the computer Network Security & its applications. The journal focuses on all technical and practical aspects of security and its applications for wired and wireless networks. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern security threats and countermeasures, and establishing new collaborations in these areas.
A SURVEY ON THE USE OF DATA CLUSTERING FOR INTRUSION DETECTION SYSTEM IN CYBE...IJNSA Journal
In the present world, it is difficult to realize any computing application working on a standalone computing device without connecting it to the network. A large amount of data is transferred over the network from one device to another. As networking is expanding, security is becoming a major concern. Therefore, it has become important to maintain a high level of security to ensure that a safe and secure connection is established among the devices. An intrusion detection system (IDS) is therefore used to differentiate between the legitimate and illegitimate activities on the system. There are different techniques are used for detecting intrusions in the intrusion detection system. This paper presents the different clustering techniques that have been implemented by different researchers in their relevant articles. This survey was carried out on 30 papers and it presents what different datasets were used by different researchers and what evaluation metrics were used to evaluate the performance of IDS. This paper also highlights the pros and cons of each clustering technique used for IDS, which can be used as a basis for future work.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
Preemptive modelling towards classifying vulnerability of DDoS attack in SDN ...IJECEIAES
Software-Defined Networking (SDN) has become an essential networking concept towards escalating the networking capabilities that are highly demanded future internet system, which is immensely distributed in nature. Owing to the novel concept in the field of network, it is still shrouded with security problems. It is also found that the Distributed Denial-of-Service (DDoS) attack is one of the prominent problems in the SDN environment. After reviewing existing research solutions towards resisting DDoS attack in SDN, it is found that still there are many open-end issues. Therefore, these issues are identified and are addressed in this paper in the form of a preemptive model of security. Different from existing approaches, this model is capable of identifying any malicious activity that leads to a DDoS attack by performing a correct classification of attack strategy using a machine learning approach. The paper also discusses the applicability of best classifiers using machine learning that is effective against DDoS attack.
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
Study on deep neural networks and big data is merging now by several aspects to enhance the capabilities of intrusion detection system (IDS). Many IDS models has been introduced to provide security over big data. This study focuses on the intrusion detection in computer networks using big datasets. The advent of big data has agitated the comprehensive assistance in cyber security by forwarding a brunch of affluent algorithms to classify and analysis patterns and making a better prediction more efficiently. In this study, to detect intrusion a detection model has been propounded applying deep neural networks. We applied the suggested model on the latest dataset available at online, formatted with packet based, flow based data and some additional metadata. The dataset is labeled and imbalanced with 79 attributes and some classes having much less training samples compared to other classes. The proposed model is build using Keras and Google Tensorflow deep learning environment. Experimental result shows that intrusions are detected with the accuracy over 99% for both binary and multiclass classification with selected best features. Receiver operating characteristics (ROC) and precision-recall curve average score is also 1. The outcome implies that Deep Neural Networks offers a novel research model with great accuracy for intrusion detection model, better than some models presented in the literature.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
A SURVEY ON THE USE OF DATA CLUSTERING FOR INTRUSION DETECTION SYSTEM IN CYBE...IJNSA Journal
In the present world, it is difficult to realize any computing application working on a standalone computing device without connecting it to the network. A large amount of data is transferred over the network from one device to another. As networking is expanding, security is becoming a major concern. Therefore, it has become important to maintain a high level of security to ensure that a safe and secure connection is established among the devices. An intrusion detection system (IDS) is therefore used to differentiate between the legitimate and illegitimate activities on the system. There are different techniques are used for detecting intrusions in the intrusion detection system. This paper presents the different clustering techniques that have been implemented by different researchers in their relevant articles. This survey was carried out on 30 papers and it presents what different datasets were used by different researchers and what evaluation metrics were used to evaluate the performance of IDS. This paper also highlights the pros and cons of each clustering technique used for IDS, which can be used as a basis for future work.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
Preemptive modelling towards classifying vulnerability of DDoS attack in SDN ...IJECEIAES
Software-Defined Networking (SDN) has become an essential networking concept towards escalating the networking capabilities that are highly demanded future internet system, which is immensely distributed in nature. Owing to the novel concept in the field of network, it is still shrouded with security problems. It is also found that the Distributed Denial-of-Service (DDoS) attack is one of the prominent problems in the SDN environment. After reviewing existing research solutions towards resisting DDoS attack in SDN, it is found that still there are many open-end issues. Therefore, these issues are identified and are addressed in this paper in the form of a preemptive model of security. Different from existing approaches, this model is capable of identifying any malicious activity that leads to a DDoS attack by performing a correct classification of attack strategy using a machine learning approach. The paper also discusses the applicability of best classifiers using machine learning that is effective against DDoS attack.
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
Study on deep neural networks and big data is merging now by several aspects to enhance the capabilities of intrusion detection system (IDS). Many IDS models has been introduced to provide security over big data. This study focuses on the intrusion detection in computer networks using big datasets. The advent of big data has agitated the comprehensive assistance in cyber security by forwarding a brunch of affluent algorithms to classify and analysis patterns and making a better prediction more efficiently. In this study, to detect intrusion a detection model has been propounded applying deep neural networks. We applied the suggested model on the latest dataset available at online, formatted with packet based, flow based data and some additional metadata. The dataset is labeled and imbalanced with 79 attributes and some classes having much less training samples compared to other classes. The proposed model is build using Keras and Google Tensorflow deep learning environment. Experimental result shows that intrusions are detected with the accuracy over 99% for both binary and multiclass classification with selected best features. Receiver operating characteristics (ROC) and precision-recall curve average score is also 1. The outcome implies that Deep Neural Networks offers a novel research model with great accuracy for intrusion detection model, better than some models presented in the literature.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion DetectionIJNSA Journal
Based on the analysis and distribution of network attacks in KDDCup99 dataset and real time traffic, this paper proposes a design of multi stage filter which is an efficient and effective approach in dealing with various categories of attacks in networks. The first stage of the filter is designed using Enhanced Adaboost with Decision tree algorithm to detect the frequent attacks occurs in the network and the second stage of the filter is designed using enhanced Adaboost with Naïve Byes algorithm to detect the moderate attacks occurs in the network. The final stage of the filter is used to detect the infrequent
attack which is designed using the enhanced Adaboost algorithm with Naïve Bayes as a base learner. Performance of this design is tested with the KDDCup99 dataset and is shown to have high detection rate with low false alarm rates.
Visualize network anomaly detection by using k means clustering algorithmIJCNCJournal
With the ever increasing amount of new attacks in today’s world the amount of data will keep increasing,
and because of the base-rate fallacy the amount of false alarms will also increase. Another problem with
detection of attacks is that they usually isn’t detected until after the attack has taken place, this makes
defending against attacks hard and can easily lead to disclosure of sensitive information.
In this paper we choose K-means algorithm with the Kdd Cup 1999 network data set to evaluate the
performance of an unsupervised learning method for anomaly detection. The results of the evaluation
showed that a high detection rate can be achieve while maintaining a low false alarm rate .This paper
presents the result of using k-means clustering by applying Cluster 3.0 tool and visualized this result by
using TreeView visualization tool .
STATISTICAL QUALITY CONTROL APPROACHES TO NETWORK INTRUSION DETECTIONIJNSA Journal
In the study of network intrusion, much attention has been drawn to on-time detection of intrusion to safeguard public and private interest and to capture the law-breakers. Even though various methods have been found in literature, some situations warrant us to determine intrusions of network in real-time to prevent further undue harm to the computer network as and when they occur. This approach helps detect the intrusion and has a greater potential to apprehend the law-breaker. The purpose of this article is to formulate a method to this effect that is based on the statistical quality control techniques widely used in the manufacturing and production processes.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Network Security: Experiment of Network Health Analysis At An ISPCSCJournals
This paper presents the findings of an analysis performed at an internet service provider. Based on netflow data collected and analyzed using nfdump, it helped assess how healthy is the network of an Internet Service Providers (ISP). The findings have been instrumental in reflection about reshaping the network architecture. And they have also demonstrated the need for consistent monitoring system.
An approach for ids by combining svm and ant colony algorithmeSAT Journals
Abstract This piece of work researches the intrusion detection problem of the network sanctuary; the primary task is to classify network behavior as normal or abnormal while reducing misclassification. In this paper, two efficient data mining algorithms are combined together to detect the network intrusion. Combining SVM and Ant colony (CSVAC) used for well-organized data classification, this technique takes the advantage of both the algorithm while avoiding their weaknesses. This algorithm is implemented and evaluated using standard benchmark KDDCUP99 data set. Experimental results drastically well produce superior results than the other algorithm in terms of accuracy rate and run time efficiency, and this algorithm able to detect the new types of attacks Keywords: Intrusion Detection; Support Vector Machine; Ant colony; Combined Support vector with ant colony
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Dear Students
Ingenious techno Solution offers an expertise guidance on you Final Year IEEE & Non- IEEE Projects on the following domain
JAVA
.NET
EMBEDDED SYSTEMS
ROBOTICS
MECHANICAL
MATLAB etc
For further details contact us:
enquiry@ingenioustech.in
044-42046028 or 8428302179.
Ingenious Techno Solution
#241/85, 4th floor
Rangarajapuram main road,
Kodambakkam (Power House)
http://www.ingenioustech.in/
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
Survey on classification techniques for intrusion detectioncsandit
Intrusion detection is the most essential component
in network security. Traditional Intrusion
Detection methods are based on extensive knowledge
of signatures of known attacks. Signature-
based methods require manual encoding of attacks by
human experts. Data mining is one of the
techniques applied to Intrusion Detection that prov
ides higher automation capabilities than
signature-based methods. Data mining techniques suc
h as classification, clustering and
association rules are used in intrusion detection.
In this paper, we present an overview of
intrusion detection, KDD Cup 1999 dataset and detai
led analysis of different classification
techniques namely Support vector Machine, Decision
tree, Naïve Bayes and Neural Networks
used in intrusion detection.
A survey of Network Intrusion Detection using soft computing Techniqueijsrd.com
with the impending era of internet, the network security has become the key foundation for lot of financial and business application. Intrusion detection is one of the looms to resolve the problem of network security. An Intrusion Detection System (IDS) is a program that analyses what happens or has happened during an execution and tries to find indications that the computer has been misused. Here we propose a new approach by utilizing neuro fuzzy and support vector machine with fuzzy genetic algorithm for higher rate of detection.
Network Forensics is scientifically proven technique to accumulate, perceive, identify, examine, associate, analyse and document digital evidence from multiple systems for the purpose of uncovering the fact of attacks and other problem incident as well as performing the action to recover from the attack. Many systems are proposed for designing the network forensic systems. In this paper we have prepared comparative analysis of various models based on different techniques.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
INTRUSION DETECTION SYSTEM CLASSIFICATION USING DIFFERENT MACHINE LEARNING AL...ijcsit
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion DetectionIJNSA Journal
Based on the analysis and distribution of network attacks in KDDCup99 dataset and real time traffic, this paper proposes a design of multi stage filter which is an efficient and effective approach in dealing with various categories of attacks in networks. The first stage of the filter is designed using Enhanced Adaboost with Decision tree algorithm to detect the frequent attacks occurs in the network and the second stage of the filter is designed using enhanced Adaboost with Naïve Byes algorithm to detect the moderate attacks occurs in the network. The final stage of the filter is used to detect the infrequent
attack which is designed using the enhanced Adaboost algorithm with Naïve Bayes as a base learner. Performance of this design is tested with the KDDCup99 dataset and is shown to have high detection rate with low false alarm rates.
Visualize network anomaly detection by using k means clustering algorithmIJCNCJournal
With the ever increasing amount of new attacks in today’s world the amount of data will keep increasing,
and because of the base-rate fallacy the amount of false alarms will also increase. Another problem with
detection of attacks is that they usually isn’t detected until after the attack has taken place, this makes
defending against attacks hard and can easily lead to disclosure of sensitive information.
In this paper we choose K-means algorithm with the Kdd Cup 1999 network data set to evaluate the
performance of an unsupervised learning method for anomaly detection. The results of the evaluation
showed that a high detection rate can be achieve while maintaining a low false alarm rate .This paper
presents the result of using k-means clustering by applying Cluster 3.0 tool and visualized this result by
using TreeView visualization tool .
STATISTICAL QUALITY CONTROL APPROACHES TO NETWORK INTRUSION DETECTIONIJNSA Journal
In the study of network intrusion, much attention has been drawn to on-time detection of intrusion to safeguard public and private interest and to capture the law-breakers. Even though various methods have been found in literature, some situations warrant us to determine intrusions of network in real-time to prevent further undue harm to the computer network as and when they occur. This approach helps detect the intrusion and has a greater potential to apprehend the law-breaker. The purpose of this article is to formulate a method to this effect that is based on the statistical quality control techniques widely used in the manufacturing and production processes.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Network Security: Experiment of Network Health Analysis At An ISPCSCJournals
This paper presents the findings of an analysis performed at an internet service provider. Based on netflow data collected and analyzed using nfdump, it helped assess how healthy is the network of an Internet Service Providers (ISP). The findings have been instrumental in reflection about reshaping the network architecture. And they have also demonstrated the need for consistent monitoring system.
An approach for ids by combining svm and ant colony algorithmeSAT Journals
Abstract This piece of work researches the intrusion detection problem of the network sanctuary; the primary task is to classify network behavior as normal or abnormal while reducing misclassification. In this paper, two efficient data mining algorithms are combined together to detect the network intrusion. Combining SVM and Ant colony (CSVAC) used for well-organized data classification, this technique takes the advantage of both the algorithm while avoiding their weaknesses. This algorithm is implemented and evaluated using standard benchmark KDDCUP99 data set. Experimental results drastically well produce superior results than the other algorithm in terms of accuracy rate and run time efficiency, and this algorithm able to detect the new types of attacks Keywords: Intrusion Detection; Support Vector Machine; Ant colony; Combined Support vector with ant colony
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Dear Students
Ingenious techno Solution offers an expertise guidance on you Final Year IEEE & Non- IEEE Projects on the following domain
JAVA
.NET
EMBEDDED SYSTEMS
ROBOTICS
MECHANICAL
MATLAB etc
For further details contact us:
enquiry@ingenioustech.in
044-42046028 or 8428302179.
Ingenious Techno Solution
#241/85, 4th floor
Rangarajapuram main road,
Kodambakkam (Power House)
http://www.ingenioustech.in/
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
Survey on classification techniques for intrusion detectioncsandit
Intrusion detection is the most essential component
in network security. Traditional Intrusion
Detection methods are based on extensive knowledge
of signatures of known attacks. Signature-
based methods require manual encoding of attacks by
human experts. Data mining is one of the
techniques applied to Intrusion Detection that prov
ides higher automation capabilities than
signature-based methods. Data mining techniques suc
h as classification, clustering and
association rules are used in intrusion detection.
In this paper, we present an overview of
intrusion detection, KDD Cup 1999 dataset and detai
led analysis of different classification
techniques namely Support vector Machine, Decision
tree, Naïve Bayes and Neural Networks
used in intrusion detection.
A survey of Network Intrusion Detection using soft computing Techniqueijsrd.com
with the impending era of internet, the network security has become the key foundation for lot of financial and business application. Intrusion detection is one of the looms to resolve the problem of network security. An Intrusion Detection System (IDS) is a program that analyses what happens or has happened during an execution and tries to find indications that the computer has been misused. Here we propose a new approach by utilizing neuro fuzzy and support vector machine with fuzzy genetic algorithm for higher rate of detection.
Network Forensics is scientifically proven technique to accumulate, perceive, identify, examine, associate, analyse and document digital evidence from multiple systems for the purpose of uncovering the fact of attacks and other problem incident as well as performing the action to recover from the attack. Many systems are proposed for designing the network forensic systems. In this paper we have prepared comparative analysis of various models based on different techniques.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
INTRUSION DETECTION SYSTEM CLASSIFICATION USING DIFFERENT MACHINE LEARNING AL...ijcsit
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
An Intrusion Detection based on Data mining technique and its intended import...Editor IJMTER
Intrusion detection is a pivotal and essential requirement of today’s era. There are two
major side of Intrusion detection namely, Host based intrusion detection as well as network based
intrusion detection. In Host based intrusion detection system, it monitors the information arrive at the
particular machine or node. While in network based intrusion system, it monitor and analyze whole
traffic of network. Data mining introduce latest technology and methods to handle and categorize
types of attacks using different classification algorithm and matching the patterns of malicious
behavior. Due to the use of this data mining technology, developers extract and analyze the types of
attack in the network.
In addition to this there are two major approach of intrusion detection. First, anomaly based approach,
in which attacks are found with high false alarm rate. However, in signature based approach, false
alarm rate is low with lack of processing of novel attacks. Most of the researchers do their research
based on signature intrusion with the purpose to increase detection rate. Major advantage of this
system, IDS does not require biased assessment and able to identify massive pattern of attacks.
Moreover, capacity to handle large connection records of network. In this paper we try to discover
the features of intrusion detection based on data mining technique.
AN IMPLEMENTATION OF INTRUSION DETECTION SYSTEM USING GENETIC ALGORITHMIJNSA Journal
Nowadays it is very important to maintain a high level security to ensure safe and trusted communication of information between various organizations. But secured data communication over internet and any other network is always under threat of intrusions and misuses. So Intrusion Detection Systems have
become a needful component in terms of computer and network security. There are various approaches being utilized in intrusion detections, but unfortunately any of the systems so far is not completely flawless. So, the quest of betterment continues. In this progression, here we present an Intrusion
Detection System (IDS), by applying genetic algorithm (GA) to efficiently detect various types of network intrusions. Parameters and evolution processes for GA are discussed in details and implemented. This approach uses evolution theory to information evolution in order to filter the traffic data and thus reduce the complexity. To implement and measure the performance of our system we used the KDD99
benchmark dataset and obtained reasonable detection rate.
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
COPYRIGHT
This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.ABSTRACT
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and lar.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
Machine learning in network security using knime analyticsIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly
programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Articles - International Journal of Network Security & Its Applications (IJNSA)IJNSA Journal
International Journal of Network Security & Its Applications (IJNSA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the computer Network Security & its applications. The journal focuses on all technical and practical aspects of security and its applications for wired and wireless networks. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern security threats and countermeasures, and establishing new collaborations in these areas.
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detectionijsrd.com
In today's interconnected world, one of pervasive issue is how to protect system from intrusion based security attacks. It is an important issue to detect the intrusion attacks for the security of network communication.Denial of Service (DoS) attacks is evolving continuously. These attacks make network resources unavailable for legitimate users which results in massive loss of data, resources and money.Significance of Intrusion detection system (IDS) in computer network security well proven. Intrusion Detection Systems (IDSs) have become an efficient defense tool against network attacks since they allow network administrator to detect policy violations. Mining approach can play very important role in developing intrusion detection system. Classification is identified as an important technique of data mining. This paper evaluates performance of well known classification algorithms for attack classification. The key ideas are to use data mining techniques efficiently for intrusion attack classification. To implement and measure the performance of our system we used the KDD99 benchmark dataset and obtained reasonable detection rate.
NETWORK INTRUSION DETECTION AND COUNTERMEASURE SELECTION IN VIRTUAL NETWORK (...ijsptm
Intrusion in a network or a system is a problem today as the trend of successful network attacks continue to
rise. Intruders can explore vulnerabilities of a network system to gain access in order to deploy some virus
or malware such as Denial of Service (DOS) attack. In this work, a frequency-based Intrusion Detection
System (IDS) is proposed to detect DOS attack. The frequency data is extracted from the time-series data
created by the traffic flow using Discrete Fourier Transform (DFT). An algorithm is developed for
anomaly-based intrusion detection with fewer false alarms which further detect known and unknown attack
signature in a network. The frequency of the traffic data of the virus or malware would be inconsistent with
the frequency of the legitimate traffic data. A Centralized Traffic Analyzer Intrusion Detection System
called CTA-IDS is introduced to further detect inside attackers in a network. The strategy is effective in
detecting abnormal content in the traffic data during information passing from one node to another and
also detects known attack signature and unknown attack. This approach is tested by running the artificial
network intrusion data in simulated networks using the Network Simulator2 (NS2) software.
Network Intrusion Detection And Countermeasure Selection In Virtual Network (...ClaraZara1
Intrusion in a network or a system is a problem today as the trend of successful network attacks continue to rise. Intruders can explore vulnerabilities of a network system to gain access in order to deploy some virus or malware such as Denial of Service (DOS) attack. In this work, a frequency-based Intrusion Detection System (IDS) is proposed to detect DOS attack. The frequency data is extracted from the time-series data created by the traffic flow using Discrete Fourier Transform (DFT). An algorithm is developed for anomaly-based intrusion detection with fewer false alarms which further detect known and unknown attack signature in a network. The frequency of the traffic data of the virus or malware would be inconsistent with the frequency of the legitimate traffic data. A Centralized Traffic Analyzer Intrusion Detection System called CTA-IDS is introduced to further detect inside attackers in a network. The strategy is effective in detecting abnormal content in the traffic data during information passing from one node to another and also detects known attack signature and unknown attack. This approach is tested by running the artificial network intrusion data in simulated networks using the Network Simulator2 (NS2) software.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
DETECTING NETWORK ANOMALIES USING CUSUM and FCMEditor IJMTER
The network intrusion detection techniques are important to prevent our systems and
networks from malicious behaviors. However, traditional network intrusion prevention such as firewalls,
user authentication and data encryption have failed to completely protect networks and systems from the
increasing and sophisticated attacks and malwares. Two anomaly detection techniques – CUSUM and
clustering are used to find network anomalies. CUSUM detect changes based on the cumulative effect of
the changes made in the random sequence instead of using a single threshold to check every variable. It
involves calculating cumulative sum and determining whether a packet is normal or not. The FCM
algorithm employs fuzzy partitioning such that a data point can belong to all groups with different
membership grades. Together, CUSUM and FCM become a good technique in detecting network
anomalies with a very less false alarm rate.
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSIONIJNSA Journal
Intrusion detection is one of the standard stages to protect computers in network security framework from several attacks. False alarms problem is critical in intrusion detection, which motivates many researchers to discover methods to minify false alarms. This paper proposes a procedure for classifying the type of intrusion according to multi-operations and multi-layer classifier for handling false alarms in intrusion detection. The proposed system is tested using on KDDcup99 benchmark. The performance showed that results obtained from three consequent classifiers are better than a single classifier. The accuracy reached 98% based on 25 features instead of using all features of KDDCup99 dataset.
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
Intrusion Detection Systems (IDS) form a key part of system defence, where it identifies abnormal
activities happening in a computer system. In recent years different soft computing based techniques have
been proposed for the development of IDS. On the other hand, intrusion detection is not yet a perfect
technology. This has provided an opportunity for data mining to make quite a lot of important
contributions in the field of intrusion detection. In this paper we have proposed a new hybrid technique
by utilizing data mining techniques such as fuzzy C means clustering, Fuzzy neural network / Neurofuzzy and radial basis function(RBF) SVM for fortification of the intrusion detection system. The
proposed technique has five major steps in which, first step is to perform the relevance analysis, and then
input data is clustered using Fuzzy C-means clustering. After that, neuro-fuzzy is trained, such that each
of the data point is trained with the corresponding neuro-fuzzy classifier associated with the cluster.
Subsequently, a vector for SVM classification is formed and in the last step, classification using RBF-
SVM is performed to detect intrusion has happened or not. Data set used is the KDD cup 1999 dataset
and we have used precision, recall, F-measure and accuracy as the evaluation metrics parameters. Our
technique could achieve better accuracy for all types of intrusions. The results of proposed technique are
compared with the other existing techniques. These comparisons proved the effectiveness of our
technique.
Similar to Current issues - International Journal of Network Security & Its Applications (IJNSA) (20)
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
Current issues - International Journal of Network Security & Its Applications (IJNSA)
1. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
DOI: 10.5121/ijnsa.2020.12101 1
A SURVEY ON THE USE OF DATA CLUSTERING FOR
INTRUSION DETECTION SYSTEM IN
CYBERSECURITY
Binita Bohara1
, Jay Bhuyan2
, Fan Wu3
, and Junhua Ding4
1,2,3
Dept.of Computer Science, Tuskegee University, Tuskegee, AL, USA
4
Dept.of Information Science, University of North Texas, Texas, USA
ABSTRACT
In the present world, it is difficult to realize any computing application working on a standalone computing
device without connecting it to the network. A large amount of data is transferred over the network from
one device to another. As networking is expanding, security is becoming a major concern. Therefore, it
has become important to maintain a high level of security to ensure that a safe and secure connection is
established among the devices. An intrusion detection system (IDS) is therefore used to differentiate
between the legitimate and illegitimate activities on the system. There are different techniques are used for
detecting intrusions in the intrusion detection system. This paper presents the different clustering
techniques that have been implemented by different researchers in their relevant articles. This survey was
carried out on 30 papers and it presents what different datasets were used by different researchers and
what evaluation metrics were used to evaluate the performance of IDS. This paper also highlights the pros
and cons of each clustering technique used for IDS, which can be used as a basis for future work.
KEYWORDS
Intrusion detection system, clustering technique, network security
1. INTRODUCTION
Due to the increasing growth of computer network usages, network security is becoming an im-
portant issue. With the evolution of new technology, there is a rapid increase in the incidents of
hacking and intrusion [1]. There is no doubt that all computers are suffering from security
vulnerabilities, that are both technically challenging and costly for manufacturers to fix. So, any
malicious activity on network security vulnerabilities or computers can seriously affect the
system and breach its confidentiality, availability, and integrity. Therefore, the intrusion detection
system has become an integral part of network security architectures.
An intrusion detection system can the ability to locate and identify malicious or anomalous
activities in the computer network by examining the network traffic in real-time [2]. The intrusion
detection system is classified into two groups: misuse and anomaly detection systems. The misuse
intrusion detection system is usually used for commercial purposes because of its predictability
and high accuracy, while anomaly detection is favored in research studies. The intrusion detection
system can also be classified as Host-based and Network-based, depending on the location of de-
tection. There are many number of data mining techniques available for detecting network
intrusions.
2. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
2
Data mining is increasingly becoming a popular technique in the network security environment
for finding regularities and irregularities in datasets. Data mining can be defined as the process of
using efficient techniques to extract useful and unexpected patterns from huge datasets [3]. There
are different supervised techniques have been used to detect intrusion. However, this approach
depends on the labeled data and requires the system to be trained on the known data. The problem
with this type of technique is its high dependency on training data and the inability to detect any
new type of attacks. To overcome this limitation of the supervised technique, an unsupervised
approach can be used for intrusion detection, which can detect the unlabeled data. Clustering is
one of the commonly used unsupervised techniques for classifying huge datasets and detecting
the intrusions. Clustering is a data mining technique that is used to infer the conclusion from
unlabeled data and find hidden patterns in datasets [3]. In other words, clustering is a technique
that is used to group similar data into one cluster and other dissimilar data into the different
clusters.
Figure 1: A schematic of clustering process showing three clusters (one red triangle, one black circle, and
one pink plus cluster)
The clustering approach is based on two assumptions [4]. The first assumption is that the number
of normal connections is larger than abnormal connections and the second assumption is that the
feature of the abnormal network is different than the normal network. Depending upon these
assumptions, there are many clustering techniques can be used for detecting intrusions and
attacks such as hierarchical, partitional, grid-based, and density-based clustering techniques.
Hierarchical clustering is a clustering algorithm that constructs a hierarchy of clusters and it
clusters the objects based on their distance. This technique is based on the cluster proximity
measure and there are three measures such as single-linkage, average-linkage, and complete
linkage. Partitional clustering splits the data points into many or some number of separate
partitions, where each partition is known as clusters. K-means is one of the simplest and efficient
partitional clustering algorithms that is used for detecting intrusions in a computer system. Meng
et al [5] used k-means algorithms on KDD99 datasets to detect unknown attacks in different
settings. Density-based clustering algorithm clusters data points based on the density of the points
in a region. DBSCAN is one of the best density-based algorithms. Yang Jian et al [6] used
improved intrusion detection based on DBSCAN that generates clusters depending upon the
density-based method.
This paper presents a literature review on clustering techniques applied in the intrusion detection
system literature. Articles that used clustering technique for intrusion detection were carefully
reviewed and the type of attack different clustering techniques can detect were analyzed. The
3. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
3
different clustering techniques were compared in terms of different evaluation metrics used,
datasets used, their strengths and weakness.
The remaining part of the paper is divided into different sections. Section II describes the datasets
used in intrusion detection research. Section III outlines the related definition of the terms used in
this paper. Section III also describes the different terms used in the clustering technique literature.
Section IV lists the related work of the clustering technique used for intrusion detection. Section
V lists the research questions used in this research. Section VI illustrates the results and
discussion. Section VII concludes the paper.
2. DESCRIPTION OF DATASET
KDDCup99 dataset, a publicly available dataset, is the most common dataset that is being used
for evaluating intrusion detection systems [7]. This dataset was used in the KDDCup99
competition and is based on the DARPA98 IDS evaluation dataset [8]. The dataset comprises of 4
GB of compressed TCP dump data of nearly 7 weeks of network traffic. The KDDcup99 dataset
comprises of training and test datasets, where the training dataset consists of nearly 5 million
datasets and the test dataset consists of 3 million datasets [9]. The datasets contain 41 features and
classes: normal and attack. There are 24 different types of attacks in this dataset, which belongs to
four types of major attacks such as:
Denial of Service (DOS): This is an attack where attackers make the system too busy for a
legitimate user to be able to use the system or resources.
User to Root Attack (U2R): U2R starts with the attacker getting access to a normal user account
on a system and exploits its vulnerability so as to get root access to the system.
Remote to Local Attack (R2L): R2L is an attack where an attacker sends packets to the system
over the network and exploits its vulnerability so as to gain access to the system as a normal user.
Probing Attack: This is an attack where an attacker scans the system or network to collect
information so as to identify the potential vulnerability of the system.
Table 1: Different types of Attack from Kddcup 99 Datasets [1]
DOS Probe R2L U2R
Back Nmap Spy Buffer overflow
Land Portsweep Phf Rootkit
Neptune Ipsweep Multihop Loadmodule
Pod Satan ftp write Perl
Smurf Imap
Teardrop Warezmaster
Guess passwd
The new version of KDD99 datasets that is publicly available is called NSL-KDD99 datasets. To
solve the inherent problem of KDD-99, NSL-KDD was proposed. It contains only selected
records from KDD datasets. This new dataset overcame shortcomings of KDD datasets; there
were no redundant records in the training set, which removed the bias in learning algorithms
towards frequent datasets. Similarly, the KDD duplicate data was also removed from the test set.
4. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
4
NSL-KDD data consists of 21 types of attacks in the training set and 37 different types of attacks
in the test set. The attacks are classified into four different types: DOS, Probe, U2R, and R2L
[10]. DARPA 98 dataset is also one of the popular datasets used in the intrusion detection system
to evaluate detection rate and false alarm rate in network traffic. This dataset consists of four
different major attack types: DOS, Probing, U2R, and R2L. However, this dataset faced lots of
criticism since a model that was used to generate the traffic was too simple [10].
Both KDD-99 and NSL-KDD do not reflect the real data flow in a computer network as they are
generated on virtual networks on simulation. However, Kyoto 2006+ dataset consists of real
datasets collected over 3 years from November 2006 to August 2009. This dataset comprises of
14 statistical features derived from KDD along with 10 additional features for analysis and
evaluation of intrusion detection system network. These datasets were captured using honeypots,
darknet sensors, email server, and web crawler [10].
Another type of dataset that is used in the literature reviewed in this research for intrusion dataset
is ISCX-2012 intrusion evaluation dataset. This dataset consists of 1512000 packets with 20
features and was obtained by observing network traffic for seven days. There were 75372 normal
traces and 2154 attack traces in training datasets; while in case of testing datasets, there were
19202 normal traces and 37159 attack traces [11].
3. RELATED TERMINOLOGY
• Detection Rate (DR): Detection Rate (DR) can be referred to as a number of attacks
detected by the system divided by the total number of intrusions in the dataset. DR can be
calculated as: DR = True Positive (TP)/(Total number of intrusions in dataset) [12].
• False Alarm Rate (FAR): False Alarm Rate (FAR) is the number of normal instances
classified as attacks divided by the number of normal instances in the dataset. FAR can
be calculated as: FAR= False Positive (FP)/(FP+ True Negative (TN) [12]
• False Negative Rate (FNR): False Negative Rate (FNR) represents the number of
intrusions that were wrongly identified as normal [12]
• True Positive Rate(TPR): True Positive Rate corresponds to IDS that correctly identifies
network activity as a malicious attack [12].
• True Negative Rate(TNR): True Negative Rate occurs when normal instances are
detected as normal [12].
• Accuracy: Accuracy is defined as the percentage of correctly classified instance in
datasets. Accuracy can be calculated as: Accuracy=(TP+TN)/((TP+TN+FP+FN)) [12]
• Sum of Square Error (SSE): The Sum of Square Error (SSE) is the sum of square of
difference between each data point and its cluster mean.
4. RELATED WORK
The intrusion detection system plays an important role in detecting any malicious activities on a
computer network. Various clustering techniques have been designed and implemented for
detecting the intrusions.
Leung et al [13] were able to get a high detection rate while suffering from a highfalse positive
rate using fpMAFIA. The performance value of fpMAFIA was 0.867, which was around 3%-
12% worse off than other algorithms. However, this density and grid-based clustering had the
5. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
5
disadvantage of not evaluating each and every point, which made the anomalies clustered into a
set of small clusters and their identification was just more straightforward.
Leonid Portnoy [14] conferred the algorithm that could automatically detect both known and
unknown intrusion. The author used a single-linkage hierarchical clustering method to distinguish
abnormal activities from normal activities. In this algorithm, at the first number of empty clusters
was formed and the distance of instances to the cluster center was checked. The calculated
shortest distance was then compared with the predefined cluster width (W) and if that distance
was shorter than W, then that record was assigned to that cluster. Otherwise, a new cluster was
formed and other instances were assigned to that cluster. The clusters were then updated to the
mean of the cluster centers. The series of updating and reassignment took place until the cluster
center was no more updated. With the algorithm, the average detection rate was around 44- 55%
and the false positive rate was detected 1.3-2.3. Even though this algorithm overcame the
shortcoming of K-means clustering to predict the number of clusters, it still had some limitations.
One of the limitations was in determining the cluster width, which has to be determined
manually. Therefore, there was a chance of mislabeling the normal instance as an abnormal and
vice-versa if W was not determined properly.
Meng et al. [5] used K-means algorithm clustering technique to detect unknown intrusions in a
computer network. This algorithm was used on the KDD cup 1999 dataset, where a number of
clusters were chosen to be 5. Using different settings, the DR with this method was always found
to be above 96 and FAR was below 4 and time complexity was low. This showed that the K-
means algorithm was an effective method for intrusion detection. However, with the K-means
algorithm, cluster number needs to be defined correctly in order to get correct intrusion detection.
In addition, it is also sensitive to the categorical attribute and the results of this algorithm are
usually not steady. In other words, the K-means clustering result can vary, even with the same
input parameter. Similarly, Gerhard et al. [15] also applied a K-means clustering technique for
intrusion detection. The technique was used to classify normal and abnormal network traffic flow
using cluster centroids as a pattern to detect intrusion.
Guan et al. [16] used Y means clustering algorithm to form a number of ‘normal’ and ‘abnormal’
clusters. This algorithm was similar to that of K-means where datasets were partitioned into k
clusters ranging from 1 to n (total number of instances). The second step was to detect the empty
clusters, where new clusters were created to replace empty clusters; followed by re-assigning
instances to existing centers. This process was iterated until there were no empty clusters, which
eventually led to the removal of outliers (splitting) to form new clusters (merging). The last step
in this process was to label the clusters based on their population, meaning that if the population
of the certain cluster was higher than a given threshold of 2.32 ( is the standard deviation of the
data), then the population was re-classified as normal which otherwise was labeled as an
intrusion. H- means+ algorithm was used for clustering KDD-99 dataset with different initial
values of k and showed that the decline of SSE (Sum of Square Errors) was fast when the value
of k was very small. After the turning point (k = 20, DR = 79 % and FAR = 1 %) of k, the
decline of SSE was very slow and the dataset was partitioned into small clusters that were closer
to each other, so there was no decrease in the value of SSE. On average, this method detected
86.63% of intrusion with a false alarm rate of 1.53%. The Y-means algorithm was again trained
with 12,000 unlabeled KDD-99 dataset. The test on the trained system with 10,000 labeled
instances had 82.32% DR and 2.5% FAR indicated that Y-means is one of the more promising
clustering methods for intrusion detection.
6. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
6
Zhou et al. [4] proposed a graph-based clustering algorithm as an intrusion detection algorithm to
differentiate between regular and irregular connections. The graph-based algorithm can identify
clusters of any shape and it only uses a parameter and does not require to define any cluster
number. This algorithm used an outlier detection method, which was based on the local deviation
coefficient with different values for k (5,8,11,15). Although there was not much difference
between DR and FPR depending on the value of k, the results showed that DR was higher
(95.3%) and FPR was lowest (2.08%) for k equals to 8.
Wei Jiang et al. [17] used the improved fuzzy clustering algorithm for intrusion detection. The
results showed that their detection rate was much higher with a lower false-positive rate over
KDD’s 99 datasets. In this procedure, the KDD-99 dataset was randomly grouped into 5 groups
with 10 thousand records in each group. The average detection rate was 78.66% and a false
positive rate of 0.704. The proposed method overcomes the disadvantage of fuzzy clustering by
adding weighted value to the data object’s membership degree and optimizes the clustering
number by introducing validity function.
M.Jianliang et al. [5] used a K-means clustering algorithm for intrusion detection to detect the
unknown attack. The algorithm was used in KDD-99 dataset, where the number of clusters (k)
was 5. The experiment was carried out in a different setting and in every setting, the detection
rate was always greater than 96% and a false alarm rate was below 4%. Even though the K-means
clustering technique is used to detect unseen attacks and partition large data it has disadvantages
of degeneracy and cluster dependency.
Li Xue–Yong and Gao Guo [18] proposed improved density-based clustering algorithm IIDBC
for intrusion detection to improve the drawbacks of the density-based clustering algorithm (DB-
SCAN) by using the rational method in merging clusters and calculating the distance. The result
showed that IIDBC improved the performance of DBSCAN, increased the detection rate and de-
creased the false-negative rate. The average detection rate was around 92.33/
Z.Muda et al. [19] used a new intrusion detection algorithm by combining clustering and
classification techniques. This algorithm was a hybrid learning approach based on a combination
of K-means clustering and OneR classification and used KDDcup 99 dataset. The main objectives
of this algorithm were to separate the potential attack from normal instances into different
clusters. This hybrid intrusion detection algorithm has the accuracy and detection rate above 99%
and a false alarm rate below 2.75%. It was found that the performance of the hybrid classifiers
was higher as compared to the single classifier. This algorithm was capable of classifying most of
the instances correctly; however, it could not classify U2R and R2l attack.
Chandrashekhar et al. [20] proposed a new approach based on K-means, fuzzy neural networks
and SVM classifiers as an intrusion detection technique. This technique uses KDDcup99 datasets
to perform the experiment. The proposed technique achieved an accuracy of 97.78% and was
effective for low-frequent attacks such as U2R and R2L.
Ravi Ranjan and G.Sahoo [12] presented a new clustering technique for detecting anomaly
intrusion and attacks. They used the K-medoids method for clustering and the proposed algorithm
achieved high detection rate and overcame the K-means algorithm defects. This approach has
advantages over the existing algorithm such as dependency on initial centroids, cluster number,
and irrelevant clusters. The detection rate for this proposed approach was 91.2% and the false
7. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
7
alarm rate is 3.2%. However, the detection rate for the proposed algorithm was low for probing
attack (70.51%) and user to root attack (70.13%).
Chitrakar et al. [21] used a hybrid approach for anomaly-based intrusion detection by combining
Na¨ıve Bayes classification and k-Medoids clustering. This hybrid approach showed better
performance as compared to k-means and Naive Bayes hybrid algorithm. The algorithm used
Kyoto 2006+ dataset and the algorithm provided 4% improvement in terms of accuracy and
detection rate and the false alarm rate was reduced by 1%.
Zhiengje et al. [22] used particle swarm organization with the combination of K-means clustering
method for intrusion detection technique on KDD Cup 1999 data set. The best performance for
this method was when the false detection rate was 2.8% and the detection rate was 82%. The
invasion detected by this technique can be broadly categorized as Probing, DoS, U2R, and R2L.
This method helped to detect known attacks up to 75.82% and unknown attacks to 60.8%. This
method has relatively low detection for DoS, which was due to mislabeling of abnormal data as
normal.
Similarly, Lizhong et al. [23] also used K-means clustering with particle swarm organizations for
anomaly intrusion detection. The algorithm was used in KDD-99 datasets and the experimen- tal
results on PSO-KM showed the detection rate as 86% and false positive rate as 2.8%. The
experiment shows that the accuracy was very good for attack type U2R (78%) and DOS (94%).
However, the accuracy was very low for R2L at 22%.
Fatma et al. [24] used two-stage detection technique in order to improve the detection rate on
DARPA dataset. The detection technique for the first stage was a self-organizing map (SOM)
with K-means algorithm and neural GAS with fuzzy c-means algorithm. The second stage
included SOM with K-means algorithm, support vector machine and decision trees. The results
showed that for the first stage: 69%of the alerts were false attacks and only 31% were real threats,
and for the second stage: 88.77% were false alerts and only 21.23% were real threats. In the first
stage, the neural GAS technique provided better results since the main objective of the first stage
was to cluster low-level alerts into meaningful partitions. In the second stage, the SVM algorithm
provided good result to reduce the rate of false attacks.
Zhong et al. [25] used Kyoto 2006+ and KDD Cup 1999 dataset to evaluate the intrusion
detection by grid-based clustering technique. The results showed that the performance was
insensitive to the variation of the convergence criterion of clustering, attack, and normal
condition in labeling. With the variation on stop condition of cell split, the FPR was 3.25 and DR
was 58.51% for 0.001 value of and FPR and DR value were 5.14% and 62.29%, respectively for
equivalent to 500.
Horng et al. [26] used KDD Cup 1999 training set for detecting different attacks, which were
broadly classified as DoS, U2R, R2L, and Probe. An SVM-based intrusion detection system was
combined with hierarchical clustering algorithm for IDS, which had an accuracy of 95.72 with
only 0.73 false-positive rate. The detection for DoS and Probe were 99.5 and 97.5 respectively,
which were comparatively higher than ESC-IDS [27], KDD’99 winner [28], KDD’99 runner-up
[29], Multi-classifier [30] and Association rule [31].
Lin et al. [32] used KDD-Cup 99 dataset for intrusion detection, where the attacks were classified
as normal, probing, denial of service (DoS), remote to local (R2L), and user to root (U2R). They
8. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
8
introduced cluster center and k-nearest neighbor approach known as CANN. In this algorithm,
different evaluation metrics such as accuracy, detection, and false alarms were considered to
evaluate the performance of intrusion detection and they were calculated using a confusion
matrix. The experiment was carried out on 6-dimensional and 19-dimensional KDD dataset [32].
The results showed that the CANN algorithm has an accuracy of 99.76%, detection rate 99.9%
and false alarm rate of about 0.003 for 6- dimensional datasets. Even though the performance was
shown best, CANN totally misclassified U2R and R2L as normal whereas in case of 19-
dimensional KDD dataset, CANN was able to correctly classify U2R and R2L as attacks.
However, the accuracy rate was very low (3.846 and 57.016).
Yongguo et al. [33] proposed a new detection algorithm known as the IDBGC algorithm,
Intrusion Detection Based On Genetic Clustering. The algorithm is a combination of two stages,
nearest neighbor clustering and genetic optimization. The average detection rate for this
algorithm was observed around 60% and the false-positive rate was only 0.4%. The results
showed that the IDBGC algorithm is feasible and effective in detecting unknown intrusions. More
than 50% of DoS, R2L,and U2R unknown attacks were detected. However, for PROBE attack,
the detection rate was less than 50%, which is relatively very low.
Sanjay et al. [34] proposed K-means clustering via na¨ıve bayes classification algorithm for
detect- ing anomaly-based network intrusion. It was observed that the above algorithm performed
better on KDD cup 99 datasets compared to naive bayes classification in terms of detection rate.
The performance of the algorithms was evaluated based on the accuracy, detection rate, and false
alarm rate. The detection rate was observed as 99% for K-means clustering via na¨ıve bayes
algorithm whereas the false alarm rate was 4. The algorithm was shown to be efficient in
detecting network intrusion; however, this approach had a high false-positive rate.
Amuthan et al. [35] proposed the anomaly network detection method based on K-means
clustering and C4.5 decision tree algorithm. K-means clustering was used first to partition the
training data into k clusters and then decision tree, C4.5 decision tree, was built on each cluster.
The experi- ment was carried out on KDD99 datasets and the performance was evaluated using
metrics such as true positive rate or detection rate, false-positive rate, precision, accuracy, and F-
measure in percentage. The true positive rate obtained was 99.6%, the false-positive rate was
0.1%, precision was 95.6%, accuracy was 95.8% and F-measure was 94.0. The proposed
algorithm gave a notable detection rate.
Reda et al. [36] proposed a hybrid network intrusion detection system based on random forests
and weighted K-means. The proposed algorithm was evaluated on different percentages of KDD-
99 datasets and the result showed that it achieved high detection rate as 99% and very low false-
positive rate as 12.6 in anomaly detection method whereas in case of misuse detection rate, the
detection rate was 92.73% and relatively very good false positive rate as 0.54%.
Chih-Fong et al. [37] introduced a triangle area based nearest neighbor (TANN) approach to
detect the intrusions in a network. The TANN approach to intrusion detection is a combination of
K- means clustering and k-NN classifier. The approach was used in KDD99 dataset with 10-fold
cross-validation and the result showed that TANN can effectively detect the intrusion detection
with high accuracy and detection rates as compared to support vector machines, k-NN, and the
hybrid classifier K-means and k-NN. The accuracy rate for the proposed algorithm was 99.01%,
detection rate was very high at 99.27, and false-positive rate was 2.99%.
9. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
9
Li Tian et al. [38] introduced improved K-means based on the k-medoids cyclic method and the
improved triangle trilateral relations theorem to detect network intrusion detection. This cluster
algorithm for network intrusion detection used KDDcup 99 datasets. The dataset was divided into
5 subsets and the average detection rate was obtained as 89.5%, whereas the false alarm rate was
4.896%, which is relatively very high.
Witcha et al. [39] proposed a fuzzy rough c-means clustering algorithm for detecting new in-
trusions so as to improve the detection rate and reduce false-positive rate in intrusion detection
systems. The fuzzy clustering algorithm was used to predict normal and suspicious behavior. The
performance of the algorithm was measured based on the detection rate, accuracy, false alarm
rate, and correlation. The detection rate for this algorithm was 91.45, accuracy was 82.46, false
alarm rate was 24.8, and correlation was 0.556.
Warusia et al. [11] also integrated K-means clustering and na¨ıve bayes classification
(KMC+NBC) for anomaly intrusion detection for the intrusion detection system. The algorithm
was assessed using ISCX 2012 datasets. The dataset was classified as training and testing datasets
and KMC+NBC algorithm proved to enhance the accuracy and detection rate and reduced false
alarm rate of NBC. The accuracy was 99.8%, the detection rate was 95.4% and the false alarm
rate of the above algorithm was 0.13 in terms of training data; whereas, in testing data, the
accuracy was 99%, the detection rate was 98.8% and false alarm rate was 2.2%.
Vipin Kumar et al. [40] used NSL-KDD datasets to cluster the data into normal and four different
types of attacks i.e., DoS, R2L, U2R, and probe. The experiment was carried out using WEKA
software with clustering technique. Simple K-means algorithm was used to cluster the data into
four different groups of attacks. The algorithm was proven to be very useful in detecting a large
amount of unlabeled data, i.e., detecting new types of attacks.
Abhaya et al. [41] proposed an efficient intrusion detection method for detecting normal and
abnormal instances based on the fuzzy c-means clustering technique and support vector machine.
They used KDD99 datasets for this approach and compared the performance with K-means,
SVM, K-means and na¨ıve bayes, and FCM and na¨ıve bayes. In this approach, the datasets are
transferred to the clustering technique and it is transferred to the classification technique and
finally, the performance is evaluated using accuracy. The accuracy of FCM+SVM was 99.74% as
compared to NB+FCM, SVM+k-means, NB+k-means which accounts to 95.63%, 99.37% and
95.32%.
Hari Om and Aritra Kundu [42] proposed hybrid intrusion detection by combining K-means and
two classifiers, na¨ıve bayes and k-nearest neighbor, to detect anomaly. This algorithm used
KDD-cup99 datasets to detect the intrusion and classify them into four categories: DoS, U2R,R2l,
and probe. The main objective of the proposed algorithm was to reduce the false alarm rate. The
detection rate of the proposed approach was 99.35%, the false alarm rate was 1.394%, and
accuracy was 98.20%.
Partha et al. [43] evaluated the data mining technique based on the fuzzy c-means clustering and
K-means clustering technique over the NSL-KDD datasets to detect the four types of attacks. It
was observed that only 45.95% of attacks were detected by fuzzy c-means clustering technique,
whereas the detection rate is 44.72% using K-means clustering technique.
10. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
10
5. RESEARCH QUESTIONS
The main objective of the research is to analyze the clustering techniques used in the literature of
the intrusion detection system over 10 years. To achieve this objective, the following questions
were established:
1. What datasets have been used in IDS?
2. What clustering technique has been used in the intrusion detection system research?
3. What are the evaluation metrics used to measure the performance of clustering technique?
4. What are the strengths and weakness of clustering techniques that have been used in IDS?
6. RESULT AND DISCUSSION
The result for this research was analyzed using the research question that was initially posted.
There were four questions that were to be answered, which we analyzed separately as following:
1. RQ1: What was the dataset that has been used in intrusion detection system?
There are different datasets being used by different learning algorithm such as classification,
clustering, regression, etc. to detect the intrusions. In this survey, 30 papers were reviewed for
clustering technique. The most common type of dataset that is being used for clustering technique
is KDDcup99 datasets. The number of papers using KDD-99 dataset had peaked in year 2011.
Among 30 papers, there were 24 that used KDD-99 dataset, where Zhong et al. used KDD99 data
along with Kyoto 2006+ datasets accounting to 80% of whole datasets. KDD- 99 dataset is
derived from DARPA dataset, this is one of the datasets that is used by re- searchers for
clustering algorithm. In addition, only 2 papers used NSL-KDD dataset and 2 papers used Kyoto
2006+ that account for 7% of the datasets used in this research as stated in figure 2. Similarly,
only one paper each reviewed in this research used DARPA 98 datasets and ISCX 2012 datasets.
Figure 2: A schematic view of datasets used in clustering technique
11. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
11
Figure 3: A schematic view of distribution of datasets over the years
2). Q2: What clustering technique has been used in the intrusion detection system re- search?
One of the major questions about this research is to find out what is the most common clustering
technique used in intrusion detection system research. Out of 30 papers studied, the most
common clustering technique used for intrusion detection was K-means (it was used 14 times).
This clustering technique was used as a single technique for three times while for others, it was
used as a hybrid technique such as hierarchial, One each R, fuzzy and SVM, Na¨ıve Bayes, PSO,
fuzzy C-means, SOM and neutral gas, and C 4.5 decision tree. Besides these graph-based, II
DBC, K-medoids, grid-based, SUM + hierarchical, CAAN, IDBGC, Random forest + weighted,
and TANN were used.
3). RQ3: What are the evaluation metrics used to measure the performance of the clustering
technique?
The clustering technique used in intrusion detection for computer networks can be assessed and
evaluated using various types of evaluation metrics. The detection rate, false alarm rate, false-
positive rate, false-negative rate were the most common evaluation metrics used to measure the
performance of clustering technique in the clustering detection literature. All the 30 papers
reviewed used these metrics for detecting the performance of the clustering technique in the
intrusion detection system. However, Leung et al. [13] and Zhiengje et al. [22] also used ROC
curve while evaluating the techniques; and similarly, Guan et al. [16] also considered sum of
square error along with detection rate and false-positive rate. Chan- drashekar et al. [20] also
looked at specificity and sensitivity for the datasets in addition to TPR, FPR, DR, FPR and FNR.
The performance of the clustering technique is considered high if the detection rate, accuracy is
12. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
12
high and the false alarm rate is low. These evaluation matrices were also used to compare the
algorithms with the already existed algorithm.
Table 2: Strengths and Weakness of Clustering Technique used in intrusion detection literature
Clustering
Technique
Strength Weakness Ref
K-means DR was high and FAR
was below 4% and time
complexity was low
Cluster number needs to
be defined
Meng et al.,
Gerhard et al.,
Vipin et al
Hierarchical
clustering
overcame the
short-
coming of
K-means
clustering to
predict the
number of
clusters
Determining the cluster
width manually, there was a
chance of mislabeling the
normal instance as an
abnormal and vice-versa if W
was not determined properly
Leonid et al.
Y-means Number of cluster depen-
dency and dengenracy of K-
means was overcome
High false positive com-
pared to other algorithm
Guan et al.
Graph-based Identifies clusters of any
shape and it only uses a
parameter and does not require
to define anyclus- ter number
Computation increased as
number of records in- creased
Zhou et al
k-medoids Has advantages over the
existing algorithm such as
dependency on initial
centroids, cluster number, and
irrelevant clusters
The detection rate for
the proposed algorithm was
low for probing at- tack(70.51)
and user to root attack(70.13)
Ravi et al.
IFCA Introduce the
function
of validity for choosing
number of clustering
Wei et al.
IIDBC Detection rate was high
and the performance of
DBSCAN was improved
Selection of Parameters Li-XUE et
al.
Grid-based The performance was in-
sensitive to the variation of the
convergence crite- rion of
clustering, attack and normal
condition
Low detection rate and
high false positive
Zhong et al.
CANN Detection rate was high
for 6-dimensional KDD
datasets
Misclassified U2R and
R2L as normal in case of 6-
dimensional KDD datasets
Lin et al.
TANN The accuracy rate, detec-
tion rate were high
Chih-Fong
et al.
13. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
13
Improved K-
means
False alarm rate was rela-
tively very high
Li Tian et al.
Fuzzy
C-means
Achieved good per-
formance compared
to Kmeans methods
Witcha et al
Density+Grid
based
High detection rate High false positive rate Leung et al.
K-
means+One R
classifica- tion
Detection rate above 99.0
and a false alarm rate be- low
2.75 and the perfor- mance of
the hybrid clas- sifiers was
higher as com- pared to the
single classi- fier
Couldn’tclassify U2R
and R2l attack
Z.muda et al.
Kmedoids+
Naive Bayes
Showed better perfor-
mance as compared to K-
means and Naive Bayes
hybrid algorithm
Chitrakar et
al.
PSO+K-
means
Accuracy was very good
for U2R and DoS
Low detection rate and
high false positive
Lizhong et
al., Zhiengje et
al.
SVM+HC Detection rate was high
for all four types of attack
Horng et al.
K-
means+Naive
Bayes
The algorithm was shown
to be efficient in detecting
network intrusion
This approach had a high
false-positive rate
Sanjay etal
Warsula et al.
K-
means+C4.5
The proposed algorithm
gave notable detection
rate
Amuthan et
al.
Random
For-
est+Weighted
K-means
High false-positive rate Reda et al.
Fuzzy
Cmeans+Svm
The accuracy value was
slightly better than ex- isting
K-Medoids+SVM and it was
far better than SVM alone and
it was shown to be stable over
other
Abhaya et
al.
SOM+
K-means+
Fuzzy Cmean
Showed to outperform
other competitor method by
reducing FAR
Fatma et al
Kmeans +
Fuzzy + SVM
Effective for low-frequent
attacks such as U2R and R2L
Chandrashekar
et al.
K-means +
KNN + NB
May sometime misclas-
sify the records
Hari Om et
al.
Fuzzy
Cmeans+K-
means
Very low detection rate Partha et al.
14. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
14
4) RQ4) What are the strengths and weakness of clustering techniques that have been used
in IDS?
After reviewing 30 papers in this research, we also evaluated the strengths and weakness of the
clustering techniques used. The common techniques used by different research were combined
and analyzed for strengths and weakness. The results were then incorporated as shown in table 2.
From table 2, it can seen that each and every clustering technique has its own pros and cons in
analyzing intrusion detection.
7. CONCLUSION
The intrusion detection system is a way of analyzing the network traffic so that unwanted packets
or any malicious activities on the system are detected and prevented. In this paper, 30 papers of
clustering techniques were studied and evaluated for intrusion detection. It was found that 27
different types of clustering techniques were applied in order to detect intrusions. The most
common clustering technique used for intrusion among the 30 papers was K-means clustering
technique.
Apart from using a single clustering technique, there was a hybrid clustering technique that was
combined with different classification techniques for detecting intrusions in a system. These
clustering techniques were used in different datasets. However the most common datasets were
found to be KDD-99 datasets. The KDDCup99 datasets comprise of millions of connections with
different features and different types of attacks. Similarly, the performance of this technique was
evaluated by different researchers using different evaluation metrics. Detection rate and False
Alarm rate was commonly used as evaluation metrics by most of the researchers to check the
effectiveness of their procedures for intrusion detection. Similarly, in this research, the strength
and weakness of different technologies used for clustering attacks were also analyzed.
As future work, we are researching on the effectiveness of user-oriented clustering to enhance the
effectiveness of an Intrusion Detection System. In this method, clusters of activities in the log
files are formed based on an expert feedback of whether or not the activities are an intrusion. The
cluster is defined based on the activities it contains. A future activity is defined as intrusion or not
based on its similarity to a cluster.
8. ACKNOWLEDGEMENTS
This research was supported in part by National Science Foundation grants #1761735, #1723586,
#1663350. National Institutes of Health grant NIH TU CBR/RCMI #U54MD007585, and a grant
from Rockwell Collins, USA
REFERENCES
[1] S. Revathi and A. Malathi, “A detailed analysis on nsl-kdd dataset using various machine learning
techniques for intrusion detection,” International Journal of Engineering Research & Technology
(IJERT), vol. 2, no. 12, pp. 1848–1853, 2013.
[2] F. Salo, M. Injadat, A. B. Nassif, A. Shami, and A. Essex, “Data mining techniques in in- trusion
detection systems: A systematic literature review,” IEEE Access, vol. 6, pp. 56 046– 56 058, 2018.
15. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
15
[3] M. K. Siddiqui and S. Naahid, “Analysis of kdd cup 99 dataset using clustering based data mining,”
International Journal of Database Theory and Application, vol. 6, no. 5, pp. 23–34, 2013.
[4] Z. Mingqiang, H. Hui, and W. Qian, “A graph-based clustering algorithm for anomaly in- trusion
detection,” in 2012 7th International Conference on Computer Science & Education (ICCSE).
IEEE, 2012, pp. 1311–1314.
[5] M. Jianliang, S. Haikun, and B. Ling, “The application on intrusion detection based on k-means
cluster algorithm,” in 2009 International Forum on Information Technology and Applications, vol. 1.
IEEE, 2009, pp. 150–152.
[6] Y. Jian, “An improved intrusion detection algorithm based on dbscan [j],” Microcomputer
Information, vol. 25, no. 13, 2009.
[7] K. Cup, “Dataset,” available at the following website http://kdd. ics. uci.
edu/databases/kddcup99/kddcup99. html, vol. 72, 1999.
[8] R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung, D. Weber,
S. E. Webster, D. Wyschogrod, R. K. Cunningham et al., “Evaluating intrusion detection systems:
The 1998 darpa off-line intrusion detection evaluation,” in Proceedings DARPA Information
Survivability Conference and Exposition. DISCEX’00, vol. 2. IEEE, 2000, pp. 12–26.
[9] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the kdd cup 99 data
set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.
IEEE, 2009, pp. 1–6.
[10] D. D. Proti c´, “Review of kdd cup’99, nsl-kdd and kyoto 2006+ datasets,” Vojnotehnicˇki
glasnik, vol. 66, no. 3, pp. 580–596, 2018.
[11] W. Yassin, N. I. Udzir, Z. Muda, M. N. Sulaiman et al., “Anomaly-based intrusion detection through
k-means clustering and naives bayes classification,” in Proc. 4th Int. Conf. Comput. Informatics,
ICOCI, no. 49, 2013, pp. 298–303.
[12] R. Ranjan and G. Sahoo, “A new clustering approach for anomaly intrusion detection,” arXiv preprint
arXiv:1404.2772, 2014.
[13] K. Leung and C. Leckie, “Unsupervised anomaly detection in network intrusion detection using
clusters,” in Proceedings of the Twenty-eighth Australasian conference on Computer Science-Volume
38.Australian Computer Society, Inc., 2005, pp. 333–342.
[14] L. Portnoy, “Intrusion detection with unlabeled data using clustering,” Ph.D. dissertation, Columbia
University, 2000.
[15] G. M u¨nz, S. Li, and G. Carle, “Traffic anomaly detection using k-means clustering,” in
GI/ITG Workshop MMBnet, 2007, pp. 13–14.
[16] Y. Guan, A. A. Ghorbani, and N. Belacel, “Y-means: A clustering method for intrusion detection,” in
CCECE 2003-Canadian Conference on Electrical and Computer Engineering. Toward a Caring and
Humane Technology (Cat. No. 03CH37436), vol. 2. IEEE, 2003, pp. 1083–1086.
[17] W. Jiang, M. Yao, and J. Yan, “Intrusion detection based on improved fuzzy c-means algo- rithm,” in
2008 International Symposium on Information Science and Engineering, vol. 2. IEEE, 2008, pp. 326–
329.
16. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
16
[18] L. Xue-Yong, G. Guo-hong, and S. Jia-Xia, “A new intrusion detection method based on improved
dbscan,” in 2010 WASE International Conference on Information Engineering, vol. 2. IEEE,
2010, pp. 117–120.
[19] Z. Muda, W. Yassin, M. N. Sulaiman, and N. I. Udzir, “Intrusion detection based on k-means
clustering and oner classification,” in 2011 7th International Conference on Information Assurance
and Security (IAS). IEEE, 2011, pp. 192–197.
[20] A. Chandrasekhar and K. Raghuveer, “Intrusion detection technique by using k-means, fuzzy neural
network and svm classifiers,” in 2013 International Conference on Computer Communication and
Informatics.IEEE, 2013, pp. 1–7.
[21] R. Chitrakar and C. Huang, “Anomaly based intrusion detection using hybrid learning ap- proach of
combining k-medoids clustering and naive bayes classification,” in 2012 8th In- ternational
Conference on Wireless Communications, Networking and Mobile Computing. IEEE, 2012, pp. 1–5.
[22] Z. Li, Y. Li, and L. Xu, “Anomaly intrusion detection method based on k-means clustering algorithm
with particle swarm optimization,” in 2011 International Conference of Informa- tion Technology,
Computer Engineering and Management Sciences, vol. 2. IEEE, 2011, pp. 157–161.
[23] L. Xiao, Z. Shao, and G. Liu, “K-means algorithm based on particle swarm optimization al- gorithm
for anomaly intrusion detection,” in 2006 6th World Congress on Intelligent Control and Automation,
vol. 2. IEEE, 2006, pp. 5854–5858.
[24] H. Fatma and L. Mohamed, “A two-stage technique to improve intrusion detection systems based on
data mining algorithms,” in 2013 5th International Conference on Modeling, Sim- ulation and
Applied Optimization (ICMSAO). IEEE, 2013, pp. 1–6.
[25] Y. Zhong, H. Yamaki, and H. Takakura, “A grid-based clustering for low-overhead anomaly intrusion
detection,” in 2011 5th International Conference on Network and System Security. IEEE, 2011, pp.
17–24.
[26] S.-J. Horng, M.-Y. Su, Y.-H. Chen, T.-W. Kao, R.-J. Chen, J.-L. Lai, and C. D. Perkasa, “A novel
intrusion detection system based on hierarchical clustering and support vector ma- chines,” Expert
systems with Applications, vol. 38, no. 1, pp. 306–313, 2011.
[27] A. N. Toosi and M. Kahani, “A new approach to intrusion detection based on an evolutionary soft
computing model using neuro-fuzzy classifiers,” Computer communications, vol. 30, no. 10, pp.
2201–2212, 2007.
[28] B. Pfahringer, “Winning the kdd99 classification cup: bagged boosting,” SIGKDD explo- rations, vol.
1, no. 2, pp. 65–66, 2000.
[29] I. Levin, “Kdd-99 classifier learning contest: Llsoft’s results overview,” SIGKDD explo- rations, vol.
1, no. 2, pp. 67–75, 2000.
[30] M. Sabhnani and G. Serpen, “Application of machine learning algorithms to kdd intrusion detection
dataset within misuse detection context.” in MLMTA, 2003, pp. 209–215.
[31] W. Xuren, H. Famei, and X. Rongsheng, “Modeling intrusion detection system by discov- ering
association rule in rough set theory framework,” in 2006 International Conference on Computational
Inteligence for Modelling Control and Automation and International Con- ference on Intelligent
Agents Web Technologies and International Commerce (CIMCA’06). IEEE, 2006, pp. 24–24.
17. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
17
[32] W.-C. Lin, S.-W. Ke, and C.-F. Tsai, “Cann: An intrusion detection system based on com- bining
cluster centers and nearest neighbors,” Knowledge-based systems, vol. 78, pp. 13–21, 2015.
[33] Y. Liu, K. Chen, X. Liao, and W. Zhang, “A genetic clustering method for intrusion detec- tion,”
Pattern Recognition, vol. 37, no. 5, pp. 927–942, 2004.
[34] S. K. Sharma, P. Pandey, S. K. Tiwari, and M. S. Sisodia, “An improved network intru- sion
detection technique based on k-means clustering via na¨ıve bayes classification,” in IEEE-
International Conference On Advances In Engineering, Science And Management (ICAESM-2012).
IEEE, 2012, pp. 417–422.
[35] A. P. Muniyandi, R. Rajeswari, and R. Rajaram, “Network anomaly detection by cascading k-means
clustering and c4. 5 decision tree algorithm,” Procedia Engineering, vol. 30, pp. 174–182, 2012.
[36] R. M. Elbasiony, E. A. Sallam, T. E. Eltobely, and M. M. Fahmy, “A hybrid network in- trusion
detection framework based on random forests and weighted k-means,” Ain Shams Engineering
Journal, vol. 4, no. 4, pp. 753–762, 2013.
[37] C.-F. Tsai and C.-Y. Lin, “A triangle area based nearest neighbors approach to intrusion detection,”
Pattern recognition, vol. 43, no. 1, pp. 222–229, 2010.
[38] L. Tian and W. Jianwen, “Research on network intrusion detection system based on im- proved k-
means clustering algorithm,” in 2009 International Forum on Computer Science- Technology and
Applications, vol. 1. IEEE, 2009, pp. 76–79.
[39] W. Chimphlee, A. H. Abdullah, M. N. M. Sap, S. Srinoy, and S. Chimphlee, “Anomaly- based
intrusion detection using fuzzy rough clustering,” in 2006 International Conference on Hybrid
Information Technology, vol. 1. IEEE, 2006, pp. 329–334.
[40] V. Kumar, H. Chauhan, and D. Panwar, “K-means clustering approach to analyze nsl- kdd
intrusion detection dataset,” International Journal of Soft Computing and Engineering (IJSCE), 2013.
[41] K. Kumar et al., “An efficient network intrusion detection system based on fuzzy c-means and
support vector machine,” in 2016 International Conference on Computer, Electrical &
Communication Engineering (ICCECE). IEEE, 2016, pp. 1–6.
[42] H. Om and A. Kundu, “A hybrid system for reducing the false alarm rate of anomaly intru- sion
detection system,” in 2012 1st International Conference on Recent Advances in Infor- mation
Technology (RAIT). IEEE, 2012, pp. 131–136.
[43] P. S. Bhattacharjee, A. K. M. Fujail, and S. A. Begum, “A comparison of intrusion detection by k-
means and fuzzy c-means clustering algorithm over the nsl-kdd dataset,” in 2017 IEEE International
Conference on Computational Intelligence and Computing Research (ICCIC). IEEE, 2017, pp. 1–6.
18. International Journal of Network Security & Its Applications (IJNSA) Vol. 12, No.1, January 2020
18
AUTHORS
Binita Bohara is a graduate student of Information System and Security Management at
Tuskegee University. She completed her Accounting degree from Australia. She is
currently working on a research on clustering techniques for intrusion detection for net-
work security under professor Dr. Jay Bhuyan. Currently, she is also working as
a teaching assistant for undergraduate students.
Jay Bhuyan is a professor in the Department of Computer Sci- ence at Tuskegee
University. His research and teaching interests include Telecom Software Architecture
and Development, Software and Network Security, Big Data Analytics, and Machine
Learn- ing. He received a Ph.D. in Computer Science from the Uni- versity of
Louisiana at Lafayette. He has over 25 years of full- time and part-time teaching
experience as well as over 15 years of research and development experience in the
Telecom industry. He is a member of ACM, IEEE and IEEE Computer, IEEE Communications Societies.
Fan Wu is a professor of Computer Science and the director of Tuskegee University
Center of Academic Excellence in Information Assurance Education. He received his
Ph.D. degree in Computer Science from Worcester Polytechnic Institute (WPI) in 2008.
His teaching and re- search interests lie primarily in the area of Information Assurance,
Soft- ware Security, Mobile Security, High Performance Computing and Ar- tificial
Intelligence. He has led several cybersecurity related projects funded by NSF, DHS, and
DoD. He has served as the editor-in-chief of the International Journal of Mobile Devices,
Wearable Technology, and Flexible Elec- tronics (IJMDWTFE). He has involved students in research and
published a number of re- search papers in cybersecurity, software security, mobile security, and
information assurance.
Junhua Ding is a Professor of Data Science and the Director of Data Science program in
the department of Information Science at the Univer- sity of North Texas. His research
interests include: data analytics, ma- chine learning, software security and software
engineering. He received a Ph.D., an M.S., and a B.S. in Computer Science from Florida
Inter- national University, Nanjing University, and China University of Geo- sciences,
respectively.