Abstract—Classical machine learning techniques have been employed severally in intrusion detection. But due to the rising cases and sophistication of attacks, more advanced machine learning techniques including ensemble-based methods, neural networks and deep learning techniques have been applied. However, there is still need for improved machine learning approach to detect attacks more effectively and efficiently. Stacked generalization approach has been shown to be capable of learning from features and meta-features but has been limited by the deficiencies of base classifiers and lack of optimization in the choice of meta-feature combination. This paper therefore proposes a stacked generalization ensemble approach based on two-tier meta-learner, in which the outputs of classical stacked ensemble are passed to multi-feature-based stacked ensemble, which is optimized. A Grid-search approach is used for the optimization. Nine data features and four meta-features derived from Logistic Regression, Support Vector Machine, Naïve Bayes, and Multilayer Perceptron neural network are used for the machine learning classification task. By applying neural networks as the meta-learner for the classification of NSL-KDD data, improved performances in terms of accuracy, precision, recall and F-measure of 0.97, 0.98, 0.98 and 0.98, respectively are achieved.
International Journal of Computer Science and Information Security,IJCSIS ISSN 1947-5500, Pittsburgh, PA, USA
Email: ijcsiseditor@gmail.com
http://sites.google.com/site/ijcsis/
https://google.academia.edu/JournalofComputerScience
https://www.linkedin.com/in/ijcsis-research-publications-8b916516/
http://www.researcherid.com/rid/E-1319-2016
A Novel Classification via Clustering Method for Anomaly Based Network Intrus...IDES Editor
Intrusion detection in the internet is an active
area of research. Intruders can be classified into two
types, namely; external intruders who are unauthorized
users of the computers they attack, and internal
intruders, who have permission to access the system but
with some restrictions. The aim of this paper is to present
a methodology to recognize attacks during the normal
activities in a system. A novel classification via sequential
information bottleneck (sIB) clustering algorithm has
been proposed to build an efficient anomaly based
network intrusion detection model. We have compared
our proposed method with other clustering algorithms
like X-Means, Farthest First, Filtered clusters, DBSCAN,
K-Means, and EM (Expectation-Maximization)
clustering in order to find the suitability of our proposed
algorithm. A subset of KDDCup 1999 intrusion detection
benchmark dataset has been used for the experiment.
Results show that the proposed method is efficient in
terms of detection accuracy, low false positive rate in
comparison to the other existing methods.
Intrusion detection with Parameterized Methods for Wireless Sensor Networksrahulmonikasharma
Current network intrusion detection systems lack adaptability to the frequently changing network environments. Furthermore, intrusion detection in the new distributed architectures is now a major requirement. In this paper, we propose two Adaboost based intrusion detection algorithms. In the first algorithm, a traditional online Adaboost process is used where decision stumps are used as weak classifiers. In the second algorithm, an improved online Adaboost process is proposed, and online Gaussian mixture models (GMMs) are used as weak classifiers. We further propose a distributed intrusion detection framework, in which a local parameterized detection model is constructed in each node using the online Adaboost algorithm. A global detection model is constructed in each node by combining the local parametric models using a small number of samples in the node. This combination is achieved using an algorithm based on particle swarm optimization (PSO) and support vector machines. The global model in each node is used to detect intrusions. Experimental results show that the improved online Adaboost process with GMMs obtains a higher detection rate and a lower false alarm rate than the traditional online Adaboost process that uses decision stumps. Both the algorithms outperform existing intrusion detection algorithms. It is also shown that our PSO, and SVM-based algorithm effectively combines the local detection models into the global model in each node; the global model in a node can handle the intrusion types that are found in other nodes, without sharing the samples of these intrusion types.
A new clutering approach for anomaly intrusion detectionIJDKP
Recent advances in technology have made our work easier compare to earlier times. Computer network is
growing day by day but while discussing about the security of computers and networks it has always been a
major concerns for organizations varying from smaller to larger enterprises. It is true that organizations
are aware of the possible threats and attacks so they always prepare for the safer side but due to some
loopholes attackers are able to make attacks.
Intrusion detection is one of the major fields of research and researchers are trying to find new algorithms
for detecting intrusions. Clustering techniques of data mining is an interested area of research for detecting
possible intrusions and attacks. This paper presents a new clustering approach for anomaly intrusion
detection by using the approach of K-medoids method of clustering and its certain modifications. The
proposed algorithm is able to achieve high detection rate and overcomes the disadvantages of K-means
algorithm.
A novel ensemble modeling for intrusion detection system IJECEIAES
Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs, and hence our system is robust and efficient.
Data mining is the knowledge discovery in databases and the gaol is to extract patterns and knowledge from
large amounts of data. The important term in data mining is text mining. Text mining extracts the quality
information highly from text. Statistical pattern learning is used to high quality information. High –quality in
text mining defines the combinations of relevance, novelty and interestingness. Tasks in text mining are text
categorization, text clustering, entity extraction and sentiment analysis. Applications of natural language
processing and analytical methods are highly preferred to turn
A Novel Classification via Clustering Method for Anomaly Based Network Intrus...IDES Editor
Intrusion detection in the internet is an active
area of research. Intruders can be classified into two
types, namely; external intruders who are unauthorized
users of the computers they attack, and internal
intruders, who have permission to access the system but
with some restrictions. The aim of this paper is to present
a methodology to recognize attacks during the normal
activities in a system. A novel classification via sequential
information bottleneck (sIB) clustering algorithm has
been proposed to build an efficient anomaly based
network intrusion detection model. We have compared
our proposed method with other clustering algorithms
like X-Means, Farthest First, Filtered clusters, DBSCAN,
K-Means, and EM (Expectation-Maximization)
clustering in order to find the suitability of our proposed
algorithm. A subset of KDDCup 1999 intrusion detection
benchmark dataset has been used for the experiment.
Results show that the proposed method is efficient in
terms of detection accuracy, low false positive rate in
comparison to the other existing methods.
Intrusion detection with Parameterized Methods for Wireless Sensor Networksrahulmonikasharma
Current network intrusion detection systems lack adaptability to the frequently changing network environments. Furthermore, intrusion detection in the new distributed architectures is now a major requirement. In this paper, we propose two Adaboost based intrusion detection algorithms. In the first algorithm, a traditional online Adaboost process is used where decision stumps are used as weak classifiers. In the second algorithm, an improved online Adaboost process is proposed, and online Gaussian mixture models (GMMs) are used as weak classifiers. We further propose a distributed intrusion detection framework, in which a local parameterized detection model is constructed in each node using the online Adaboost algorithm. A global detection model is constructed in each node by combining the local parametric models using a small number of samples in the node. This combination is achieved using an algorithm based on particle swarm optimization (PSO) and support vector machines. The global model in each node is used to detect intrusions. Experimental results show that the improved online Adaboost process with GMMs obtains a higher detection rate and a lower false alarm rate than the traditional online Adaboost process that uses decision stumps. Both the algorithms outperform existing intrusion detection algorithms. It is also shown that our PSO, and SVM-based algorithm effectively combines the local detection models into the global model in each node; the global model in a node can handle the intrusion types that are found in other nodes, without sharing the samples of these intrusion types.
A new clutering approach for anomaly intrusion detectionIJDKP
Recent advances in technology have made our work easier compare to earlier times. Computer network is
growing day by day but while discussing about the security of computers and networks it has always been a
major concerns for organizations varying from smaller to larger enterprises. It is true that organizations
are aware of the possible threats and attacks so they always prepare for the safer side but due to some
loopholes attackers are able to make attacks.
Intrusion detection is one of the major fields of research and researchers are trying to find new algorithms
for detecting intrusions. Clustering techniques of data mining is an interested area of research for detecting
possible intrusions and attacks. This paper presents a new clustering approach for anomaly intrusion
detection by using the approach of K-medoids method of clustering and its certain modifications. The
proposed algorithm is able to achieve high detection rate and overcomes the disadvantages of K-means
algorithm.
A novel ensemble modeling for intrusion detection system IJECEIAES
Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs, and hence our system is robust and efficient.
Data mining is the knowledge discovery in databases and the gaol is to extract patterns and knowledge from
large amounts of data. The important term in data mining is text mining. Text mining extracts the quality
information highly from text. Statistical pattern learning is used to high quality information. High –quality in
text mining defines the combinations of relevance, novelty and interestingness. Tasks in text mining are text
categorization, text clustering, entity extraction and sentiment analysis. Applications of natural language
processing and analytical methods are highly preferred to turn
Using Cisco Network Components to Improve NIDPS Performance csandit
Network Intrusion Detection and Prevention Systems (NIDPSs) are used to detect, prevent and
report evidence of attacks and malicious traffic. Our paper presents a study where we used open
source NIDPS software. We show that NIDPS detection performance can be weak in the face of
high-speed and high-load traffic in terms of missed alerts and missed logs. To counteract this
problem, we have proposed and evaluated a solution that utilizes QoS, queues and parallel
technologies in a multi-layer Cisco Catalyst Switch to increase NIDPSs detection performance.
Our approach designs a novel QoS architecture to organise and improve throughput-forwardplan
traffic in a layer 3 switch in order to improve NIDPS performance.
An Iterative Improved k-means ClusteringIDES Editor
Clustering is a data mining (machine learning),
unsupervised learning technique used to place data elements
into related groups without advance knowledge of the group
definitions. One of the most popular and widely studied
clustering methods that minimize the clustering error for
points in Euclidean space is called K-means clustering.
However, the k-means method converges to one of many local
minima, and it is known that the final results depend on the
initial starting points (means). In this research paper, we have
introduced and tested an improved algorithm to start the kmeans
with good starting points (means). The good initial
starting points allow k-means to converge to a better local
minimum; also the numbers of iteration over the full dataset
are being decreased. Experimental results show that initial
starting points lead to good solution reducing the number of
iterations to form a cluster.
LSTM deep learning method for network intrusion detection system IJECEIAES
The security of the network has become a primary concern for organizations. Attackers use different means to disrupt services, these various attacks push to think of a new way to block them all in one manner. In addition, these intrusions can change and penetrate the devices of security. To solve these issues, we suggest, in this paper, a new idea for Network Intrusion Detection System (NIDS) based on Long Short-Term Memory (LSTM) to recognize menaces and to obtain a long-term memory on them, in order to stop the new attacks that are like the existing ones, and at the same time, to have a single mean to block intrusions. According to the results of the experiments of detections that we have realized, the Accuracy reaches up to 99.98 % and 99.93 % for respectively the classification of two classes and several classes, also the False Positive Rate (FPR) reaches up to only 0,068 % and 0,023 % for respectively the classification of two classes and several classes, which proves that the proposed model is effective, it has a great ability to memorize and differentiate between normal traffic and attacks, and its identification is more accurate than other Machine Learning classifiers.
A survey of Network Intrusion Detection using soft computing Techniqueijsrd.com
with the impending era of internet, the network security has become the key foundation for lot of financial and business application. Intrusion detection is one of the looms to resolve the problem of network security. An Intrusion Detection System (IDS) is a program that analyses what happens or has happened during an execution and tries to find indications that the computer has been misused. Here we propose a new approach by utilizing neuro fuzzy and support vector machine with fuzzy genetic algorithm for higher rate of detection.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Pattern Recognition using Artificial Neural NetworkEditor IJCATR
An artificial neural network (ANN) usually called neural network. It can be considered as a resemblance to a paradigm
which is inspired by biological nervous system. In network the signals are transmitted by the means of connections links. The links
possess an associated way which is multiplied along with the incoming signal. The output signal is obtained by applying activation to
the net input NN are one of the most exciting and challenging research areas. As ANN mature into commercial systems, they are likely
to be implemented in hardware. Their fault tolerance and reliability are therefore vital to the functioning of the system in which they
are embedded. The pattern recognition system is implemented with Back propagation network and Hopfield network to remove the
distortion from the input. The Hopfield network has high fault tolerance which supports this system to get the accurate output.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...gerogepatton
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So,these result in model not able to efficiently learn the characteristics of rare categories and this will result in
poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...ijaia
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So, these result in model not able to efficiently learn the characteristics of rare categories and this will result in poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
This session talks about how to define a problem as a machine learning one. What are the steps toward reaching a satisfying solution from data preparation, feature engineering, evaluating suitable algorithms until releasing the model and putting it in practice. It presents a case study and go through some algorithms mostly implemented in Python.
By Hussein Natsheh - Data Mining entrepreneur, scholar, and founder of CiApple
YouTube video: https://youtu.be/NGbyeX4kpU4
An Extensive Review on Generative Adversarial Networks GAN’sijtsrd
This paper is to provide a high level understanding of Generative Adversarial Networks. This paper will be covering the working of GAN’s by explaining the background idea of the framework, types of GAN’s in the industry, it’s advantages and disadvantages, history of how GAN’s are developed and enhanced along the timeline and some applications where GAN’s outperforms themselves. Atharva Chitnavis | Yogeshchandra Puranik "An Extensive Review on Generative Adversarial Networks (GAN’s)" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42357.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42357/an-extensive-review-on-generative-adversarial-networks-gan’s/atharva-chitnavis
Improving IF Algorithm for Data Aggregation Techniques in Wireless Sensor Net...IJECEIAES
In Wireless Sensor Network (WSN), fact from different sensor nodes is collected at assembling node, which is typically complete via modest procedures such as averaging as inadequate computational power and energy resources. Though such collections is identified to be extremely susceptible to node compromising attacks. These approaches are extremely prone to attacks as WSN are typically lacking interfere resilient hardware. Thus, purpose of veracity of facts and prestige of sensor nodes is critical for wireless sensor networks. Therefore, imminent gatherer nodes will be proficient of accomplishment additional cultivated data aggregation algorithms, so creating WSN little unresisting, as the performance of actual low power processors affectedly increases. Iterative filtering algorithms embrace inordinate capacity for such a resolution. The way of allocated the matching mass elements to information delivered by each source, such iterative algorithms concurrently assemble facts from several roots and deliver entrust valuation of these roots. Though suggestively extra substantial against collusion attacks beside the modest averaging techniques, are quiet vulnerable to a different cultivated attack familiarize. The existing literature is surveyed in this paper to have a study of iterative filtering techniques and a detailed comparison is provided. At the end of this paper new technique of improved iterative filtering is proposed with the help of literature survey and drawbacks found in the literature.
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
COPYRIGHT
This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.ABSTRACT
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and lar.
Using Cisco Network Components to Improve NIDPS Performance csandit
Network Intrusion Detection and Prevention Systems (NIDPSs) are used to detect, prevent and
report evidence of attacks and malicious traffic. Our paper presents a study where we used open
source NIDPS software. We show that NIDPS detection performance can be weak in the face of
high-speed and high-load traffic in terms of missed alerts and missed logs. To counteract this
problem, we have proposed and evaluated a solution that utilizes QoS, queues and parallel
technologies in a multi-layer Cisco Catalyst Switch to increase NIDPSs detection performance.
Our approach designs a novel QoS architecture to organise and improve throughput-forwardplan
traffic in a layer 3 switch in order to improve NIDPS performance.
An Iterative Improved k-means ClusteringIDES Editor
Clustering is a data mining (machine learning),
unsupervised learning technique used to place data elements
into related groups without advance knowledge of the group
definitions. One of the most popular and widely studied
clustering methods that minimize the clustering error for
points in Euclidean space is called K-means clustering.
However, the k-means method converges to one of many local
minima, and it is known that the final results depend on the
initial starting points (means). In this research paper, we have
introduced and tested an improved algorithm to start the kmeans
with good starting points (means). The good initial
starting points allow k-means to converge to a better local
minimum; also the numbers of iteration over the full dataset
are being decreased. Experimental results show that initial
starting points lead to good solution reducing the number of
iterations to form a cluster.
LSTM deep learning method for network intrusion detection system IJECEIAES
The security of the network has become a primary concern for organizations. Attackers use different means to disrupt services, these various attacks push to think of a new way to block them all in one manner. In addition, these intrusions can change and penetrate the devices of security. To solve these issues, we suggest, in this paper, a new idea for Network Intrusion Detection System (NIDS) based on Long Short-Term Memory (LSTM) to recognize menaces and to obtain a long-term memory on them, in order to stop the new attacks that are like the existing ones, and at the same time, to have a single mean to block intrusions. According to the results of the experiments of detections that we have realized, the Accuracy reaches up to 99.98 % and 99.93 % for respectively the classification of two classes and several classes, also the False Positive Rate (FPR) reaches up to only 0,068 % and 0,023 % for respectively the classification of two classes and several classes, which proves that the proposed model is effective, it has a great ability to memorize and differentiate between normal traffic and attacks, and its identification is more accurate than other Machine Learning classifiers.
A survey of Network Intrusion Detection using soft computing Techniqueijsrd.com
with the impending era of internet, the network security has become the key foundation for lot of financial and business application. Intrusion detection is one of the looms to resolve the problem of network security. An Intrusion Detection System (IDS) is a program that analyses what happens or has happened during an execution and tries to find indications that the computer has been misused. Here we propose a new approach by utilizing neuro fuzzy and support vector machine with fuzzy genetic algorithm for higher rate of detection.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Pattern Recognition using Artificial Neural NetworkEditor IJCATR
An artificial neural network (ANN) usually called neural network. It can be considered as a resemblance to a paradigm
which is inspired by biological nervous system. In network the signals are transmitted by the means of connections links. The links
possess an associated way which is multiplied along with the incoming signal. The output signal is obtained by applying activation to
the net input NN are one of the most exciting and challenging research areas. As ANN mature into commercial systems, they are likely
to be implemented in hardware. Their fault tolerance and reliability are therefore vital to the functioning of the system in which they
are embedded. The pattern recognition system is implemented with Back propagation network and Hopfield network to remove the
distortion from the input. The Hopfield network has high fault tolerance which supports this system to get the accurate output.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...gerogepatton
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So,these result in model not able to efficiently learn the characteristics of rare categories and this will result in
poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...ijaia
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So, these result in model not able to efficiently learn the characteristics of rare categories and this will result in poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
This session talks about how to define a problem as a machine learning one. What are the steps toward reaching a satisfying solution from data preparation, feature engineering, evaluating suitable algorithms until releasing the model and putting it in practice. It presents a case study and go through some algorithms mostly implemented in Python.
By Hussein Natsheh - Data Mining entrepreneur, scholar, and founder of CiApple
YouTube video: https://youtu.be/NGbyeX4kpU4
An Extensive Review on Generative Adversarial Networks GAN’sijtsrd
This paper is to provide a high level understanding of Generative Adversarial Networks. This paper will be covering the working of GAN’s by explaining the background idea of the framework, types of GAN’s in the industry, it’s advantages and disadvantages, history of how GAN’s are developed and enhanced along the timeline and some applications where GAN’s outperforms themselves. Atharva Chitnavis | Yogeshchandra Puranik "An Extensive Review on Generative Adversarial Networks (GAN’s)" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42357.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42357/an-extensive-review-on-generative-adversarial-networks-gan’s/atharva-chitnavis
Improving IF Algorithm for Data Aggregation Techniques in Wireless Sensor Net...IJECEIAES
In Wireless Sensor Network (WSN), fact from different sensor nodes is collected at assembling node, which is typically complete via modest procedures such as averaging as inadequate computational power and energy resources. Though such collections is identified to be extremely susceptible to node compromising attacks. These approaches are extremely prone to attacks as WSN are typically lacking interfere resilient hardware. Thus, purpose of veracity of facts and prestige of sensor nodes is critical for wireless sensor networks. Therefore, imminent gatherer nodes will be proficient of accomplishment additional cultivated data aggregation algorithms, so creating WSN little unresisting, as the performance of actual low power processors affectedly increases. Iterative filtering algorithms embrace inordinate capacity for such a resolution. The way of allocated the matching mass elements to information delivered by each source, such iterative algorithms concurrently assemble facts from several roots and deliver entrust valuation of these roots. Though suggestively extra substantial against collusion attacks beside the modest averaging techniques, are quiet vulnerable to a different cultivated attack familiarize. The existing literature is surveyed in this paper to have a study of iterative filtering techniques and a detailed comparison is provided. At the end of this paper new technique of improved iterative filtering is proposed with the help of literature survey and drawbacks found in the literature.
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
COPYRIGHT
This thesis is copyright materials protected under the Berne Convection, the copyright Act 1999 and other international and national enactments in that behalf, on intellectual property. It may not be reproduced by any means in full or in part except for short extracts in fair dealing so for research or private study, critical scholarly review or discourse with acknowledgment, with written permission of the Dean School of Graduate Studies on behalf of both the author and XXX XXX University.ABSTRACT
With Fast growing internet world the risk of intrusion has also increased, as a result Intrusion Detection System (IDS) is the admired key research field. IDS are used to identify any suspicious activity or patterns in the network or machine, which endeavors the security features or compromise the machine. IDS majorly use all the features of the data. It is a keen observation that all the features are not of equal relevance for the detection of attacks. Moreover every feature does not contribute in enhancing the system performance significantly. The main aim of the work done is to develop an efficient denial of service network intrusion classification model. The specific objectives included: to analyse existing literature in intrusion detection systems; what are the techniques used to model IDS, types of network attacks, performance of various machine learning tools, how are network intrusion detection systems assessed; to find out top network traffic attributes that can be used to model denial of service intrusion detection; to develop a machine learning model for detection of denial of service network intrusion.Methods: The research design was experimental and data was collected by simulation using NSL-KDD dataset. By implementing Correlation Feature Selection (CFS) mechanism using three search algorithms, a smallest set of features is selected with all the features that are selected very frequently. Findings: The smallest subset of features chosen is the most nominal among all the feature subset found. Further, the performances using Artificial neural networks(ANN), decision trees, Support Vector Machines (SVM) and K-Nearest Neighbour (KNN) classifiers is compared for 7 subsets found by filter model and 41 attributes. Results: The outcome indicates a remarkable improvement in the performance metrics used for comparison of the two classifiers. The results show that using 17/18 selected features improves DOS types classification accuracies as compared to using the 41 features in the NSL-KDD dataset. It was further observed that using an ensemble of three classifiers with decision fusion performs better as compared to using a single classifier for DOS type’s classification. Among machine learning tools experimented, ANN achieved best classification accuracies followed by SVM and DT. KNN registered the lowest classification accuracies. Application: The proposed work with such an improved detection rate and lesser classification time and lar.
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
Study on deep neural networks and big data is merging now by several aspects to enhance the capabilities of intrusion detection system (IDS). Many IDS models has been introduced to provide security over big data. This study focuses on the intrusion detection in computer networks using big datasets. The advent of big data has agitated the comprehensive assistance in cyber security by forwarding a brunch of affluent algorithms to classify and analysis patterns and making a better prediction more efficiently. In this study, to detect intrusion a detection model has been propounded applying deep neural networks. We applied the suggested model on the latest dataset available at online, formatted with packet based, flow based data and some additional metadata. The dataset is labeled and imbalanced with 79 attributes and some classes having much less training samples compared to other classes. The proposed model is build using Keras and Google Tensorflow deep learning environment. Experimental result shows that intrusions are detected with the accuracy over 99% for both binary and multiclass classification with selected best features. Receiver operating characteristics (ROC) and precision-recall curve average score is also 1. The outcome implies that Deep Neural Networks offers a novel research model with great accuracy for intrusion detection model, better than some models presented in the literature.
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...IJCNCJournal
An efficient Intrusion Detection System has to be given high priority while connecting systems with a network to prevent the system before an attack happens. It is a big challenge to the network security group to prevent the system from a variable types of new attacks as technology is growing in parallel. In this paper, an efficient model to detect Intrusion is proposed to predict attacks with high accuracy and less false-negative rate by deriving custom features UNSW-CF by using the benchmark intrusion dataset UNSW-NB15. To reduce the learning complexity, Custom Features are derived and then Significant Features are constructed by applying meta-heuristic FPA (Flower Pollination algorithm) and MRMR (Minimal Redundancy and Maximum Redundancy) which reduces learning time and also increases prediction accuracy. ENC (ElasicNet Classifier), KRRC (Kernel Ridge Regression Classifier), IGBC (Improved Gradient Boosting Classifier) is employed to classify the attacks in the datasets UNSW-CF, UNSW and recorded that UNSW-CF with derived custom features using IGBC integrated with FPA provided high accuracy of 97.38% and a low error rate of 2.16%. Also, the sensitivity and specificity rate for IGB attains a high rate of 97.32% and 97.50% respectively.
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...IJCNCJournal
An efficient Intrusion Detection System has to be given high priority while connecting systems with a network to prevent the system before an attack happens. It is a big challenge to the network security group to prevent the system from a variable types of new attacks as technology is growing in parallel. In this paper, an efficient model to detect Intrusion is proposed to predict attacks with high accuracy and less false-negative rate by deriving custom features UNSW-CF by using the benchmark intrusion dataset UNSW-NB15. To reduce the learning complexity, Custom Features are derived and then Significant Features are constructed by applying meta-heuristic FPA (Flower Pollination algorithm) and MRMR (Minimal Redundancy and Maximum Redundancy) which reduces learning time and also increases prediction accuracy. ENC (ElasicNet Classifier), KRRC (Kernel Ridge Regression Classifier), IGBC (Improved Gradient Boosting Classifier) is employed to classify the attacks in the datasets UNSW-CF, UNSW and recorded that UNSW-CF with derived custom features using IGBC integrated with FPA provided high accuracy of 97.38% and a low error rate of 2.16%. Also, the sensitivity and specificity rate for IGB attains a high rate of 97.32% and 97.50% respectively.
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
Progress of Machine Learning in the Field of Intrusion Detection Systemsijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into
computer systems are increasing. Implementing intrusion detection systems have become extremely
important to help maintain good network security. Support vector machines (SVMs), a classic pattern
recognition tool, have been widely used in intrusion detection. They can handle very large data with high
efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model
enriched with a Gaussian kernel function based on the features of the training data for intrusion detection.
The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection
efficiency and false alarm rate, which can give better coverage and make detection more efficient.
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...ijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into computer systems are increasing. Implementing intrusion detection systems have become extremely important to help maintain good network security. Support vector machines (SVMs), a classic pattern recognition tool, have been widely used in intrusion detection. They can handle very large data with high efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model enriched with a Gaussian kernel function based on the features of the training data for intrusion detection. The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection efficiency and false alarm rate, which can give better coverage and make detection more efficient.
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
Enormous studies on intrusion detection have widely applied data mining techniques to
finding out the useful knowledge automatically from large amount of databases, while few studies have
proposed classification data mining approaches. In an actual risk assessment process, the discovery of
intrusion detection prediction knowledge from experts is still regarded as an important task because
experts’ predictions depend on their subjectivity. Traditional statistical techniques and artificial
intelligence techniques are commonly used to solve this classification decision making. This paper
proposes an ant-miner based data mining method for discovering network intrusion detection rules from
large dataset. The obtained result of this experiment shows that clearly the ant-miner is superior than
ID3, J48, ADtree, BFtree, Simple cart. Although different classification models have been developed for
network intrusion detection, each of them has its strength and weakness, including the most commonly
applied Support Vector Machine(SVM)method and the clustering based on Self Organized Ant Colony
Network (CSOACN).Our algorithm is implemented and evaluated using a standard bench mark KDD99
dataset. Experiments show that ant-miner algorithm out performs than other methods in terms of both
classification rate and accuracy
Real Time Intrusion Detection System Using Computational Intelligence and Neu...ijtsrd
Today, Intrusion detection system using neural network is interested and measurable area for the researchers. The computational intelligence describe based on following parameters such as computational speed, adaptation, error resilience and fault tolerance. A good intrusion detection system must be satisfied adaptable as requirements. The objective of this paper, provide an outline of the research progress via computational intelligence and neural network over the intrusion detection. In this paper focused, existing research challenges, review analysis, research suggestion regarding Intrusion detection system. Dr. Prabha Shreeraj Nair"Real Time Intrusion Detection System Using Computational Intelligence and Neural Network: A Review" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-6 , October 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5781.pdf http://www.ijtsrd.com/engineering/computer-engineering/5781/real-time-intrusion-detection-system-using-computational-intelligence-and-neural-network-a-review/dr-prabha-shreeraj-nair
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Similar to A Stacked Generalization Ensemble Approach for Improved Intrusion Detection (20)
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
1. A Stacked Generalization Ensemble Approach for
Improved Intrusion Detection
Oluwafemi Oriola
Department of Computer Science
Adekunle Ajasin University
Akungba Akoko, Nigeria
oluwafemi.oriola@aaua.edu.ng
Abstract—Classical machine learning techniques have been
employed severally in intrusion detection. But due to the rising
cases and sophistication of attacks, more advanced machine
learning techniques including ensemble-based methods, neural
networks and deep learning techniques have been applied.
However, there is still need for improved machine learning
approach to detect attacks more effectively and efficiently.
Stacked generalization approach has been shown to be capable of
learning from features and meta-features but has been limited by
the deficiencies of base classifiers and lack of optimization in the
choice of meta-feature combination. This paper therefore
proposes a stacked generalization ensemble approach based on
two-tier meta-learner, in which the outputs of classical stacked
ensemble are passed to multi-feature-based stacked ensemble,
which is optimized. A Grid-search approach is used for the
optimization. Nine data features and four meta-features derived
from Logistic Regression, Support Vector Machine, Naïve Bayes,
and Multilayer Perceptron neural network are used for the
machine learning classification task. By applying neural
networks as the meta-learner for the classification of NSL-KDD
data, improved performances in terms of accuracy, precision,
recall and F-measure of 0.97, 0.98, 0.98 and 0.98, respectively are
achieved.
Keywords-Intrusion detection system; machine learning;
ensemble method; stacked generalization; two-tier meta-learner
I. INTRODUCTION
The field of Artificial Intelligence, most especially machine
learning has been very beneficial to numerous sectors such as
health care, education, transport and logistics,
pharmaceuticals, finance, energy, manufacturing and public
service[1]. Machine learning refers to an artificial intelligence
technology that allows systems to learn directly from
examples, data, and experience without having to be explicitly
programmed. The machine learning techniques include
supervised, unsupervised or reinforcement learning
techniques[2]. Supervised learning technique is the
commonest technique, which learns from a set of labelled data
and predict the classes of unlabelled data. In manufacturing
sector, supervised learning techniques have been applied to
improve the effectiveness of Intrusion Detection Systems[3].
Intrusion Detection Systems (IDS) are used to monitor
network activities and detect incidents of attacks[3]. Various
types of IDS exist such as signature-based IDS and anomaly-
based IDS. The signature-based IDS rely on repository of
previous attacks to detect new attacks, while anomaly-based
IDS rely on normal behaviour of systems and networks to
detect incidents of attacks. There are also Hybrid IDS, which
combine the characteristics of signature and anomaly-based
IDS. These different types of IDS can operate either at the host
or the network level in a network. The commercial and open
source IDS such as SNORT, Bro, Prelude, Ethereal and
OSSEC are still largely signature-based IDS. In the research
community however, anomaly and hybrid have been designed
using machine learning.
Presently, the capabilities of the traditional IDS could not
match the capacity of network attacks because of the
sophistication of network threats and availability of high-end
computers and network systems. Thus, several novelties have
been devised among which is ensemble machine learning to
improve the capacity of IDS. Ensemble method is a way of
combining same or different approaches to solve a particular
problem; weaker learners are combined to form stronger
learner[4][5]. It usually involves combination of the outputs of
classifiers. The major advantage of ensemble algorithms over
hybrid is its modularity, which allows lesser performing
algorithm to be replaced with better ones. The bagging,
boosting and stacking methods[6][7] have been popularly used
for the mix-of-experts functions. Also, voting selection
methods including majority voting, weighted voting, rule-based
voting, probability voting, and average voting methods have
also been used. Except for stacking methods, the ensemble
algorithms are focused on homogeneous sets of features, which
might not be effective for class imbalance contexts such as
intrusion detection.
However, the existing performances of deep learning
algorithms, with high running costs, have far surpassed the
performances of classical ensemble approaches including
stacking. Therefore, this paper focuses on intrusion detection
using improved stacking ensemble machine learning approach,
which results are better than both existing ensemble and
artificial neural network algorithms.
II. ENSEMBLE AND DEEP NEURAL NETWORKS-BASED
INTRUSION DETECTION
Several works have been carried out on intrusion detection
system. The subsections below present the ensemble and deep
neural network-based intrusion detection, respectively.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
62 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
2. A. Ensemble-based Intrusion Detection
The most related work, which applied meta-classification
enabled by stacked generalization was proposed by [8]. The
authors combined Random Forest, K-Nearest Neighbour and
Logistic Regression to predict multiclass intrusion classes in
UNSW-NB15 dataset. Multi-feature-based stacking, with
single-tier meta-learner was applied. Support Vector Machine
was used as the meta-learner. It achieved a cross-validated
accuracy of 0.94 and F-Measure of 0.95. The work however
was limited by the deficiency of the base classifier and lack of
optimization in the choice of meta-feature combination.
Gaikwad and Thool [9] improved on the Accuracy of IDS by
using Genetic Algorithm for selecting fifteen most relevant
features from NSL-KDD. They applied bagging ensemble
approach on each of Naive Bayes, PART and C4.5 algorithm.
The result showed that the bagging led to increase in the
accuracy. A hybrid intelligent intrusion detection system
composed of pre-processing phase, feature reduction phase,
classification phase and combining phase was proposed in
[10]. The work investigated the performance of hybrid Radial
Basis Function (RBF) and Support Vector Machine (SVM) for
IDS. The 41 attributes of NSL-KDD dataset were used, while
approximately 20% of the original dataset were used for
classification. The best training sets were selected based on
Adaptive Resampling and Combining (ARCING). The results
showed a classification Accuracy of 0.8519 which was higher
than the Accuracy of RBF and SVM base classifiers. Gao et
al. [11] designed an ensemble adaptive voting algorithm for
classification of NSL-KDD intrusion dataset. The adaptive
voting algorithm recorded the highest accuracy of 0.852.
Multi-Tree algorithm outperformed decision tree, random
forest, k-Nearest Neighbour and Deep Neural Networks.
B. Neural Networks-based Intrusion Detection
The authors in [12] proposed artificial neural network
architecture based on Multilayer Perceptron (MLP) for
detection of intrusion in a typical benign network traffic data.
They obtained an accuracy of 0.98, relative operating
characteristics of 0.98, and false positive rate of less than 2%
using 10-fold cross-validation. The neural network ensemble
method comprised of autoencoder, deep belief neural network,
deep neural network, and an extreme learning machine was
proposed in [13], which are computationally expensive. Using
NSL-KDD dataset, the testing accuracy was 0.92, while the F-
score was 0.93. Yang et al. [14]worked on improvement of the
performance of IDS using NSL-KDD and UNSW-NB15
datasets; they used modified density peak clustering algorithm
(MDPCA) to solve class imbalance problems by dividing the
training set into several subsets with similar set of attributes.
Deep belief networks (DBNs) was used to reduce the high
dimensions and perform classification. The outputs of the
DBNs were aggregated using Fuzzy Membership Weights.
The results showed that the accuracy and F-score were 0.82
and 0.81, respectively for NSL-KDD, while they were 0.90
and 0.91 for UNSW-NB15.
This paper focuses on development of ensemble Machine
Learning approach based on stack generalization, with feature
selection, optimal meta-feature combination and artificial
neural networks for the purpose of efficient and effective
intrusion detection.
III. STACKING
Stacking (or stacked generalization), is an ensemble technique
of combining multiple classifiers [8]. Unlike bagging and
boosting, stacking is usually used to combine different
classifiers. Stacking consists of two levels which are base
learner and stacking model learner. Base learner uses many
different models to learn from a dataset. The outputs of each
of the models are collected to create a new dataset. In the new
dataset, each instance is related to the real value that is
supposed to be predicted. Then the dataset is used by stacking
model learner to provide the final output. For example, the
predicted classifications from the base classifiers such as naïve
bayes, decision tree and support vector machine can be used as
input variables into a k-nearest neighbour classifier as a
stacking model learner, which will attempt to learn from the
data how to combine the predictions from the different models
to achieve the best classification accuracy.
The popular stacked generalization techniques include the
classical stacked method[15] and multi-feature-based stacked
generalization method[16], which involve single meta-learner.
The classical stacking involves a single set of features and a
meta-learner, while a multi-feature-based stacking involves
multiple set of features and a meta-learner. The general
algorithm for the meta-classification is presented as follows:
Meta-learner ( )
Input: Labels L predicted by Base Classifiers, S
Output: Labels Lp predicted by E
//Initialize predictions for each feature in horizontal axis
Do for D= 1 to d
//initialize predictions for each feature in vertical axis
Do for M = 1 to m
If D =! M Then
Construct a new dataset T(d,m)
LT ←Train(T(d,m))
LP ←Test(LT, Ltest)
Endif
End
IV. PROPOSED APPROACH
The objective of the proposed approach is to obtain
improved predictions by using an ensemble technique called
stacking. Therefore, a two-tier stacking approach was
developed.
The algorithm and other existing algorithms were applied
to analyze a popular and reliable intrusion detection evaluation
data known as NSL-KDD [17], which has been used in
previous works. NSL-KDD is an upgraded version of KDD’99
developed by Canadian Institute of Cybersecurity, University
of New Brunswick. It was designed as a solution to the
There is no sponsor for the work
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
63 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
3. limitations of KDD’99 data. The dataset consists of 41
network features presented in Table I.
TABLE I: NSL_KDD NETWORK FEATURES WITH TYPES
S/N Feature Feature type
1 duration continuous
2 protocol_type symbolic
3 service symbolic
4 flag symbolic
5 src_bytes continuous
6 dst_bytes continuous
7 land symbolic
8 wrong_fragment continuous
9 urgent continuous
10 hot continuous
11 num_failed_logins continuous
12 logged_in symbolic
13 num_compromised continuous
14 root_shell continuous
15 su_attempted continuous
16 num_root continuous
17 num_file_creations continuous
18 num_shells continuous
19 num_access_files continuous
20 num_outbound_cmds continuous
21 is_host_login symbolic
22 is_guest_login symbolic
23 Count continuous
24 srv_count continuous
25 serror_rate continuous
26 srv_serror_rate continuous
27 rerror_rate continuous
28 srv_rerror_rate continuous
29 same_srv_rate continuous
30 diff_srv_rate continuous
31 srv_diff_host_rate continuous
32 dst_host_count continuous
33 dst_host_srv_count continuous
34 dst_host_same_srv_rate continuous
35 dst_host_diff_srv_rate continuous
36 dst_host_same_src_port_rate continuous
37 dst_host_srv_diff_host_rate continuous
38 dst_host_serror_rate continuous
39 dst_host_srv_serror_rate continuous
40 dst_host_rerror_rate continuous
41 dst_host_srv_rerror_rate continuous
The attack classes with the corresponding attack types,
which have been categorized as probe, user-to-root (U2R),
remote-to-local (R2L) or denial of service (DOS) is presented
in Table II.
Table II. ATTACK CLASSES AND TYPES
S/N Attack
Class
Attack types
1 DoS Back, land, neptune, pod, smurf,
teardrop, mailbomb, apache2, processtable,
udpstorm
2 Probe Ipsweep, nmap, portsweep, satan,
mscan, saint
3 R2L Ftp write, guess passwd, imap, multihop,
phf, spy, warezclient, warezmaster,
sendmail, named, snmpgetattack,
snmpguess, xlock, xsnoop, worm
4 U2R Buffer overflow, loadmodule, perl,
rootkit, httptunnel, ps, sqlattack, xterm
The data is composed of two datasets, which include
training and testing datasets. The training dataset contains
125,973 instances, while the testing dataset consists 22,453
instances. The distribution of the classes in both datasets is
imbalance. Table III presents the distribution of the NSL-
KDD datasets.
TABLE III. THE DISTRIBUTION OF NSL-KDD ATTACKS
Category Normal DoS Probe U2R R2L Total
Training 67,343 45,927 11,656 52 995 125,973
Testing 9,711 7,458 2,421 200 2,754 22,453
Three preprocessing processes were carried out on the
training and testing datasets. The steps include:
• Removal of redundant features
• Transformation of categorical features to numerical
features
• Normalization of the features using min-max
normalization presented in (5).
(5)
In Figure 1, the framework for the proposed stacking
approach is presented. Based on the objective of detecting
intrusion in efficient manner, three sets of best features
including 21, 11 and 9 were extracted using Sklearn Chi-
Square [18] from which 9 features consistently showed better
performance in terms of efficiency and effectiveness during
training. Therefore, nine features were used.
Thereafter, four base classifiers including Support Vector
Machine (SVM), Logistic Regression (LogReg), Naïve Bayes
(NB) and Multilayer Perceptron Artificial Neural Networks
(MLP-ANN) were trained. The prediction outputs of each
classifier form the meta-features, which were combined and
classified by the classical meta-leaner (stage 1). Furthermore,
multifeatured-based stacked ensemble was applied on the
outputs of the classical meta-learner as a second-tier meta-
learner (stage 2). Artificial neural network[12] was employed
as meta-learner in both cases.
The multi-feature-based stacking involved multilevel
combination of features, with levels ranging from 2 to 4 and
and meta-features ranging from 2 to 4 in each of the level. The
best combination is selected using Pipeline Grid-search
optimization based on parameter settings and was
implemented with Python Scikit-Learn [18].
The performance was compared with the existing stacking
ensembles as well as state-of-the-arts based on accuracy,
precision, recall and F-score. The performance metrics are
estimated as presented in equations (1) to (4). The equations
rely on the true positive (TP), which is the number of instances
of a class that are correctly predicted; true negative (TN),
which is the number of instances of other classes that are
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
64 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
4. incorrectly predicted; false positive (FP), which is the number
of instances of other classes that have been incorrectly
predicted as belonging to a class; false negative (FN), which is
the number of instances of a class that have been incorrectly
predicted as belonging to another class. The equations are
presented in (1) and (2).
Accuracy (Acc) = (1)
Precision (P) = (2)
Recall (R) = (3)
F-Score (F1) = (4)
Figure 1. Framework for the proposed Stacking Approach
Given data instances, I = (1, . . ., h), classifiers, S= (1, ... , m)
with parameters, R = (1, .. ., r) and values, V = (v11, ... , vmp), the
prediction scores (Lp) and final predictions (F) were evaluated
using the algorithmic methods presented below:
Base Classifier ( )
Input: Values V = (v11, . . ., vmp)
Output: Score P for each instance I
Do for S=1 to m
Do for R=1 to r
Do for I = 1 to h
Model ←S (train, R, V)//for train and test set
Score ←S (test, Model)
End
Meta-Learner 1 ( )
Input: Labels Lp predicted by Base Classifiers, S
Output: Labels Lp predicted by E
//Initialize predictions for each feature in horizontal axis
Do for D= 1 to d
//initialize predictions for each feature in vertical axis
Do for M = 1 to m
If D =! M Then
Construct a new dataset T(d,m)
LT ←Train(T(d,m))
LP ←Test (LT, Ltest)
Endif
End
Meta-learner 2 ( )
Input: The predictions P corresponding to L by the meta-
learner 1.
Output: The labels F predicted by the meta-learner G.
Do for R= 2 to 4 // meta-levels
Do for U = 2 to 4//meta-features
B[X] ← Instant (R, U)
Z← Meta-learner (B[X], P)
If Zru>Zr+1, u+1
F ← Zru (Label corresponding to best prediction for G)
Endif
End
V. RESULTS
The proposed stacked ensemble and previous stacked
ensembles were implemented in Python Scikit-Learn [18] with
default settings.
The accuracy, precision, recall and F-score of the base
classifiers, the proposed stacked ensemble, the multi-feature-
based stacked ensemble, and the classical stacked ensemble are
presented in Table IV. The comparison of the proposed stacked
ensemble and existing ensemble methods are presented in
Table V. Figure 2 shows the bar chart for the comparative
analysis of the proposed stacked ensemble and the state-of-the-
arts based on accuracy.
TABLE IV. PERFORMANCE OF THE PROPOSED STACKED
ENSEMBLE, BASE CLASSIFIER AND PREVIOUS STACKED
Classifier Accuracy Precision Recall F-
score
LogReg O.67 0.86 0.68 0.74
SVM 0.80 0.91 0.81 0.84
MLP-ANN 0.95 0.97 0.96 0.96
NB 0.45 0.80 0.45 0.50
Classical Stacked
Ensemble
0.80 0.91 0.81 0.85
Multi-feature
Stacked Ensemble
0.87 0.93 0.88 0.90
Proposed Stacked
Ensemble
0.97 0.98 0.98 0.98
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
65 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
5. The results in Table IV shows that the proposed stacked
ensemble method outperformed LogReg, SVM, NB and MLP-
ANN base classifiers by recording accuracy of 0.97, precision
of 0.98, recall of 0.98 and F-score of 0.98. It also performed
better than both the classical stacking method, with accuracy of
0.80, precision of 0.91, recall of 0.81 and F-score of 0.85 and
multi-feature based stacking method with accuracy of 0.87,
precision of 0.93, recall of 0.88 and F-score of 0.98. The
performance of multifeatured-based stacking method was
however better than the performance of the classical stacking
method. Except for the proposed stacked method, MLP-ANN
outperformed all the base classifiers and the stacking methods,
justifying the results of neural networks[12] [13].
TABLE V. PERFORMANCE OF THE PROPOSED STACKED ENSEMBLE
AND OTHER ENSEMBLE METHODS
Accuracy Precision Recall F-score
Majority
Voting
Ensemble
0.92 0.97 0.92 0.92
Weighted
Voting
Ensemble
0.95 0.97 0.96 0.96
Classical
Stacked
Ensemble
0.80 0.91 0.81 0.85
Multi-
feature
Stacked
Ensemble
0.87 0.93 0.88 0.90
Proposed
Stacked
Ensemble
0.97 0.98 0.98 0.98
The results in Table V shows that the proposed stacked
ensemble method outperformed majority voting ensemble
method, with accuracy of 0.92, precision of 0.97, recall of 0.92,
F-score of 0.92 and weighted voting ensemble method, with
accuracy of 0.95, precision of 0.97, recall of 0.96 and F-score
0.96. However, their performances were better than the
performance of both the classical stacked and multi-featured-
based stacked methods. The performance of weighted voting
ensemble method was better than the performance of majority
voting method.
The bar chart in Fig. 2 shows that the performance of the
proposed stacked ensemble method was better than the
performances of the state-of-the-arts in the evaluation of NSL-
KDD data in terms of accuracy and F-score. The chart shows
that the performance of Ludwig [13] which relied on deep
neural networks was better than other state-of-the-arts, but
clearly performed lesser than the proposed approach.
Fig. 2. Comparison of the Proposed Stacked Ensemble Method and the
State-of-the-arts
VI. CONCLUSION
This paper has proposed a stacked generalization ensemble
approach, with two meta-learners. The first meta-learner was
based on the classical stacked ensemble, while the second
meta-learner was based on the multi-feature-based stacked
ensemble. The second meta-learner was optimized to obtain
the best combination of meta-features.
By applying the algorithm to NSL-KDD intrusion
evaluation data, an accuracy of 0.97, precision of 0.98, recall
of 0.98 and F-score of 0.98 were achieved. The comparison of
the method with base classifiers, ensemble methods and state-
of-the-arts showed that the proposed stacked generalization
approach is better. Therefore, the stacked ensemble approach
provides a more effective way of detecting intrusion detection
in efficient manner compared to computationally expensive
deep learning methods [13].
In future, more optimization algorithms and datasets will
be evaluated.
REFERENCES
[1] RS, Machine learning: the power and promise of
computers that learn by example. The Royal Society,
2017.
[2] S. Das, A. Dey, A. Pal, and N. Roy, “Applications of
Artificial Intelligence in Machine Learning: Review
and Prospect,” Int. J. Comput. Appl., 2015.
[3] C. F. Tsai, Y. F. Hsu, C. Y. Lin, and W. Y. Lin,
“Intrusion detection by machine learning: A review,”
Expert Systems with Applications. 2009.
[4] L. K. Hansen and P. Salamon, “Neural Network
Ensembles,” IEEE Trans. Pattern Anal. Mach. Intell.,
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
66 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
6. 1990.
[5] R. E. Schapire, “The Strength of Weak Learnability,”
Mach. Learn., 1990.
[6] I. Syarif, E. Zaluska, A. Prugel-bennett, and G. Wills,
“Application of Bagging , Boosting and Stacking,” pp.
593–602, 2012.
[7] S. S. Roy and V. Krishna, “Analyzing Intrusion
Detection System : An Ensemble based Stacking
Approach,” pp. 307–309, 2014.
[8] S. Rajagopal, P. P. Kundapur, and K. S. Hareesha, “A
Stacking Ensemble for Network Intrusion Detection
Using Heterogeneous Datasets,” vol. 2020, 2020.
[9] D. Gaikwad and R. Thool, “Intrusion Detection
System using Bagging with Partial Decision Treebase
Classifier,” Procedia Comput. Sci., vol. 49, pp. 92–98,
2015.
[10] M. Govindarajan and R. Chandrasekaran, “Intrusion
Detection using an Ensemble of Classification
Methods,” Proc. World Congr. Eng. Comput. Sci., vol.
I, no. October, 2012.
[11] X. Gao, C. Shan, C. Hu, Z. Niu, and Z. Liu, “An
Adaptive Ensemble Machine Learning Model for
Intrusion Detection,” IEEE Access, vol. 7, pp. 82512–
82521, 2019.
[12] A. Shenfield, D. Day, and A. Ayesh, “Intelligent
intrusion detection systems using artificial neural
networks,” ICT Express, vol. 4, no. 2, pp. 95–99,
2018.
[13] S. A. Ludwig, “Applying A Neural Network Ensemble
To Intrusion Detection,” vol. 9, no. 3, pp. 177–188,
2019.
[14] Y. Yang, K. Zheng, C. Wu, X. Niu, and Y. Yang,
“applied sciences Building an Effective Intrusion
Detection System Using the Modified Density Peak
Clustering Algorithm and Deep Belief Networks,”
2019.
[15] S. Raschka, Mlxtend 0.9.0. 2017.
[16] S. Malmasi and M. Zampieri, “Challenges in
discriminating profanity from hate speech,” J. Exp.
Theor. Artif. Intell., vol. 3079, pp. 1–16, 2018.
[17] CICS, “NSL-KDD,” Canadian Institute of
Cybersecurity, University of New Brunswick, 2019.
[Online]. Available:
https://www.unb.ca/cic/datasets/nsl.html. [Accessed:
05-Apr-2019].
[18] G. O. and D. E. Pedregosa F., Varoquaux G., Gramfort
A., Michel V., Thirion B., “Scikit-learn: Machine
Learning in Python,” J. Mach. Learn. Res., vol. 12, pp.
2825–2830, 2011.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 5, May 2020
67 https://sites.google.com/site/ijcsis/
ISSN 1947-5500