The document discusses using machine learning algorithms like Random Forest and k-Nearest Neighbors for intrusion detection. It analyzes the KDD Cup 1999 intrusion detection dataset to classify network traffic as normal or different types of attacks. The proposed model uses Random Forest for feature selection and k-Nearest Neighbors for classification to more accurately detect known and unknown attacks. Experimental results show the combined approach achieves better detection rates than other algorithms alone, especially for novel attacks not in the training data. Further combining the algorithms into a two-stage model is suggested to improve performance.
Current issues - International Journal of Network Security & Its Applications...IJNSA Journal
nternational Journal of Network Security & Its Applications (IJNSA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the computer Network Security & its applications. The journal focuses on all technical and practical aspects of security and its applications for wired and wireless networks. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern security threats and countermeasures, and establishing new collaborations in these areas.
A SURVEY ON THE USE OF DATA CLUSTERING FOR INTRUSION DETECTION SYSTEM IN CYBE...IJNSA Journal
In the present world, it is difficult to realize any computing application working on a standalone computing device without connecting it to the network. A large amount of data is transferred over the network from one device to another. As networking is expanding, security is becoming a major concern. Therefore, it has become important to maintain a high level of security to ensure that a safe and secure connection is established among the devices. An intrusion detection system (IDS) is therefore used to differentiate between the legitimate and illegitimate activities on the system. There are different techniques are used for detecting intrusions in the intrusion detection system. This paper presents the different clustering techniques that have been implemented by different researchers in their relevant articles. This survey was carried out on 30 papers and it presents what different datasets were used by different researchers and what evaluation metrics were used to evaluate the performance of IDS. This paper also highlights the pros and cons of each clustering technique used for IDS, which can be used as a basis for future work.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
Study on deep neural networks and big data is merging now by several aspects to enhance the capabilities of intrusion detection system (IDS). Many IDS models has been introduced to provide security over big data. This study focuses on the intrusion detection in computer networks using big datasets. The advent of big data has agitated the comprehensive assistance in cyber security by forwarding a brunch of affluent algorithms to classify and analysis patterns and making a better prediction more efficiently. In this study, to detect intrusion a detection model has been propounded applying deep neural networks. We applied the suggested model on the latest dataset available at online, formatted with packet based, flow based data and some additional metadata. The dataset is labeled and imbalanced with 79 attributes and some classes having much less training samples compared to other classes. The proposed model is build using Keras and Google Tensorflow deep learning environment. Experimental result shows that intrusions are detected with the accuracy over 99% for both binary and multiclass classification with selected best features. Receiver operating characteristics (ROC) and precision-recall curve average score is also 1. The outcome implies that Deep Neural Networks offers a novel research model with great accuracy for intrusion detection model, better than some models presented in the literature.
Current issues - International Journal of Network Security & Its Applications...IJNSA Journal
nternational Journal of Network Security & Its Applications (IJNSA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the computer Network Security & its applications. The journal focuses on all technical and practical aspects of security and its applications for wired and wireless networks. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern security threats and countermeasures, and establishing new collaborations in these areas.
A SURVEY ON THE USE OF DATA CLUSTERING FOR INTRUSION DETECTION SYSTEM IN CYBE...IJNSA Journal
In the present world, it is difficult to realize any computing application working on a standalone computing device without connecting it to the network. A large amount of data is transferred over the network from one device to another. As networking is expanding, security is becoming a major concern. Therefore, it has become important to maintain a high level of security to ensure that a safe and secure connection is established among the devices. An intrusion detection system (IDS) is therefore used to differentiate between the legitimate and illegitimate activities on the system. There are different techniques are used for detecting intrusions in the intrusion detection system. This paper presents the different clustering techniques that have been implemented by different researchers in their relevant articles. This survey was carried out on 30 papers and it presents what different datasets were used by different researchers and what evaluation metrics were used to evaluate the performance of IDS. This paper also highlights the pros and cons of each clustering technique used for IDS, which can be used as a basis for future work.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
COMBINING NAIVE BAYES AND DECISION TREE FOR ADAPTIVE INTRUSION DETECTIONIJNSA Journal
In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
Study on deep neural networks and big data is merging now by several aspects to enhance the capabilities of intrusion detection system (IDS). Many IDS models has been introduced to provide security over big data. This study focuses on the intrusion detection in computer networks using big datasets. The advent of big data has agitated the comprehensive assistance in cyber security by forwarding a brunch of affluent algorithms to classify and analysis patterns and making a better prediction more efficiently. In this study, to detect intrusion a detection model has been propounded applying deep neural networks. We applied the suggested model on the latest dataset available at online, formatted with packet based, flow based data and some additional metadata. The dataset is labeled and imbalanced with 79 attributes and some classes having much less training samples compared to other classes. The proposed model is build using Keras and Google Tensorflow deep learning environment. Experimental result shows that intrusions are detected with the accuracy over 99% for both binary and multiclass classification with selected best features. Receiver operating characteristics (ROC) and precision-recall curve average score is also 1. The outcome implies that Deep Neural Networks offers a novel research model with great accuracy for intrusion detection model, better than some models presented in the literature.
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
Visualize network anomaly detection by using k means clustering algorithmIJCNCJournal
With the ever increasing amount of new attacks in today’s world the amount of data will keep increasing,
and because of the base-rate fallacy the amount of false alarms will also increase. Another problem with
detection of attacks is that they usually isn’t detected until after the attack has taken place, this makes
defending against attacks hard and can easily lead to disclosure of sensitive information.
In this paper we choose K-means algorithm with the Kdd Cup 1999 network data set to evaluate the
performance of an unsupervised learning method for anomaly detection. The results of the evaluation
showed that a high detection rate can be achieve while maintaining a low false alarm rate .This paper
presents the result of using k-means clustering by applying Cluster 3.0 tool and visualized this result by
using TreeView visualization tool .
INTRUSION DETECTION SYSTEM CLASSIFICATION USING DIFFERENT MACHINE LEARNING AL...ijcsit
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...Eswar Publications
Wireless Sensor Network (WSNs) are deployed at aggressive environments which are vulnerable to various security attacks such as Wormholes, Denial of Attacks and Sybil Attacks. There are various intrusion detection techniques that are used to identify attacks in a network with high accuracy level. This paper has focused on Denial of Service attack, since it is the most common attack that affects the environment severely. Therefore a new hybrid technique combining Hidden Markov Model with Ant Colony Optimization (HMM+ACO) has been
proposed that gives improved performance than the other techniques.
An approach for ids by combining svm and ant colony algorithmeSAT Journals
Abstract This piece of work researches the intrusion detection problem of the network sanctuary; the primary task is to classify network behavior as normal or abnormal while reducing misclassification. In this paper, two efficient data mining algorithms are combined together to detect the network intrusion. Combining SVM and Ant colony (CSVAC) used for well-organized data classification, this technique takes the advantage of both the algorithm while avoiding their weaknesses. This algorithm is implemented and evaluated using standard benchmark KDDCUP99 data set. Experimental results drastically well produce superior results than the other algorithm in terms of accuracy rate and run time efficiency, and this algorithm able to detect the new types of attacks Keywords: Intrusion Detection; Support Vector Machine; Ant colony; Combined Support vector with ant colony
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
Enormous studies on intrusion detection have widely applied data mining techniques to
finding out the useful knowledge automatically from large amount of databases, while few studies have
proposed classification data mining approaches. In an actual risk assessment process, the discovery of
intrusion detection prediction knowledge from experts is still regarded as an important task because
experts’ predictions depend on their subjectivity. Traditional statistical techniques and artificial
intelligence techniques are commonly used to solve this classification decision making. This paper
proposes an ant-miner based data mining method for discovering network intrusion detection rules from
large dataset. The obtained result of this experiment shows that clearly the ant-miner is superior than
ID3, J48, ADtree, BFtree, Simple cart. Although different classification models have been developed for
network intrusion detection, each of them has its strength and weakness, including the most commonly
applied Support Vector Machine(SVM)method and the clustering based on Self Organized Ant Colony
Network (CSOACN).Our algorithm is implemented and evaluated using a standard bench mark KDD99
dataset. Experiments show that ant-miner algorithm out performs than other methods in terms of both
classification rate and accuracy
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...ijaia
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So, these result in model not able to efficiently learn the characteristics of rare categories and this will result in poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...gerogepatton
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So,these result in model not able to efficiently learn the characteristics of rare categories and this will result in
poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
Visualize network anomaly detection by using k means clustering algorithmIJCNCJournal
With the ever increasing amount of new attacks in today’s world the amount of data will keep increasing,
and because of the base-rate fallacy the amount of false alarms will also increase. Another problem with
detection of attacks is that they usually isn’t detected until after the attack has taken place, this makes
defending against attacks hard and can easily lead to disclosure of sensitive information.
In this paper we choose K-means algorithm with the Kdd Cup 1999 network data set to evaluate the
performance of an unsupervised learning method for anomaly detection. The results of the evaluation
showed that a high detection rate can be achieve while maintaining a low false alarm rate .This paper
presents the result of using k-means clustering by applying Cluster 3.0 tool and visualized this result by
using TreeView visualization tool .
INTRUSION DETECTION SYSTEM CLASSIFICATION USING DIFFERENT MACHINE LEARNING AL...ijcsit
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
Hybrid Technique for Detection of Denial of Service (DOS) Attack in Wireless ...Eswar Publications
Wireless Sensor Network (WSNs) are deployed at aggressive environments which are vulnerable to various security attacks such as Wormholes, Denial of Attacks and Sybil Attacks. There are various intrusion detection techniques that are used to identify attacks in a network with high accuracy level. This paper has focused on Denial of Service attack, since it is the most common attack that affects the environment severely. Therefore a new hybrid technique combining Hidden Markov Model with Ant Colony Optimization (HMM+ACO) has been
proposed that gives improved performance than the other techniques.
An approach for ids by combining svm and ant colony algorithmeSAT Journals
Abstract This piece of work researches the intrusion detection problem of the network sanctuary; the primary task is to classify network behavior as normal or abnormal while reducing misclassification. In this paper, two efficient data mining algorithms are combined together to detect the network intrusion. Combining SVM and Ant colony (CSVAC) used for well-organized data classification, this technique takes the advantage of both the algorithm while avoiding their weaknesses. This algorithm is implemented and evaluated using standard benchmark KDDCUP99 data set. Experimental results drastically well produce superior results than the other algorithm in terms of accuracy rate and run time efficiency, and this algorithm able to detect the new types of attacks Keywords: Intrusion Detection; Support Vector Machine; Ant colony; Combined Support vector with ant colony
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
Enormous studies on intrusion detection have widely applied data mining techniques to
finding out the useful knowledge automatically from large amount of databases, while few studies have
proposed classification data mining approaches. In an actual risk assessment process, the discovery of
intrusion detection prediction knowledge from experts is still regarded as an important task because
experts’ predictions depend on their subjectivity. Traditional statistical techniques and artificial
intelligence techniques are commonly used to solve this classification decision making. This paper
proposes an ant-miner based data mining method for discovering network intrusion detection rules from
large dataset. The obtained result of this experiment shows that clearly the ant-miner is superior than
ID3, J48, ADtree, BFtree, Simple cart. Although different classification models have been developed for
network intrusion detection, each of them has its strength and weakness, including the most commonly
applied Support Vector Machine(SVM)method and the clustering based on Self Organized Ant Colony
Network (CSOACN).Our algorithm is implemented and evaluated using a standard bench mark KDD99
dataset. Experiments show that ant-miner algorithm out performs than other methods in terms of both
classification rate and accuracy
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...ijaia
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So, these result in model not able to efficiently learn the characteristics of rare categories and this will result in poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...gerogepatton
Network intrusion detection often finds a difficulty in creating classifiers that could handle unequal distributed attack categories. Generally, attacks such as Remote to Local (R2L) and User to Root (U2R) attacks are very rare attacks and even in KDD dataset, these attacks are only 2% of overall datasets. So,these result in model not able to efficiently learn the characteristics of rare categories and this will result in
poor detection rates of rare attack categories like R2L and U2R attacks. We even compared the accuracy of KDD and NSL-KDD datasets using different classifiers in WEKA.
Editorial 04
Por Elisabeth Gomes
Artigo 01
Gestão do Conhecimento e Educação a Distância.
Entrevista concedida pelo Dr. Domingo Gallego à Diretora
de Relações Internacionais da SBGC, Lourdes Martins.
Tradução: Sonia Goulart (SBGC).
Artigo 02
Como identificar os vários estilos de aprendizagem
e utilizá-los como facilitadores da construção do Conhecimento.
Por Daniela Melaré Vieira Barros
Artigo 03
O papel da tecnologia da informação como auxílio à
engenharia e Gestão do Conhecimento. Por Giuvania Terezinha Lehmkuhl, Carla Rosana da Veiga e Gregório Jean Varvakis Rado
Artigo 04
Gestão do Conhecimento para profissionais liberais -
Uma metodologia para a expansão sustentável
dos negócios de profissionais liberais empreendedores
Por Fabrício Yutaka Fujikawa
Artigo 05
Implantando um programa de lições aprendidas nas organizações.
Por Alexandre Bello
Artigo 06
Análise, avaliação e otimização de Gestão do Conhecimento.
Por Feruccio Bilich e Ricardo da Silva
Resenha do livro
Resenha do livro “A Estratégia do Oceano Azul”. Por Eric Eustáquio M. dos Santos
Agenda
Eventos Internacionais
Direto dos Pólos e Núcleos
Palavra da SBGC
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
An Intrusion Detection based on Data mining technique and its intended import...Editor IJMTER
Intrusion detection is a pivotal and essential requirement of today’s era. There are two
major side of Intrusion detection namely, Host based intrusion detection as well as network based
intrusion detection. In Host based intrusion detection system, it monitors the information arrive at the
particular machine or node. While in network based intrusion system, it monitor and analyze whole
traffic of network. Data mining introduce latest technology and methods to handle and categorize
types of attacks using different classification algorithm and matching the patterns of malicious
behavior. Due to the use of this data mining technology, developers extract and analyze the types of
attack in the network.
In addition to this there are two major approach of intrusion detection. First, anomaly based approach,
in which attacks are found with high false alarm rate. However, in signature based approach, false
alarm rate is low with lack of processing of novel attacks. Most of the researchers do their research
based on signature intrusion with the purpose to increase detection rate. Major advantage of this
system, IDS does not require biased assessment and able to identify massive pattern of attacks.
Moreover, capacity to handle large connection records of network. In this paper we try to discover
the features of intrusion detection based on data mining technique.
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detectionijsrd.com
In today's interconnected world, one of pervasive issue is how to protect system from intrusion based security attacks. It is an important issue to detect the intrusion attacks for the security of network communication.Denial of Service (DoS) attacks is evolving continuously. These attacks make network resources unavailable for legitimate users which results in massive loss of data, resources and money.Significance of Intrusion detection system (IDS) in computer network security well proven. Intrusion Detection Systems (IDSs) have become an efficient defense tool against network attacks since they allow network administrator to detect policy violations. Mining approach can play very important role in developing intrusion detection system. Classification is identified as an important technique of data mining. This paper evaluates performance of well known classification algorithms for attack classification. The key ideas are to use data mining techniques efficiently for intrusion attack classification. To implement and measure the performance of our system we used the KDD99 benchmark dataset and obtained reasonable detection rate.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Survey on classification techniques for intrusion detectioncsandit
Intrusion detection is the most essential component
in network security. Traditional Intrusion
Detection methods are based on extensive knowledge
of signatures of known attacks. Signature-
based methods require manual encoding of attacks by
human experts. Data mining is one of the
techniques applied to Intrusion Detection that prov
ides higher automation capabilities than
signature-based methods. Data mining techniques suc
h as classification, clustering and
association rules are used in intrusion detection.
In this paper, we present an overview of
intrusion detection, KDD Cup 1999 dataset and detai
led analysis of different classification
techniques namely Support vector Machine, Decision
tree, Naïve Bayes and Neural Networks
used in intrusion detection.
AN IMPLEMENTATION OF INTRUSION DETECTION SYSTEM USING GENETIC ALGORITHMIJNSA Journal
Nowadays it is very important to maintain a high level security to ensure safe and trusted communication of information between various organizations. But secured data communication over internet and any other network is always under threat of intrusions and misuses. So Intrusion Detection Systems have
become a needful component in terms of computer and network security. There are various approaches being utilized in intrusion detections, but unfortunately any of the systems so far is not completely flawless. So, the quest of betterment continues. In this progression, here we present an Intrusion
Detection System (IDS), by applying genetic algorithm (GA) to efficiently detect various types of network intrusions. Parameters and evolution processes for GA are discussed in details and implemented. This approach uses evolution theory to information evolution in order to filter the traffic data and thus reduce the complexity. To implement and measure the performance of our system we used the KDD99
benchmark dataset and obtained reasonable detection rate.
Machine learning in network security using knime analyticsIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly
programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Articles - International Journal of Network Security & Its Applications (IJNSA)IJNSA Journal
International Journal of Network Security & Its Applications (IJNSA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the computer Network Security & its applications. The journal focuses on all technical and practical aspects of security and its applications for wired and wireless networks. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on understanding Modern security threats and countermeasures, and establishing new collaborations in these areas.
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
MULTI-LAYER CLASSIFIER FOR MINIMIZING FALSE INTRUSIONIJNSA Journal
Intrusion detection is one of the standard stages to protect computers in network security framework from several attacks. False alarms problem is critical in intrusion detection, which motivates many researchers to discover methods to minify false alarms. This paper proposes a procedure for classifying the type of intrusion according to multi-operations and multi-layer classifier for handling false alarms in intrusion detection. The proposed system is tested using on KDDcup99 benchmark. The performance showed that results obtained from three consequent classifiers are better than a single classifier. The accuracy reached 98% based on 25 features instead of using all features of KDDCup99 dataset.
Electrically small antennas: The art of miniaturizationEditor IJARCET
We are living in the technological era, were we preferred to have the portable devices rather than unmovable devices. We are isolating our self rom the wires and we are becoming the habitual of wireless world what makes the device portable? I guess physical dimensions (mechanical) of that particular device, but along with this the electrical dimension is of the device is also of great importance. Reducing the physical dimension of the antenna would result in the small antenna but not electrically small antenna. We have different definition for the electrically small antenna but the one which is most appropriate is, where k is the wave number and is equal to and a is the radius of the imaginary sphere circumscribing the maximum dimension of the antenna. As the present day electronic devices progress to diminish in size, technocrats have become increasingly concentrated on electrically small antenna (ESA) designs to reduce the size of the antenna in the overall electronics system. Researchers in many fields, including RF and Microwave, biomedical technology and national intelligence, can benefit from electrically small antennas as long as the performance of the designed ESA meets the system requirement.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
When stars align: studies in data quality, knowledge graphs, and machine lear...
1850 1854
1. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, No 5, May 2013
www.ijarcet.org
1850
Abstract— Today, there are so many information
interchanges are performed in that internet as the
increasing the amount of using internet. That why, the
methods used as intrusion detective tools for protecting
network systems against diverse attacks are became too
important. The available of IDS are getting more powerful.
But, modern intrusion detection system facing complex
problems. These system has to be require reliable,
extensible, easy to manage, and have low maintenance cost.
In recent years, data mining-based intrusion detection
systems (IDSs) have demonstrated high accuracy, good
generalization to novel types of intrusion, and robust
behavior in a changing environment. In this proposed
model, we focus on the best two data mining algorithms for
intrusion detection system. The k-Nearest Neighbor was
used as the classical pattern reorganization tools have been
widely used for Intruder detections. There have some
different characteristic of features in building an Intrusion
Detection System. Conventional k-NN do not concern about
that. Our enhanced Model proposed with an Random
Forest (RDF) and k-Nearest Neighbor (kNN) method to
perform more accurate classification task of the new model.
RDF can select more important features and kNN can
select more precisely than the conventional System.
Experiments and comparisons are conducted through
intrusion dataset: the KDD Cup 1999 dataset.
Index Terms— Intrusion Detection System, Random
Forest,and k-Nearest Neighbor KDD Cup.
unknown(novel)attack
I. INTRODUCTION
This Intrusion Detection Systems (IDSs) have become a
major focus of computer scientists and practitioners as
computer attacks have become an increasing threat to
commercial business as well as our daily lives. Researchers
have developed two main approaches for intrusion detection:
misuse and anomaly intrusion detection. Misuse consists of
representing the specific patterns of intrusions that exploit
known system vulnerabilities or violate system security
policies.
Manuscript received May, 2013.
Phyu Thi Htun, Faculty of Information and communication
Technology, University of Technology ,Yatanarpon Cyber City, Myanmar.,
Yatanarpon Cyber City, Myanmar, Phone/ Mobile No: +959402560402
Kyaw Thet Khaing, Computer Hardware Department, University of
Computer Studies, Yangon,Yangon, Myanmar,
On the other side, anomaly detection assumes that all
intrusive activities are necessarily anomalous. This means that
if we could establish a normal activity profile for a system, we
could, in theory, flag all system states varying from the
established profile as intrusion attempts. These two kinds of
systems have their own strengths and weaknesses.
The former can detect known attacks with a very high
accuracy via pattern matching on known signatures, but
cannot detect novel attacks because their signatures are not
yet available for pattern matching. The latter can detect novel
attacks but in general for most such existing systems, have a
high false alarm rate because it is difficult to generate
practical normal behavior profiles for protected systems.
In this paper, we only consider anomaly detection systems,
extend the definition of anomaly detection to not only take
into account normal profiles but also handle known attacks
and explore supervised machine learning techniques. These
techniques have proven their efficiency in predicting the
different classes of the unlabeled data in the test data set for
the KDD’99 intrusion detection contest.
The rest of the paper is organized as follows. Section 2
presents the related works using corresponding machine
learning Algorithms. Section 3 described the KDD 99
intrusion detection cup dataset. Section 4 introduces
machine-learning techniques, Random Forests and k-Nearest
Neighbor, and presented the proposed system model. Using
those machine learning algorithms in our proposed system,
which presented in Section 4, Section 5 describes the
experimental results obtained by using the machine-learning
algorithms with WEKA tool[15]. This results obtained with
the algorithm over KDD99 do not correspond to what we
expect. This is due, in reality, to the transformation of
DARPA 98 to KDD 99. Section 6 concluded for our research
by using the out coming results using with those machine
learning algorithms
II. PROCEDURE FOR PAPER SUBMISSION
A IDDM (Intrusion Detection using Data Mining
Techniques) [24] is a real-time NIDS for misuse and anomaly
detection. It applied association rules, Meta rules, and
characteristic rules. Jiong Zhang and Mohammad Zulkernine
[21] employ random forests for intrusion detection system.
Random forests algorithm is more accurate and efficient on
large dataset like network traffic. We also use this data mining
technique to select features and handle imbalanced intrusion
problem. The most related work to ours is done also by them
[19]. They use Random Forests Algorithm over rule-based
NIDSs. Thus, novel attacks can be detected in this network
Important Roles Of Data Mining Techniques
For Anomaly Intrusion Detection System
Phyu Thi Htun and Kyaw Thet Khaing
2. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, No 5, May 2013
1851
www.ijarcet.org
intrusion detection system.
In contrast to the previously proposed data mining based
IDSs, we employ random forests for anomaly intrusion
detection. Random forests algorithm is more accurate and
efficient on large dataset like network traffic. We also use the
data mining techniques to select features and handle
imbalanced intrusion problem.[16]
Random Forest (RDF) also intend to handle new instances
that are not considered in all current supervised machine
learning techniques[21], And k- Nearest Neighbor(k-NN)
algorithm, is one of those algorithms that are very simple to
understand but works incredibly well in practice. k-NN
method was used as a supporter method for multi-class
classification [22][25].
III. DATASET DESCRIPTIONS
A Since 1999, KDD’99 [12] has been the most widely used
data set for the evaluation of anomaly detection methods. This
data set is built based on the data captured in DARPA’98 IDS
evaluation program [8]. DARPA’98 is about 4 gigabytes of
compressed raw (binary) tcpdump data of 7 weeks of network
traffic. The two weeks of test data have around 2 million
connection records. KDD training dataset consists of
approximately 4,900,000 single connection vectors each of
which contains 41 features and is labeled as either normal or
an attack, with exactly one specific attack type. The simulated
attacks fall in one of the following four categories:
(1) Denial of Service Attack (DoS): is an attack in which
the attacker makes some computing or memory resource too
busy or too full to handle legitimate requests, or denies
legitimate users access to a machine.
(2) User to Root Attack (U2R): is a class of exploit in
which the attacker starts out with access to a normal user
account on the system (perhaps gained by sniffing passwords,
a dictionary attack, or social engineering) and is able to
exploit some vulnerability to gain root access to the system.
(3) Remote to Local Attack (R2L): occurs when an attacker
who has the ability to send packets to a machine over a
network but who does not have an account on that machine
exploits some vulnerability to gain local access as a user of
that machine.
(4) Probing Attack: is an attempt to gather information
about a network of computers for the apparent purpose of
circumventing its security controls. Table 1 showed the four
categories and their corresponding attacks on each categories.
Table I. Classification of attacks on KDD dataset
Classification of Attacks Attack Name
DoS smurf, land, pod, teardrop,
neptune, back
R2L ftp_write, guess_passwd, imap,
multihop, phf, spy,
warezmaster, warezclient
U2R perl, buffer_overflow, rootkit,
loadmodule
Probe ipsweep, nmap, satan,
portsweep
It is important to note that the test data is not from the same
probability distribution as the training data, and it includes
specific attack types not in the training data which make the
task more realistic. Some intrusion experts believe that most
novel attacks are variants of known attacks and the signature
of known attacks can be sufficient to catch novel variants.
The KDD CUP shared 4 dataset file, Train+,
Train+_20Percent,Test+ and Test-21. The first two files
represent for training datasets and contain the general attacks.
The rest two files represent for testing datasets and contain not
only general attacks but also the unknown (novel) attacks.
The connection for each attack type is shown in Table II.
Table II. Number of connection in each attack type
Datasets Normal DoS U2R R2L Probe Total
Train+ 67343 45927 993 54 11656 125973
Train+20
Percent
13449 9234 206 12 2289 25190
Test+ 9711 7458 2421 533 2421 22544
Test-21 2152 4342 2421 533 2402 11850
IV. MACHINE LEARNING ALGORITHMS
To overcome the limitations of the rule-based systems, a
number of IDSs employ data mining techniques. Data mining
is the analysis of (often large) observational data sets to find
patterns or models that are both understandable and useful to
the data owner [17][23]. Data mining can efficiently extract
patterns of intrusions for misuse detection, establish profiles
of normal network activities for anomaly detection, and build
classifiers to detect attacks, especially for the vast amount of
audit data. Data mining-based systems are more flexible and
deployable.
Over the past several years, a growing number of
research projects have applied data mining to intrusion
detection with different algorithms. We propose an approach
to use random forests and k-Nearest Neighbor in intrusion
detection. For instance, those had been applied to prediction,
probability estimation, and pattern analysis in multimedia
information retrieval and bioinformatics.
Unfortunately, to the best of our knowledge, Random
Forests algorithm has not been completely applied to detect
novel attacks (unknown attacks) in automatic intrusion
detection. Fortunately, we can take advantages from k-NN
that can classify in more precisely and an important pattern
recognizing method based on representative points.[2]
A. Random Forest
The Random Forests [4] is an ensemble of unpruned
classification or regression trees. Random forest generates
many classification trees. Each tree is constructed by a
different bootstrap sample from the original data using a tree
classification algorithm. After the forest is formed, a new
object that needs to be classified is put down each of the tree
in the forest for classification. Each tree gives a vote that
indicates the tree’s decision about the class of the object. The
forest chooses the class with the most votes for the object.
The main features of the random forests algorithm are
listed as follows:
• It runs efficiently on large data sets with many features.
• It can give the estimates of what features are important.
3. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, No 5, May 2013
www.ijarcet.org
1852
• It has no nominal data problem and does not over-fit.
• It can handle unbalanced data sets.
B. k-Nearest Neighbor
k-NN classification is an easy to understand and easy to
implement classification technique[22]. Despite its
simplicity, it can perform well in many situations. k-NN is
particularly well suited for multi-modal classes as well as
applications in which an object can have many class labels.
For example, for the assignment of functions to genes based
on expression profiles, some researchers found that k-NN
outperformed SVM, which is a much more sophisticated
classification scheme[2].
The 1-Nearest Neighbor(1NN) classifier is an important
pattern recognizing method based on representative points
[23]. In the 1NN algorithm, whole train samples are taken as
representative points and the distances from the test samples
to each representative point are computed. The test samples
have the same class label as the representative point nearest to
them. The k-NN is an extension of 1NN, which determines the
test samples through finding the k nearest neighbors.
C. Intrusion Detection System
In this section, we describe the methods employed in the
proposed system, and illustrate how to apply these methods to
detect novel attacks with true positive rate, low false positive
rate for network intrusion detection.
Fig 1. The proposed System
This system is process of identifying the abnormal and
normal instances that are two phases. The first is the training
phase that reduces the irrelevant features. Next phase is
detection phase. This system is shown in Figure 1.
Since the operations of normal instances are specified and
they show expected behavior, we could use the knowledge
based (misuse) IDS detection, while unexpected activity
(presumably an intrusion would be unusual) is continually
designed and progressed and could not be seen as a
knowledge based attack, therefore the anomaly IDS detection
is performed over novel attacks.
We also report our experimental results over the KDD’99
datasets. The results show that the proposed approach
provides better performance compared to the best results from
the KDD’99 contest.
V. EXPERIMENTS AND RESULTS
In this section, we summarize our experimental results to
detect unknown attacks for intrusion detection with over the
KDD’99 datasets. Experimental results are presented in terms
of the classes that achieved good level of discrimination from
others in the training set.
Firstly, our proposed system will reduced some features in
dataset by using Random Forest algorithm at each connection.
So, system will try to detect various anomaly attacks using
corrected KDD dataset. The proposed system will reduced in
training time and will increase the accuracy of the system’s
classification. The experimental results will come out by
using WEKA tool [15].
In the experiments process, the system use 10 trees and the
reduced features (default 6 in WEKA) to classify. The
accuracy of the system will be increased other systems as
shown in Figure 2 and the detection rate using proposed
method on each attack type are shown in figure 3.
Since the test datasets “Test+”and “Test-21” have with
different statistical distributions than either “Train+” or
“Train_20Percent”, the accuracy decrease rather than Cross
Validation results with those train files. But as to detect the
unknown attack, the results in test file that contains more
unknown attack types (novel attacks) than the other datasets
get more detection rate of Random Forest can compare with
other methods as shown in figure 2.
Fig. 2 The Comparison accuracy results between Machine
Learning Algorithm Random Forest, k-NN and Naive Bayes.
VI. CONCLUSIONS AND FURTHER EXTENSION
Recent researches employed decision trees, artificial neural
networks and a probabilistic classifier and reported, in terms
of detection and false alarm rates, but it was still high false
positives and irrelevant alerts in detection of novel attacks.
This paper has presented a survey of the various data
mining techniques that have been proposed towards the
enhancement of anomaly intrusion detection systems. And,
we applied the classification methods for classifying the
attacks (intrusions) on DARPA dataset. The results showing
4. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, No 5, May 2013
1853
www.ijarcet.org
the performance of the Random Forest is better than other
classifiers. But the time taken is more for Random Forest than
other classifiers. On the other hand, k-Nearest Neighbor is
also the good modeling algorithm in our experiments.The
reason that the Random Forest cannot consider on pattern
recognition, and also k-NN is a good pattern recognition
method which used in many researches [3][21][22]. Thus, we
can extend this experiment by combining those two
algorithms; the system may expect to get the more accurate
and detection rate to detected intrusion. Random Forest will
process in the filtering stage and the k-NN will use as a
classifier.
According to the experimental results and conclusion, we
proposed a new model for more accurate and detection rate as
shown in figure 3 using Knowledge Flow process in WEKA
tools.
In this proposed model, as mention in conclusion, the
Random Forest can process in feature ranking and selection in
most research, we will used it in the filtering process of
preprocessing state and it will construct the trees and also
select the random features. After preprocessing state, we will
use the k-NN algorithm, pattern recognition method for
classification state to detect the incoming attacks.
Finally, we will drawn the results with text that express the
Ture Positive, False Positive Rate, Precision, Recall and also
confusion matrix we can extract.
Fig. 3 The proposed Model
REFERENCES
[1] W. Lee and S. J. Stolfo, “Data Mining Approaches for Intrusion
Detection”, the 7th USENIX Security Symposium, San Antonio, TX,
January 1998.
[2] K.T.Khaing and T.T.Naing, “Enhanced Feature Ranking and
Selection using Recurisive Featue Elemination and k-Nearest
Neighbor Algorithms in SVM for IDS”, Internaiton Journal of
Network and Mobile Technology(IJNMT), No.1, Vol 1. 2010.
[3] M. Bahrololum, E. Salahi and M. Khaleghi, "Anomaly Intrusion
Detection Design using Hybrid of Unsupervised and Supervised
Neural Network", International Journal of Computer Network &
Communications(IJCNC), Vol.1, No.2, July 2009.
[4] L. Breiman, “Random Forests”, Machine Learning 45(1):5–32, 2001.
[5] V. Marinova-Boncheva, "A Short Survey of Intrusion Detection
System" , 2007.
[6] Tamas Abraham, “IDDM: Intrusion Detection Using Data Mining
Techniques”, DSTO Electronics and Surveillance Research
Laboratory, Salisbury, Australia, May 2001.
[7] M. Mahoney and P. Chan, “An Analysis of the 1999 DARPA/Lincoln
Laboratory Evaluation Data for Network Anomaly Detection”,
Proceeding of Recent Advances in Intrusion Detection (RAID)-2003,
Pittsburgh, USA, September 2003.
[8] KDD’99 datasets, The UCI KDD Archive,
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html , Irvine,
CA, USA, 1999.
[9] KDD Cup 1999. Available on:
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html,
December 2009.
[10] Lan Guo, Yan Ma, Bojan Cukic, and Harshinder Singh, “Robust
Prediction of Fault-Proneness by Random Forests”, Proceedings of
the 15th International Symposium on Software Reliability
Engineering (ISSRE'04), pp. 417-428, Brittany, France, November
2004.
[11] Ting-Fan Wu, Chih-Jen Lin, and Ruby C. Weng, “Probability
Estimates for Multi-class Classification by Pairwise Coupling”, The
Journal of Machine Learning Research, Volume 5, December 2004.
[12] Yimin Wu, High-dimensional Pattern Analysis in Multimedia
Information Retrieval and Bioinformatics, Doctoral Thesis, State
University of New York, January 2004.
[13] Bogdan E. Popescu, and Jerome H. Friedman, Ensemble Learning for
Prediction, Doctoral Thesis, Stanford University, January 2004.
[14] Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, and
Salvatore Stolfo. “A Geometric Framework for Unsupervised
Anomaly Detection: Detecting Intrusions in Unlabeled Data.”
Applications of Data Mining in Computer Security, 2002.
[15] WEKA software, Machine Learning,
http://www.cs.waikato.ac.nz/ml/weka/, The University of Waikato,
Hamilton, New Zealand.
[16] Leo Breiman and Adele Cutler, Random forests,
http://statwww.berkeley.edu/users/breiman/RandomForests/cc_hom
e.htm, University of California, Berkeley, CA, USA.
[17] David J. Hand, Heikki Mannila, and Padhraic Smyth, Principles of
Data Mining, The MIT Press, August, 2001.
[18] MIT Lincoln Laboratory, DARPA Intrusion Detection Evaluation,
http://www.ll.mit.edu/IST/ideval/,MA, USA.
[19] J.Zhange and M. Zulkerline, “Network Intrusion Detection using
Random Forests”,2011.
[20] T. Lappas and K. Pelechrinis Data Mining Techniques for (Network)
Intrusion Detection Systems”.
[21] J. Zhang and M. Zulkernine, ”Anomaly Based Network Intrusion
Detection with Unsupervised Outlier Detection”, Symposium on
Network Security and Information Assurance Proc. of the IEEE
International Conference on Communications (ICC), 6 pages,
Istanbul, Turkey, June 2006.
[22] S. Thirumuruganathan , “A Detailed Introduction to K-Nearest
Neighbor (KNN) Algorithm”, World Press, May 17, 2010.
[23] X Wu, V Kumar, J Ross Quinlan, J Ghosh, “Top 10 Data mining
Algorithm”, Knowledge and Information Systems, Volume 14, Issue
1, pp 1-37 ,2008 – Springer
[24] S. Mukkamala, A.H. Hung and A. Abraham, “Intrusion Detection
Using an Ensemble of Intelligent Paradigms.” Journal of Network
and Computer Applications, Vol. 28(2005), 167-182.
[25] S. Chebrolu, , A. Abraham, and J.P. Thomas, “Feature Deduction and
Ensemble Design of Intrusion Detection Systems.” International
Journal of Computers and Security, Vol 24, Issue 4,(June 2005),
295-307
[26] A.H. Sung and S. Mukkamala, “The Feature Selection and Intrusion
Detection Problems.” Proceedings of Advances in Computer Science
- ASIAN 2004: Higher- Level Decision Making. 9th Asian
Computing Science Conference. Vol. 321(2004) , 468-482.
[27] S. Mukkamala, A.H. Sung and A. Abraham, “Modeling Intrusion
Detection Systems Using Linear Genetic Programming Approach.”
LNCS 3029, Springer Hiedelberg, 2004, pp. 633-642.
[28] A. Abraham and R. Jain, “Soft Computing Models for Network
Intrusion Detection Systems.” Soft Computing in Knowledge
Discovery: Methods and Applications, Springer Chap 16, 2004,
20pp.
5. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, No 5, May 2013
www.ijarcet.org
1854
[29] A. Abraham, C. Grosan, and C.M. Vide, “Evolutionar Design of
Intrusion Detection Programs.” International Journal of Network
Security, Vol. 4, No. 3, 2007, pp. 328-339
[30] Shannon C.E., Weaver W., The Mathematical Theory of
Communication. Urbana, IL, University of Illinois Press, 1946.
[31] Setiono R., Liu H., .Improving backpropagation learning with feature
selection". Applied Intelligence: The International Journal of Arti_cal
Intelligence, Neural Networks, and Complex Problem-Solving
Technologies, vol. 6, pp. 129-139, 1996.
[32] Press, W.H., Flnnery B.P., Teukolsky S.A. and Vetterling W.T.
.Numerical recipies in C.. Cambridge University Press, Cambridge.
[33] Chi J., .Entropy based feature evaluation and selection technique.,
Proc. of 4th Australian Conf. on Neural Networks (ACNN'93), pp.
181-196, 1993.
Phyu Thi Htun received the B.E. degree in
Information and Technology Engineering from
Government Technological College, Thanlyin ,
Myanmar, in 2006. And she received M.E degree in
Information and Technology Engineering from
Westen Yangon Technology
University,Yangon,Myanmar, in 2010 respectively.
She is currently doing research for Ph.D(IT) in
University of Technology ,Yadanapon Cyber City ,
Myanmar since November 2010. Her research
interests are in network security, especially intrusion
detection systems, and survivability of cloud
computing.
Kyaw Thet Khaing received the Master of
Computer Technology from University of Computer
Studies,Mandalay , Myanmar, in 2004. And he
received Ph.D degree in Information Technology
from University of Computer
Studies,Yangon,Myanmar, in 2010 respectively. His
research interests are in network security, especially
intrusion detection systems, and survivability of
cloud computing.