SlideShare a Scribd company logo
1 of 60
Network Intrusion Detection using
Supervised Machine Learning Technique
with Feature Selection
SEMINAR REPORT
Submitted by
JOWIN JOHN CHEMBAN
in partial fulfillment for the award of the degree
of
Bachelor of Technology
in
COMPUTER SCIENCE AND ENGINEERING
of
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
HOLY GRACE ACADEMY OF ENGINEERING
MALA 680 735
NOVEMBER 2019
CERTIFICATE
This is to Certify that the seminar report entitled “Network Intrusion
Detection using Supervised Machine Learning Technique with Feature
Selection” is a bonafide record of the work done by Mr. JOWIN JOHN
CHEMBAN, Register No. HGW16CS022 under our supervision, in partial
fulfillment of the requirements for the award of Degree of Bachelor of
Technology in Computer Science & Engineering from APJ Abdul Kalam
Technological University, Trivandrum for the years 2016-2020
Ms. SUJITHA B CHERKOTTU Ms. VIDHU VALSAN A
Asst Professor, Dept. of CSE Asst Professor, Dept. of CSE
Seminar Coordinator Seminar Guide
Ms. SANAM ANTO
Head of Department, Dept. of CSE
Date :
ACKNOWLEDGEMENT
An endeavor over a long period may be successful only with the advice and
guidance of many well-wishers. I take this opportunity to express my gratitude to all who
encouraged me to complete this seminar. I would like to express my deep sense of
gratitude to my respected Principal Dr. THRESIAMMA PHILIP for her inspiration
and for creating an atmosphere in the college to do the seminar.
I would like to thank Ms. SANAM ANTO, Head of Department of Computer
Science and Engineering for providing permission and facilities to conduct the seminar
in a systematic way, and for guiding me and giving timely advices, suggestions and
whole-hearted moral support in the successful completion of this seminar.
My sincere thanks to the seminar coordinator Ms. SUJITHA B CHERKOTTU,
Assistant Professor in Department of Computer Science and Engineering for their
wholehearted moral support in completion of this seminar.
My sincere thanks to my seminar guide Ms. VIDHU VALSAN A, Assistant
Professor in Department of Computer Science and Engineering for their wholehearted
moral support in completion of this seminar.
Last but not the least, I would like to thank all the Lectures and non-teaching staff
and my friends who have helped me in every possible way in the completion of my
seminar.
Date : JOWIN JOHN CHEMBAN
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
ABSTRACT
A novel supervised machine learning system is developed to classify network traffic
whether it is malicious or benign. To find the best model considering detection success
rate, combination of supervised learning algorithm and feature selection method have
been used. Through this study, it is found that Artificial Neural Network (ANN) based
machine learning with wrapper feature selection outperform support vector machine
(SVM) technique while classifying network traffic. To evaluate the performance, NSL-
KDD dataset is used to classify network traffic using SVM and ANN supervised
machine learning techniques. Comparative study shows that the proposed model is
efficient than other existing models with respect to intrusion detection success rate.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
TABLE OF CONTENTS
CHAPTER
NO.
TITLE PAGE
NO.
1 INTRODUCTION 1
2 LITERATURE SURVEY 3
2.1 IMPORTANCE OF INTRUSION DETECTION SYSTEM
(IDS)
3
2.2 MACHINE LEARNING TECHNIQUES FOR
INTRUSION DETECTION
8
2.3 ANOMALY-BASED NETWORK INTRUSION
DETECTION: TECHNIQUES, SYSTEMS AND
CHALLENGES
16
2.4 INCREMENTAL ANOMALY-BASED INTRUSION
DETECTION SYSTEM USING LIMITED LABELED
DATA
26
2.5 A DEEP LEARNING APPROACH FOR NETWORK
INTRUSION DETECTION SYSTEM
33
3 PROPOSED SYSTEM 43
4 APPLICATIONS 49
5 CONCLUSION 50
REFRENCES 51
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
LIST OF ABBREVATIONS
NIDS Network Intrusion Detection System
IDS Intrusion Detection System
UTM Unified Threat Modeling
IPS Intrusion Prevention System
SVM Support Vector Machine
ANN Artificial Neural Network
DIDS Distributed Intrusion Detection System
CMDS Computer Misuse Detection System
ASIM Automated Security Measurement System
AFCERT Air Force’s Computer Emergency Response Team
TCP Transfer Control Protocol
IP Internet Protocol
HIDS Host based Intrusion Detection System
AI Artificial Intelligence
CI Computational Intelligence
ML Machine Learning
kNN k-Nearest Neighbor
MLP Multi-Layer Perceptron
SVM Support Vector Machine
UDP User Datagram Protocol
GA Genetic Algorithms
KDD Knowledge Discovery in Databases
RBF Radial Basis Function
DoS Denial of Service
R2L Root to Local
U2R User to Root
PRB Probing
AIS Artificial Immune System
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
NSA Negative Selection Algorithm
CIDF Common Intrusion Detection Framework
IDWG Intrusion Detection Working Group
IDXP Intrusion Detection eXchange Protocol
IDMEF Intrusion Detection MEssage Format
OS Operating System
A-NIDS Anomaly based Network Intrusion Detection System
TP True Positive
FP False Positive
TN True Negative
FN False Negative
LAN Local Area Network
SC Service Classifier
ITI Incremental Tree Inductive
NADAL Network Anomaly Detection using Active Learning
STL Self-Taught Learning
SNIDS Signature (misuse) based Network Intrusion Detection System
ADNIDS Anomaly Detection based Network Intrusion Detection System
ANN Artificial Neural Network
SVM Support Vector Machine
NB Naïve Bayesian
RF Random Forests
SOM Self-Organized Maps
DMNB Discriminative Multinomial Naïve Bayes
END Ensembles of Balanced Nested Dichotomies
OPF Optimum Path Forest
DBN Deep Belief Network
UFL Unsupervised Feature Learning
RBM Restricted Boltzmann Machine
SMR Soft Max Regression
CPU Central Processing Unit
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
LIST OF FIGURES
NO. TITLE PAGE
NO.
2.1.1 Number of incidents reported 5
2.1.2 Vulnerabilities reported 6
2.1.3 Layered Security approach for reducing risk 7
2.2.1 Average of detection rates for methods evaluated in Pavel
Laskov, Patrick Dssel, Christin Schfer, and Konrad Rieck.
Learning intrusion detection: Supervised or unsupervised?
11
2.3.1 General CIDF architecture for IDS systems 17
2.3.2 Generic A-NIDS functional architecture 18
2.3.3 Classification of the anomaly detection techniques
according to the nature of the processing involved in the
‘‘behavioural’’ model considered.
19
2.4.1 The proposed model called NADAL 31
2.5.1 The two-stage process of self-taught learning:
a) Unsupervised Feature Learning on unlabeled data.
b) Classification on labeled data.
37
2.5.2 Various steps involved in our NIDS implementation 39
2.5.3 Classification accuracy using self-taught learning and
soft-max regression for 2Class, 5-Class, and 23-Class
when applied to training data
2.5.4 Precision, Recall, and F-Measure values using self-taught
learning and soft-max regression for 2-Class when applied
to training data
41
2.5.5 Classification accuracy using self-taught learning and
soft-max regression for 2-class and 5-class when applied
to test data
41
2.5.6 Precision, Recall, and F-Measure values using self-taught
learning and soft-max regression for 2-class when applied
to test data
41
2.5.7 Precision, Recall, and F-Measure values using self-taught
learning and soft-max regression for 5-class when applied
to test data
42
3.1 Proposed supervised machine learning classifier system 44
3.2 SVM classifier in two-dimensional problem spaces 45
3.3 Artificial neural network showing the input, output and
hidden layers
46
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
HGAE DEPARTMENT OF CSE
LIST OF TABLES
NO. TITLE PAGE
NO.
2.3.1 Fundamentals of the A-NIDS techniques 20
2.4.1 ACCURACY AND KAPPA FOR TEN
RANDOMIZATIONS: NADAL VS. INCREMENTAL
NAIVE BAYESIAN CLASSIFIER
32
2.5.1 Traffic records distribution in the training 38
3.1 RESULT OF FEATURE SELECTION 47
3.2 RESULT OF CLASSIFICATION 47
3.3 PERFORMANCE COMPARISON WITH EXISTING
MODELS
48
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
1
HGAE DEPARTMENT OF CSE
CHAPTER 1
INTRODUCTION
1.1 NIDS using Supervised Machine Learning with Feature Selection
With the wide spreading usages of internet and increases in access to online
contents, cybercrime is also happening at an increasing rate. Intrusion is some time also
called as hacker or cracker attempting to break into or misuse your system/network.
Intrusion detection is the first step to prevent security attack. Hence the security
solutions such as Firewall, Intrusion Detection System (IDS), Unified Threat Modeling
(UTM) and Intrusion Prevention System (IPS) are getting much attention in studies.
IDS detect attacks from a variety of systems and network sources by collecting
information and then analyzes the information for possible security breaches.
An IDS installed on a network/system provides much the same purpose as a burglar
alarm system installed in a house. Through various methods, both detect when an
intruder/attacker/burglar is present, and both subsequently issue some type of warning
or alert. The network-based IDS analyze the data packets that travel over a network and
this analysis are carried out in two ways. Till today anomaly-based detection is far
behind than the detection that works based on signature and hence anomaly-based
detection still remains a major area for research. The challenges with anomaly-based
intrusion detection are that it needs to deal with novel attack for which there is no prior
knowledge to identify the anomaly. Hence the system somehow needs to have the
intelligence to segregate which traffic is harmless and which one is malicious or
anomalous and for that machine learning techniques are being explored by the
researchers over the last few years. IDS however is not an answer to all security related
problems. For example, IDS cannot compensate weak identification and authentication
mechanisms or if there is a weakness in the network protocols.
Studying the field of intrusion detection first started in 1980 and the first such
model was published in 1987. For the last few decades, though huge commercial
investments and substantial research were done, intrusion detection technology is still
immature and hence not effective. While network IDS that works based on signature
have seen commercial success and widespread adoption by the technology-based
organization throughout the globe, anomaly-based network IDS have not gained
success in the same scale. Due to that reason in the field of IDS, currently anomaly-
based detection is a major focus area of research and development. And before going to
any wide scale deployment of anomaly-based intrusion detection system, key issues
remain to be solved. But the literature today is limited when it comes to compare on
how intrusion detection performs when using supervised machine learning techniques.
To protect target systems and networks against malicious activities anomaly-based
network IDS is a valuable technology. Despite the variety of anomaly-based network
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
2
HGAE DEPARTMENT OF CSE
intrusion detection techniques described in the literature in recent years, anomaly
detection functionalities enabled security tools are just beginning to appear, and some
important problems remain to be solved. Several anomaly-based techniques have been
proposed including Linear Regression, Support Vector Machines (SVM), Genetic
Algorithm, Gaussian mixture model, k-nearest neighbor algorithm, Naive Bayes
classifier, Decision Tree. Among them the most widely used learning algorithm is SVM
as it has already established itself on different types of problem. One major issue on
anomaly-based detection is though all these proposed techniques can detect novel
attacks but they all suffer a high false alarm rate in general. The cause behind is the
complexity of generating profiles of practical normal behaviour by learning from the
training data sets. Today Artificial Neural Network (ANN) are often trained by the
back-propagation algorithm, which had been around since 1970 as the reverse mode of
automatic differentiation.
The major challenges in evaluating performance of network IDS is the
unavailability of a comprehensive network-based data set. Most of the proposed
anomaly-based techniques found in the literature were evaluated using KDD CUP 99
dataset. In this paper we used SVM and ANN –two machine learning techniques, on
NSLKDD which is a popular benchmark dataset for network intrusion.
The promise and the contribution machine learning did till today are fascinating.
There are many real-life applications we are using today offered by machine learning. It
seems that machine learning will rule the world in coming days. Hence, we came out
into a hypothesis that the challenge of identifying new attacks or zero-day attacks
facing by the technology enabled organizations today can be overcome using machine
learning techniques. Here we developed a supervised machine learning model that can
classify unseen network traffic based on what is learnt from the seen traffic. We used
both SVM and ANN learning algorithm to find the best classifier with higher accuracy
and success rate.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
3
HGAE DEPARTMENT OF CSE
CHAPTER 2
LITERATURE SURVEY
2.1 IMPORTANCE OF INTRUSION DETECTION SYSTEM (IDS)
Intruders computers, who are spread across the Internet have become a major
threat in our world, the researchers proposed a number of techniques such as (firewall,
encryption) to prevent such penetration and protect the infrastructure of computers, but
with this, the intruders managed to penetrate the computers. IDS has taken much of the
attention of researchers, IDS monitor the resources computer and sends reports on the
activities of any anomaly or strange patterns. The aim of this paper is to explain the
stages of the evolution of the idea of IDS and its importance to researchers and research
centres, security, military and to examine the importance of intrusion detection systems
and categories, classifications, and where can put IDS to reduce the risk to the network
Security is an important issue for all the networks of companies and institutions
at the present time and all the intrusions are trying in ways that successful access to the
data network of these companies and Web services and despite the development of
multiple ways to ensure that the infiltration of intrusion to the infrastructure of the
network via the Internet, through the use of firewalls, encryption, etc.
But IDS is a relatively new technology of the techniques for intrusion detection
methods that have emerged in recent years. Intrusion detection system’s main role in a
network is to help computer systems to prepare and deal with the network attacks.
Intrusion detection functions include:
• Monitoring and analyzing both user and system activities
• Analyzing system configurations and vulnerabilities
• Assessing system and file integrity
• Ability to recognize patterns typical of attacks
• Analysis of abnormal activity patterns
• Tracking user policy violations
The purpose of IDS is to help computer systems on how to deal with attacks,
and that IDS is collecting information from several different sources within the
computer systems and networks and compares this information with preexisting
patterns of discrimination as to whether there are attacks or weaknesses.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
4
HGAE DEPARTMENT OF CSE
2.1.1 INTRUSION DETECTION SYSTEMS: ABRIEF HISTORY
The goal of intrusion detection is to monitor network assets to detect anomalous
behaviour and misuse in network. This concept has been around for nearly twenty years
but only recently has it seen a dramatic rise in popularity and incorporation into the
overall information security infrastructure. Beginning in 1980, with James Anderson's
paper, Computer Security Threat Monitoring and Surveillance, the intrusion detection
was born. Since then, several polar events in IDS technology have advanced intrusion
detection to its current state.
James Anderson's seminal paper, was written for a government organization,
introduced the notion that audit trails contained vital information that could be valuable
in tracking misuse and understanding of user behaviour. With the release of this paper,
the concept of "detecting" misuse and specific user events emerged. His insight into
audit data and its importance led to tremendous improvements in the auditing
subsystems of virtually every operating system. Anderson's hypothesize also provided
the foundation for future intrusion detection system design and development. His work
was the start of host-based intrusion detection and IDS in general.
In 1983, SRI International, and Dr. Dorothy Denning, began working on a
government project that launched a new effort into intrusion detection system
development. Their goal was to analyze audit trails from government mainframe
computers and create profiles of users based upon their activities. One year later, Dr.
Denning helped to develop the first model for intrusion detection, the Intrusion
Detection Expert System (IDES), which provided the foundation for the IDS
technology development that was soon to follow.
In 1984, SRI also developed a means of tracking and analyzing audit data
containing authentication information of users on ARPANET, the original Internet.
Soon after, SRI completed a Navy SPAWAR contract with the realization of the first
functional intrusion detection system, IDES. Using her research and development work
at SRI, Dr. Denning published the decisive work, An Intrusion Detection Model, which
revealed the necessary information for commercial intrusion detection system
development. The subsequent iteration of this tool was called the Distributed Intrusion
Detection System (DIDS). DIDS augmented the existing solution by tracking client
machines as well as the servers it originally monitored. Finally, in 1989, the developers
from the Haystack project formed the commercial company, Haystack Labs, and
released the last generation of the technology, Stalker. Crosby Marks says that "Stalker
was a host-based, pattern matching system that included robust search capabilities to
manually and automatically query the audit data." The Haystack advances, coupled
with the work of SRI and Denning, greatly advanced the development of host-based
intrusion detection technologies.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
5
HGAE DEPARTMENT OF CSE
Commercial development of intrusion detection technologies began in the early
1990s. Haystack Labs was the first commercial vendor of IDS tools, with its Stalker
line of host-based products. SAIC was also developing a form of host-based intrusion
detection, called Computer Misuse Detection System (CMDS). Simultaneously, the Air
Force's Crypto Logic Support Canter developed the Automated Security Measurement
System (ASIM) to monitor network traffic on the US Air Force's network. ASIM made
considerable progress in overcoming scalability and portability issues that previously
plagued NID products. Additionally, ASIM was the first solution to incorporate both a
hardware and software solution to network intrusion detection. ASIM is still currently
in use and managed by the Air Force's Computer Emergency Response Team
(AFCERT) at locations all over the world. As often happened, the development group
on the ASIM project formed a commercial company in 1994, the Wheel Group. Their
product, Net Ranger, was the first commercially viable network intrusion detection
device.
The intrusion detection market began to gain in popularity and truly generate
revenues around 1997. In that year, the security market leader, ISS, developed a
network intrusion detection system called Real Secure. A year later, Cisco recognized
the importance of network intrusion detection and purchased the Wheel Group,
attaining a security solution they could provide to their customers. Similarly, the first
visible host-based intrusion detection company, Centrex Corporation, emerged as a
result of a merger of the development staff from Haystack Labs and the departure of the
CMDS team from SAIC. From there, the commercial IDS world expanded its market-
base and a roller coaster ride of start-up companies, mergers, and acquisitions ensued.
Network intrusion detection actually deals with information passing on the wire
between hosts. Typically referred to as "packet-sniffers," network intrusion detection
devices intercept packets travelling in and out in network along various communication
mediums and protocols, usually TCP/IP. Once captured, the packets are analyzed in a
number of different ways. Some IDS devices will simply compare the packet to a
signature database consisting of known attacks and malicious packet "fingerprints",
Figure 2.1.1 : Number of incidents reported
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
6
HGAE DEPARTMENT OF CSE
while others will look for anomalous packet activity that might indicate malicious
behaviour.
The IDS basically monitor network traffic for activity that falls within the
banned activity in the network. The IDS main job is gives alert to network admins for
allow them to take corrective action, blocking access to vulnerable ports, denying
access to specific IP address or shutting down services used to allow attacks. This is
nothing but front-line weapon in the network admins war against hackers. This
information is then compared with predefined blueprints of known attacks and
vulnerabilities.
2.1.2 CATEGORIES OF INTRUSION DETECTION SYSTEM
Intrusion detection system is classified into three categories: signature-based
detection systems, anomaly-based detection systems and specification-based detection
systems.
1) Signature based Detection System
Signature based detection system (also called misuse based), This type of
detection is very effective against known attacks, and it depends on the
receiving of regular updates of patterns and will be unable to detect unknown
previous threats or new releases.
2) Anomaly based Detection System
This type of detection depends on the classification of the network to the normal
and anomalous, as this classification is based on rules or heuristics rather than
patterns or signatures and the implementation of this system we first need to
know the normal behaviour of the network.
Anomaly based detection system unlike the misuse-based detection system
because it can detect previous unknown threats, But the false positive to rise
more probably.
3) Specification based Detection System
Figure 2.1.2 : Vulnerabilities reported
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
7
HGAE DEPARTMENT OF CSE
This type of detection systems is responsible for monitoring the processes and
matching the actual data with the program and in case of any Abnormal
behaviour will be issued an alert and must be maintained and updated whenever
a change was made on the surveillance programs in order to be able to detect
the previous attacks the unknown and the number of false positives what can be
less than the anomaly detection system approach.
2.1.3 CLASSIFICATION OF INTRUSION DETECTION SYSTEM
Intrusion detection system are classified into three types
1) Host based IDS (HIDS)
This type is placed on one device such as server or workstation, where the
data is analyzed locally to the machine and are collecting this data from
different sources. HIDS can use both anomaly and misuse detection system.
2) Network based IDS (NIDS)
NIDS are deployed on strategic point in network infrastructure. The NIDS
can capture and analyze data to detect known attacks by comparing patterns
or signatures of the database or detection of illegal activities by scanning
traffic for anomalous activity. NIDS are also referred as “packet-sniffers”,
Because it captures the packets passing through the of communication
mediums.
3) Hybrid based IDS
The management and alerting from both network and host-based intrusion
detection devices, and provide the logical complement to NID and HID -
central intrusion detection management.
Figure 2.1.3 : Layered Security approach for reducing risk
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
8
HGAE DEPARTMENT OF CSE
2.1.4 CONCLUSION
An intrusion detection system is a part of the defensive operations that
complements the defenses such as firewalls, UTM etc. The intrusion detection system
basically detects attack signs and then alerts. According to the detection methodology,
intrusion detection systems are typically categorized as misuse detection and anomaly
detection systems. The deployment perspective, they are be classified in network based
or host-based IDS. In current intrusion detection systems where information is collected
from both network and host resources. In terms of performance, an intrusion detection
system becomes more accurate as it detects more attacks and raises fewer false positive
alarms.
2.2 MACHINE LEARNING TECHNIQUES FOR INTRUSION
DETECTION
An Intrusion Detection System (IDS) is a software that monitors a single or a
network of computers for malicious activities (attacks) that are aimed at stealing or
censoring information or corrupting network protocols. Most techniques used in
today’s IDS are not able to deal with the dynamic and complex nature of cyber-attacks
on computer networks. Hence, efficient adaptive methods like various techniques of
machine learning can result in higher detection rates, lower false alarm rates and
reasonable computation and communication costs. In this paper, we study several such
schemes and compare their performance. We divide the schemes into methods based on
classical artificial intelligence (AI) and methods based on computational intelligence
(CI). We explain how various characteristics of CI techniques can be used to build
efficient IDS.
Today, political and commercial entities are increasingly engaging in
sophisticated cyber-warfare to damage, disrupt, or censor information content in
computer networks. In designing network protocols, there is a need to ensure reliability
against intrusions of powerful attackers that can even control a fraction of parties in the
network. The controlled parties can launch both passive (e.g., eavesdropping,
nonparticipation) and active attacks (e.g., jamming, message dropping, corruption, and
forging).
Intrusion detection is the process of dynamically monitoring events occurring in
a computer system or network, analyzing them for signs of possible incidents and often
interdicting the unauthorized access. This is typically accomplished by automatically
collecting information from a variety of systems and network sources, and then
analyzing the information for possible security problems.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
9
HGAE DEPARTMENT OF CSE
Motivation
Traditional intrusion detection and prevention techniques, like firewalls, access control
mechanisms, and encryptions, have several limitations in fully protecting networks and
systems from increasingly sophisticated attacks like denial of service. Moreover, most
systems built based on such techniques suffer from high false positive and false
negative detection rates and the lack of continuously adapting to changing malicious
behaviours. In the past decade, however, several Machine Learning (ML) techniques
have been applied to the problem of intrusion detection with the hope of improving
detection rates and adaptability. These techniques are often used to keep the attack
knowledge bases up-to-date and comprehensive.
Study Approach
In this paper, we study several papers that use ML methods for detecting
malicious behaviour in distributed computer systems. There is a huge body of work in
this area thus, we decided to carefully select a few papers based on two factors:
diversity and citations count. By diversity we mean most ML techniques for IDS are
covered but only one paper is picked from the set of papers that use the same technique.
Also, the papers are chosen based on their citations count as this factor greatly shows
how much the corresponding work has influenced the community. All non-survey
papers studied here are cited at least 100 times.
2.2.1 CHALLENGES AND APPROACHES
An IDS generally has to deal with problems such as large network traffic
volumes, highly uneven data distribution, the difficulty to realize decision boundaries
between normal and abnormal behaviour, and a requirement for continuous adaptation
to a constantly changing environment. In general, the challenge is to efficiently capture
and classify various behaviours in a computer network. Strategies for classification of
network behaviours are typically divided into two categories: misuse detection and
anomaly detection.
Misuse detection techniques examine both network and system activity for
known instances of misuse using signature matching algorithms. This technique is
effective at detecting attacks that are already known. However, novel attacks are often
missed giving rise to false negatives. Alerts may be generated by the IDS, but reaction
to every alert wastes time and resources leading to instability of the system. To
overcome this problem, IDS should not start elimination procedure as soon as the first
symptom has been detected but rather it should be patient enough to collect alerts and
decide based on the correlation of them.
Anomaly detection systems rely on constructing a model of user behaviour that
is considered normal. This is achieved by using a combination of statistical or machine
learning methods to examine network traffic or system calls and processes. The
detection of novel attacks is more successful using the anomaly detection approach as
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
10
HGAE DEPARTMENT OF CSE
any deviant behaviour is classified as an intrusion. However, normal behaviour in a
large and dynamic system is not well defined and it changes over the time. This often
results in a substantial number of false alarms known as false positives. A network-
based IDS looks at the incoming network traffic for patterns that can signify whether a
person is probing the network for vulnerable computers. Since responding to each alert
consumes relatively large amounts of time and resources, IDS should not respond to
every alert it generates. Disregarding this fact may result in a self-inflicted denial-of-
service. To overcome this problem, alerts should be aggregated and correlated in order
to produce fewer but more expressive and remarkable alerts.
2.2.1.1 MACHINE LEARNING APPROACHES
We divide the ML-based approaches to intrusion detection into two categories:
approaches based on Artificial Intelligence (AI) techniques and approaches based on
Computational Intelligence (CI) methods. AI techniques refer to the methods from the
domain of classical AI like statistical modeling and while CI techniques refer to nature-
inspired methods that are used to deal with complex problems that classical methods
are unable to solve. Important CI methodologies are evolutionary computation, fuzzy
logic, artificial neural networks, and artificial immune systems. CI is different from the
well-known field of AI. AI handles symbolic knowledge representation, while CI
handles numeric representation of information. Although the boundary between these
two categories is not always clear and many hybrid methods have been proposed in the
literature, most previous work are mainly designed based on either of the categories.
Moreover, it would be quite useful to understand how well nature-based techniques
perform in contrast to classical methods.
1) AI-BASED TECHNIQUES
Laskov et al. develop an experimental framework for comparative analysis of
supervised (classification) and unsupervised learning (clustering) techniques for
detecting malicious activities. The supervised methods evaluated in this work
include decision trees, k-Nearest Neighbor (kNN), Multi-Layer Perceptron (MLP),
and Support Vector Machines (SVM). The unsupervised algorithms include γ-
algorithm, k-means clustering, and single linkage clustering. They define two
scenarios for evaluating the aforementioned learning algorithms from both
categories. In the first scenario, they assume that training and test data come from
the same unknown distribution. In the second scenario, they consider the case
where the test data comes from new (i.e., unseen) attack patterns. This scenario
helps us understand how much an IDS can generalize its knowledge to new
malicious patterns, which is often very essential for an IDS system.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
11
HGAE DEPARTMENT OF CSE
Since today’s sophisticated adversaries tend to use several intrusion patterns to
escape from modern IDS.
The results show that the supervised algorithms in general show better
classification accuracy on the data with known attacks (the first scenario). Among
these algorithms, the decision tree algorithm has achieved the best results (95% true
positive rate, 1% false-positive rate). The next two best algorithms are the MLP and
the SVM, followed by the k-nearest neighbor algorithm. However, if there are
unseen attacks in the test data, then the detection rate of supervised methods
decreases significantly. This is where the unsupervised techniques perform better as
they do not show significant difference in accuracy for seen and unseen attacks.
Figure 2.2.1 shows the average true/false positive rates of all methods evaluated. As
the plots show, the supervised techniques generally perform better although
unsupervised methods give more robust results in both scenarios.
Zanero and Savaresi introduce a two-tier anomaly-based architecture for
IDS in TCP/IP networks based on unsupervised learning: the first tier is an
unsupervised clustering algorithm, which build small-size patterns from the
network packets payload. In other words, TCP or UDP packet are assigned to two
clusters representing normal and abnormal traffic. The second tier is an optimized
traditional anomaly detection algorithm improved by the availability of data on the
packet payload content. The motivation behind the work is that unsupervised
learning methods are usually more powerful in generalization of attack patterns
than supervised methods thus, there is a hope that such an architecture can resist
polymorphic attacks more efficiently.
Lee and Solfo build a classifier to detect anomalies in networks using data
mining techniques. They implement two general data mining algorithms that are
essential in describing normal behaviour of a program or user. They propose an
agent-based architecture for intrusion detection systems, where the learning agents
Figure 2.2.1 : Average of detection rates for methods evaluated in Pavel Laskov, Patrick
Dssel, Christin Schfer, and Konrad Rieck. Learning intrusion detection: Supervised or
unsupervised? In Image Analysis and Processing ICIAP 2005, volume 3617 of Lecture
Notes in Computer Science, pages 50–57. Springer Berlin Heidelberg, 2005. in two
scenarios: test data contains only known attacks (left) and test data contains unknown
attacks (right).
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
12
HGAE DEPARTMENT OF CSE
continuously compute and provide the updated detection models to the agents. They
conduct experiments on Sendmail system call data and network tcpdump data to
demonstrate the effectiveness of their classification models in detecting anomalies.
They finally argue that the most important challenge of using data mining
approaches in intrusion detection is that they require a large amount of audit data in
order to compute the profile rule sets.
Sommer and Paxson study the imbalance between the extensive amount of
research on ML-based intrusion detection versus the lack of operational
deployments of such systems. They identify challenges particular to network
intrusion detection and provide a set of guidelines for fortifying future research on
ML-based intrusion detection. More specifically, they argue that an anomaly-based
IDS requires outlier detection while the classic application of ML is a classification
problem that deals with finding similarities between activities. It is true that in some
cases, an outlier detection problem can be modeled as a classification problem in
which there are two classes: normal and abnormal. In machine learning, one needs
to train a system with training patterns of all classes while in anomaly detection one
can only train on normal patterns. This means that anomaly detection is better for
finding variations of known attacks, rather than previously unknown malicious
activity. This is why ML methods have been applied to spam detection more
effectively than to intrusion detection.
2) CI-BASED TECHNIQUES
In this section, we review several algorithms based on the four core
techniques of computational intelligence.
• Genetic Algorithms (GA)
Genetic algorithms are aimed at finding optimal solutions to problems. Each
potential solution to a problem is represented as a sequence of bits (genes)
called a genome or chromosome. A genetic algorithm begins with a set of
genomes (population) and an evaluation function called fitness function that
measures the quality (goodness) of each genome. The algorithm uses two
reproduction operators called crossover and mutation to create new
descendants (solutions), which are then evaluated. Crossover determines
how various properties of the parents in a population are inherited by the
descendants. Mutation is the spontaneous alteration of a single gene.
Sinclair et al. use genetic algorithms and decision trees to create rules for an
intrusion detection expert system, which supports the analyst’s job in
differentiating anomalous network activity from normal network traffic. In
this work, GA is used to evolve simple rules for network traffic. Each rule is
represented by a genome and the initial population of genomes is a set of
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
13
HGAE DEPARTMENT OF CSE
random rules. Each genome is comprised of 29 genes: 8 for source IP, 8 for
destination IP, 6 for source port, 6 for destination port, and 1 for protocol.
The fitness function is based on the actual performance of each rule on a
preclassified data set. An analyst marks a data set comprised of connections
as either normal or abnormal. The system uses analyst-created training sets
for rule development and analyst decision support. If a rule completely
matches an abnormal connection, then it is rewarded a bonus and if it
matches a normal connection it is penalized. Hence, the generations are
biased toward rules that match intrusive connections only. Once the genetic
algorithm reaches a certain number of generations, it stops and the best
genomes (i.e., rules) are selected. The generated rule set can be used as
knowledge inside the IDS for judging whether the network connection and
related behaviours are potential intrusions.
The traditional GA tends to converge to a single best solution called global
maximum. Since, the algorithm requires a group of best unique rules, a
nature inspired technique called niching that attempts to create
subpopulations which converge on local maxima.
Li describes a few disadvantages of the algorithm proposed and defines a
new technique for defining IDS rules. They argue that in order to detect
intrusive behaviours for a local network, network connections should be
used to define normal and abnormal behaviours. An attack can sometimes
be as simple as scanning for available ports in a server or a password-
guessing scheme. But typically, they are complex and are generated by
automated tools. So, one needs to use temporal and spatial information of
network connections to define IDS rules that can classify complex
anomalous activities using an efficient genetic algorithm.
• Artificial Neural Networks (ANN)
A neural network consists of a collection of processing units called neurons
that are highly interconnected according to a given topology. ANN have the
ability to learning by example and generalize from limited, noisy, and
incomplete data. They have been successfully employed in a broad spectrum
of data-intensive applications.
Mukkamala et al. describe approaches to intrusion detection using neural
networks and Support Vector Machines (SVM). Their goal is to discover
patterns or features that describe user behaviour to build classifiers for
recognizing anomalies. SVM are supervised learning machines that
represent the training vector in high-dimensional feature space and label
each vector by its class. SVM define an upper bound on the margin
(separation) between different classes to minimize the generalization error,
which is the amount of error in classification of unknown vectors. SVM
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
14
HGAE DEPARTMENT OF CSE
classify data by determining a set of training data called support vectors that
approximate a hyperplane in feature space.
Mukkamala et al. use an SVM for non-linear classification of feature vectors
in an IDS. The SVM is trained with 7312 data points and test with 6980 test
points from KDD. Each point is located on a 41-dimensional space and the
training is done using the radial basis function (RBF). The RBF is used to
approximate the non-linear hyperplane that separates the normal and
abnormal classes. Using this SVM, they reach an accuracy of 99.5% in
classification of test points. They also use three multilayer feed-forward
ANN to classify the same test points. The ANN are trained using the same
7312-point training set. The best result from experimenting the different
ANN architectures is a detection rate of 99.25%. The authors conclude that
although their SVM IDS shows higher detection rates than their ANN, SVM
can only be used for binary classification, which is a big limitation for IDS
that require multiple classes.
• Fuzzy Logic
Fuzzy logic is a method to computing based on degrees of truth rather than
the usual true or false Boolean logic on which the modern computers are
based. With fuzzy spaces, fuzzy logic allows an object to belong to different
classes at the same time. This makes fuzzy logic a great choice for intrusion
detection because the security itself includes fuzziness and the boundary
between the normal and anomaly is not well defined. Moreover, the
intrusion detection problem involves many numeric attributes in collected
data, and various derived statistical measures. Building models directly on
numeric data usually causes high detection errors. A behaviour that deviates
only slightly from a model may not be detected or a small change in normal
behaviour may cause a false positive. With fuzzy logic, it is possible to
model these small deviations to keep the false positive/negative rates small.
Every fuzzy rule has the following general form,
IF condition THEN conclusion [weight],
where condition is a fuzzy expression defined using fuzzy logic operators
like fuzzy AND & fuzzy OR, conclusion is an atomic expression, and
weight is a real number in [0,1] that shows the confidence of the rule.
Gomez and Dasgupta show that with fuzzy logic, the false alarm rate in
determining intrusive activities can be reduced. They define a set of fuzzy
rules to define the normal and abnormal behaviour in a computer network,
and a fuzzy inference engine to determine intrusions. They use a genetic
algorithm to generate fuzzy classifiers, which is a set of fuzzy rules in the
form defined above. Each fuzzy rule is represented by a genome and the GA
is used to find the best genomes (fuzzy rules) to be added to the fuzzy
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
15
HGAE DEPARTMENT OF CSE
classifier. The authors conducted experiments using the KDD evaluation
data to classify 22 different types of attacks into 4 intrusion classes: denial
of service (DoS), unauthorized access from a remote machine (R2L),
unauthorized access to local superuser (root) privileges (U2R), and probing
(PRB). The results show that their algorithm achieves an overall true
positive rate of 98.95% and a false positive rate of 7%.
• Artificial Immune Systems (AIS)
Natural immune systems consist of molecules, cells, and tissues that
establish body’s resistance to infections caused by pathogens like bacteria,
viruses, and parasites. They distinguish pathogens from self-cells and
eliminate the pathogens. This provides a great source of inspiration for
computer security systems, especially IDS. An artificial immune system is a
computationally intelligent system based on behaviour of the natural
immune systems.
The first immune-inspired model applicable to various computer security
problems was proposed by Hofmeyr and Forrest. Their model is specialized
to detect intrusions in local area networks based on TCP/IP. They build a
database containing normal sequences of system calls that act as the self-
definition of the normal behaviour of a program, and as the basis to detect
anomalies. Each TCP connection is modeled by a triple, which encodes
address of sender, address of receiver and port number of the receiver.
Detectors are generated randomly through negative selection algorithm
(NSA). In addition to NSA that results in a signal to stimulate or tolerate the
immune response, they used a second signal (called co-stimulation) to
confirm the anomaly that was detected through NS procedure. In this
system, a human is required to generate this signal manually in order to
reduce false alarms (autoimmunity) of the system.
Kim et al. provide an introduction and analysis of the key developments
within the field of immune-inspired computer security as well as
suggestions for future research. They summarize six immune features that
are desirable for an effective IDS: distributed, multi-layered, self-organized,
lightweight, diverse and disposable. They explain that the human immune
system is distributed through immune networks and it generates unique
antibody sets to provide the first four requirements. It is self-organized
through gene library evolution, negative selection, and clonal. Finally, it is
lightweight through approximate binding, memory cells, and gene
expression to increase efficiency.
Zamani et al. describe an artificial immune algorithm for intrusion detection
in distributed systems based on danger theory, an immunological model
based on the idea that the immune system does not recognize between self
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
16
HGAE DEPARTMENT OF CSE
and non-self, but rather between events that cause damage. The authors
propose a multi-agent environment that computationally emulates the
behaviour of natural immune systems is effective in reducing false positive
rates. They show the effectiveness of their model in practice by performing
a case study on the problem of detecting distributed denial-of-service attacks
in wireless sensor networks.
Dasgupta proposes a multi-agent IDS based on AIS. He defines three types
of agents: monitoring agents that roam around the network and monitor
various parameters simultaneously at multiple levels (user to packet level),
communicator agents that are used to play the role of signals between
immune cells called lymphokines and decision/action agents to make
decisions based on collected local warning signals. Roles of each type of
agents is unique, though they may work in collaboration. This work
unfortunately does not provide any experimental results making it difficult
for the reader to compare the performance of the proposed system with other
ML-based IDS.
2.2.2 CONCLUSION
We reviewed several influential algorithms for intrusion detection based on
various machine learning techniques. Characteristics of ML techniques makes it
possible to design IDS that have high detection rates and low false positive rates while
the system quickly adapts itself to changing malicious behaviours. We divided these
algorithms into two types of ML-based schemes: Artificial Intelligence (AI) and
Computational Intelligence (CI). Although these two categories of algorithms share
many similarities, several features of CI-based techniques, such as adaptation, fault
tolerance, high computational speed and error resilience in the face of noisy
information, conform the requirement of building efficient intrusion detection systems.
2.3 ANOMALY-BASED NETWORK INTRUSION DETECTION:
TECHNIQUES, SYSTEMS AND CHALLENGES
The Internet and computer networks are exposed to an increasing number of
security threats. With new types of attacks appearing continually, developing flexible
and adaptive security-oriented approaches is a severe challenge. In this context,
anomaly-based network intrusion detection techniques are a valuable technology to
protect target systems and networks against malicious activities. However, despite the
variety of such methods described in the literature in recent years, security tools
incorporating anomaly detection functionalities are just starting to appear, and several
important problems remain to be solved.
Noteworthy work has been carried out by CIDF (‘‘Common Intrusion Detection
Framework’’), a working group created by DARPA in 1998 mainly oriented towards
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
17
HGAE DEPARTMENT OF CSE
coordinating and defining a common framework in the IDS field. Integrated within
IETF in 2000, and having adopted the new acronym IDWG (‘‘Intrusion Detection
Working Group’’), the group defined a general IDS architecture based on the
consideration of four types of functional modules (Figure 2.3.1):
• E blocks (‘‘Event-boxes’’): This kind of block is composed of sensor
elements that monitor the target system, thus acquiring information events
to be analyzed by other blocks.
• D blocks (‘‘Database-boxes’’): These are elements intended to store
information from E blocks for subsequent processing by A and R boxes.
• A blocks (‘‘Analysis-boxes’’): Processing modules for analyzing events
and detecting potential hostile behaviour, so that some kind of alarm will be
generated if necessary.
• R blocks (‘‘Response-boxes’’): The main function of this type of block is
the execution, if any intrusion occurs, of a response to thwart the detected
menace.
Other key contributions in the IDS field concern the definition of protocols for
data exchange between components (e.g. IDXP, ‘‘Intrusion Detection eXchange
Protocol’’, RFC 4767), and the format considered for this (e.g. IDMEF, ‘‘Intrusion
Detection MEssage Format’’, RFC 4765).
Depending on the information source considered (E boxes in Figure 2.3.1), an
IDS may be either host or network-based. A host-based IDS analyzes events such as
process identifiers and system calls, mainly related to OS information. On the other
hand, a network-based IDS analyzes network related events: traffic volume, IP
addresses, service ports, protocol usage, etc. This paper focuses on the latter type of
IDS.
Depending on the type of analysis carried out (A blocks in Figure 2.3.1),
intrusion detection systems are classified as either signature-based or anomaly-based.
Signature-based schemes (also denoted as misuse-based) seek defined patterns, or
signatures, within the analyzed data. For this purpose, a signature database
corresponding to known attacks is specified a priori. On the other hand, anomaly-based
Figure 2.3.1 : General CIDF architecture for IDS systems
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
18
HGAE DEPARTMENT OF CSE
detectors attempt to estimate the ‘‘normal’’ behaviour of the system to be protected,
and generate an anomaly alarm whenever the deviation between a given observation at
an instant and the normal behaviour exceeds a predefined threshold. Another possibility
is to model the ‘‘abnormal’’ behaviour of the system and to raise an alarm when the
difference between the observed behaviour and the expected one falls below a given
limit
Signature and anomaly-based systems are similar in terms of conceptual
operation and composition. The main differences between these methodologies are
inherent in the concepts of ‘‘attack’’ and ‘‘anomaly’’. An attack can be defined as ‘‘a
sequence of operations that puts the security of a system at risk’’. An anomaly is just
‘‘an event that is suspicious from the perspective of security’’. Based on this
distinction, the main advantages and disadvantages of each IDS type can be pointed
out.
2.3.1 A-NIDS Techniques
Although different A-NIDS approaches exist (Este´vezTapiador et al., 2004), in
general terms all of them consist of the following basic modules or stages (Figure 2.3.2)
• Parameterization: In this stage, the observed instances of the target system are
represented in a pre-established form.
• Training stage: The normal (or abnormal) behaviour of the system is
characterized and a corresponding model is built. This can be done in very
different ways, automatically or manually, depending on the type of A-NIDS
considered.
• Detection stage: Once the model for the system is available, it is compared with
the (parameterized) observed traffic. If the deviation found exceeds (or is
below, in the case of abnormality models) a given threshold an alarm will be
triggered (Este´vez-Tapiador et al., 2004).
According to the type of processing related to the “behavioural” model of the
target system, anomaly detection techniques can be classified into three main categories
(Lazarevic et al., 2005) (see Figure 2.3.3): statistical based, knowledge-based, and
Figure 2.3.2 : Generic A-NIDS functional architecture.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
19
HGAE DEPARTMENT OF CSE
machine learning-based. In the statistical-based case, the behaviour of the system is
represented from a random viewpoint. On the other hand, knowledge-based A-NIDS
techniques try to capture the claimed behaviour from available system data (protocol
specifications, network traffic instances, etc.). Finally, machine learning A-NIDS
schemes are based on the establishment of an explicit or implicit model that allows the
patterns analyzed to be categorized.
Two key aspects concern the evaluation, and thus the comparison, of the
performance of alternative intrusion detection approaches: these are the efficiency of
the detection process, and the cost involved in the operation. Without underestimating
the importance of the cost, at this point the efficiency aspect must be emphasized. Four
situations exist in this context, corresponding to the relation between the result of the
detection for an analyzed event (“normal” vs. “intrusion”) and its actual nature
(‘‘innocuous’’ vs. ‘‘malicious’’). These situations are: false positive (FP), if the
analyzed event is innocuous (or ‘‘clean’’) from the perspective of security, but it is
classified as malicious; true positive (TP), if the analyzed event is correctly classified as
intrusion/malicious; false negative (FN), if the analyzed event is malicious but it is
classified as normal/innocuous; and true negative (TN), if the analyzed event is
Figure 2.3.3 : Classification of the anomaly detection techniques
according to the nature of the processing involved in the
‘‘behavioural’’ model considered.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
20
HGAE DEPARTMENT OF CSE
correctly classified as normal/innocuous. It is clear that low FP and FN rates, together
with high TP and TN rates, will result in good efficiency values.
The fundamentals for statistical, knowledge and machine learning-based A-
NIDS, as well as the principal subtypes of each, are described below. The main features
of all are summarized in Table 2.3.1. Above and beyond other possibilities, the
question of efficiency should be a prime consideration in selecting and implementing
A-NIDS methodologies.
1) Statistical-based A-NIDS techniques
In statistical-based techniques, the network traffic activity is captured and a
profile representing its stochastic behaviour is created. This profile is based on
metrics such as the traffic rate, the number of packets for each protocol, the rate
of connections, the number of different IP addresses, etc. Two datasets of
network traffic are considered during the anomaly detection process: one
corresponds to the currently observed profile over time, and the other is for the
previously trained statistical profile.
Apart from their inherent features for use as anomaly-based techniques,
statistical A-NIDS approaches have a number of virtues. Firstly, they do not
require prior knowledge about the normal activity of the target system; instead,
they have the ability to learn the expected behaviour of the system from
observations. Secondly, statistical methods can provide accurate notification of
malicious activities occurring over long periods of time.
However, some drawbacks should also be pointed out. First, this kind of A-
NIDS is susceptible to be trained by an attacker in such a way that the network
traffic generated during the attack is considered as normal. Second, setting the
values of the different parameters/metrics is a difficult task, especially because
the balance between false positives and false negatives is affected.
Table 2.3.1 : Fundamentals of the A-NIDS techniques
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
21
HGAE DEPARTMENT OF CSE
2) Knowledge-based techniques
The so-called expert system approach is one of the most widely used
knowledge-based IDS schemes. However, like other A-NIDS methodologies,
expert systems can also be classified into other, different categories. Expert
systems are intended to classify the audit data according to a set of rules,
involving three steps. First, different attributes and classes are identified from
the training data. Second, a set of classification rules, parameters or procedures
are deduced. Third, the audit data are classified accordingly.
More restrictive/particular in some senses are specification-based anomaly
methods, for which the desired model is manually constructed by a human
expert, in terms of a set of rules (the specifications) that seek to determine
legitimate system behaviour. If the specifications are complete enough, the
model will be able to detect illegitimate behavioural patterns. Moreover, the
number of false positives is reduced, mainly because this kind of system avoids
the problem of harmless activities, not previously observed, being reported as
intrusions. Specifications could also be developed by using some kind of formal
tool.
3) Machine learning-based A-NIDS schemes
Machine learning techniques are based on establishing an explicit or implicit
model that enables the patterns analyzed to be categorized. A singular
characteristic of these schemes is the need for labelled data to train the
behavioural model, a procedure that places severe demands on resources.
In many cases, the applicability of machine learning principles coincides with
that for the statistical techniques, although the former is focused on building a
model that improves its performance on the basis of previous results. Hence, a
machine learning A-NIDS has the ability to change its execution strategy as it
acquires new information. Although this feature could make it desirable to use
such schemes for all situations, the major drawback is their resource expensive
nature.
Several machine learning-based schemes have been applied to A-NIDS. Some
of the most important are cited below, and their main advantages and drawbacks
are identified.
• Bayesian networks
A Bayesian network is a model that encodes probabilistic relationships
among variables of interest. This technique is generally used for intrusion
detection in combination with statistical schemes, a procedure that yields
several advantages, including the capability of encoding interdependencies
between variables and of predicting events, as well as the ability to
incorporate both prior knowledge and data.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
22
HGAE DEPARTMENT OF CSE
However, a serious disadvantage of using Bayesian networks is that their
results are similar to those derived from threshold-based systems, while
considerably higher computational effort is required.
Although the use of Bayesian networks has proved to be effective in certain
situations, the results obtained are highly dependent on the assumptions
about the behaviour of the target system, and so a deviation in these
hypotheses leads to detection errors, attributable to the model considered.
• Markov models
A Markov chain is a set of states that are interconnected through certain
transition probabilities, which determine the topology and the capabilities of
the model. During a first training phase, the probabilities associated to the
transitions are estimated from the normal behaviour of the target system.
The detection of anomalies is then carried out by comparing the anomaly
score obtained for the observed sequences with a fixed threshold.
Markov-based techniques have been extensively used in the context of host
IDS, normally applied to system calls. In all cases, the model derived for the
target system has provided a good approach for the claimed profile, while,
as in Bayesian networks, the results are highly dependent on the
assumptions about the behaviour accepted for the system.
• Neural networks
With the aim of simulating the operation of the human brain, neural
networks have been adopted in the field of anomaly intrusion detection,
mainly because of their flexibility and adaptability to environmental
changes. However, a common characteristic in the proposed variants, from
recurrent neural networks to self-organizing maps (Ramadas et al., 2003), is
that they do not provide a descriptive model that explains why a particular
detection decision has been taken.
• Fuzzy logic techniques
Fuzzy logic is derived from fuzzy set theory under which reasoning is
approximate rather than precisely deduced from classical predicate logic.
Fuzzy techniques are thus used in the field of anomaly detection mainly
because the features to be considered can be seen as fuzzy variables. This
kind of processing scheme considers an observation as normal if it lies
within a given interval.
Although fuzzy logic has proved to be effective, especially against port
scans and probes, its main disadvantage is the high resource consumption
involved. On the other hand, it should also be noticed that fuzzy logic is
controversial in some circles, and it has been rejected by some engineers
and by most statisticians, who hold that probability is the only rigorous
mathematical description of uncertainty.
• Genetic algorithms
Genetic algorithms are categorized as global search heuristics, and are a
particular class of evolutionary algorithms that use techniques inspired by
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
23
HGAE DEPARTMENT OF CSE
evolutionary biology such as inheritance, mutation, selection and
recombination. Thus, genetic algorithms constitute another type of machine
learning-based technique, capable of deriving classification rules and/or
selecting appropriate features or optimal parameters for the detection
process. The main advantage of this subtype of machine learning A-NIDS is
the use of a flexible and robust global search method that converges to a
solution from multiple directions, whilst no prior knowledge about the
system behaviour is assumed. Its main disadvantage is the high resource
consumption involved.
• Clustering and outlier detection
Clustering techniques work by grouping the observed data into clusters,
according to a given similarity or distance measure. The procedure most
commonly used for this consists in selecting a representative point for each
cluster. Then, each new data point is classified as belonging to a given
cluster according to the proximity to the corresponding representative point.
Some points may not belong to any cluster; these are named outliers and
represent the anomalies in the detection process.
Clustering techniques determine the occurrence of intrusion events only
from the raw audit data, and so the effort required to tune the IDS is
reduced.
4) Additional considerations on A-NIDS processing.
KDD and data mining
In addition to the above described A-NIDS techniques, there are others that may
help in the task of dealing with the amount of information contained within a
dataset. Two of these techniques are principal component analysis (PCA) and
association rule discovery.
PCA is a technique that is used to reduce the complexity of a dataset. It is not a
detection scheme itself but an auxiliary one. A given data collection (or dataset),
obtained by means of the different sensors in the target environment, becomes
more and more extensive and complex as the number of different services and
speed of the networks grow. To simplify the dataset, PCA makes a translation
on a basis by which n correlated variables are represented in order to reduce the
number of variables to d < n, which will be both uncorrelated and linear
combinations of the original ones. This makes it possible to express the data in a
reduced form, thus facilitating the detection process.
To conclude the present section, let us present an important discussion of A-
NIDS techniques. During recent decades several scientific communities have
contributed to analyzing information from high volume databases. However, in
the 1990s, KDD (‘‘Knowledge Discovery in Databases’’) burst onto the scene,
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
24
HGAE DEPARTMENT OF CSE
to ‘‘identify new, valid, potentially useful and comprehensible patterns for
data’’.
2.3.2 AVAILABLE A-NIDS SYSTEMS
This section describes several reported endeavours in the development and
deployment of A-NIDS platforms in real network environments. The analysis is split
into two categories: available platforms, commercial or freeware, and research systems.
Commercial systems tend to use well proven techniques, and so they do not usually
consider the A-NIDS techniques most recently proposed in the specialized literature. In
fact, most of them include a signature-based detection module as the core of the
detection platform.
2.3.2.1 A-NIDS platforms
In recent years, a number of important actions have focused on implementing
A-NIDS techniques in real security platforms. Currently available IDS software tools in
this line include Snort (www.snort.org), Prelude (www.prelude-ids.org), and N@G
(www.ncb.ernet.in/nag).
Although anomaly-based detection techniques are not yet mature, they are
beginning to appear in commercial and open source products. Furthermore, in recent
years, some pioneering systems and businesses in the A-NIDS field have been acquired
by bigger companies, and their products incorporated into more general and integral
network security platforms.
More recent systems make use of a distributed architecture for intrusion
detection by incorporating agents (or sensors), and a central console to supervise the
overall detection process. This is the case of the SecurityFocus DeepSight Threat
Management System – now part of DeepNines BBX Intrusion Prevention which uses a
statistical approach to detect potential Internet threats. Data are collected by distributed
sensors, which include intrusion detection capabilities. The sensors report current
network scans and attacks to the controller, providing a global detection capability.
Most of the platforms perform further analysis on the monitored data, related to
audit, tracing and forensic capabilities. Additionally, they may trigger some kind of
response to detected attacks, namely an interaction with firewalls, the reset of TCP
connections, the use of honey systems, etc.
More advanced platforms include the Protocol Anomaly Detection (PAD)
technique, which is based on the detection of anomalies in the use of protocols. This
kind of analysis is adopted in BarbedWire IDS, DeepNines BBX, N@G, and Strata
Guard. PAD combines specification-based and statistical characterization A-NIDS
techniques to model the behaviour of a given protocol. This can be complemented by
using additional A-NIDS techniques.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
25
HGAE DEPARTMENT OF CSE
2.3.2.2 A-NIDS research-related environments
Although some of the above-mentioned A-NIDS platforms are also usable for
research purposes, others have been specifically developed for this. Unlike
‘‘commercial’’ A-NIDS systems, research-oriented environments include more
innovative anomaly detection techniques. Conceived as research platforms, these
systems enable the integration of contributed modules performing additional detection
techniques. This is also the case of Snort and Prelude, two of the most widely deployed
NIDS tools today.
Another observed tendency is the consideration of intrusion prevention
procedures or IPS (Intrusion Prevention System), that is, inline IDS schemes that filter
and analyze all the network traffic accessing the target environment. This has two main
consequences. On one hand, most projects have a structured architecture in which
various detectors can work jointly, typically in a distributed way (e.g. EMERALD,
AAFID, GIDRE). On the other hand, as the detectors are now ‘‘pluggable’’ modules, a
specialization of their functions and capabilities can be observed. Thus, individual
detectors are designed to monitor only a specific protocol or behaviour (e.g. Anagram
targets HTTP payloads), and the global detection capabilities of the platform result
from combining and correlating the information from different detectors.
2.3.3 OPEN ISSUES AND CHALLENGES
Intrusion detection techniques are continuously evolving, with the goal of
improving the security and protection of networks and computer infrastructures.
Despite the promising nature of anomaly-based IDS, as well as its relatively long
existence, there still exist several open issues regarding these systems. Some of the
most significant challenges in the area are:
• Low detection efficiency, especially due to the high false positive rate
usually obtained (Axelsson, 2000). This aspect is generally explained as
arising from the lack of good studies on the nature of the intrusion events.
The problem calls for the exploration and development of new, accurate
processing schemes, as well as better structured approaches to modelling
network systems.
• Low throughput and high cost, mainly due to the high data rates (Gbps)
that characterize current wideband transmission technologies (Kruegel et al.,
2002). Some proposals intended to optimize intrusion detection are
concerned with grid techniques and distributed detection paradigms.
• The absence of appropriate metrics and assessment methodologies, as
well as a general framework for evaluating and comparing alternative IDS
techniques (Stolfo and Fan, 2000; Gaffney and Ulvila, 2001). Due to the
importance of this issue, it is analyzed in greater depth below.
• The analysis of ciphered data, although this is also a general problem
faced by all intrusion detection platforms. Moreover, this problem could be
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
26
HGAE DEPARTMENT OF CSE
dealt with by simply locating the detection agents at those functional points
in the system where data are available in ‘‘plaintext’’ format and, for which
the corresponding detection analysis can be carried out without special
restrictions.
A-NIDS assessment
One of the main challenges that researchers must face, when trying to
implement and validate a new intrusion detection method, is to assess it and
compare its performance with that of other available approaches. It is noticeable
that this task is not restricted to A-NIDS, but is also applicable to NIDS in
general. The need for test-beds that provide robust and reliable metrics to
quantify NIDS has been suggested. Although some authors defend a testing
methodology in real environments, most of them, advocate an evaluation
procedure in experimental environments.
An advantage of assessment in real environments is that the traffic is
sufficiently realistic; however, this approach is subject to:
(a) The risk of potential attacks
(b) The possible interruption of the system operation due to simulated attacks
On the other hand, the evaluation of NIDS methodologies in experimental
environments involves the generation of synthetic traffic as well as background
traffic representing legal users, which is far from being a trivial undertaking.
2.3.4 SUMMARY
The present paper discusses the foundations of the main A-NIDS technologies,
together with their general operational architecture, and provides a classification for
them according to the type of processing related to the “behavioural” model for the
target system. Another valuable aspect of this study is that it describes, in a concise
way, the main features of several currently available IDS systems/platforms. Finally,
the most significant open issues regarding A-NIDS are identified, among which that of
assessment is given particular emphasis.
2.4 INCREMENTAL ANOMALY-BASED INTRUSION
DETECTION SYSTEM USING LIMITED LABELED DATA
With the proliferation of the internet and increased global access to online
media, cybercrime is also occurring at an increasing rate. Currently, both personal users
and companies are vulnerable to cybercrime. A number of tools including firewalls and
Intrusion Detection Systems (IDS) can be used as defense mechanisms. A firewall acts
as a checkpoint which allows packets to pass through according to predetermined
conditions. In extreme cases, it may even disconnect all network traffic. An IDS, on the
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
27
HGAE DEPARTMENT OF CSE
other hand, automates the monitoring process in computer networks. The streaming
nature of data in computer networks poses a significant challenge in building IDS. In
this paper, a method is proposed to overcome this problem by performing online
classification on datasets. In doing so, an incremental naive Bayesian classifier is
employed. Furthermore, active learning enables solving the problem using a small set
of labeled data points which are often very expensive to acquire. The proposed method
includes two groups of actions i.e. offline and online. The former involves data
preprocessing while the latter introduces the NADAL online method. The proposed
method is compared to the incremental naive Bayesian classifier using the NSL-KDD
standard dataset.
There are three advantages with the proposed method:
(1) overcoming the streaming data challenge;
(2) reducing the high cost associated with instance labeling; and
(3) improved accuracy and Kappa compared to the incremental naive Bayesian
approach.
Thus, the method is well-suited to IDS applications.
An attack refers to a set of actions that compromise the confidentiality,
integrity, and accessibility of resources. A system is known to be secure if it can
guarantee these three criteria. Attacks must be identified before doing any harm to the
organization. Even Local Area Networks (LAN) need to be able to withstand such
attacks since network performance is important in terms of bandwidth and other
resources. The most common means of defense against potential attacks involves a
two-layered system. The first layer comprises a firewall which controls access to the
network while the second layer is configured to detect threats that somehow manage to
pass through the firewall and take appropriate action to defend the network. This
second layer is known as an Intrusion Detection System (IDS) which is able to identify
intrusion attempts by monitoring and analyzing network packets and logs. In case an
intrusion is detected, the system alerts the network administer.
With respect to information source, IDS are divided into two categories: host-
based and network-based. Host-based methods tend to monitor and analyze internal
computer operations, for instance by determining the resources that are allowed for
each host as well as illegal access attempts. Network-based systems, in contrast, deal
with intrusion at the network level. Anomalies at this level are often caused by external
attackers whose aim is to gain unauthorized network access, steal information, and
disrupt the network. Anomalies at this level are often caused by external attackers
whose aim is to gain unauthorized network access, steal information, and disrupt the
network. There are certain challenges for anomaly detection systems. Unlike traditional
data packets which are inherently static, data streams are continuous flows of data
which cannot be stored; they must be analyzed as one unit.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
28
HGAE DEPARTMENT OF CSE
2.4.1 RELATED WORK
Anomaly-based IDS have been extensively studied; however, few studies
present an incremental approach. Incremental methods may be supervised, semi-
supervised, and unsupervised. In this paper, supervised methods are considered which
model the normality of the data. Here, the problem of anomaly detection is converted
into one of classification.
• W.-Y. Yu and H.-M. Lee propose an incremental learning method by cascading
a Service Classifier (SC) using Incremental Tree Inductive (ITI) learning. The
cascading approach includes three steps:
(1) training;
(2) test;
(3) incremental learning.
• In another study, a novel anomaly detection system is proposed by Ren et al. to
which dynamically update normal usage profiles. Upon encountering new
behavior, density-based incremental clustering is used to insert the new
behavior into old profiles. The authors report less sensitivity to data disruptions
compared to Anomaly Detection with Fast Incremental Clustering (ADWICE)
profiles. The approach also improves cluster quality and reduces false alarms;
nevertheless, the method displays poor performance in working with large
datasets.
• Other authors propose Reserved Set-Incremental Support Vector Machine (RS-
IVM) which is an improved incremental SVM for intrusion detection. In order
to reduce the noise cause by large differences between feature values, the
authors propose a modified kernel function known as U-RBF which embeds
feature means and root square mean differences in the RBF kernel. The authors
claim that RS-ISVM facilitates the fluctuation phenomenon in the learning
process while providing better and more reliable performance. However, it
suffers from low U2R and R2L and requires a large number of parameters.
Many modern intrusion detection methods focus on feature selection or reduction. This
is because many features may be irrelevant or redundant and may inhibit system
performance. Efficient naive Bayesian classifiers are applied to the reduced dataset to
detect possible intrusions. Experimental results show that the selected features are more
appropriate for designing IDS and result in more effective intrusion detection.
In this paper, the naive Bayesian algorithm is evaluated using the KDD-NSL
dataset to detect four types of attacks: Probe, DoS, U2R, and R2L. Feature reduction
may use three standard feature selection methods: correlation, information gain, or gain
ratio. The proposed method in this study employs feature vitality-based reduction. The
results indicate that the proposed model provides better performance.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
29
HGAE DEPARTMENT OF CSE
2.4.2 NAIVE BAYESIAN CLASSIFICATION
Naive Bayesian classification is a popular method for stream mining. The
popularity of the method is due to the fact that the model can be updated with new data
streams very easily. The method is inherently incremental since new data points are
updated as they arrive. Given this incremental nature, the algorithm is very suitable to
stream mining.
Assuming m classes, namely C1, C2, … , Cm for tuple X, the classifier seeks to find the
class with the highest posterior probability on the condition X. In fact, the classifier
predicts whether tuple X belongs to the class. Therefore, X belongs to Ci if and only if:
(1)
Since P(X) remains constant for all classes, one must determine the class that
maximizes the expression. If prior probabilities are unknown, they are commonly
regarded as being equal i.e. p(C1) = (C2) = … = p(Cm); Hence, only p(X|Ci) must be
maximized. Moreover, the probabilities may be estimated using
, where |Ci,D| is the number training tuples with the label Ci .
Datasets with large numbers of features impose high calculation cost for p(X|Ci). To
reduce the calculations, the classes are assumed to be independent. Thus, the following
is true:
(2)
Using the training tuples, individual probabilities p(X1|Ci), p(X2|Ci), and p(Xn|Ci) may
be estimated.
2.4.3 ACTIVE LEARNING
Instead of inquiring about the correct labels for all instances, active learning
determines how input instances are selectively labeled. Quite often, this approach
requires considerably fewer instances to learn a concept, compared to typical
supervised methods. In active learning, once an instance is scanned, depending on the
selected strategy, the algorithm searches for the correct label and the predictive model
is trained with the new instance.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
30
HGAE DEPARTMENT OF CSE
In the following, we briefly explain four active learning strategies
• Random Strategy: Input samples are given random labels.
• Fixed Uncertainty Strategy: The instances for which the current classifier has
minimum confidence are labeled. A constant threshold is considered. Only
those instances are labeled for which the maximum posterior probability as
estimated by the classifier does not exceed the threshold.
• Variable Uncertainty Strategy: Instances below the threshold are labeled with
a time interval; the threshold is introduced as varying with time; and the budget
is spent in a uniform fashion over time.
• Uncertainty Strategy with Randomization: A random threshold is selected
and the labels for instances near the threshold are inquired.
2.4.4 PROPOSED METHOD
The proposed model, called Network Anomaly Detection using Active Learning
(NADAL) involves an offline and an online step. The selected dataset is preprocessed
in an offline fashion. The NSL-KDD dataset contains instances labeled with the attack
type. During the preprocessing step, the attacks are divided into four categories: DoS,
Probe, R2L, and U2R. Furthermore, there are four classifiers at the respective layers of
attacks. Thus, the preprocessing carried out using Weka selects the appropriate features
for each classifier. The selected features are then given to the feature filtering module
in NADAL.
Figure 2.4.1 illustrates the NADAL framework. In the proposed online method,
at each time, each instance is processed at most once to improve the model. The
instance is then discarded. Initially, instance Xt having label yt passes through the
feature filtering module and the appropriate features for each classifier are considered.
At each layer, the naive Bayesian module incrementally predicts the probability that the
instance belongs to the class. Thereafter, the selected active learning strategy (i.e.
uncertainty with randomization) is called. The output of the strategy determines
whether the label for the instance must be inquired. A logical OR gate is used to
aggregate the results from different active learning modules. The classifiers are updated
using the instance if the gate outputs 1. Otherwise, the aggregate output module
predicts the label according to the maximum certainty calculated by the classifiers. In
this case, ŷt represents the actual label for instance Xt.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
31
HGAE DEPARTMENT OF CSE
2.4.4 EVALUATION
The proposed framework in this paper was implemented using Java in NetBeans
8.0.2. Feature selection was performed using Weka and the Wrapper method. The
active learning modules as well as the incremental naive Bayesian module were
implemented by modifying the code from Massive Online Analysis (MOA1) 2016.04
written in Java. The standard NSLKDD2 dataset is used for evaluation purposes. The
dataset was randomized via the Randomize functionality in Weka. The accuracy and
Kappa values were then calculated for the framework at four layers: DoS, Probe, U2R,
and R2L. The results were compared to those of the incremental naïve Bayesian
approach in MOA.
Figure 2.4.1 : The proposed model called NADAL
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
32
HGAE DEPARTMENT OF CSE
A. Dataset
As mentioned earlier, in this paper, the standard NSL-KDD dataset is used for
evaluation purposes. The dataset is a revision of the KDD-99 without repetitive
and redundant instances. Each record includes 42nd
features. The KDDtrain+.txt
file was used wherein the 42nd
feature identifies a normal vs. attack label. There
are four types of attacks: DoS, Probe, R2L, and U2R
B. Evaluation Criteria
The results are evaluated according to accuracy and Kappa. Accuracy represents
the percentage of tuples in the dataset that are correctly labeled. The measure is
calculated as below:
(3)
The Kappa coefficient measures the agreement among individuals who classify
or measure items. The value is obtained as follows:
(4)
Where p0 and pc denote observed and chance agreement, respectively.
C. Implementation Results
The results exhibit a clear improvement in both accuracy and Kappa compared
to the incremental naive Bayesian approach. The results are shown for the NSL-
KDD dataset with randomizations.
Table 2.4.1 : ACCURACY AND KAPPA FOR TEN RANDOMIZATIONS: NADAL VS. INCREMENTAL
NAIVE BAYESIAN CLASSIFIER
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
33
HGAE DEPARTMENT OF CSE
2.4.5 CONCLUSION AND RECOMMENDATIONS
Traditional data packets are inherently static. In contrast, streaming data are
continuously created; they cannot be stored; and must by analyzed as a single unit. A
novel network anomaly detection framework was proposed to improve efficiency in
classifying data in an online fashion. Furthermore, active learning was used to reduce
labeling costs. The proposed system was evaluated using the standard NSL-KDD
dataset. Implementation results revealed that the proposed method outperforms the
naive Bayesian approach in terms of both accuracy and Kappa.
2.5 A DEEP LEARNING APPROACH FOR NETWORK
INTRUSION DETECTION SYSTEM
A Network Intrusion Detection System (NIDS) helps system administrators to
detect network security breaches in their organizations. However, many challenges
arise while developing a flexible and efficient NIDS for unforeseen and unpredictable
attacks. We propose a deep learning-based approach for developing such an efficient
and flexible NIDS. We use Self-taught Learning (STL), a deep learning-based
technique, on NSL-KDD - a benchmark dataset for network intrusion. We present the
performance of our approach and compare it with a few previous works. Compared
metrics include accuracy, precision, recall, and f-measure values.
A NIDS monitors and analyzes the network traffic entering into or exiting from
the network devices of an organization and raises alarms if an intrusion is observed.
Based on the methods of intrusion detection, NIDSs are categorized into two classes:
1) Signature (misuse) based NIDS (SNIDS)
2) Anomaly Detection based NIDS (ADNIDS)
In SNIDS, e.g. Snort, attack signatures are pre-installed in the NIDS. A pattern
matching is performed for the traffic against the installed signatures to detect an
intrusion in the network.
In contrast, an ADNIDS classifies network traffic as an intrusion when it
observes a deviation from the normal traffic pattern.
SNIDS is effective in the detection of known attacks and shows high detection
accuracy with less false-alarm rates. However, its performance suffers during
detection of unknown or new attacks due to the limitation of attack signatures
that can be installed beforehand in an IDS.
ADNIDS, on the other hand, is well-suited for the detection of unknown and
new attacks. Although ADNIDS produces high false-positive rates, its
theoretical potential in the identification of novel attacks has caused its wide
acceptance among the research community.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
34
HGAE DEPARTMENT OF CSE
There are primarily two challenges that arise while developing an efficient and
flexible NIDS for unknown future attacks. First, proper feature selections from the
network traffic dataset for anomaly detection is difficult. The features selected for one
class of attack may not work well for other categories of attacks due to continuously
changing and evolving attack scenarios. Second, unavailability of labeled traffic dataset
from real networks for developing a NIDS. Immense efforts are required to produce
such a labeled dataset from the raw network traffic traces collected over a period or in
real-time. Additionally, to preserve the confidentiality of the internal organizational
network structure as well as the privacy of various users, network administrators are
reluctant towards reporting any intrusion that might have occurred in their networks
Various machine learning techniques have been used to develop ADNIDSs,
such as Artificial Neural Networks (ANN), Support Vector Machines (SVM), Naive-
Bayesian (NB), Random Forests (RF), and Self-Organized Maps (SOM). The NIDSs
are developed as classifiers to differentiate the normal traffic from the anomalous
traffic. Many NIDSs perform a feature selection task to extract a subset of relevant
features from the traffic dataset to enhance classification results. Feature selection helps
in the elimination of the possibility of incorrect training through the removal of
redundant features and noises. Recently, deep learning-based methods have been
successfully applied in audio, image, and speech processing applications. These
methods aim to learn a good feature representation from a large amount of unlabeled
data and subsequently apply these learned features on a limited amount of labeled data
in a supervised classification. The labeled and unlabeled data may come from different
distributions. However, they must be relevant to each other.
It is envisioned that the deep learning-based approaches can help to overcome
the challenges of developing an efficient NIDS. We can collect unlabeled network
traffic data from different network sources and a good feature representation from these
datasets using deep learning techniques can be obtained. These features can, then, be
applied for supervised classification to a small, but labeled traffic dataset consisting of
normal as well as anomalous traffic records. The traffic data for labeled dataset can be
collected in a confined, isolated and private network environment. With this
motivation, we use self-taught learning, a deep learning technique based on sparse
autoencoder and soft-max regression, to develop a NIDS. We verify the usability of the
self-taught learning-based NIDS by applying on NSL-KDD intrusion dataset, an
improved version of the benchmark dataset for various NIDS evaluations - KDD Cup
99.
2.5.1 RELATED WORK
This section presents various recent accomplishments in this area. we only
discuss the work that have used the NSL-KDD dataset for their performance
benchmarking. Therefore, any dataset referred from this point forward should be
considered as NSL-KDD. This approach allows a more accurate comparison of work
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
35
HGAE DEPARTMENT OF CSE
with other found in the literature. Finally, we discuss a few deep-learning based
approaches that have been tried so far for similar kind of work.
One of the earliest works found in literature used ANN with enhanced resilient
back-propagation for the design of such an IDS. This work used only the training
dataset for training (70%), validation (15%) and testing (15%). As expected, use of
unlabeled data for testing resulted in a reduction of performance.
A more recent work used J48 decision tree classifier with 10-fold cross-
validation for testing on the training dataset. This work used a reduced feature set of 22
features instead of the full set of 41 features.
A similar work evaluated various popular supervised tree-based classifiers and
found that Random Tree model performed best with the highest degree of accuracy
along with a reduced false alarm rate.
Many 2-level classification approaches have also been proposed. One such
work used Discriminative Multinomial Naive Bayes (DMNB) as a base classifier and
Nominal-to Binary supervised filtering at the second level along with 10-fold cross
validation for testing. This work was further extended to use Ensembles of Balanced
Nested Dichotomies (END) at the first level and Random Forest at the second level. As
expected, this enhancement resulted in an improved detection rate and a lower false
positive rate.
Another 2-level implementation used principal component analysis (PCA) for
the feature set reduction and then SVM (using Radial Basis Function) for final
classification, resulted in a high detection accuracy with only the training dataset and
full 41 features set. A reduction in features set to 23 resulted in even better detection
accuracy in some of the attack classes, but the overall performance was reduced. The
authors improved their work by using information gain to rank the features and then a
behavior-based feature selection to reduce the feature set to 20. This resulted in an
improvement in reported accuracy using the training dataset.
The second category to look at, used both the training and test dataset. An initial
attempt in this category used fuzzy classification with genetic algorithm and resulted in
a detection accuracy of 80%+ with a low false positive rate. Another important work
used unsupervised clustering algorithms and found that the performance using only the
training data was reduced drastically when test data was also used.
A similar implementation using the k-point algorithm resulted in a slightly
better detection accuracy and lower false positive rate, using both training and test
datasets.
Another less popular technique, OPF (optimum path forest) which uses graph
partitioning for feature classification, was found to demonstrate a high detection
accuracy within one-third of the time compared to SVMRBF method.
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
36
HGAE DEPARTMENT OF CSE
A deep learning approach with Deep Belief Network (DBN) as a feature
selector and SVM as a classifier resulted in an accuracy of 92.84% when applied on
training data.
2.5.2 SELF-TAUGHT LEARNING & NSL-KDD DATASET
OVERVIEW
1) Self-Taught Learning
Self-taught Learning (STL) is a deep learning approach that consists of two
stages for the classification. First, a good feature representation is learnt from a
large collection of unlabeled data, xu, termed as Unsupervised Feature Learning
(UFL). In the second stage, this learnt representation is applied to labeled data,
xl, and used for the classification task. Figure 2.5.1 shows the architecture
diagram of STL. There are different approaches used for UFL, such as Sparse
Autoencoder, Restricted Boltzmann Machine (RBM), K-Means Clustering, and
Gaussian Mixtures.
A sparse autoencoder is a neural network consists of an input, a hidden, and an
output layer. The input and output layers contain N nodes, and the hidden layer
contains K nodes. The target values at the output layer are set equal to the input
values, i.e., x̂ i = xi as shown in Figure 2.5.1(a). The sparse autoencoder network
finds the optimal values for weight matrices, W ∈ K×N and V ∈ N×K, and bias
vectors, b1 ∈ K×1 and b2 ∈ N×1, using back-propagation algorithm while
trying to learn the approximation of the identity function, i.e., output x̂ similar to
x. Sigmoid function, 𝑔(𝑧) =
1
1+ⅇ−𝑧
, is used for the activation, hW,b of the nodes
in the hidden and output layers:
hW,b(x) = g(Wx + b) (1)
(2)
The cost function to be minimized in sparse autoencoder using back-
propagation is represented by Eq. (2). The first term is the average of sum-of-
square error terms for all m input data. The second term is a weight decay term,
with λ as weight decay parameter, to avoid the over-fitting in training. The last
term in the equation is sparsity penalty term that puts a constraint into the
hidden layer to maintain a low average activation values, and expressed as
KullbackLeibler (KL) divergence shown in Eq. (3):
(3)
NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION
37
HGAE DEPARTMENT OF CSE
where ρ is a sparsity constraint parameter ranges from 0 to 1 and β controls the
sparsity penalty term. The KL(ρ||p̂ j) attains a minimum value when ρ = p̂ j,
where p̂ j denotes the average activation value of hidden unit j over all training
inputs x. Once we learn optimal values for W and b1 by applying the sparse
autoencoder on unlabeled data, xu, we evaluate the feature representation a =
hW,b1(xl) for the labeled data, (xl,y). We use this new feature representation, a,
with the labels vector, y, for the classification task in the second stage. We use
soft-max regression for the classification task as shown in the Figure 2.5.1(b)
2) NSL-KDD Dataset
NSL-KDD dataset is an improved and reduced version of the KDD Cup 99
dataset. The KDD Cup dataset was prepared using the network traffic captured
by 1998 DARPA IDS evaluation program. The network traffic includes normal
and different kinds of attack traffic, such as DoS, Probing, user-to-root (U2R),
and root-to-local (R2L). The network traffic for training was collected for seven
weeks followed by two weeks of traffic collection for testing in raw tcpdump
format. The test data contains many attacks that were not injected during the
training data collection phase to make the intrusion detection task realistic. It is
believed that most of the novel attacks can be derived from the known attacks.
Finally, the training and test data were processed into the datasets of five
million and two million TCP/IP connection records, respectively. The KDD
Cup dataset has been widely used as a benchmark dataset for many years in the
evaluation of NIDS. One of the major drawbacks with the dataset is that it
contains an enormous amount of redundant records both in the training and test
data. It was observed that almost 78% and 75% records are redundant in the
Figure 2.5.1 : The two-stage process of self-taught learning: a) Unsupervised
Feature Learning (UFL) on unlabeled data. b) Classification on labeled data.
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection
Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection

More Related Content

What's hot

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection systemAAKASH S
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection systemAparna Bhadran
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)Papun Papun
 
Intrusion Detection System(IDS)
Intrusion Detection System(IDS)Intrusion Detection System(IDS)
Intrusion Detection System(IDS)shraddha_b
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Simplilearn
 
Network Security and Cryptography
Network Security and CryptographyNetwork Security and Cryptography
Network Security and CryptographyAdam Reagan
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognitionsaniya shaikh
 
intrusion detection system (IDS)
intrusion detection system (IDS)intrusion detection system (IDS)
intrusion detection system (IDS)Aj Maurya
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
Iot forensics
Iot forensicsIot forensics
Iot forensicsAbeis Ab
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsPrashanth Guntal
 
DNA based Cryptography_Final_Review
DNA based Cryptography_Final_ReviewDNA based Cryptography_Final_Review
DNA based Cryptography_Final_ReviewRasheed Karuvally
 
Intrusion detection system ppt
Intrusion detection system pptIntrusion detection system ppt
Intrusion detection system pptSheetal Verma
 

What's hot (20)

Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Packet sniffers
Packet sniffersPacket sniffers
Packet sniffers
 
Intrusion prevention system(ips)
Intrusion prevention system(ips)Intrusion prevention system(ips)
Intrusion prevention system(ips)
 
Intrusion Detection System(IDS)
Intrusion Detection System(IDS)Intrusion Detection System(IDS)
Intrusion Detection System(IDS)
 
Intrusion Prevention System
Intrusion Prevention SystemIntrusion Prevention System
Intrusion Prevention System
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
 
Network Security and Cryptography
Network Security and CryptographyNetwork Security and Cryptography
Network Security and Cryptography
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Network Forensics
Network ForensicsNetwork Forensics
Network Forensics
 
Cnn
CnnCnn
Cnn
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognition
 
intrusion detection system (IDS)
intrusion detection system (IDS)intrusion detection system (IDS)
intrusion detection system (IDS)
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
Iot forensics
Iot forensicsIot forensics
Iot forensics
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
DNA based Cryptography_Final_Review
DNA based Cryptography_Final_ReviewDNA based Cryptography_Final_Review
DNA based Cryptography_Final_Review
 
Ch07 Access Control Fundamentals
Ch07 Access Control FundamentalsCh07 Access Control Fundamentals
Ch07 Access Control Fundamentals
 
Intrusion detection system ppt
Intrusion detection system pptIntrusion detection system ppt
Intrusion detection system ppt
 
Network attacks
Network attacksNetwork attacks
Network attacks
 

Similar to Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection

IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET Journal
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
 
An intrusion detection system for packet and flow based networks using deep n...
An intrusion detection system for packet and flow based networks using deep n...An intrusion detection system for packet and flow based networks using deep n...
An intrusion detection system for packet and flow based networks using deep n...IJECEIAES
 
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...IJCSIS Research Publications
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
 
Intrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIntrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIRJET Journal
 
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
 
Analyzing and implementing of network penetration testing
Analyzing and implementing of network penetration testingAnalyzing and implementing of network penetration testing
Analyzing and implementing of network penetration testingEngr Md Yusuf Miah
 
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...IJCNCJournal
 
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...IJCNCJournal
 
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETCLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
 
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETCLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
 
Three level intrusion detection system based on conditional generative advers...
Three level intrusion detection system based on conditional generative advers...Three level intrusion detection system based on conditional generative advers...
Three level intrusion detection system based on conditional generative advers...IJECEIAES
 
Network Intrusion Detection System using Machine Learning
Network Intrusion Detection System using Machine LearningNetwork Intrusion Detection System using Machine Learning
Network Intrusion Detection System using Machine LearningIRJET Journal
 
Constructing a predictive model for an intelligent network intrusion detection
Constructing a predictive model for an intelligent network intrusion detectionConstructing a predictive model for an intelligent network intrusion detection
Constructing a predictive model for an intelligent network intrusion detectionAlebachew Chiche
 
Detecting network attacks model based on a convolutional neural network
Detecting network attacks model based on a convolutional neural network Detecting network attacks model based on a convolutional neural network
Detecting network attacks model based on a convolutional neural network IJECEIAES
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
 
A survey of Network Intrusion Detection using soft computing Technique
A survey of Network Intrusion Detection using soft computing TechniqueA survey of Network Intrusion Detection using soft computing Technique
A survey of Network Intrusion Detection using soft computing Techniqueijsrd.com
 

Similar to Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection (20)

IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
IRJET- Review on Network Intrusion Detection using Recurrent Neural Network A...
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
 
An intrusion detection system for packet and flow based networks using deep n...
An intrusion detection system for packet and flow based networks using deep n...An intrusion detection system for packet and flow based networks using deep n...
An intrusion detection system for packet and flow based networks using deep n...
 
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
Enhanced Intrusion Detection System using Feature Selection Method and Ensemb...
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docx
 
Intrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An OverviewIntrusion Detection System Using Machine Learning: An Overview
Intrusion Detection System Using Machine Learning: An Overview
 
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest Classifier
 
Analyzing and implementing of network penetration testing
Analyzing and implementing of network penetration testingAnalyzing and implementing of network penetration testing
Analyzing and implementing of network penetration testing
 
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
An Efficient Intrusion Detection System with Custom Features using FPA-Gradie...
 
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
AN EFFICIENT INTRUSION DETECTION SYSTEM WITH CUSTOM FEATURES USING FPA-GRADIE...
 
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETCLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
 
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETCLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SET
 
Three level intrusion detection system based on conditional generative advers...
Three level intrusion detection system based on conditional generative advers...Three level intrusion detection system based on conditional generative advers...
Three level intrusion detection system based on conditional generative advers...
 
Network Intrusion Detection System using Machine Learning
Network Intrusion Detection System using Machine LearningNetwork Intrusion Detection System using Machine Learning
Network Intrusion Detection System using Machine Learning
 
Constructing a predictive model for an intelligent network intrusion detection
Constructing a predictive model for an intelligent network intrusion detectionConstructing a predictive model for an intelligent network intrusion detection
Constructing a predictive model for an intelligent network intrusion detection
 
Detecting network attacks model based on a convolutional neural network
Detecting network attacks model based on a convolutional neural network Detecting network attacks model based on a convolutional neural network
Detecting network attacks model based on a convolutional neural network
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
 
A survey of Network Intrusion Detection using soft computing Technique
A survey of Network Intrusion Detection using soft computing TechniqueA survey of Network Intrusion Detection using soft computing Technique
A survey of Network Intrusion Detection using soft computing Technique
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Seminar Report | Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection

  • 1. Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection SEMINAR REPORT Submitted by JOWIN JOHN CHEMBAN in partial fulfillment for the award of the degree of Bachelor of Technology in COMPUTER SCIENCE AND ENGINEERING of APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING HOLY GRACE ACADEMY OF ENGINEERING MALA 680 735 NOVEMBER 2019
  • 2. CERTIFICATE This is to Certify that the seminar report entitled “Network Intrusion Detection using Supervised Machine Learning Technique with Feature Selection” is a bonafide record of the work done by Mr. JOWIN JOHN CHEMBAN, Register No. HGW16CS022 under our supervision, in partial fulfillment of the requirements for the award of Degree of Bachelor of Technology in Computer Science & Engineering from APJ Abdul Kalam Technological University, Trivandrum for the years 2016-2020 Ms. SUJITHA B CHERKOTTU Ms. VIDHU VALSAN A Asst Professor, Dept. of CSE Asst Professor, Dept. of CSE Seminar Coordinator Seminar Guide Ms. SANAM ANTO Head of Department, Dept. of CSE Date :
  • 3. ACKNOWLEDGEMENT An endeavor over a long period may be successful only with the advice and guidance of many well-wishers. I take this opportunity to express my gratitude to all who encouraged me to complete this seminar. I would like to express my deep sense of gratitude to my respected Principal Dr. THRESIAMMA PHILIP for her inspiration and for creating an atmosphere in the college to do the seminar. I would like to thank Ms. SANAM ANTO, Head of Department of Computer Science and Engineering for providing permission and facilities to conduct the seminar in a systematic way, and for guiding me and giving timely advices, suggestions and whole-hearted moral support in the successful completion of this seminar. My sincere thanks to the seminar coordinator Ms. SUJITHA B CHERKOTTU, Assistant Professor in Department of Computer Science and Engineering for their wholehearted moral support in completion of this seminar. My sincere thanks to my seminar guide Ms. VIDHU VALSAN A, Assistant Professor in Department of Computer Science and Engineering for their wholehearted moral support in completion of this seminar. Last but not the least, I would like to thank all the Lectures and non-teaching staff and my friends who have helped me in every possible way in the completion of my seminar. Date : JOWIN JOHN CHEMBAN
  • 4. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE ABSTRACT A novel supervised machine learning system is developed to classify network traffic whether it is malicious or benign. To find the best model considering detection success rate, combination of supervised learning algorithm and feature selection method have been used. Through this study, it is found that Artificial Neural Network (ANN) based machine learning with wrapper feature selection outperform support vector machine (SVM) technique while classifying network traffic. To evaluate the performance, NSL- KDD dataset is used to classify network traffic using SVM and ANN supervised machine learning techniques. Comparative study shows that the proposed model is efficient than other existing models with respect to intrusion detection success rate.
  • 5. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. 1 INTRODUCTION 1 2 LITERATURE SURVEY 3 2.1 IMPORTANCE OF INTRUSION DETECTION SYSTEM (IDS) 3 2.2 MACHINE LEARNING TECHNIQUES FOR INTRUSION DETECTION 8 2.3 ANOMALY-BASED NETWORK INTRUSION DETECTION: TECHNIQUES, SYSTEMS AND CHALLENGES 16 2.4 INCREMENTAL ANOMALY-BASED INTRUSION DETECTION SYSTEM USING LIMITED LABELED DATA 26 2.5 A DEEP LEARNING APPROACH FOR NETWORK INTRUSION DETECTION SYSTEM 33 3 PROPOSED SYSTEM 43 4 APPLICATIONS 49 5 CONCLUSION 50 REFRENCES 51
  • 6. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE LIST OF ABBREVATIONS NIDS Network Intrusion Detection System IDS Intrusion Detection System UTM Unified Threat Modeling IPS Intrusion Prevention System SVM Support Vector Machine ANN Artificial Neural Network DIDS Distributed Intrusion Detection System CMDS Computer Misuse Detection System ASIM Automated Security Measurement System AFCERT Air Force’s Computer Emergency Response Team TCP Transfer Control Protocol IP Internet Protocol HIDS Host based Intrusion Detection System AI Artificial Intelligence CI Computational Intelligence ML Machine Learning kNN k-Nearest Neighbor MLP Multi-Layer Perceptron SVM Support Vector Machine UDP User Datagram Protocol GA Genetic Algorithms KDD Knowledge Discovery in Databases RBF Radial Basis Function DoS Denial of Service R2L Root to Local U2R User to Root PRB Probing AIS Artificial Immune System
  • 7. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE NSA Negative Selection Algorithm CIDF Common Intrusion Detection Framework IDWG Intrusion Detection Working Group IDXP Intrusion Detection eXchange Protocol IDMEF Intrusion Detection MEssage Format OS Operating System A-NIDS Anomaly based Network Intrusion Detection System TP True Positive FP False Positive TN True Negative FN False Negative LAN Local Area Network SC Service Classifier ITI Incremental Tree Inductive NADAL Network Anomaly Detection using Active Learning STL Self-Taught Learning SNIDS Signature (misuse) based Network Intrusion Detection System ADNIDS Anomaly Detection based Network Intrusion Detection System ANN Artificial Neural Network SVM Support Vector Machine NB Naïve Bayesian RF Random Forests SOM Self-Organized Maps DMNB Discriminative Multinomial Naïve Bayes END Ensembles of Balanced Nested Dichotomies OPF Optimum Path Forest DBN Deep Belief Network UFL Unsupervised Feature Learning RBM Restricted Boltzmann Machine SMR Soft Max Regression CPU Central Processing Unit
  • 8. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE LIST OF FIGURES NO. TITLE PAGE NO. 2.1.1 Number of incidents reported 5 2.1.2 Vulnerabilities reported 6 2.1.3 Layered Security approach for reducing risk 7 2.2.1 Average of detection rates for methods evaluated in Pavel Laskov, Patrick Dssel, Christin Schfer, and Konrad Rieck. Learning intrusion detection: Supervised or unsupervised? 11 2.3.1 General CIDF architecture for IDS systems 17 2.3.2 Generic A-NIDS functional architecture 18 2.3.3 Classification of the anomaly detection techniques according to the nature of the processing involved in the ‘‘behavioural’’ model considered. 19 2.4.1 The proposed model called NADAL 31 2.5.1 The two-stage process of self-taught learning: a) Unsupervised Feature Learning on unlabeled data. b) Classification on labeled data. 37 2.5.2 Various steps involved in our NIDS implementation 39 2.5.3 Classification accuracy using self-taught learning and soft-max regression for 2Class, 5-Class, and 23-Class when applied to training data 2.5.4 Precision, Recall, and F-Measure values using self-taught learning and soft-max regression for 2-Class when applied to training data 41 2.5.5 Classification accuracy using self-taught learning and soft-max regression for 2-class and 5-class when applied to test data 41 2.5.6 Precision, Recall, and F-Measure values using self-taught learning and soft-max regression for 2-class when applied to test data 41 2.5.7 Precision, Recall, and F-Measure values using self-taught learning and soft-max regression for 5-class when applied to test data 42 3.1 Proposed supervised machine learning classifier system 44 3.2 SVM classifier in two-dimensional problem spaces 45 3.3 Artificial neural network showing the input, output and hidden layers 46
  • 9. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION HGAE DEPARTMENT OF CSE LIST OF TABLES NO. TITLE PAGE NO. 2.3.1 Fundamentals of the A-NIDS techniques 20 2.4.1 ACCURACY AND KAPPA FOR TEN RANDOMIZATIONS: NADAL VS. INCREMENTAL NAIVE BAYESIAN CLASSIFIER 32 2.5.1 Traffic records distribution in the training 38 3.1 RESULT OF FEATURE SELECTION 47 3.2 RESULT OF CLASSIFICATION 47 3.3 PERFORMANCE COMPARISON WITH EXISTING MODELS 48
  • 10. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 1 HGAE DEPARTMENT OF CSE CHAPTER 1 INTRODUCTION 1.1 NIDS using Supervised Machine Learning with Feature Selection With the wide spreading usages of internet and increases in access to online contents, cybercrime is also happening at an increasing rate. Intrusion is some time also called as hacker or cracker attempting to break into or misuse your system/network. Intrusion detection is the first step to prevent security attack. Hence the security solutions such as Firewall, Intrusion Detection System (IDS), Unified Threat Modeling (UTM) and Intrusion Prevention System (IPS) are getting much attention in studies. IDS detect attacks from a variety of systems and network sources by collecting information and then analyzes the information for possible security breaches. An IDS installed on a network/system provides much the same purpose as a burglar alarm system installed in a house. Through various methods, both detect when an intruder/attacker/burglar is present, and both subsequently issue some type of warning or alert. The network-based IDS analyze the data packets that travel over a network and this analysis are carried out in two ways. Till today anomaly-based detection is far behind than the detection that works based on signature and hence anomaly-based detection still remains a major area for research. The challenges with anomaly-based intrusion detection are that it needs to deal with novel attack for which there is no prior knowledge to identify the anomaly. Hence the system somehow needs to have the intelligence to segregate which traffic is harmless and which one is malicious or anomalous and for that machine learning techniques are being explored by the researchers over the last few years. IDS however is not an answer to all security related problems. For example, IDS cannot compensate weak identification and authentication mechanisms or if there is a weakness in the network protocols. Studying the field of intrusion detection first started in 1980 and the first such model was published in 1987. For the last few decades, though huge commercial investments and substantial research were done, intrusion detection technology is still immature and hence not effective. While network IDS that works based on signature have seen commercial success and widespread adoption by the technology-based organization throughout the globe, anomaly-based network IDS have not gained success in the same scale. Due to that reason in the field of IDS, currently anomaly- based detection is a major focus area of research and development. And before going to any wide scale deployment of anomaly-based intrusion detection system, key issues remain to be solved. But the literature today is limited when it comes to compare on how intrusion detection performs when using supervised machine learning techniques. To protect target systems and networks against malicious activities anomaly-based network IDS is a valuable technology. Despite the variety of anomaly-based network
  • 11. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 2 HGAE DEPARTMENT OF CSE intrusion detection techniques described in the literature in recent years, anomaly detection functionalities enabled security tools are just beginning to appear, and some important problems remain to be solved. Several anomaly-based techniques have been proposed including Linear Regression, Support Vector Machines (SVM), Genetic Algorithm, Gaussian mixture model, k-nearest neighbor algorithm, Naive Bayes classifier, Decision Tree. Among them the most widely used learning algorithm is SVM as it has already established itself on different types of problem. One major issue on anomaly-based detection is though all these proposed techniques can detect novel attacks but they all suffer a high false alarm rate in general. The cause behind is the complexity of generating profiles of practical normal behaviour by learning from the training data sets. Today Artificial Neural Network (ANN) are often trained by the back-propagation algorithm, which had been around since 1970 as the reverse mode of automatic differentiation. The major challenges in evaluating performance of network IDS is the unavailability of a comprehensive network-based data set. Most of the proposed anomaly-based techniques found in the literature were evaluated using KDD CUP 99 dataset. In this paper we used SVM and ANN –two machine learning techniques, on NSLKDD which is a popular benchmark dataset for network intrusion. The promise and the contribution machine learning did till today are fascinating. There are many real-life applications we are using today offered by machine learning. It seems that machine learning will rule the world in coming days. Hence, we came out into a hypothesis that the challenge of identifying new attacks or zero-day attacks facing by the technology enabled organizations today can be overcome using machine learning techniques. Here we developed a supervised machine learning model that can classify unseen network traffic based on what is learnt from the seen traffic. We used both SVM and ANN learning algorithm to find the best classifier with higher accuracy and success rate.
  • 12. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 3 HGAE DEPARTMENT OF CSE CHAPTER 2 LITERATURE SURVEY 2.1 IMPORTANCE OF INTRUSION DETECTION SYSTEM (IDS) Intruders computers, who are spread across the Internet have become a major threat in our world, the researchers proposed a number of techniques such as (firewall, encryption) to prevent such penetration and protect the infrastructure of computers, but with this, the intruders managed to penetrate the computers. IDS has taken much of the attention of researchers, IDS monitor the resources computer and sends reports on the activities of any anomaly or strange patterns. The aim of this paper is to explain the stages of the evolution of the idea of IDS and its importance to researchers and research centres, security, military and to examine the importance of intrusion detection systems and categories, classifications, and where can put IDS to reduce the risk to the network Security is an important issue for all the networks of companies and institutions at the present time and all the intrusions are trying in ways that successful access to the data network of these companies and Web services and despite the development of multiple ways to ensure that the infiltration of intrusion to the infrastructure of the network via the Internet, through the use of firewalls, encryption, etc. But IDS is a relatively new technology of the techniques for intrusion detection methods that have emerged in recent years. Intrusion detection system’s main role in a network is to help computer systems to prepare and deal with the network attacks. Intrusion detection functions include: • Monitoring and analyzing both user and system activities • Analyzing system configurations and vulnerabilities • Assessing system and file integrity • Ability to recognize patterns typical of attacks • Analysis of abnormal activity patterns • Tracking user policy violations The purpose of IDS is to help computer systems on how to deal with attacks, and that IDS is collecting information from several different sources within the computer systems and networks and compares this information with preexisting patterns of discrimination as to whether there are attacks or weaknesses.
  • 13. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 4 HGAE DEPARTMENT OF CSE 2.1.1 INTRUSION DETECTION SYSTEMS: ABRIEF HISTORY The goal of intrusion detection is to monitor network assets to detect anomalous behaviour and misuse in network. This concept has been around for nearly twenty years but only recently has it seen a dramatic rise in popularity and incorporation into the overall information security infrastructure. Beginning in 1980, with James Anderson's paper, Computer Security Threat Monitoring and Surveillance, the intrusion detection was born. Since then, several polar events in IDS technology have advanced intrusion detection to its current state. James Anderson's seminal paper, was written for a government organization, introduced the notion that audit trails contained vital information that could be valuable in tracking misuse and understanding of user behaviour. With the release of this paper, the concept of "detecting" misuse and specific user events emerged. His insight into audit data and its importance led to tremendous improvements in the auditing subsystems of virtually every operating system. Anderson's hypothesize also provided the foundation for future intrusion detection system design and development. His work was the start of host-based intrusion detection and IDS in general. In 1983, SRI International, and Dr. Dorothy Denning, began working on a government project that launched a new effort into intrusion detection system development. Their goal was to analyze audit trails from government mainframe computers and create profiles of users based upon their activities. One year later, Dr. Denning helped to develop the first model for intrusion detection, the Intrusion Detection Expert System (IDES), which provided the foundation for the IDS technology development that was soon to follow. In 1984, SRI also developed a means of tracking and analyzing audit data containing authentication information of users on ARPANET, the original Internet. Soon after, SRI completed a Navy SPAWAR contract with the realization of the first functional intrusion detection system, IDES. Using her research and development work at SRI, Dr. Denning published the decisive work, An Intrusion Detection Model, which revealed the necessary information for commercial intrusion detection system development. The subsequent iteration of this tool was called the Distributed Intrusion Detection System (DIDS). DIDS augmented the existing solution by tracking client machines as well as the servers it originally monitored. Finally, in 1989, the developers from the Haystack project formed the commercial company, Haystack Labs, and released the last generation of the technology, Stalker. Crosby Marks says that "Stalker was a host-based, pattern matching system that included robust search capabilities to manually and automatically query the audit data." The Haystack advances, coupled with the work of SRI and Denning, greatly advanced the development of host-based intrusion detection technologies.
  • 14. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 5 HGAE DEPARTMENT OF CSE Commercial development of intrusion detection technologies began in the early 1990s. Haystack Labs was the first commercial vendor of IDS tools, with its Stalker line of host-based products. SAIC was also developing a form of host-based intrusion detection, called Computer Misuse Detection System (CMDS). Simultaneously, the Air Force's Crypto Logic Support Canter developed the Automated Security Measurement System (ASIM) to monitor network traffic on the US Air Force's network. ASIM made considerable progress in overcoming scalability and portability issues that previously plagued NID products. Additionally, ASIM was the first solution to incorporate both a hardware and software solution to network intrusion detection. ASIM is still currently in use and managed by the Air Force's Computer Emergency Response Team (AFCERT) at locations all over the world. As often happened, the development group on the ASIM project formed a commercial company in 1994, the Wheel Group. Their product, Net Ranger, was the first commercially viable network intrusion detection device. The intrusion detection market began to gain in popularity and truly generate revenues around 1997. In that year, the security market leader, ISS, developed a network intrusion detection system called Real Secure. A year later, Cisco recognized the importance of network intrusion detection and purchased the Wheel Group, attaining a security solution they could provide to their customers. Similarly, the first visible host-based intrusion detection company, Centrex Corporation, emerged as a result of a merger of the development staff from Haystack Labs and the departure of the CMDS team from SAIC. From there, the commercial IDS world expanded its market- base and a roller coaster ride of start-up companies, mergers, and acquisitions ensued. Network intrusion detection actually deals with information passing on the wire between hosts. Typically referred to as "packet-sniffers," network intrusion detection devices intercept packets travelling in and out in network along various communication mediums and protocols, usually TCP/IP. Once captured, the packets are analyzed in a number of different ways. Some IDS devices will simply compare the packet to a signature database consisting of known attacks and malicious packet "fingerprints", Figure 2.1.1 : Number of incidents reported
  • 15. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 6 HGAE DEPARTMENT OF CSE while others will look for anomalous packet activity that might indicate malicious behaviour. The IDS basically monitor network traffic for activity that falls within the banned activity in the network. The IDS main job is gives alert to network admins for allow them to take corrective action, blocking access to vulnerable ports, denying access to specific IP address or shutting down services used to allow attacks. This is nothing but front-line weapon in the network admins war against hackers. This information is then compared with predefined blueprints of known attacks and vulnerabilities. 2.1.2 CATEGORIES OF INTRUSION DETECTION SYSTEM Intrusion detection system is classified into three categories: signature-based detection systems, anomaly-based detection systems and specification-based detection systems. 1) Signature based Detection System Signature based detection system (also called misuse based), This type of detection is very effective against known attacks, and it depends on the receiving of regular updates of patterns and will be unable to detect unknown previous threats or new releases. 2) Anomaly based Detection System This type of detection depends on the classification of the network to the normal and anomalous, as this classification is based on rules or heuristics rather than patterns or signatures and the implementation of this system we first need to know the normal behaviour of the network. Anomaly based detection system unlike the misuse-based detection system because it can detect previous unknown threats, But the false positive to rise more probably. 3) Specification based Detection System Figure 2.1.2 : Vulnerabilities reported
  • 16. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 7 HGAE DEPARTMENT OF CSE This type of detection systems is responsible for monitoring the processes and matching the actual data with the program and in case of any Abnormal behaviour will be issued an alert and must be maintained and updated whenever a change was made on the surveillance programs in order to be able to detect the previous attacks the unknown and the number of false positives what can be less than the anomaly detection system approach. 2.1.3 CLASSIFICATION OF INTRUSION DETECTION SYSTEM Intrusion detection system are classified into three types 1) Host based IDS (HIDS) This type is placed on one device such as server or workstation, where the data is analyzed locally to the machine and are collecting this data from different sources. HIDS can use both anomaly and misuse detection system. 2) Network based IDS (NIDS) NIDS are deployed on strategic point in network infrastructure. The NIDS can capture and analyze data to detect known attacks by comparing patterns or signatures of the database or detection of illegal activities by scanning traffic for anomalous activity. NIDS are also referred as “packet-sniffers”, Because it captures the packets passing through the of communication mediums. 3) Hybrid based IDS The management and alerting from both network and host-based intrusion detection devices, and provide the logical complement to NID and HID - central intrusion detection management. Figure 2.1.3 : Layered Security approach for reducing risk
  • 17. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 8 HGAE DEPARTMENT OF CSE 2.1.4 CONCLUSION An intrusion detection system is a part of the defensive operations that complements the defenses such as firewalls, UTM etc. The intrusion detection system basically detects attack signs and then alerts. According to the detection methodology, intrusion detection systems are typically categorized as misuse detection and anomaly detection systems. The deployment perspective, they are be classified in network based or host-based IDS. In current intrusion detection systems where information is collected from both network and host resources. In terms of performance, an intrusion detection system becomes more accurate as it detects more attacks and raises fewer false positive alarms. 2.2 MACHINE LEARNING TECHNIQUES FOR INTRUSION DETECTION An Intrusion Detection System (IDS) is a software that monitors a single or a network of computers for malicious activities (attacks) that are aimed at stealing or censoring information or corrupting network protocols. Most techniques used in today’s IDS are not able to deal with the dynamic and complex nature of cyber-attacks on computer networks. Hence, efficient adaptive methods like various techniques of machine learning can result in higher detection rates, lower false alarm rates and reasonable computation and communication costs. In this paper, we study several such schemes and compare their performance. We divide the schemes into methods based on classical artificial intelligence (AI) and methods based on computational intelligence (CI). We explain how various characteristics of CI techniques can be used to build efficient IDS. Today, political and commercial entities are increasingly engaging in sophisticated cyber-warfare to damage, disrupt, or censor information content in computer networks. In designing network protocols, there is a need to ensure reliability against intrusions of powerful attackers that can even control a fraction of parties in the network. The controlled parties can launch both passive (e.g., eavesdropping, nonparticipation) and active attacks (e.g., jamming, message dropping, corruption, and forging). Intrusion detection is the process of dynamically monitoring events occurring in a computer system or network, analyzing them for signs of possible incidents and often interdicting the unauthorized access. This is typically accomplished by automatically collecting information from a variety of systems and network sources, and then analyzing the information for possible security problems.
  • 18. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 9 HGAE DEPARTMENT OF CSE Motivation Traditional intrusion detection and prevention techniques, like firewalls, access control mechanisms, and encryptions, have several limitations in fully protecting networks and systems from increasingly sophisticated attacks like denial of service. Moreover, most systems built based on such techniques suffer from high false positive and false negative detection rates and the lack of continuously adapting to changing malicious behaviours. In the past decade, however, several Machine Learning (ML) techniques have been applied to the problem of intrusion detection with the hope of improving detection rates and adaptability. These techniques are often used to keep the attack knowledge bases up-to-date and comprehensive. Study Approach In this paper, we study several papers that use ML methods for detecting malicious behaviour in distributed computer systems. There is a huge body of work in this area thus, we decided to carefully select a few papers based on two factors: diversity and citations count. By diversity we mean most ML techniques for IDS are covered but only one paper is picked from the set of papers that use the same technique. Also, the papers are chosen based on their citations count as this factor greatly shows how much the corresponding work has influenced the community. All non-survey papers studied here are cited at least 100 times. 2.2.1 CHALLENGES AND APPROACHES An IDS generally has to deal with problems such as large network traffic volumes, highly uneven data distribution, the difficulty to realize decision boundaries between normal and abnormal behaviour, and a requirement for continuous adaptation to a constantly changing environment. In general, the challenge is to efficiently capture and classify various behaviours in a computer network. Strategies for classification of network behaviours are typically divided into two categories: misuse detection and anomaly detection. Misuse detection techniques examine both network and system activity for known instances of misuse using signature matching algorithms. This technique is effective at detecting attacks that are already known. However, novel attacks are often missed giving rise to false negatives. Alerts may be generated by the IDS, but reaction to every alert wastes time and resources leading to instability of the system. To overcome this problem, IDS should not start elimination procedure as soon as the first symptom has been detected but rather it should be patient enough to collect alerts and decide based on the correlation of them. Anomaly detection systems rely on constructing a model of user behaviour that is considered normal. This is achieved by using a combination of statistical or machine learning methods to examine network traffic or system calls and processes. The detection of novel attacks is more successful using the anomaly detection approach as
  • 19. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 10 HGAE DEPARTMENT OF CSE any deviant behaviour is classified as an intrusion. However, normal behaviour in a large and dynamic system is not well defined and it changes over the time. This often results in a substantial number of false alarms known as false positives. A network- based IDS looks at the incoming network traffic for patterns that can signify whether a person is probing the network for vulnerable computers. Since responding to each alert consumes relatively large amounts of time and resources, IDS should not respond to every alert it generates. Disregarding this fact may result in a self-inflicted denial-of- service. To overcome this problem, alerts should be aggregated and correlated in order to produce fewer but more expressive and remarkable alerts. 2.2.1.1 MACHINE LEARNING APPROACHES We divide the ML-based approaches to intrusion detection into two categories: approaches based on Artificial Intelligence (AI) techniques and approaches based on Computational Intelligence (CI) methods. AI techniques refer to the methods from the domain of classical AI like statistical modeling and while CI techniques refer to nature- inspired methods that are used to deal with complex problems that classical methods are unable to solve. Important CI methodologies are evolutionary computation, fuzzy logic, artificial neural networks, and artificial immune systems. CI is different from the well-known field of AI. AI handles symbolic knowledge representation, while CI handles numeric representation of information. Although the boundary between these two categories is not always clear and many hybrid methods have been proposed in the literature, most previous work are mainly designed based on either of the categories. Moreover, it would be quite useful to understand how well nature-based techniques perform in contrast to classical methods. 1) AI-BASED TECHNIQUES Laskov et al. develop an experimental framework for comparative analysis of supervised (classification) and unsupervised learning (clustering) techniques for detecting malicious activities. The supervised methods evaluated in this work include decision trees, k-Nearest Neighbor (kNN), Multi-Layer Perceptron (MLP), and Support Vector Machines (SVM). The unsupervised algorithms include γ- algorithm, k-means clustering, and single linkage clustering. They define two scenarios for evaluating the aforementioned learning algorithms from both categories. In the first scenario, they assume that training and test data come from the same unknown distribution. In the second scenario, they consider the case where the test data comes from new (i.e., unseen) attack patterns. This scenario helps us understand how much an IDS can generalize its knowledge to new malicious patterns, which is often very essential for an IDS system.
  • 20. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 11 HGAE DEPARTMENT OF CSE Since today’s sophisticated adversaries tend to use several intrusion patterns to escape from modern IDS. The results show that the supervised algorithms in general show better classification accuracy on the data with known attacks (the first scenario). Among these algorithms, the decision tree algorithm has achieved the best results (95% true positive rate, 1% false-positive rate). The next two best algorithms are the MLP and the SVM, followed by the k-nearest neighbor algorithm. However, if there are unseen attacks in the test data, then the detection rate of supervised methods decreases significantly. This is where the unsupervised techniques perform better as they do not show significant difference in accuracy for seen and unseen attacks. Figure 2.2.1 shows the average true/false positive rates of all methods evaluated. As the plots show, the supervised techniques generally perform better although unsupervised methods give more robust results in both scenarios. Zanero and Savaresi introduce a two-tier anomaly-based architecture for IDS in TCP/IP networks based on unsupervised learning: the first tier is an unsupervised clustering algorithm, which build small-size patterns from the network packets payload. In other words, TCP or UDP packet are assigned to two clusters representing normal and abnormal traffic. The second tier is an optimized traditional anomaly detection algorithm improved by the availability of data on the packet payload content. The motivation behind the work is that unsupervised learning methods are usually more powerful in generalization of attack patterns than supervised methods thus, there is a hope that such an architecture can resist polymorphic attacks more efficiently. Lee and Solfo build a classifier to detect anomalies in networks using data mining techniques. They implement two general data mining algorithms that are essential in describing normal behaviour of a program or user. They propose an agent-based architecture for intrusion detection systems, where the learning agents Figure 2.2.1 : Average of detection rates for methods evaluated in Pavel Laskov, Patrick Dssel, Christin Schfer, and Konrad Rieck. Learning intrusion detection: Supervised or unsupervised? In Image Analysis and Processing ICIAP 2005, volume 3617 of Lecture Notes in Computer Science, pages 50–57. Springer Berlin Heidelberg, 2005. in two scenarios: test data contains only known attacks (left) and test data contains unknown attacks (right).
  • 21. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 12 HGAE DEPARTMENT OF CSE continuously compute and provide the updated detection models to the agents. They conduct experiments on Sendmail system call data and network tcpdump data to demonstrate the effectiveness of their classification models in detecting anomalies. They finally argue that the most important challenge of using data mining approaches in intrusion detection is that they require a large amount of audit data in order to compute the profile rule sets. Sommer and Paxson study the imbalance between the extensive amount of research on ML-based intrusion detection versus the lack of operational deployments of such systems. They identify challenges particular to network intrusion detection and provide a set of guidelines for fortifying future research on ML-based intrusion detection. More specifically, they argue that an anomaly-based IDS requires outlier detection while the classic application of ML is a classification problem that deals with finding similarities between activities. It is true that in some cases, an outlier detection problem can be modeled as a classification problem in which there are two classes: normal and abnormal. In machine learning, one needs to train a system with training patterns of all classes while in anomaly detection one can only train on normal patterns. This means that anomaly detection is better for finding variations of known attacks, rather than previously unknown malicious activity. This is why ML methods have been applied to spam detection more effectively than to intrusion detection. 2) CI-BASED TECHNIQUES In this section, we review several algorithms based on the four core techniques of computational intelligence. • Genetic Algorithms (GA) Genetic algorithms are aimed at finding optimal solutions to problems. Each potential solution to a problem is represented as a sequence of bits (genes) called a genome or chromosome. A genetic algorithm begins with a set of genomes (population) and an evaluation function called fitness function that measures the quality (goodness) of each genome. The algorithm uses two reproduction operators called crossover and mutation to create new descendants (solutions), which are then evaluated. Crossover determines how various properties of the parents in a population are inherited by the descendants. Mutation is the spontaneous alteration of a single gene. Sinclair et al. use genetic algorithms and decision trees to create rules for an intrusion detection expert system, which supports the analyst’s job in differentiating anomalous network activity from normal network traffic. In this work, GA is used to evolve simple rules for network traffic. Each rule is represented by a genome and the initial population of genomes is a set of
  • 22. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 13 HGAE DEPARTMENT OF CSE random rules. Each genome is comprised of 29 genes: 8 for source IP, 8 for destination IP, 6 for source port, 6 for destination port, and 1 for protocol. The fitness function is based on the actual performance of each rule on a preclassified data set. An analyst marks a data set comprised of connections as either normal or abnormal. The system uses analyst-created training sets for rule development and analyst decision support. If a rule completely matches an abnormal connection, then it is rewarded a bonus and if it matches a normal connection it is penalized. Hence, the generations are biased toward rules that match intrusive connections only. Once the genetic algorithm reaches a certain number of generations, it stops and the best genomes (i.e., rules) are selected. The generated rule set can be used as knowledge inside the IDS for judging whether the network connection and related behaviours are potential intrusions. The traditional GA tends to converge to a single best solution called global maximum. Since, the algorithm requires a group of best unique rules, a nature inspired technique called niching that attempts to create subpopulations which converge on local maxima. Li describes a few disadvantages of the algorithm proposed and defines a new technique for defining IDS rules. They argue that in order to detect intrusive behaviours for a local network, network connections should be used to define normal and abnormal behaviours. An attack can sometimes be as simple as scanning for available ports in a server or a password- guessing scheme. But typically, they are complex and are generated by automated tools. So, one needs to use temporal and spatial information of network connections to define IDS rules that can classify complex anomalous activities using an efficient genetic algorithm. • Artificial Neural Networks (ANN) A neural network consists of a collection of processing units called neurons that are highly interconnected according to a given topology. ANN have the ability to learning by example and generalize from limited, noisy, and incomplete data. They have been successfully employed in a broad spectrum of data-intensive applications. Mukkamala et al. describe approaches to intrusion detection using neural networks and Support Vector Machines (SVM). Their goal is to discover patterns or features that describe user behaviour to build classifiers for recognizing anomalies. SVM are supervised learning machines that represent the training vector in high-dimensional feature space and label each vector by its class. SVM define an upper bound on the margin (separation) between different classes to minimize the generalization error, which is the amount of error in classification of unknown vectors. SVM
  • 23. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 14 HGAE DEPARTMENT OF CSE classify data by determining a set of training data called support vectors that approximate a hyperplane in feature space. Mukkamala et al. use an SVM for non-linear classification of feature vectors in an IDS. The SVM is trained with 7312 data points and test with 6980 test points from KDD. Each point is located on a 41-dimensional space and the training is done using the radial basis function (RBF). The RBF is used to approximate the non-linear hyperplane that separates the normal and abnormal classes. Using this SVM, they reach an accuracy of 99.5% in classification of test points. They also use three multilayer feed-forward ANN to classify the same test points. The ANN are trained using the same 7312-point training set. The best result from experimenting the different ANN architectures is a detection rate of 99.25%. The authors conclude that although their SVM IDS shows higher detection rates than their ANN, SVM can only be used for binary classification, which is a big limitation for IDS that require multiple classes. • Fuzzy Logic Fuzzy logic is a method to computing based on degrees of truth rather than the usual true or false Boolean logic on which the modern computers are based. With fuzzy spaces, fuzzy logic allows an object to belong to different classes at the same time. This makes fuzzy logic a great choice for intrusion detection because the security itself includes fuzziness and the boundary between the normal and anomaly is not well defined. Moreover, the intrusion detection problem involves many numeric attributes in collected data, and various derived statistical measures. Building models directly on numeric data usually causes high detection errors. A behaviour that deviates only slightly from a model may not be detected or a small change in normal behaviour may cause a false positive. With fuzzy logic, it is possible to model these small deviations to keep the false positive/negative rates small. Every fuzzy rule has the following general form, IF condition THEN conclusion [weight], where condition is a fuzzy expression defined using fuzzy logic operators like fuzzy AND & fuzzy OR, conclusion is an atomic expression, and weight is a real number in [0,1] that shows the confidence of the rule. Gomez and Dasgupta show that with fuzzy logic, the false alarm rate in determining intrusive activities can be reduced. They define a set of fuzzy rules to define the normal and abnormal behaviour in a computer network, and a fuzzy inference engine to determine intrusions. They use a genetic algorithm to generate fuzzy classifiers, which is a set of fuzzy rules in the form defined above. Each fuzzy rule is represented by a genome and the GA is used to find the best genomes (fuzzy rules) to be added to the fuzzy
  • 24. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 15 HGAE DEPARTMENT OF CSE classifier. The authors conducted experiments using the KDD evaluation data to classify 22 different types of attacks into 4 intrusion classes: denial of service (DoS), unauthorized access from a remote machine (R2L), unauthorized access to local superuser (root) privileges (U2R), and probing (PRB). The results show that their algorithm achieves an overall true positive rate of 98.95% and a false positive rate of 7%. • Artificial Immune Systems (AIS) Natural immune systems consist of molecules, cells, and tissues that establish body’s resistance to infections caused by pathogens like bacteria, viruses, and parasites. They distinguish pathogens from self-cells and eliminate the pathogens. This provides a great source of inspiration for computer security systems, especially IDS. An artificial immune system is a computationally intelligent system based on behaviour of the natural immune systems. The first immune-inspired model applicable to various computer security problems was proposed by Hofmeyr and Forrest. Their model is specialized to detect intrusions in local area networks based on TCP/IP. They build a database containing normal sequences of system calls that act as the self- definition of the normal behaviour of a program, and as the basis to detect anomalies. Each TCP connection is modeled by a triple, which encodes address of sender, address of receiver and port number of the receiver. Detectors are generated randomly through negative selection algorithm (NSA). In addition to NSA that results in a signal to stimulate or tolerate the immune response, they used a second signal (called co-stimulation) to confirm the anomaly that was detected through NS procedure. In this system, a human is required to generate this signal manually in order to reduce false alarms (autoimmunity) of the system. Kim et al. provide an introduction and analysis of the key developments within the field of immune-inspired computer security as well as suggestions for future research. They summarize six immune features that are desirable for an effective IDS: distributed, multi-layered, self-organized, lightweight, diverse and disposable. They explain that the human immune system is distributed through immune networks and it generates unique antibody sets to provide the first four requirements. It is self-organized through gene library evolution, negative selection, and clonal. Finally, it is lightweight through approximate binding, memory cells, and gene expression to increase efficiency. Zamani et al. describe an artificial immune algorithm for intrusion detection in distributed systems based on danger theory, an immunological model based on the idea that the immune system does not recognize between self
  • 25. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 16 HGAE DEPARTMENT OF CSE and non-self, but rather between events that cause damage. The authors propose a multi-agent environment that computationally emulates the behaviour of natural immune systems is effective in reducing false positive rates. They show the effectiveness of their model in practice by performing a case study on the problem of detecting distributed denial-of-service attacks in wireless sensor networks. Dasgupta proposes a multi-agent IDS based on AIS. He defines three types of agents: monitoring agents that roam around the network and monitor various parameters simultaneously at multiple levels (user to packet level), communicator agents that are used to play the role of signals between immune cells called lymphokines and decision/action agents to make decisions based on collected local warning signals. Roles of each type of agents is unique, though they may work in collaboration. This work unfortunately does not provide any experimental results making it difficult for the reader to compare the performance of the proposed system with other ML-based IDS. 2.2.2 CONCLUSION We reviewed several influential algorithms for intrusion detection based on various machine learning techniques. Characteristics of ML techniques makes it possible to design IDS that have high detection rates and low false positive rates while the system quickly adapts itself to changing malicious behaviours. We divided these algorithms into two types of ML-based schemes: Artificial Intelligence (AI) and Computational Intelligence (CI). Although these two categories of algorithms share many similarities, several features of CI-based techniques, such as adaptation, fault tolerance, high computational speed and error resilience in the face of noisy information, conform the requirement of building efficient intrusion detection systems. 2.3 ANOMALY-BASED NETWORK INTRUSION DETECTION: TECHNIQUES, SYSTEMS AND CHALLENGES The Internet and computer networks are exposed to an increasing number of security threats. With new types of attacks appearing continually, developing flexible and adaptive security-oriented approaches is a severe challenge. In this context, anomaly-based network intrusion detection techniques are a valuable technology to protect target systems and networks against malicious activities. However, despite the variety of such methods described in the literature in recent years, security tools incorporating anomaly detection functionalities are just starting to appear, and several important problems remain to be solved. Noteworthy work has been carried out by CIDF (‘‘Common Intrusion Detection Framework’’), a working group created by DARPA in 1998 mainly oriented towards
  • 26. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 17 HGAE DEPARTMENT OF CSE coordinating and defining a common framework in the IDS field. Integrated within IETF in 2000, and having adopted the new acronym IDWG (‘‘Intrusion Detection Working Group’’), the group defined a general IDS architecture based on the consideration of four types of functional modules (Figure 2.3.1): • E blocks (‘‘Event-boxes’’): This kind of block is composed of sensor elements that monitor the target system, thus acquiring information events to be analyzed by other blocks. • D blocks (‘‘Database-boxes’’): These are elements intended to store information from E blocks for subsequent processing by A and R boxes. • A blocks (‘‘Analysis-boxes’’): Processing modules for analyzing events and detecting potential hostile behaviour, so that some kind of alarm will be generated if necessary. • R blocks (‘‘Response-boxes’’): The main function of this type of block is the execution, if any intrusion occurs, of a response to thwart the detected menace. Other key contributions in the IDS field concern the definition of protocols for data exchange between components (e.g. IDXP, ‘‘Intrusion Detection eXchange Protocol’’, RFC 4767), and the format considered for this (e.g. IDMEF, ‘‘Intrusion Detection MEssage Format’’, RFC 4765). Depending on the information source considered (E boxes in Figure 2.3.1), an IDS may be either host or network-based. A host-based IDS analyzes events such as process identifiers and system calls, mainly related to OS information. On the other hand, a network-based IDS analyzes network related events: traffic volume, IP addresses, service ports, protocol usage, etc. This paper focuses on the latter type of IDS. Depending on the type of analysis carried out (A blocks in Figure 2.3.1), intrusion detection systems are classified as either signature-based or anomaly-based. Signature-based schemes (also denoted as misuse-based) seek defined patterns, or signatures, within the analyzed data. For this purpose, a signature database corresponding to known attacks is specified a priori. On the other hand, anomaly-based Figure 2.3.1 : General CIDF architecture for IDS systems
  • 27. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 18 HGAE DEPARTMENT OF CSE detectors attempt to estimate the ‘‘normal’’ behaviour of the system to be protected, and generate an anomaly alarm whenever the deviation between a given observation at an instant and the normal behaviour exceeds a predefined threshold. Another possibility is to model the ‘‘abnormal’’ behaviour of the system and to raise an alarm when the difference between the observed behaviour and the expected one falls below a given limit Signature and anomaly-based systems are similar in terms of conceptual operation and composition. The main differences between these methodologies are inherent in the concepts of ‘‘attack’’ and ‘‘anomaly’’. An attack can be defined as ‘‘a sequence of operations that puts the security of a system at risk’’. An anomaly is just ‘‘an event that is suspicious from the perspective of security’’. Based on this distinction, the main advantages and disadvantages of each IDS type can be pointed out. 2.3.1 A-NIDS Techniques Although different A-NIDS approaches exist (Este´vezTapiador et al., 2004), in general terms all of them consist of the following basic modules or stages (Figure 2.3.2) • Parameterization: In this stage, the observed instances of the target system are represented in a pre-established form. • Training stage: The normal (or abnormal) behaviour of the system is characterized and a corresponding model is built. This can be done in very different ways, automatically or manually, depending on the type of A-NIDS considered. • Detection stage: Once the model for the system is available, it is compared with the (parameterized) observed traffic. If the deviation found exceeds (or is below, in the case of abnormality models) a given threshold an alarm will be triggered (Este´vez-Tapiador et al., 2004). According to the type of processing related to the “behavioural” model of the target system, anomaly detection techniques can be classified into three main categories (Lazarevic et al., 2005) (see Figure 2.3.3): statistical based, knowledge-based, and Figure 2.3.2 : Generic A-NIDS functional architecture.
  • 28. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 19 HGAE DEPARTMENT OF CSE machine learning-based. In the statistical-based case, the behaviour of the system is represented from a random viewpoint. On the other hand, knowledge-based A-NIDS techniques try to capture the claimed behaviour from available system data (protocol specifications, network traffic instances, etc.). Finally, machine learning A-NIDS schemes are based on the establishment of an explicit or implicit model that allows the patterns analyzed to be categorized. Two key aspects concern the evaluation, and thus the comparison, of the performance of alternative intrusion detection approaches: these are the efficiency of the detection process, and the cost involved in the operation. Without underestimating the importance of the cost, at this point the efficiency aspect must be emphasized. Four situations exist in this context, corresponding to the relation between the result of the detection for an analyzed event (“normal” vs. “intrusion”) and its actual nature (‘‘innocuous’’ vs. ‘‘malicious’’). These situations are: false positive (FP), if the analyzed event is innocuous (or ‘‘clean’’) from the perspective of security, but it is classified as malicious; true positive (TP), if the analyzed event is correctly classified as intrusion/malicious; false negative (FN), if the analyzed event is malicious but it is classified as normal/innocuous; and true negative (TN), if the analyzed event is Figure 2.3.3 : Classification of the anomaly detection techniques according to the nature of the processing involved in the ‘‘behavioural’’ model considered.
  • 29. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 20 HGAE DEPARTMENT OF CSE correctly classified as normal/innocuous. It is clear that low FP and FN rates, together with high TP and TN rates, will result in good efficiency values. The fundamentals for statistical, knowledge and machine learning-based A- NIDS, as well as the principal subtypes of each, are described below. The main features of all are summarized in Table 2.3.1. Above and beyond other possibilities, the question of efficiency should be a prime consideration in selecting and implementing A-NIDS methodologies. 1) Statistical-based A-NIDS techniques In statistical-based techniques, the network traffic activity is captured and a profile representing its stochastic behaviour is created. This profile is based on metrics such as the traffic rate, the number of packets for each protocol, the rate of connections, the number of different IP addresses, etc. Two datasets of network traffic are considered during the anomaly detection process: one corresponds to the currently observed profile over time, and the other is for the previously trained statistical profile. Apart from their inherent features for use as anomaly-based techniques, statistical A-NIDS approaches have a number of virtues. Firstly, they do not require prior knowledge about the normal activity of the target system; instead, they have the ability to learn the expected behaviour of the system from observations. Secondly, statistical methods can provide accurate notification of malicious activities occurring over long periods of time. However, some drawbacks should also be pointed out. First, this kind of A- NIDS is susceptible to be trained by an attacker in such a way that the network traffic generated during the attack is considered as normal. Second, setting the values of the different parameters/metrics is a difficult task, especially because the balance between false positives and false negatives is affected. Table 2.3.1 : Fundamentals of the A-NIDS techniques
  • 30. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 21 HGAE DEPARTMENT OF CSE 2) Knowledge-based techniques The so-called expert system approach is one of the most widely used knowledge-based IDS schemes. However, like other A-NIDS methodologies, expert systems can also be classified into other, different categories. Expert systems are intended to classify the audit data according to a set of rules, involving three steps. First, different attributes and classes are identified from the training data. Second, a set of classification rules, parameters or procedures are deduced. Third, the audit data are classified accordingly. More restrictive/particular in some senses are specification-based anomaly methods, for which the desired model is manually constructed by a human expert, in terms of a set of rules (the specifications) that seek to determine legitimate system behaviour. If the specifications are complete enough, the model will be able to detect illegitimate behavioural patterns. Moreover, the number of false positives is reduced, mainly because this kind of system avoids the problem of harmless activities, not previously observed, being reported as intrusions. Specifications could also be developed by using some kind of formal tool. 3) Machine learning-based A-NIDS schemes Machine learning techniques are based on establishing an explicit or implicit model that enables the patterns analyzed to be categorized. A singular characteristic of these schemes is the need for labelled data to train the behavioural model, a procedure that places severe demands on resources. In many cases, the applicability of machine learning principles coincides with that for the statistical techniques, although the former is focused on building a model that improves its performance on the basis of previous results. Hence, a machine learning A-NIDS has the ability to change its execution strategy as it acquires new information. Although this feature could make it desirable to use such schemes for all situations, the major drawback is their resource expensive nature. Several machine learning-based schemes have been applied to A-NIDS. Some of the most important are cited below, and their main advantages and drawbacks are identified. • Bayesian networks A Bayesian network is a model that encodes probabilistic relationships among variables of interest. This technique is generally used for intrusion detection in combination with statistical schemes, a procedure that yields several advantages, including the capability of encoding interdependencies between variables and of predicting events, as well as the ability to incorporate both prior knowledge and data.
  • 31. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 22 HGAE DEPARTMENT OF CSE However, a serious disadvantage of using Bayesian networks is that their results are similar to those derived from threshold-based systems, while considerably higher computational effort is required. Although the use of Bayesian networks has proved to be effective in certain situations, the results obtained are highly dependent on the assumptions about the behaviour of the target system, and so a deviation in these hypotheses leads to detection errors, attributable to the model considered. • Markov models A Markov chain is a set of states that are interconnected through certain transition probabilities, which determine the topology and the capabilities of the model. During a first training phase, the probabilities associated to the transitions are estimated from the normal behaviour of the target system. The detection of anomalies is then carried out by comparing the anomaly score obtained for the observed sequences with a fixed threshold. Markov-based techniques have been extensively used in the context of host IDS, normally applied to system calls. In all cases, the model derived for the target system has provided a good approach for the claimed profile, while, as in Bayesian networks, the results are highly dependent on the assumptions about the behaviour accepted for the system. • Neural networks With the aim of simulating the operation of the human brain, neural networks have been adopted in the field of anomaly intrusion detection, mainly because of their flexibility and adaptability to environmental changes. However, a common characteristic in the proposed variants, from recurrent neural networks to self-organizing maps (Ramadas et al., 2003), is that they do not provide a descriptive model that explains why a particular detection decision has been taken. • Fuzzy logic techniques Fuzzy logic is derived from fuzzy set theory under which reasoning is approximate rather than precisely deduced from classical predicate logic. Fuzzy techniques are thus used in the field of anomaly detection mainly because the features to be considered can be seen as fuzzy variables. This kind of processing scheme considers an observation as normal if it lies within a given interval. Although fuzzy logic has proved to be effective, especially against port scans and probes, its main disadvantage is the high resource consumption involved. On the other hand, it should also be noticed that fuzzy logic is controversial in some circles, and it has been rejected by some engineers and by most statisticians, who hold that probability is the only rigorous mathematical description of uncertainty. • Genetic algorithms Genetic algorithms are categorized as global search heuristics, and are a particular class of evolutionary algorithms that use techniques inspired by
  • 32. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 23 HGAE DEPARTMENT OF CSE evolutionary biology such as inheritance, mutation, selection and recombination. Thus, genetic algorithms constitute another type of machine learning-based technique, capable of deriving classification rules and/or selecting appropriate features or optimal parameters for the detection process. The main advantage of this subtype of machine learning A-NIDS is the use of a flexible and robust global search method that converges to a solution from multiple directions, whilst no prior knowledge about the system behaviour is assumed. Its main disadvantage is the high resource consumption involved. • Clustering and outlier detection Clustering techniques work by grouping the observed data into clusters, according to a given similarity or distance measure. The procedure most commonly used for this consists in selecting a representative point for each cluster. Then, each new data point is classified as belonging to a given cluster according to the proximity to the corresponding representative point. Some points may not belong to any cluster; these are named outliers and represent the anomalies in the detection process. Clustering techniques determine the occurrence of intrusion events only from the raw audit data, and so the effort required to tune the IDS is reduced. 4) Additional considerations on A-NIDS processing. KDD and data mining In addition to the above described A-NIDS techniques, there are others that may help in the task of dealing with the amount of information contained within a dataset. Two of these techniques are principal component analysis (PCA) and association rule discovery. PCA is a technique that is used to reduce the complexity of a dataset. It is not a detection scheme itself but an auxiliary one. A given data collection (or dataset), obtained by means of the different sensors in the target environment, becomes more and more extensive and complex as the number of different services and speed of the networks grow. To simplify the dataset, PCA makes a translation on a basis by which n correlated variables are represented in order to reduce the number of variables to d < n, which will be both uncorrelated and linear combinations of the original ones. This makes it possible to express the data in a reduced form, thus facilitating the detection process. To conclude the present section, let us present an important discussion of A- NIDS techniques. During recent decades several scientific communities have contributed to analyzing information from high volume databases. However, in the 1990s, KDD (‘‘Knowledge Discovery in Databases’’) burst onto the scene,
  • 33. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 24 HGAE DEPARTMENT OF CSE to ‘‘identify new, valid, potentially useful and comprehensible patterns for data’’. 2.3.2 AVAILABLE A-NIDS SYSTEMS This section describes several reported endeavours in the development and deployment of A-NIDS platforms in real network environments. The analysis is split into two categories: available platforms, commercial or freeware, and research systems. Commercial systems tend to use well proven techniques, and so they do not usually consider the A-NIDS techniques most recently proposed in the specialized literature. In fact, most of them include a signature-based detection module as the core of the detection platform. 2.3.2.1 A-NIDS platforms In recent years, a number of important actions have focused on implementing A-NIDS techniques in real security platforms. Currently available IDS software tools in this line include Snort (www.snort.org), Prelude (www.prelude-ids.org), and N@G (www.ncb.ernet.in/nag). Although anomaly-based detection techniques are not yet mature, they are beginning to appear in commercial and open source products. Furthermore, in recent years, some pioneering systems and businesses in the A-NIDS field have been acquired by bigger companies, and their products incorporated into more general and integral network security platforms. More recent systems make use of a distributed architecture for intrusion detection by incorporating agents (or sensors), and a central console to supervise the overall detection process. This is the case of the SecurityFocus DeepSight Threat Management System – now part of DeepNines BBX Intrusion Prevention which uses a statistical approach to detect potential Internet threats. Data are collected by distributed sensors, which include intrusion detection capabilities. The sensors report current network scans and attacks to the controller, providing a global detection capability. Most of the platforms perform further analysis on the monitored data, related to audit, tracing and forensic capabilities. Additionally, they may trigger some kind of response to detected attacks, namely an interaction with firewalls, the reset of TCP connections, the use of honey systems, etc. More advanced platforms include the Protocol Anomaly Detection (PAD) technique, which is based on the detection of anomalies in the use of protocols. This kind of analysis is adopted in BarbedWire IDS, DeepNines BBX, N@G, and Strata Guard. PAD combines specification-based and statistical characterization A-NIDS techniques to model the behaviour of a given protocol. This can be complemented by using additional A-NIDS techniques.
  • 34. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 25 HGAE DEPARTMENT OF CSE 2.3.2.2 A-NIDS research-related environments Although some of the above-mentioned A-NIDS platforms are also usable for research purposes, others have been specifically developed for this. Unlike ‘‘commercial’’ A-NIDS systems, research-oriented environments include more innovative anomaly detection techniques. Conceived as research platforms, these systems enable the integration of contributed modules performing additional detection techniques. This is also the case of Snort and Prelude, two of the most widely deployed NIDS tools today. Another observed tendency is the consideration of intrusion prevention procedures or IPS (Intrusion Prevention System), that is, inline IDS schemes that filter and analyze all the network traffic accessing the target environment. This has two main consequences. On one hand, most projects have a structured architecture in which various detectors can work jointly, typically in a distributed way (e.g. EMERALD, AAFID, GIDRE). On the other hand, as the detectors are now ‘‘pluggable’’ modules, a specialization of their functions and capabilities can be observed. Thus, individual detectors are designed to monitor only a specific protocol or behaviour (e.g. Anagram targets HTTP payloads), and the global detection capabilities of the platform result from combining and correlating the information from different detectors. 2.3.3 OPEN ISSUES AND CHALLENGES Intrusion detection techniques are continuously evolving, with the goal of improving the security and protection of networks and computer infrastructures. Despite the promising nature of anomaly-based IDS, as well as its relatively long existence, there still exist several open issues regarding these systems. Some of the most significant challenges in the area are: • Low detection efficiency, especially due to the high false positive rate usually obtained (Axelsson, 2000). This aspect is generally explained as arising from the lack of good studies on the nature of the intrusion events. The problem calls for the exploration and development of new, accurate processing schemes, as well as better structured approaches to modelling network systems. • Low throughput and high cost, mainly due to the high data rates (Gbps) that characterize current wideband transmission technologies (Kruegel et al., 2002). Some proposals intended to optimize intrusion detection are concerned with grid techniques and distributed detection paradigms. • The absence of appropriate metrics and assessment methodologies, as well as a general framework for evaluating and comparing alternative IDS techniques (Stolfo and Fan, 2000; Gaffney and Ulvila, 2001). Due to the importance of this issue, it is analyzed in greater depth below. • The analysis of ciphered data, although this is also a general problem faced by all intrusion detection platforms. Moreover, this problem could be
  • 35. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 26 HGAE DEPARTMENT OF CSE dealt with by simply locating the detection agents at those functional points in the system where data are available in ‘‘plaintext’’ format and, for which the corresponding detection analysis can be carried out without special restrictions. A-NIDS assessment One of the main challenges that researchers must face, when trying to implement and validate a new intrusion detection method, is to assess it and compare its performance with that of other available approaches. It is noticeable that this task is not restricted to A-NIDS, but is also applicable to NIDS in general. The need for test-beds that provide robust and reliable metrics to quantify NIDS has been suggested. Although some authors defend a testing methodology in real environments, most of them, advocate an evaluation procedure in experimental environments. An advantage of assessment in real environments is that the traffic is sufficiently realistic; however, this approach is subject to: (a) The risk of potential attacks (b) The possible interruption of the system operation due to simulated attacks On the other hand, the evaluation of NIDS methodologies in experimental environments involves the generation of synthetic traffic as well as background traffic representing legal users, which is far from being a trivial undertaking. 2.3.4 SUMMARY The present paper discusses the foundations of the main A-NIDS technologies, together with their general operational architecture, and provides a classification for them according to the type of processing related to the “behavioural” model for the target system. Another valuable aspect of this study is that it describes, in a concise way, the main features of several currently available IDS systems/platforms. Finally, the most significant open issues regarding A-NIDS are identified, among which that of assessment is given particular emphasis. 2.4 INCREMENTAL ANOMALY-BASED INTRUSION DETECTION SYSTEM USING LIMITED LABELED DATA With the proliferation of the internet and increased global access to online media, cybercrime is also occurring at an increasing rate. Currently, both personal users and companies are vulnerable to cybercrime. A number of tools including firewalls and Intrusion Detection Systems (IDS) can be used as defense mechanisms. A firewall acts as a checkpoint which allows packets to pass through according to predetermined conditions. In extreme cases, it may even disconnect all network traffic. An IDS, on the
  • 36. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 27 HGAE DEPARTMENT OF CSE other hand, automates the monitoring process in computer networks. The streaming nature of data in computer networks poses a significant challenge in building IDS. In this paper, a method is proposed to overcome this problem by performing online classification on datasets. In doing so, an incremental naive Bayesian classifier is employed. Furthermore, active learning enables solving the problem using a small set of labeled data points which are often very expensive to acquire. The proposed method includes two groups of actions i.e. offline and online. The former involves data preprocessing while the latter introduces the NADAL online method. The proposed method is compared to the incremental naive Bayesian classifier using the NSL-KDD standard dataset. There are three advantages with the proposed method: (1) overcoming the streaming data challenge; (2) reducing the high cost associated with instance labeling; and (3) improved accuracy and Kappa compared to the incremental naive Bayesian approach. Thus, the method is well-suited to IDS applications. An attack refers to a set of actions that compromise the confidentiality, integrity, and accessibility of resources. A system is known to be secure if it can guarantee these three criteria. Attacks must be identified before doing any harm to the organization. Even Local Area Networks (LAN) need to be able to withstand such attacks since network performance is important in terms of bandwidth and other resources. The most common means of defense against potential attacks involves a two-layered system. The first layer comprises a firewall which controls access to the network while the second layer is configured to detect threats that somehow manage to pass through the firewall and take appropriate action to defend the network. This second layer is known as an Intrusion Detection System (IDS) which is able to identify intrusion attempts by monitoring and analyzing network packets and logs. In case an intrusion is detected, the system alerts the network administer. With respect to information source, IDS are divided into two categories: host- based and network-based. Host-based methods tend to monitor and analyze internal computer operations, for instance by determining the resources that are allowed for each host as well as illegal access attempts. Network-based systems, in contrast, deal with intrusion at the network level. Anomalies at this level are often caused by external attackers whose aim is to gain unauthorized network access, steal information, and disrupt the network. Anomalies at this level are often caused by external attackers whose aim is to gain unauthorized network access, steal information, and disrupt the network. There are certain challenges for anomaly detection systems. Unlike traditional data packets which are inherently static, data streams are continuous flows of data which cannot be stored; they must be analyzed as one unit.
  • 37. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 28 HGAE DEPARTMENT OF CSE 2.4.1 RELATED WORK Anomaly-based IDS have been extensively studied; however, few studies present an incremental approach. Incremental methods may be supervised, semi- supervised, and unsupervised. In this paper, supervised methods are considered which model the normality of the data. Here, the problem of anomaly detection is converted into one of classification. • W.-Y. Yu and H.-M. Lee propose an incremental learning method by cascading a Service Classifier (SC) using Incremental Tree Inductive (ITI) learning. The cascading approach includes three steps: (1) training; (2) test; (3) incremental learning. • In another study, a novel anomaly detection system is proposed by Ren et al. to which dynamically update normal usage profiles. Upon encountering new behavior, density-based incremental clustering is used to insert the new behavior into old profiles. The authors report less sensitivity to data disruptions compared to Anomaly Detection with Fast Incremental Clustering (ADWICE) profiles. The approach also improves cluster quality and reduces false alarms; nevertheless, the method displays poor performance in working with large datasets. • Other authors propose Reserved Set-Incremental Support Vector Machine (RS- IVM) which is an improved incremental SVM for intrusion detection. In order to reduce the noise cause by large differences between feature values, the authors propose a modified kernel function known as U-RBF which embeds feature means and root square mean differences in the RBF kernel. The authors claim that RS-ISVM facilitates the fluctuation phenomenon in the learning process while providing better and more reliable performance. However, it suffers from low U2R and R2L and requires a large number of parameters. Many modern intrusion detection methods focus on feature selection or reduction. This is because many features may be irrelevant or redundant and may inhibit system performance. Efficient naive Bayesian classifiers are applied to the reduced dataset to detect possible intrusions. Experimental results show that the selected features are more appropriate for designing IDS and result in more effective intrusion detection. In this paper, the naive Bayesian algorithm is evaluated using the KDD-NSL dataset to detect four types of attacks: Probe, DoS, U2R, and R2L. Feature reduction may use three standard feature selection methods: correlation, information gain, or gain ratio. The proposed method in this study employs feature vitality-based reduction. The results indicate that the proposed model provides better performance.
  • 38. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 29 HGAE DEPARTMENT OF CSE 2.4.2 NAIVE BAYESIAN CLASSIFICATION Naive Bayesian classification is a popular method for stream mining. The popularity of the method is due to the fact that the model can be updated with new data streams very easily. The method is inherently incremental since new data points are updated as they arrive. Given this incremental nature, the algorithm is very suitable to stream mining. Assuming m classes, namely C1, C2, … , Cm for tuple X, the classifier seeks to find the class with the highest posterior probability on the condition X. In fact, the classifier predicts whether tuple X belongs to the class. Therefore, X belongs to Ci if and only if: (1) Since P(X) remains constant for all classes, one must determine the class that maximizes the expression. If prior probabilities are unknown, they are commonly regarded as being equal i.e. p(C1) = (C2) = … = p(Cm); Hence, only p(X|Ci) must be maximized. Moreover, the probabilities may be estimated using , where |Ci,D| is the number training tuples with the label Ci . Datasets with large numbers of features impose high calculation cost for p(X|Ci). To reduce the calculations, the classes are assumed to be independent. Thus, the following is true: (2) Using the training tuples, individual probabilities p(X1|Ci), p(X2|Ci), and p(Xn|Ci) may be estimated. 2.4.3 ACTIVE LEARNING Instead of inquiring about the correct labels for all instances, active learning determines how input instances are selectively labeled. Quite often, this approach requires considerably fewer instances to learn a concept, compared to typical supervised methods. In active learning, once an instance is scanned, depending on the selected strategy, the algorithm searches for the correct label and the predictive model is trained with the new instance.
  • 39. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 30 HGAE DEPARTMENT OF CSE In the following, we briefly explain four active learning strategies • Random Strategy: Input samples are given random labels. • Fixed Uncertainty Strategy: The instances for which the current classifier has minimum confidence are labeled. A constant threshold is considered. Only those instances are labeled for which the maximum posterior probability as estimated by the classifier does not exceed the threshold. • Variable Uncertainty Strategy: Instances below the threshold are labeled with a time interval; the threshold is introduced as varying with time; and the budget is spent in a uniform fashion over time. • Uncertainty Strategy with Randomization: A random threshold is selected and the labels for instances near the threshold are inquired. 2.4.4 PROPOSED METHOD The proposed model, called Network Anomaly Detection using Active Learning (NADAL) involves an offline and an online step. The selected dataset is preprocessed in an offline fashion. The NSL-KDD dataset contains instances labeled with the attack type. During the preprocessing step, the attacks are divided into four categories: DoS, Probe, R2L, and U2R. Furthermore, there are four classifiers at the respective layers of attacks. Thus, the preprocessing carried out using Weka selects the appropriate features for each classifier. The selected features are then given to the feature filtering module in NADAL. Figure 2.4.1 illustrates the NADAL framework. In the proposed online method, at each time, each instance is processed at most once to improve the model. The instance is then discarded. Initially, instance Xt having label yt passes through the feature filtering module and the appropriate features for each classifier are considered. At each layer, the naive Bayesian module incrementally predicts the probability that the instance belongs to the class. Thereafter, the selected active learning strategy (i.e. uncertainty with randomization) is called. The output of the strategy determines whether the label for the instance must be inquired. A logical OR gate is used to aggregate the results from different active learning modules. The classifiers are updated using the instance if the gate outputs 1. Otherwise, the aggregate output module predicts the label according to the maximum certainty calculated by the classifiers. In this case, ŷt represents the actual label for instance Xt.
  • 40. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 31 HGAE DEPARTMENT OF CSE 2.4.4 EVALUATION The proposed framework in this paper was implemented using Java in NetBeans 8.0.2. Feature selection was performed using Weka and the Wrapper method. The active learning modules as well as the incremental naive Bayesian module were implemented by modifying the code from Massive Online Analysis (MOA1) 2016.04 written in Java. The standard NSLKDD2 dataset is used for evaluation purposes. The dataset was randomized via the Randomize functionality in Weka. The accuracy and Kappa values were then calculated for the framework at four layers: DoS, Probe, U2R, and R2L. The results were compared to those of the incremental naïve Bayesian approach in MOA. Figure 2.4.1 : The proposed model called NADAL
  • 41. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 32 HGAE DEPARTMENT OF CSE A. Dataset As mentioned earlier, in this paper, the standard NSL-KDD dataset is used for evaluation purposes. The dataset is a revision of the KDD-99 without repetitive and redundant instances. Each record includes 42nd features. The KDDtrain+.txt file was used wherein the 42nd feature identifies a normal vs. attack label. There are four types of attacks: DoS, Probe, R2L, and U2R B. Evaluation Criteria The results are evaluated according to accuracy and Kappa. Accuracy represents the percentage of tuples in the dataset that are correctly labeled. The measure is calculated as below: (3) The Kappa coefficient measures the agreement among individuals who classify or measure items. The value is obtained as follows: (4) Where p0 and pc denote observed and chance agreement, respectively. C. Implementation Results The results exhibit a clear improvement in both accuracy and Kappa compared to the incremental naive Bayesian approach. The results are shown for the NSL- KDD dataset with randomizations. Table 2.4.1 : ACCURACY AND KAPPA FOR TEN RANDOMIZATIONS: NADAL VS. INCREMENTAL NAIVE BAYESIAN CLASSIFIER
  • 42. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 33 HGAE DEPARTMENT OF CSE 2.4.5 CONCLUSION AND RECOMMENDATIONS Traditional data packets are inherently static. In contrast, streaming data are continuously created; they cannot be stored; and must by analyzed as a single unit. A novel network anomaly detection framework was proposed to improve efficiency in classifying data in an online fashion. Furthermore, active learning was used to reduce labeling costs. The proposed system was evaluated using the standard NSL-KDD dataset. Implementation results revealed that the proposed method outperforms the naive Bayesian approach in terms of both accuracy and Kappa. 2.5 A DEEP LEARNING APPROACH FOR NETWORK INTRUSION DETECTION SYSTEM A Network Intrusion Detection System (NIDS) helps system administrators to detect network security breaches in their organizations. However, many challenges arise while developing a flexible and efficient NIDS for unforeseen and unpredictable attacks. We propose a deep learning-based approach for developing such an efficient and flexible NIDS. We use Self-taught Learning (STL), a deep learning-based technique, on NSL-KDD - a benchmark dataset for network intrusion. We present the performance of our approach and compare it with a few previous works. Compared metrics include accuracy, precision, recall, and f-measure values. A NIDS monitors and analyzes the network traffic entering into or exiting from the network devices of an organization and raises alarms if an intrusion is observed. Based on the methods of intrusion detection, NIDSs are categorized into two classes: 1) Signature (misuse) based NIDS (SNIDS) 2) Anomaly Detection based NIDS (ADNIDS) In SNIDS, e.g. Snort, attack signatures are pre-installed in the NIDS. A pattern matching is performed for the traffic against the installed signatures to detect an intrusion in the network. In contrast, an ADNIDS classifies network traffic as an intrusion when it observes a deviation from the normal traffic pattern. SNIDS is effective in the detection of known attacks and shows high detection accuracy with less false-alarm rates. However, its performance suffers during detection of unknown or new attacks due to the limitation of attack signatures that can be installed beforehand in an IDS. ADNIDS, on the other hand, is well-suited for the detection of unknown and new attacks. Although ADNIDS produces high false-positive rates, its theoretical potential in the identification of novel attacks has caused its wide acceptance among the research community.
  • 43. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 34 HGAE DEPARTMENT OF CSE There are primarily two challenges that arise while developing an efficient and flexible NIDS for unknown future attacks. First, proper feature selections from the network traffic dataset for anomaly detection is difficult. The features selected for one class of attack may not work well for other categories of attacks due to continuously changing and evolving attack scenarios. Second, unavailability of labeled traffic dataset from real networks for developing a NIDS. Immense efforts are required to produce such a labeled dataset from the raw network traffic traces collected over a period or in real-time. Additionally, to preserve the confidentiality of the internal organizational network structure as well as the privacy of various users, network administrators are reluctant towards reporting any intrusion that might have occurred in their networks Various machine learning techniques have been used to develop ADNIDSs, such as Artificial Neural Networks (ANN), Support Vector Machines (SVM), Naive- Bayesian (NB), Random Forests (RF), and Self-Organized Maps (SOM). The NIDSs are developed as classifiers to differentiate the normal traffic from the anomalous traffic. Many NIDSs perform a feature selection task to extract a subset of relevant features from the traffic dataset to enhance classification results. Feature selection helps in the elimination of the possibility of incorrect training through the removal of redundant features and noises. Recently, deep learning-based methods have been successfully applied in audio, image, and speech processing applications. These methods aim to learn a good feature representation from a large amount of unlabeled data and subsequently apply these learned features on a limited amount of labeled data in a supervised classification. The labeled and unlabeled data may come from different distributions. However, they must be relevant to each other. It is envisioned that the deep learning-based approaches can help to overcome the challenges of developing an efficient NIDS. We can collect unlabeled network traffic data from different network sources and a good feature representation from these datasets using deep learning techniques can be obtained. These features can, then, be applied for supervised classification to a small, but labeled traffic dataset consisting of normal as well as anomalous traffic records. The traffic data for labeled dataset can be collected in a confined, isolated and private network environment. With this motivation, we use self-taught learning, a deep learning technique based on sparse autoencoder and soft-max regression, to develop a NIDS. We verify the usability of the self-taught learning-based NIDS by applying on NSL-KDD intrusion dataset, an improved version of the benchmark dataset for various NIDS evaluations - KDD Cup 99. 2.5.1 RELATED WORK This section presents various recent accomplishments in this area. we only discuss the work that have used the NSL-KDD dataset for their performance benchmarking. Therefore, any dataset referred from this point forward should be considered as NSL-KDD. This approach allows a more accurate comparison of work
  • 44. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 35 HGAE DEPARTMENT OF CSE with other found in the literature. Finally, we discuss a few deep-learning based approaches that have been tried so far for similar kind of work. One of the earliest works found in literature used ANN with enhanced resilient back-propagation for the design of such an IDS. This work used only the training dataset for training (70%), validation (15%) and testing (15%). As expected, use of unlabeled data for testing resulted in a reduction of performance. A more recent work used J48 decision tree classifier with 10-fold cross- validation for testing on the training dataset. This work used a reduced feature set of 22 features instead of the full set of 41 features. A similar work evaluated various popular supervised tree-based classifiers and found that Random Tree model performed best with the highest degree of accuracy along with a reduced false alarm rate. Many 2-level classification approaches have also been proposed. One such work used Discriminative Multinomial Naive Bayes (DMNB) as a base classifier and Nominal-to Binary supervised filtering at the second level along with 10-fold cross validation for testing. This work was further extended to use Ensembles of Balanced Nested Dichotomies (END) at the first level and Random Forest at the second level. As expected, this enhancement resulted in an improved detection rate and a lower false positive rate. Another 2-level implementation used principal component analysis (PCA) for the feature set reduction and then SVM (using Radial Basis Function) for final classification, resulted in a high detection accuracy with only the training dataset and full 41 features set. A reduction in features set to 23 resulted in even better detection accuracy in some of the attack classes, but the overall performance was reduced. The authors improved their work by using information gain to rank the features and then a behavior-based feature selection to reduce the feature set to 20. This resulted in an improvement in reported accuracy using the training dataset. The second category to look at, used both the training and test dataset. An initial attempt in this category used fuzzy classification with genetic algorithm and resulted in a detection accuracy of 80%+ with a low false positive rate. Another important work used unsupervised clustering algorithms and found that the performance using only the training data was reduced drastically when test data was also used. A similar implementation using the k-point algorithm resulted in a slightly better detection accuracy and lower false positive rate, using both training and test datasets. Another less popular technique, OPF (optimum path forest) which uses graph partitioning for feature classification, was found to demonstrate a high detection accuracy within one-third of the time compared to SVMRBF method.
  • 45. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 36 HGAE DEPARTMENT OF CSE A deep learning approach with Deep Belief Network (DBN) as a feature selector and SVM as a classifier resulted in an accuracy of 92.84% when applied on training data. 2.5.2 SELF-TAUGHT LEARNING & NSL-KDD DATASET OVERVIEW 1) Self-Taught Learning Self-taught Learning (STL) is a deep learning approach that consists of two stages for the classification. First, a good feature representation is learnt from a large collection of unlabeled data, xu, termed as Unsupervised Feature Learning (UFL). In the second stage, this learnt representation is applied to labeled data, xl, and used for the classification task. Figure 2.5.1 shows the architecture diagram of STL. There are different approaches used for UFL, such as Sparse Autoencoder, Restricted Boltzmann Machine (RBM), K-Means Clustering, and Gaussian Mixtures. A sparse autoencoder is a neural network consists of an input, a hidden, and an output layer. The input and output layers contain N nodes, and the hidden layer contains K nodes. The target values at the output layer are set equal to the input values, i.e., x̂ i = xi as shown in Figure 2.5.1(a). The sparse autoencoder network finds the optimal values for weight matrices, W ∈ K×N and V ∈ N×K, and bias vectors, b1 ∈ K×1 and b2 ∈ N×1, using back-propagation algorithm while trying to learn the approximation of the identity function, i.e., output x̂ similar to x. Sigmoid function, 𝑔(𝑧) = 1 1+ⅇ−𝑧 , is used for the activation, hW,b of the nodes in the hidden and output layers: hW,b(x) = g(Wx + b) (1) (2) The cost function to be minimized in sparse autoencoder using back- propagation is represented by Eq. (2). The first term is the average of sum-of- square error terms for all m input data. The second term is a weight decay term, with λ as weight decay parameter, to avoid the over-fitting in training. The last term in the equation is sparsity penalty term that puts a constraint into the hidden layer to maintain a low average activation values, and expressed as KullbackLeibler (KL) divergence shown in Eq. (3): (3)
  • 46. NETWORK INTRUSION DETECTION USING SUPERVISED MACHINE LEARNING TECHNIQUE WITH FEATURE SELECTION 37 HGAE DEPARTMENT OF CSE where ρ is a sparsity constraint parameter ranges from 0 to 1 and β controls the sparsity penalty term. The KL(ρ||p̂ j) attains a minimum value when ρ = p̂ j, where p̂ j denotes the average activation value of hidden unit j over all training inputs x. Once we learn optimal values for W and b1 by applying the sparse autoencoder on unlabeled data, xu, we evaluate the feature representation a = hW,b1(xl) for the labeled data, (xl,y). We use this new feature representation, a, with the labels vector, y, for the classification task in the second stage. We use soft-max regression for the classification task as shown in the Figure 2.5.1(b) 2) NSL-KDD Dataset NSL-KDD dataset is an improved and reduced version of the KDD Cup 99 dataset. The KDD Cup dataset was prepared using the network traffic captured by 1998 DARPA IDS evaluation program. The network traffic includes normal and different kinds of attack traffic, such as DoS, Probing, user-to-root (U2R), and root-to-local (R2L). The network traffic for training was collected for seven weeks followed by two weeks of traffic collection for testing in raw tcpdump format. The test data contains many attacks that were not injected during the training data collection phase to make the intrusion detection task realistic. It is believed that most of the novel attacks can be derived from the known attacks. Finally, the training and test data were processed into the datasets of five million and two million TCP/IP connection records, respectively. The KDD Cup dataset has been widely used as a benchmark dataset for many years in the evaluation of NIDS. One of the major drawbacks with the dataset is that it contains an enormous amount of redundant records both in the training and test data. It was observed that almost 78% and 75% records are redundant in the Figure 2.5.1 : The two-stage process of self-taught learning: a) Unsupervised Feature Learning (UFL) on unlabeled data. b) Classification on labeled data.