M.Tech. IDS Lecture-Mid Term.pptx

Intrusion Detection System
By:
Dr. SANTOSH KUMAR
Professor, Department of Computer Science and Engineering
Graphic Era Deemed to be University, Dehradun (India)

Outline of Course work
1. Introduction: Basic Definition, Network Attacks
2. Intrusion Detection Approaches
3. Theoretical foundation of Intrusion Detection
4. IDS and IPS Internals: Implementation and deployment
5. Security and IDS Management
Case studies:

1. Introduction: Basic Definition, Network Attacks
What is Intrusion?
• The act of intruding or the state of being intruded; especially : the act of
wrongfully entering upon, seizing, or taking possession of the property of
another.
What is a computer intrusion?
• Computer Definition. To compromise a computer system by breaking the security
of such a system or causing it to enter into an insecure state. The act of
intruding—or gaining unauthorized access to a system—typically leaves traces
that can be discovered by intrusion detection systems.

What is meant by intruders in network security?
• An Intruder is a person who attempts to gain unauthorized access to a system, to
damage that system, or to disturb data on that system. In summary, this person
attempts to violate Security by interfering with system Availability, data Integrity
or data Confidentiality.
What is Intrusion Detection System (IDS)?
• An intrusion detection system (IDS) is a device or software application that
monitors a network or systems for malicious activity or policy violations.
What is IPS and IDS?
• IDS (Intrusion Detection System) and IPS (Intrusion Prevention System) both
increase the security level of networks, monitoring traffic and inspecting and
scanning packets for suspicious data. Detection in both systems is mainly based
on signatures already detected and recognized.

Attack Taxonomies
• Classification is simply the separation or ordering of the objects into
classes, a taxonomy is the theoretical study of the classification,
including its bases, principles, procedures and rules.
• The purpose of attack taxonomies is to provide a theoretical and
consistent means of classifying computer and network attacks, thus
improving the efficiency in information exchange when describing
attacks.
• three typical taxonomies for computer and network attacks from the
perspective of general usage, specific usage and the attacker,
respectively
• The taxonomy called VERDICT (Validation Exposure Randomness
Deallocation Improper Conditions Taxonomy), shows that all
computer attacks can be classified using four improper conditions,
namely validation, exposure, randomness and deallocation.

Network Attacks
In the first dimension, the attacks have been classified into ten 10 categories
that are listed below:
• Virus: self-replicating program that attach itself to an existing program and
infects a system without permission or knowledge of the user.
• Worm: self-replicating program that propagates through network services
on computers without any intervention of users.
• Trojan: a piece of program made to perform a certain benign action, but in
fact perform different code for malicious purpose.
• Buffer overflow: a process that gains control or crashes another process by
overwriting the boundary of a fixed length buffer.
• Denial of service: an attack which prevents intended legitimate users from
accessing or using a computer or network resource.
• Network attack: an attack that crash the users on the network or the
network itself through manipulating network protocols, ranging from the
data-link layer to the application layer.

• Physical attack: an attack that attempts to damage physical
components of a network or computer.
• Password attack: an attack that aims to gain a password and is
usually indicated by a series of failed logins within a short period of
time.
• Information gathering attack: an attack that gathers information or
finds known vulnerabilities by scanning or probing existing computer
networks.

• Probes: Network probes are usually attacks scanning computer
networks to gather information or find known vulnerabilities, which
are exploited for further or future attacks.

Questions on IDS
???
• Q#1. What is Security Testing?
• Ans. Security testing can be considered most important in all type of
software testing. Its main objective is to find vulnerabilities in any
software (web or networking) based application and protect their
data from possible attacks or intruders.
• As many applications contains confidential data and needs to be
protected being leaked. Software testing needs to be done
periodically on such applications to identify threats and to take
immediate action on them.

Q#2. What is “Vulnerability”?
• Ans. The Vulnerability can be defined as weakness of any system
through which intruders or bugs can attack on the system.
If security testing has not been performed rigorously on the system
then chances of vulnerabilities get increase. Time to time patches or
fixes requires preventing a system from the vulnerabilities.

Q#3. What is the Intrusion Detection?
Ans. Intrusion detection is a system which helps in determining
possible attacks and deal with it. Intrusion detection includes collecting
information from many systems and sources, analysis of the
information and find out the possible ways of attack on the system.
Intrusion detection check following:
• Possible attacks
• Any abnormal activity
• Auditing the system data
• Analysis of different collected data etc.

Q#4. What is “SQL Injection”?
• Ans. SQL Injection is one of the common attacking techniques used
by hackers to get the critical data.
• Hackers check for any loop hole in the system through which they can
pass SQL queries which by passed the security checks and return back
the critical data. This is known as SQL injection. It can allow hackers to
steal the critical data or even crash a system.
• SQL injections are very critical and needs to be avoided. Periodic
security testing can prevent these kind of attacks. SQL database
security needs to be define correctly and input boxes and special
characters should be handled properly.

Q#5. List the attributes of Security Testing?
• Q#5. List the attributes of Security Testing?
Ans. There are following seven attributes of Security Testing:
• Authentication
• Authorization
• Confidentiality
• Availability
• Integrity
• Non-repudiation (Non- Denial)
• Resilience (Flexibility)

Q#6. What is XSS or Cross Site Scripting?
• Ans. XSS or cross site scripting is type of vulnerability that hackers
used to attack web applications.
• It allows hackers to inject HTML or JAVASCRIPT code into a web page
which can steal the confidential information from the cookies and
returns to the hackers. It is one of the most critical and common
technique which needs to be prevented.

Q#7. What is SSL connection and an SSL
session?
Ans. SSL or secured socket layer connection is a temporary peer-to-
peer communications link where each connection is associated with
one SSL Session
• SSL session can be defines as association between client and server
generally crated by handshake protocol. There are set of parameters
are defined and it may be share by multiple SSL connections.

Q#8. What is “Penetration Testing”?
Ans. Penetration testing is on the security testing which helps in identifying
vulnerabilities in a system. Penetration test is an attempt to evaluate the
security of a system by manual or automated techniques and if any
vulnerability found testers uses that vulnerability to get deeper access to the
system and found more vulnerabilities. The main purpose of this testing to
prevent a system from any possible attacks.
• Penetration testing can be done by two ways –White Box testing and Black
box testing.
• In white box testing all the information is available with the testers
whereas in black box testing testers don’t have any information and they
test the system in real world scenario to find out the vulnerabilities.

Q#9. Why “Penetration Testing” is important?
Ans. Penetration testing is important because-
• Security breaches and loop holes in the systems can be very costly as
threat of attack is always possible and hackers can steal the important
data or even crash the system.
• It is impossible to protect all the information all the time. Hackers
always come with new techniques to steal the important data and its
necessary for testers as well to perform the testing periodically to
detect the possible attacks.
• Penetration testing identifies and protects a system by various attacks
and helps organizations to keep their data safe.

Q#10. Name the two common techniques
used to protect a password file?
Ans. Two common techniques to protect a password file are- hashed
passwords and a salt value (random Value) or password file access control.
Q#12. What is ISO 17799?
Ans. ISO/IEC 17799 is originally published in UK and defines best practices
for Information Security Management. It has guidelines for all organizations
small or big for Information security.
ISO: International Organization for Standardization
IEC: International Electrotechnical Commission

Ethical Hacking
• Ethical Hacking is when a person is allowed to hacks the system with the
permission of the product owner to find weakness in a system and later fix
them.
• IP address: To every device IP address is assigned, so that device can be
located on the network. In other words IP address is like your postal
address, where anyone who knows your postal address can send you a
letter.
• MAC (Machine Access Control) address: A MAC address is a unique serial
number assigned to every network interface on every device. MAC address
is like your physical mail box, only your postal carrier (network router) can
identify it and you can change it by getting a new mailbox (network card) at
any time and striking your name (IP address) on it.

Common tools used by Ethical hackers
• Metasploit: Free and integrates with Nexpose to verify
vulnerabilities scanner
• Wire Shark: free and open source. Wireshark is a network protocol
analyzer for Unix and Windows
• NMap (Network mapper)- Free Security scanner and used to discover
hosts and services on a computer network.
• John The Ripper : free password cracking software tool
• Maltego: proprietary software used for open-source intelligence and
forensics, developed by Paterva

Types of ethical hackers
The types of ethical hackers are
• Grey Box hackers or Cyber warrior
• Black Box penetration Testers
• White Box penetration Testers
• Certified Ethical hacker

Footprinting and techniques used for ethical
hacking
Footprinting refers accumulating and uncovering as much as information
about the target network before gaining access into any network. The
approach adopted by hackers before hacking
• Open Source Footprinting : It will look for the contact information of
administrators that will be used in guessing the password in Social
engineering
• Network Enumeration : The hacker tries to identify the domain names and
the network blocks of the target network
• Scanning : Once the network is known, the second step is to spy the active
IP addresses on the network. For identifying active IP addresses (ICMP)
Internet Control Message Protocol is an active IP addresses
• Stack Fingerprinting : Once the hosts and port have been mapped by
scanning the network, the final footprinting step can be performed. This is
called Stack fingerprinting.

Brute Force Hack
• Brute force hack is a technique for hacking password and get access
to system and network resources, it takes much time, it needs a
hacker to learn about JavaScripts.
• For this purpose, one can use tool name “Hydra”.

Denial of service attack
• Denial of Service, is a malicious attack on network that is done by
flooding the network with useless traffic. Although, DOS does not
cause any theft of information or security breach, it can cost the
website owner a great deal of money and time.
• Buffer Overflow Attacks
• SYN Attack
• Teardrop Attack
• Smurf Attack
• Viruses

Computer based social engineering attacks
and Phishing
Computer based social engineering attacks are
• Phishing
• Baiting
• On-line scams
Phishing technique involves sending false e-mails, chats or website to
impersonate real system with aim of stealing information from original
website.
Baiting is like the real-world Trojan Horse that uses physical media and
relies on the curiosity or greed of the victim. In this attack, the attacker
leaves a malware infected floppy disk, CD-ROM, or USB flash drive in a
location sure to be found (bathroom, elevator, sidewalk, parking lot), gives
it a legitimate looking and curiosity-irritating label, and simply waits for the
victim to use the device.

Network Sniffing
• A network sniffer monitors data flowing over computer network links.
By allowing you to capture and view the packet level data on your
network, sniffer tool can help you to locate network problems.
Sniffers can be used for both stealing information off a network and
also for legitimate network management.
ARP Spoofing or ARP poisoning:
ARP (Address Resolution Protocol) is a form of attack in which an
attacker changes MAC ( Media Access Control) address and attacks an
internet LAN by changing the target computer’s ARP cache with a
forged ARP request and reply packets.

Method to avoid or prevent ARP Poisoning
ARP poisoning can be prevented by following methods
• Packet Filtering : Packet filters are capable for filtering out and blocking
packets with conflicting source address information
• Avoid trust relationship : Organization should develop protocol that rely
on trust relationship as little as possible
• Use ARP spoofing detection software : There are programs that inspects
and certifies data before it is transmitted and blocks data that is spoofed.
• Use cryptographic network protocols : By using secure communications
protocols like TLS (Transport layer security), SSH (secure file transfer (SFTP)
or secure copy-SCP protocols), HTTP secure prevents ARP spoofing attack
by encrypting data prior to transmission and authenticating data when it is
received.

MAC Flooding
• Mac Flooding is a technique where the security of given network switch is
compromised. In MAC flooding the hacker or attacker floods the switch
with large number of frames, then what a switch can handle. This make
switch behaving as a hub and transmits all packets at all the ports. Taking
the advantage of this the attacker will try to send his packet inside the
network to steal the sensitive information.
DHCP Rogue Server
• A Rogue DHCP server is DHCP server on a network which is not under the
control of administration of network staff. Rogue DHCP Server can be a
router or modem. It will offer users IP addresses , default gateway, WINS
servers as soon as user’s logged in. Rogue server can sniff into all the
traffic sent by client to all other networks.

Cross-site scripting
• Cross site scripting is done by using the known vulnerabilities like web
based applications, their servers or plug-ins users rely
upon. Exploiting one of these by inserting malicious coding into a link
which appears to be a trustworthy source. When users click on this
link the malicious code will run as a part of the client’s web request
and execute on the user’s computer, allowing attacker to steal
information.
• There are three types of Cross-site scripting
• Non-persistent
• Persistent Server side versus DOM(document object model) based
vulnerabilities

Burp Suite
• Burp suite is an integrated platform used for attacking web applications. It consists of all
the Burp tools required for attacking an application. Burp Suite tool has same approach
for attacking web applications like framework for handling HTTP request, upstream
proxies, alerting, logging and so on.
• The tools that Burp Suite has
Proxy
Spider
Scanner
Intruder
Repeater
Decoder
Comparer
Sequencer

Pharming and Defacement
• Pharming: In this technique the attacker compromises the DNS (
Domain Name System) servers or on the user computer so that traffic
is directed to a malicious site
• Defacement: In this technique the attacker replaces the organization
website with a different page. It contains the hackers name, images
and may even include messages and background music

Method to Stopped website getting hacked
By adapting following method you can stop your website from getting hacked
• Sanitizing and Validating users parameters: By Sanitizing and Validating user parameters
before submitting them to the database can reduce the chances of being attacked by SQL
injection
• Using Firewall: Firewall can be used to drop traffic from suspicious IP address if attack is
a simple DOS
• Encrypting the Cookies: Cookie or Session poisoning can be prevented by encrypting the
content of the cookies, associating cookies with the client IP address and timing out the
cookies after some time
• Validating and Verifying user input : This approach is ready to prevent form tempering
by verifying and validating the user input before processing it
• Validating and Sanitizing headers : This techniques is useful against cross site scripting
or XSS, this technique includes validating and sanitizing headers, parameters passed via
the URL, form parameters and hidden values to reduce XSS attacks

Keylogger Trojan and Enumeration
• Keylogger Trojan is malicious software that can monitor your
keystroke, logging them to a file and sending them off to remote
attackers. When the desired behaviour is observed, it will record the
keystroke and captures your login username and password.
• The process of extracting machine name, user names, network
resources, shares and services from a system. Under Intranet
environment enumeration techniques are conducted.

Password cracking techniques
Password cracking technique includes
• AttackBrute Forcing
• AttacksHybrid
• AttackSyllable
• AttackRule

Hacking stages
The types of hacking stages are
• Gaining Access Escalating
• Privileges Executing
• Applications Hiding
• Files Covering Tracks

CSRF (Cross Site Request Forgery)
• CSRF or Cross site request forgery is an attack from a malicious
website that will send a request to a web application that a user is
already authenticated against from a different website. To prevent
CSRF you can attach unpredictable challenge token to each request
and associate them with user’s session. It will ensure the developer
that the request received is from a valid source.

IDS: Data Collection
• Data collection is one of the most important steps when designing an
Intrusion Detection System (IDS).
• It influences the whole design and implementation process, and also
the final detection result.
• Intrusion detection systems collect data from many different sources,
such as system log files, network packets or flows, system calls and a
running code itself.
• The place where the data are collected decides the detection
capability and scope of IDSs, i.e. a network based IDS can not detect a
User-to-Root attack, while an application based IDS is not able to find
a port scanning attack.
• Therefore, the Data collection in terms of the different locus including
host-based, network-based and application-based.

Data Collection for Host-Based IDSs
• Host-based Intrusion Detection Systems (HIDSs) analyze activities on
a protected host by monitoring different sources of data that reside
on that host, such as a log file, system calls, file accesses, or the
contents of the memory.
• There are two main data sources that can be used for Host-Based
detection, namely auditlogs and system calls.
• Audit-logs stand for a set of events created by the Operating System
(OS) for performing certain tasks.
• While system-calls represent the behaviour of each user-critical
application running on the OS.

System Call Sequences
• System Call (SC) sequences have proved to be an effective
information source in host-based intrusion detection.
• The basic idea behind using system calls for intrusion detection is that
if an anomaly exists in the application, it will also affect the way in
which the application interacts with the OS.

Theoretical Foundation of Detection
• Taxonomy of Anomaly Detection Systems:

IDS and IPS Analysis Schemes
• IDSs and IPSs perform analyses, and it is important to understand the
analysis process: what analysis does, what types of analysis are
available, and what the advantages and disadvantages of different
analysis schemes are.
What Is Analysis?
• Analysis, in the context of intrusion detection and prevention, is the
organization of the constituent parts of data and their
interrelationships to identify any anomalous activity of interest.
• Real-time analysis is analysis done on the fly as the data travels the
path to the network or host. This is a bit of a misnomer, however, as
analysis can only be performed after the fact in near-real-time.

Contd….
The fundamental goal of intrusion-detection and intrusion-prevention
analysis is to improve an information system’s security. This goal can be
further broken down:
• Create records of relevant activity for follow-up
• Determine flaws in the network by detecting specific activities
• Record unauthorized activity for use in forensics or criminal
prosecution of intrusion attacks
• Act as a deterrent to malicious activity
• Increase accountability by linking activities of one individual across
systems

The Anatomy of Intrusion Analysis
There are many possible data-analysis schemes for an analysis engine,
and in order to understand them, the intrusion-analysis process can be
broken down into four phases:
i. Preprocessing
ii. Analysis
iii. Response
iv. Refinement
(i)Preprocessing is a key function once data are collected from an IDS or
IPS sensor. In this step, the data are organized in some fashion for
classification. The preprocessing will help determine the format the
data are put into, which is usually some canonical format or could be
a structured database. Once the data are formatted, they are broken
down further into classifications.

• These classifications can depend on the analysis schemes being used. For
example, if rule-based detection is being used, the classification will involve rules
and pattern descriptors.
• If anomaly detection is used, you will usually have a statistical profile based on
different algorithms in which the user behavior is baselined over time and any
behavior that falls outside of that classification is flagged as an anomaly.
• Upon completion of the classification process, the data is concatenated and put
into a defined version or detection template of some object by replacing variables
with values. These detection templates populate the knowledgebase which are
stored in the core analysis engine:
• Detection of the modification of system log files
• Detection of unexpected privilege escalation
• Detection of Backdoor Netbus
• Detection of Backdoor SubSeven
• ORACLE grant attempt
• RPC mountd UDP export request

• Once the prepossessing is completed, the analysis stage begins. The data
record is compared to the knowledge base, and the data record will either
be logged as an intrusion event or it will be dropped. Then the next data
record is analyzed.
• The next phase, response, is one of the differentiating factors between IDS
and IPS.
• With IDS, you typically have limited prevention abilities—you are getting
the information passively after the fact, so you will have an alert after the
fact. Once information has been logged as an intrusion, a response can be
initiated.
• With IPS, the sensor is inline and it can provide real-time prevention
through an automated response. This is the essential difference between
reactive security and proactive security.
• Either way, the response is specific to the nature of the intrusion or the
different analysis schemes used.
• The response can be set to be automatically performed, or it can be done
manually after someone has manually analyzed the situation. For example,
Network Flight Recorder (a commercial IDS) offers a feature that can send
aTCPRST packet and kill a session.

• The final phase is the refinement stage. This is where the fine-tuning
of the IDS or IPS system can be done, based on previous usage and
detected intrusions.
• This gives the security professional a chance to reduce false-positive
levels and to have a more accurate security tool.
• This is a very critical stage for getting the most from your IDS or IPS
system. The system must be fine-tuned for your environment to get
any real value from it. There are tools, like Cisco Threat Response
(CTR), that will help with the refining stage by actually making sure
that an alert is valid by checking whether you are vulnerable to that
attack or not.

Rule-Based Detection (Misuse Detection)
• Rule-based detection, also referred to as signature detection, pattern matching
and misuse detection, is the first scheme that was used in early intrusion-
detection systems. Rule-based detection uses pattern matching to detect known
attack patterns.
• Let’s look at how the four phases of the analysis process are applied in a rule-
based detection system:
1. Preprocessing The first step is to collect data about intrusions, vulnerabilities,
and attacks, and put them into a classification scheme or pattern descriptor.
From the classification scheme, a behavioral model is built, and then put into a
common format:
• Signature Name The given name of a signature
• Signature ID A unique ID for the signature
• Signature Description Description of the signature and what it does
• Possible False Positive Description An explanation of any “false positives” that may appear to
be an exploit but are actually normal network activity.
• Related Vulnerability Information This field has any related vulnerability information
• User Notes This field allows a security professional to add specific notes related to their
network

2. Analysis The event data are formatted and compared against the
knowledge base by using a pattern-matching analysis engine. The
analysis engine looks for defined patterns that are known as attacks.
3. Response If the event matches the pattern of an attack, the analysis
engine sends an alert. If the event is a partial match, the next event
is examined. Note that partial matches can only be analyzed with a
stateful detector, which has the ability to maintain state, as many IDS
systems do. Different responses can be returned depending on the
specific event records.
4. Refinement Refinement of pattern-matching analysis comes down to
updating signatures, because an IDS is only as good as its latest
signature update. This is one of the drawbacks of pattern-matching
analysis. Most IDSs allow automatic and manual updating of attack
signatures.

Profile-Based Detection (Anomaly Detection)
An anomaly is something that is different from the norm or that cannot
be easily classified. Anomaly detection, also referred to as profile-based
detection, creates a profile system that flags any events that strays from
a normal pattern and passes this information on to output routines.
Anomaly-based schemes fall into three main categories: behavioral,
traffic pattern and protocol.
• Behavioral analysis looks for anomalies in the types of behavior that
have been statistically baselined, such as relationships in packets and
what is being sent over a network.
• Traffic-pattern analysis looks for specific patterns in network traffic.
• Protocol analysis looks for network protocol violations or misuse
based on RFC-based behavior.

let’s review the analysis model in the context of anomaly detection:
• Preprocessing The first step in the analysis process is collecting the data in
which behavior considered normal on the network is baselined over a
period of time. The data are put into a numeric form and is then formatted.
Then the information is classified into a statistical profile that is based on
different algorithms in the knowledge base.
• Analysis The event data are typically reduced to a profile vector, which is
then compared to the knowledge base. The contents of the profile vector
are compared to a historical record for that particular user, and any data
that fall outside of the baseline normal activity is labeled a deviation.
• Response At this point, a response can be triggered either automatically or
manually.
• Refinement The data records must be kept updated. The profile vector
history will typically be deleted after a specific number of days. In addition,
different weighting systems can be used to add more weight to recent
behaviors than past behaviors.

Target Monitoring
• Target-monitoring systems will report whether certain target objects
have been changed or modified. This is usually done through a
cryptographic algorithm that computes a cryptochecksum for each
target file. The IDS reports any changes, such as a file modifications or
program logon, which would cause changes in cryptochecksums
• Tripwire software will perform target monitoring using
cryptochecksums by providing-instant notification of changes to
configuration files and enabling automatic restoration. The main
advantage of this approach is that you do not have to continuously
monitor the target files.

Stealth Probes
• Stealth probes correlate data to try to detect attacks made over a long
period of time, often referred to as “low and slow” attacks. Data are
collected from a variety of sources, and it is characterized and sampled to
discover any correlating attacks. This technique is also referred to as wide-
area correlation, and it is typically a combination or hybrid approach that
uses other detection methods to try and uncover malicious behavior.
Heuristics
• The term heuristics refers to artificial intelligence (AI). In theory, an IDS will
identify anomalies to detect an intrusion, and it will then learn over time
what can be considered normal. To use heuristics, an AI scripting language
can apply analysis to the incoming data.
• Heuristics still leave a lot to be desired at this stage, but development is
progressing. What is needed is a pattern-matching language that can use
programming constructs to learn and identify malicious activity more
accurately.

Hybrid Approach
• We have examined the fundamental analysis schemes. You will find
that there is much debate on which is considered the best approach.
In actuality, they all have their merits and drawbacks, but when they
are used together they can offer a more robust security system.
Products that use a hybrid approach typically perform better,
especially against complex attacks.

Theoretical Foundation of Detection
• Understanding the strengths and weaknesses of the machine learning and
data mining approaches helps to choose the best approach to design and
develop a detection system.
Taxonomy of Anomaly Detection Systems
• The idea of applying machine learning techniques for intrusion detection is
to automatically build the model based on the training data set.
• This data set contains a collection of data instances each of which can be
described using a set of attributes (features) and the associated labels.
• The attributes can be of different types such as categorical or continuous.
• The nature of attributes determines the applicability of anomaly detection
techniques.

• For example, distance-based methods are initially built to work with
continuous features and usually do not provide satisfactory results on
categorical attributes.
• The labels associated with data instances are usually in the form of
binary values, i.e. normal and anomalous.
• In contrast, some researchers have employed different types of
attacks such as DoS, U2R (User to root), R2L (remote to local) and
Probe rather than the anomalous label.
• Since labeling is often done manually by human experts, obtaining an
accurate labeled data set which is representative of all types of
behaviors is quite expensive.
• Based on the availability of the labels, three operating modes are
defined for anomaly detection techniques:

1. Supervised Anomaly Detection:
• supervised methods, also known as classification methods, need a
labeled training set containing both normal and anomalous samples
to build the predictive model.
• Theoretically, supervised methods provide better detection rate
compared to semi-supervised and unsupervised methods since they
have access to more information. However, there exist some technical
issues which make these methods be as not accurate as they are
assumed to be.
• As examples of supervised learning methods we can name Neural
Networks, Support Vector Machines (SVM), k-Nearest Neighbors,
Bayesian Networks, and Decision Trees.

2. Semi-supervised Anomaly Detection:
• Semi-supervised learning falls between unsupervised learning (without any
labeled training data) and supervised learning (with completely labeled
training data).
• semi-supervised methods employ unlabeled data in conjunction with a
small amount of labeled data. As a result, they highly reduce the labeling
cost, while maintaining the high performance of supervised methods.
• Although the typical approach in semi-supervised techniques is to model
the normal behavior, there exist a limited number of anomaly detection
techniques that assume availability of the anomalous instances for training.
• Such techniques are not wildly used since it is almost impossible to obtain
a training set which covers every possible anomalous behavior.

3. Unsupervised Anomaly Detection
• unsupervised techniques do not require training data.
• this approach is based on two basic assumptions.
• First, it assumes that the majority of the network connections represent
normal traffic and that only a very small percentage of the traffic is
malicious.
• Second, it is expected that malicious traffic is statistically different from
normal traffic.
• Based on these two assumptions, data instances that build groups of
similar instances and appear very frequently are supposed to represent
normal traffic, while instances that appear infrequently and are
significantly different from the majority of the instances are considered to
be suspicious.

• Typically, there are three types of output to report the anomalies
namely scores, binary labels, and labels.
• 1) Scores: in this technique, anomaly detectors will assign a
numeric score to each instance which indicates how likely it is that
the test instance is anomaly.
• The advantage of this technique is that the analyst can rank the
malicious activities, set a threshold to cut off the anomalies, and
select the most significant ones.
• Bayesian networks such as Naive Bayes are good examples of this
kind of methods in which they provide the administrator with the
calculated probabilities
•

• 2) Binary Labels: some of the anomaly detection techniques such as
Decision Trees are not able to provide scores for the instances;
instead they label the test instances as either anomalous or normal.
• This approach can be considered as the special case of labeling
techniques.
• 3) Labels: anomaly detection techniques in this category assign a
label to each test instance. In this approach, usually there is one label
for normal traffic, normal.

Fuzzy
• The word fuzzy refers to things which are not clear or are vague.
• Any event, process, or function that is changing continuously cannot always
be defined as either true or false, which means that we need to define such
activities in a Fuzzy manner.
What is Fuzzy Logic?
• Fuzzy Logic resembles the human decision-making methodology. It deals
with vague and imprecise information.
• This is gross oversimplification of the real-world problems and based on
degrees of truth rather than usual true/false or 1/0 like Boolean logic.
• in fuzzy systems, the values are indicated by a number in the range from 0
to 1. 1.0 represents absolute truth and 0.0 represents absolute falseness.
The number which indicates the value in fuzzy systems is called the truth
value.

• Fuzzy Logic was introduced in 1965 by Lofti A. Zadeh in his research
paper “Fuzzy Sets”. He is considered as the father of Fuzzy Logic.
• Fuzzy logic- Classical set theory:
• A set is an unordered collection of different elements. It can be
written explicitly by listing its elements using the set bracket. If the
order of the elements is changed or any element of a set is repeated,
it does not make any changes in the set.
• Example
• A set of all positive integers.
• A set of all the planets in the solar system.
• A set of all the states in India.
• A set of all the lowercase letters of the alphabet.

Mathematical Representation of a Set
Roster or Tabular Form:
• In this form, a set is represented by listing all the elements comprising it.
The elements are enclosed within braces and separated by commas.
Following are the examples of set in Roster or Tabular Form −
• Set of vowels in English alphabet, A = {a,e,i,o,u}
• Set of odd numbers less than 10, B = {1,3,5,7,9}
Set Builder Notation
• In this form, the set is defined by specifying a property that elements of the
set have in common. The set is described as A = {x:p(x)}
• Example 1 − The set {a,e,i,o,u} is written as
• A = {x:x is a vowel in English alphabet}
• Example 2 − The set {1,3,5,7,9} is written as
• B = {x:1 ≤ x < 10 and (x%2) ≠ 0}

Fuzzy Logic - Membership Function
• Fuzzy logic is not logic that is fuzzy but logic that is used to describe fuzziness.
• This fuzziness is best characterized by its membership function.
• In other words, membership function represents the degree of truth in fuzzy logic.

• Fuzzy membership functions of a variable represent the degrees of
similarity of its different values to imprecisely defined properties
• The core of a membership function (full membership) for a fuzzy set
A, core(A), is defined as those elements of the universe, n, for which
μA(n) = 1.
•

Fuzzy Logic in Anomaly Detection
• The application of fuzzy logic for computer security was first proposed by Hosmer in 1993.
• Later, Dickerson et al. proposed a Fuzzy Intrusion Recognition Engine (FIRE) for detecting malicious intrusion activities.
• Anomaly based Intrusion Detection System (IDS) can be possible using both the fuzzy logic and the data mining techniques.
• The fuzzy logic part of the system is mainly responsible for both handling the large number of input parameters and dealing
with the inexactness of the input data.
• There are three fuzzy characteristics used generally:
 COUNT
 UNIQUENESS
 VARIANCE
• The implemented fuzzy inference engine uses five fuzzy sets for each data element
 LOW,
 MEDIUM-LOW,
 MEDIUM,
 MEDIUM-HIGH
 HIGH
• Appropriate fuzzy rules to detect the intrusion is required
• The fuzzy set is a very important issue for the fuzzy inference engine and in some cases genetic approach can be implemented
to select the best combination

• Bayes Theory
- Naive Bayes Classifier
• Bayes Theory in Anomaly Detection
-Pseudo-Bayes estimator based on Naive Bayes probabilistic model in order to
enhance an anomaly detection system’s ability for detecting new attacks while
reducing the false alarm rate as much as possible.
-Audit Data Analysis and Mining (ADAM) is method that could be used for
anomaly detection.
• -ADAM applies mining association rules techniques to detect abnormal events
in network traffic data first, and then abnormal events are classified into
normal instances and abnormal instances.
• However, a major limitation of ADAM is that the classifier is limited by
training data, i.e. it cannot recognize the normal instances and the attacks
appeared in the training data.
• As a result, they construct a Naive Bayes classifier to classify the instances into
normal instances, known attacks and new attacks
• Assumptions some time mislead the detections

Artificial Neural Networks
• Processing Elements
• Connections
• Network Architectures
Feedforward networks
Recurrent networks
• Learning Process
Supervised learning
Unsupervised learning
• Artificial Neural Networks in Anomaly Detection
• Support Vector Machine (SVM)
Support Vector Machine in Anomaly Detection
• Evolutionary Computation
• Evolutionary Computation in Anomaly Detection
• Association Rules

The Apriori Algorithm
• Association Rules in Anomaly Detection
• Clustering
Taxonomy of Clustering Algorithms
-Hierarchical clustering
-Non-hierarchical clustering
-Model based clustering
K-Means Clustering
• Clustering in Anomaly Detection
• Signal Processing Techniques Based Models

M.Tech. IDS Lecture-Mid Term.pptx

Recommended

Recommended

More Related Content

Similar to M.Tech. IDS Lecture-Mid Term.pptx

Similar to M.Tech. IDS Lecture-Mid Term.pptx (20)

Recently uploaded

Recently uploaded (20)

M.Tech. IDS Lecture-Mid Term.pptx