SlideShare a Scribd company logo
Anomaly Detectoin: A Survey
by Varun Chandola, Arindam Banerjee, Vipin Kumar
Presenter: Jaehyeong Ahn
Department of Applied Statistics, Konkuk University
jayahn0104@gmail.com
1 / 48
Table of Contents
Introduction
• Rapid Introduction to Anomaly Detection Problem
• Different Aspects of Anomaly Detection Problem
Anomaly Detection Techniques
• Classification Based Anomaly Detection
• Nearest Neighbor Based Anomaly Detection
• Clustering Based Anomaly Detection
• Statistical Anomaly Detection
• Spectral Anomaly Detection
Conclusion
2 / 48
Introduction
3 / 48
Rapid Introduction to Anomaly Detection Problem
What are Anomalies?
• Anomalies are patterns in data that do not conform to a well
defined notion of normal behavior
Figure: Simple illustration of anomalies in a two-dimensional data set
• N, N: Normal regions, o, o, O: Anomalies
4 / 48
Rapid Introduction to Anomaly Detection Problem
How are they called?
• Anomalies, Outliers, Discordant observations, Exceptions,
Aberrations, Surprises, Peculiarities, or Contaminants in different
application domains
What is an Anomaly Detection?
• Anomaly Detection refers to the problem of finding patterns in data
that do not conform to expected behavior
5 / 48
Rapid Introduction to Anomaly Detection Problem
Various applications of Anomaly Detection
• fraud detection for credit card, insurance, or health care
• intrusion detection for cyber-security
• fault detection in safety critical systems
• military surveillance for enemy activities
Why Detecting Anomaly is important?
• The importance of anomaly detection is due to the fact that in spite
of the small numbers, anomalies cause significant and critical issues
• For example, credit card fraud, cyber-intrusion, terrorist activity or
break down a system, etc
6 / 48
Different Aspects of An Anomaly Detection
Problem
Anomaly Detection problem can be characterized as a specific
formulation by categorizing several different factors:
• Nature of Input Data
• Type of Anomaly
• Availability of Data Labels
• Output Type of Anomaly Detection Technique
7 / 48
Different Aspects of An Anomaly Detection
Problem
Nature of Input Data
Input data can be categorized based on the nature of attribute type and
the relationship among the data instances
• Attribute type
• Univariate vs Multivariate
• Categorical vs Continuous
• Same type vs Mixture of different types
• Relationship among data instances
• No relationship vs Related (e.g. sequence data, spatial data,
graph data)
8 / 48
Different Aspects of An Anomaly Detection
Problem
Type of Anomaly
An important aspect of an anomaly detection technique is the nature of
the desired anomaly.
Anomalies can be classified into three categories
• Point Anomalies
• Contextual Anomalies
• Collective Anomalies
9 / 48
Different Aspects of An Anomaly Detection
Problem
Point Anomalies
• If an individual data instance can be considered as anomalous with
respect to the rest of the data, then the instance is termed a point
anomaly
• This is the simplest type of anomaly and is the focus of majority of
research on anomaly detection
10 / 48
Different Aspects of An Anomaly Detection
Problem
Contextual Anomalies
• If a data instance is anomalous in a specific context, but not
otherwise, then it is termed a contextual anomaly
• Thus when a data instance is a contextual anomaly in a given
context, but an identical data instance could be considered normal
in a different context
• Most commonly explored in time-series data and spatial data
11 / 48
Different Aspects of An Anomaly Detection
Problem
Collective Anomalies
• If a collection of related data instances is anomalous with
respect to the entire data set, it is termed a collective anomaly
• Thus the individual instances in a collective anomaly may not be
anomalies by themselves, but their occurrence together as a
collection is anomalous
12 / 48
Different Aspects of An Anomaly Detection
Problem
Data Labels
• Obtaining labeled data that is accurate is often prohibitively
expensive
• Furthermore, getting a labeled set of anomalous data instances that
covers all possible type of anomalous behavior is more difficult than
getting labels for normal behavior (Because Anomalous behaviors
are not in standardized manner)
• Based on the extent to which the labels are available, anomaly
detection techniques can operate in one of the three modes:
• Supervised Anomaly Detection
• Semi-Supervised Anomaly Detection
• Unsupervised Anomaly Detection
13 / 48
Different Aspects of An Anomaly Detection
Problem
Supervised Anomaly Detection
• Require full accurate class labels
• Extremely imbalanced data set
• Similar to solve an imbalanced classification problem
Semi-Supervised Anomaly Detection
• Require labels only for the normal instances
• More widely applicable than supervised techniques
• Typical approach: to build a model for the normal behavior, and use the
model to identify anomalies in the test data
Unsupervised Anomaly Detection
• Require no label information. Thus it is most widely applicable
• Requires implicit assumption which is ”Anomalies are few and different
from the normal points” (If this assumption is not true then such
techniques suffer from high false alarm rate) 14 / 48
Different Aspects of An Anomaly Detection
Problem
Output of Anomaly Detection
The outputs produced by anomaly detection techniques are one of the
following two types:
Scores
• Scoring techniques report anomaly score to each instance in the test
data depending of the degree to which that instance is considered
an anomaly
• Thus the outputs can be represented in a ranked form
Labels
• These techniques assign a label (normal or anomalous) to each test
instances
15 / 48
Figure: General approach for solving Anomaly Detection problem
16 / 48
Anomaly
Detection
Techniques
17 / 48
Classification Based
General Assumption
A classifier that can distinguish between normal and anomalous classes
can be learned in the given feature space
Basic Idea
It works similar to classification problem in supervised learning except for
that it only obtains normal class instances for training
• Training Phase
• Learns a classifier using the available labeled training data
• Testing Phase
• Classifies a test instance as normal or anomalous, using the
classifier
18 / 48
Classification Based
Two broad categories based on the availability of labels
• One-Class Classification based anomaly detection
• Assumption: all training instances have only one class label
• Learns a discriminative boundary around the normal instances
using a one-class classification algorithm
• Multi-Class Classification based anomaly detection
• Assumption: the training data contains labeled instances
belonging to multiple normal classes
• Learns discriminative boundaries around the normal class
instances
Both techniques identify test instance as anomalous if it is not
classified as normal by classifier
19 / 48
Classification Based
Two broad categories based on the availability of labels
(a) One-Class Classification (b) Multi-Class Classification
20 / 48
Classification Based
Various Classification-Based Anomaly Detection Techniques
• Neural Networks-Based
• Bayesian Networks-Based
• Support Vector Machines-Based
• · · ·
Computational Complexity
• It mainly depends on the classification algorithm being used
• The testing phase is usually very fast since the testing phase
uses a learned model for classification
21 / 48
Classification Based
Advantages and Disadvantages
(+) It can make use of powerful algorithms that can distinguish between
instances belonging to different classes
(+) The testing phase is fast, since each test instance needs to be
compared against the precomputed model
(–) Require the availability of accurate labels which is often not possible
(–) Return binary output, normal or anomalous, which can be a
disadvantage when a meaningful anomaly score (or ranking) is
desired
(some techniques obtain a probabilistic prediction score to overcome
this problem)
22 / 48
Nearest Neighbor Based
General Assumption
Normal data instances occur in dense neighborhoods, while anomalies
occur far from their closest neighbors
Basic Idea
• Lazy learning algorithm (Training stage does just storing the
training data)
• Training Phase
• Store training data
• Testing Phase
• Compute specified distance between test instance and all
training instances
• For a test instance, if computed distance is high, then it is identified
as anomalous
23 / 48
Nearest Neighbor Based
Two Broad Categories of Nearest Neighbor Techniques
• Using Distance to kth Nearest Neighbor
• Use the distance of a data instance to its kth
nearest neighbor
as the anomaly score
• Using Local Density of test instance
• use the inverse of local density of each data instance to
compute its anomaly score
24 / 48
Nearest Neighbor Based
Using Distance to kth Nearest Neighbor
• The anomaly score of a data instance is defined as its distance to its
kth
nearest neighbor in a given data set
(a) Normal point (b) Anomaly point
• Basic Extensions
• Modifies the definition to obtain anomaly score (e.g.
average,max, · · · )
• Use different distance measure to handle complex data type
• Improve the computational efficiency
25 / 48
Nearest Neighbor Based
Density-Based
• An instance that lies in a neighborhood with low density is declared
to be anomalous while an instance that lies in a dense neighborhood
is declared to be normal
(a) Normal point (b) Anomaly point
• Density of the instance is usually estimated as k divided by the
radius of hypershpere, centered at the given data instance, which
contains k other instances
26 / 48
Nearest Neighbor Based
Problem of Basic Nearest Neighbor Method
• If the data has regions of varying densities, the basic nearest
neighbor method fails to detect anomalies
Figure: Problem of Basic Nearest Neighbor method, retrieved from [2]
To handle this issue, Relative Density method has been proposed
27 / 48
Nearest Neighbor Based
Using Relative Density
Figure: Local Outlier Factor
• Basic Extensions
• Estimate the local density in a different way
• Compute more complex data types
• Improve the computational efficiency
28 / 48
Nearest Neighbor Based
Advantages and Disadvantages
(+) Unsupervised in nature
(+) Adapting nearest neighbor-based techniques to a different data type
is straightforward, and primarily requires defining a appropriate
distance measure for the given data
(–) If the general assumption does not conform to the given data set,
the detector fails to catch anomalies
(–) The computational complexity of the testing phase is a challenge
O(N2
). Since it has to compute all distances between each test
instance to all training data points
(–) Performance greatly relies on a distance measure
29 / 48
Clustering Based
• Clustering is used to group similar data instances into clusters
• Clustering-Based anomaly detection techniques can be
grouped into three categories based on its assumption
• Normal data instances belong to a cluster in the data, while
anomalies do not belong to any cluster
• Normal data instances lie close to their closest cluster centroid,
while anomalies are far away from their closest cluster centroid
• Normal data instances belong to large and dense clusters,
while anomalies either belong to small or sparse clusters
30 / 48
Clustering Based
First Category
• Assumption
• Normal data instances belong to a cluster in the data, while
anomalies do not belong to any cluster
• Several clustering algorithms that do not force every data instance
to belong to a cluster can be used (e.g. DBSCAN, ROCK, SNN,
· · · )
• Disadvantage
• They are not optimized to find anomalies, since the main aim
of the underlying clustering algorithm is to find clusters
31 / 48
Clustering Based
First Category
Figure: Clustering-based anomaly detection of first category, retrieved from [3]
32 / 48
Clustering Based
Second Category
• Assumption
• Normal data instances lie close to their closest cluster centroid,
while anomalies are far away from their closest cluster centroid
• These kind of techniques are operated in two-steps
1 The data is clustered using a clustering algorithm
2 For each data instance, its distance to its closest cluster
centroid is calculated as its anomaly score
• Various clustering algorithms can be obtained (e.g. Self- Organizing
Maps (SOM), K-means Clustering, · · · )
• Disadvantage
• If the anomalies in the data form clusters by themselves, these
techniques will not be able to detect such anomalies
33 / 48
Clustering Based
Second Category
Figure: Clustering-based anomaly detection of second category, retrieved from
[4] 34 / 48
Clustering Based
Third Category
• Assumption
• Normal data instances belong to large and dense clusters,
while anomalies either belong to small or sparse clusters
• Techniques based on this assumption declare instances belonging to
clusters whose size and/or density is below a certain threshold, as
anomalous
• Several variations have been proposed for this categorry
• For example, Cluster-Bassed Local Outlier Factor (CBLOF)
captures the size of the cluser to which the data instance
belongs, as well as the distance of the data instance to its
cluster centroid
35 / 48
Clustering Based
Advantages and Disadvantages
(+) Can operate in an unsupervised mode
(+) Can be adapted to other complex data types by simply plugging in a
clustering algorithm that can handle the particular data type
(+) The testing phase is fast since the number of clusters against which
every test instance needs to be compared is a small constant
(–) Performance is highly dependent on the effectiveness of clustering
algorithms
(–) They are not optimized for anomaly detection
(–) The computational complexity for clustering the data is often a
bottleneck
36 / 48
Statistical Anomaly Detection
General Assumption
Normal data instances occur in high probability regions of a stochastic
model, while anomalies occur in the low probability regions of the
stochastic model
Basic Idea
• Fit a statistical model to the given data (usually for normal
behavior) and then apply a statistical inference test to determine if
an unseen instance belongs to this model or not
Two broad categories of statistical anomaly detection
• Parametric Techniques
• which assume the knowledge of the underlying distribution and
estimate the parameters from the given data
• Nonparametric Techniques
37 / 48
Statistical Anomaly Detection
Parametric Techniques
• Assume that the normal data is generated by a parametric
distribution with parameters Θ and probability density function
f(x, Θ)
• The anomaly score of a test instance x is the inverse of the
probability density function, f(x, Θ). Θ is estimated from the given
data
• Alternatively, a statistical hypothesis test may be used
• The null hypothesis H: the data instance x has been
generated using the estimated distribution (with parameters Θ)
• If the statistical test rejects H, x is declared to be anomaly
• Since a statistical hypothesis test is associated with a test
statistic, the test statistic value can be obtained as an anomaly
score for x
38 / 48
Statistical Anomaly Detection
Parametric Techniques
• Basic Example: Gaussian Model-Based
• Assume that the data is generated from a Gaussian distribution
• The parameters are estimated using
Maximum Likelihood Estimates(MLE)
• The distance of a data instance to the estimated mean is the
anomaly score for that instance
• A certain threshold is applied to the anomaly scores to
determine the anomalies (e.g. µ±3σ, since this region contains
99.7% of the data)
39 / 48
Statistical Anomaly Detection
Nonparametric Techniques
• The model structure is not defined a priori, but is instead
determined from given data
• Basic Example: Histogram-Based
1 Build a histogram based on the different values in the training
data
2 Check if a test instance falls in any one of the bins of the
histogram
• The size of the bin used when building the histogram is a key
for anomaly detection
• Alternatively the height of the bin that contains the instance
can be obtained as an anomaly score
40 / 48
Statistical Anomaly Detection
Advantages and Disadvantages
(+) If the underlying data distribution assumption holds true, they
provide a statistically justifiable solution for anomaly detection
(+) The anomaly score is associated with a confidence interval, which
can be used as additional information
(–) They rely on the assumption that the data is generated from a
particular distribution.
(–) Choosing the best statistic is often not a straightforward task
41 / 48
Spectral Anomaly Detection
General Assumption
Data can be embedded into a lower dimensional subspace in which
normal instances and anomalies appear significantly different
Basic Idea
• Spectral techniques try to find an approximation of the data using a
combination of attributes that capture the bulk of the variability in
the data
• Thus these techniques seek to determine the subspaces in which the
anomalous instances can be easily identified
• Basic Example: Principal Component Analysis (PCA)
• Projecting the data into lower dimensional space while
maximizing the variability
• Compute the distance between a data instance and the
subspace, generated by PCA as an anomaly score
42 / 48
Spectral Anomaly Detection
Figure: Simple illustration of PCA-based anomaly detection, retrieved from [5]
43 / 48
Spectral Anomaly Detection
Advantages and Disadvantages
(+) These techniques automatically perform dimensionality reduction
and hence are suitable for handling high dimensional data
(+) Thus they can also be used as a preprocessing step before applying
any existing anomaly detection technique
(+) Can be used in unsupervised setting
(–) Useful only if the normal and anomalous instances are separable in
the lower dimensional embedding of the data
(–) High computational complexity
44 / 48
Conclusion
45 / 48
Conclusion
Relative Strengths and Weaknesses of Anomaly Detection
Techniques
• Each of the anomaly detection techniques discussed in the previous
sections have their unique strengths and weaknesses
• Thus it is important to know which anomaly detection technique is
best suited for a given problem
Concluding Remarks
• We have discussed the way that anomaly detection problem can be
characterized and have provided a overview of the various techniques
• For each category, we have identified a unique assumption regarding
the notion of normal and anomalous data
• When applying a given technique to a particular domain, these
assumptions can be used as a guidelines for application
46 / 48
Conclusion
The Contribution of this paper
• This survey is an attempt to provide a structured and broad
overview of extensive research on anomaly detection techniques
spanning multiple research areas and application domains
47 / 48
References
1 Chandola, Varun, Arindam Banerjee, and Vipin Kumar. ”Anomaly
detection: A survey.” ACM computing surveys (CSUR) 41, no. 3 (2009):
1-58.
2 Breunig, Markus M., Hans-Peter Kriegel, Raymond T. Ng, and J¨org
Sander. ”LOF: identifying density-based local outliers.” In Proceedings of
the 2000 ACM SIGMOD international conference on Management of
data, pp. 93-104. 2000.
3 Eik Loo, Chong, Mun Yong Ng, Christopher Leckie, and Marimuthu
Palaniswami. ”Intrusion detection for routing attacks in sensor networks.”
International Journal of Distributed Sensor Networks 2, no. 4 (2006):
313-332.
4 Anomaly Detection Using K means Clustering, Ashen Weerathunga,
https://ashenweerathunga.wordpress.com/2016/01/05/anomaly-
detection-using-k-means-clustering/
5 Anomaly detection using PCA reconstruction error,
https://stats.stackexchange.com/questions/259806/anomaly-detection-
using-pca-reconstruction-error
48 / 48

More Related Content

What's hot

Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Dr. Stylianos Kampakis
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Hitesh Mohapatra
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
QuantUniversity
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Impetus Technologies
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
guest0edcaf
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
Impetus Technologies
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
Carol Hargreaves
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
Gianmario Spacagna
 
Isolation Forest
Isolation ForestIsolation Forest
Isolation Forest
Konkuk University, Korea
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
PyData
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
ShantanuDeosthale
 
svm classification
svm classificationsvm classification
svm classification
Akhilesh Joshi
 
Decision trees & random forests
Decision trees & random forestsDecision trees & random forests
Decision trees & random forests
SC5.io
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
QuantUniversity
 
Random Forest
Random ForestRandom Forest
Random Forest
Abdullah al Mamun
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networks
antoniomorancardenas
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Edureka!
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
VARUN KUMAR
 
Anomaly/Novelty detection with scikit-learn
Anomaly/Novelty detection with scikit-learnAnomaly/Novelty detection with scikit-learn
Anomaly/Novelty detection with scikit-learn
agramfort
 

What's hot (20)

Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
Isolation Forest
Isolation ForestIsolation Forest
Isolation Forest
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
svm classification
svm classificationsvm classification
svm classification
 
Decision trees & random forests
Decision trees & random forestsDecision trees & random forests
Decision trees & random forests
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networks
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Anomaly/Novelty detection with scikit-learn
Anomaly/Novelty detection with scikit-learnAnomaly/Novelty detection with scikit-learn
Anomaly/Novelty detection with scikit-learn
 

Similar to Anomaly Detection: A Survey

Outlier analysis,Chapter-12, Data Mining: Concepts and Techniques
Outlier analysis,Chapter-12, Data Mining: Concepts and TechniquesOutlier analysis,Chapter-12, Data Mining: Concepts and Techniques
Outlier analysis,Chapter-12, Data Mining: Concepts and Techniques
Ashikur Rahman
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit iv
malathieswaran29
 
Feature selection with imbalanced data in agriculture
Feature selection with  imbalanced data in agricultureFeature selection with  imbalanced data in agriculture
Feature selection with imbalanced data in agriculture
Aboul Ella Hassanien
 
MLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for AnomaliesMLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for Anomalies
BigML, Inc
 
8 sampling & sample size (Dr. Mai,2014)
8  sampling & sample size (Dr. Mai,2014)8  sampling & sample size (Dr. Mai,2014)
8 sampling & sample size (Dr. Mai,2014)
Phong Đá
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
Georgios Ath. Kounis
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
Valerii Klymchuk
 
Pattern recognition at scale anomaly detection in banking on stream data
Pattern recognition at scale anomaly detection in banking on stream dataPattern recognition at scale anomaly detection in banking on stream data
Pattern recognition at scale anomaly detection in banking on stream data
NUS-ISS
 
Analytical chemistry_Instrumentation_Introduction
Analytical chemistry_Instrumentation_IntroductionAnalytical chemistry_Instrumentation_Introduction
Analytical chemistry_Instrumentation_Introduction
Bivek Timalsina
 
Sampling techniques market research
Sampling techniques market researchSampling techniques market research
Sampling techniques market researchKrishna Ramakrishnan
 
Regenstrief WIP 07012015
Regenstrief WIP 07012015Regenstrief WIP 07012015
Regenstrief WIP 07012015
Suranga Nath Kasthurirathne
 
Week_8machine learning (feature selection).pptx
Week_8machine learning (feature selection).pptxWeek_8machine learning (feature selection).pptx
Week_8machine learning (feature selection).pptx
muhammadsamroz
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
tafosepsdfasg
 
Unit 8 - Parametric Tests, Research and Statistics
Unit 8 - Parametric Tests, Research and StatisticsUnit 8 - Parametric Tests, Research and Statistics
Unit 8 - Parametric Tests, Research and Statistics
BasilleQuinto
 
Sampling (statistics and probability)
Sampling (statistics and probability)Sampling (statistics and probability)
Sampling (statistics and probability)
MaryamAsghar9
 
Data wrangling week 10
Data wrangling week 10Data wrangling week 10
Data wrangling week 10
Ferdin Joe John Joseph PhD
 
03 presentation-bothiesson
03 presentation-bothiesson03 presentation-bothiesson
03 presentation-bothiesson
InfinIT - Innovationsnetværket for it
 
Classification Using Decision Trees and RulesChapter 5.docx
Classification Using Decision Trees and RulesChapter 5.docxClassification Using Decision Trees and RulesChapter 5.docx
Classification Using Decision Trees and RulesChapter 5.docx
monicafrancis71118
 
Tester contribution to Testing Effectiveness. An Empirical Research
Tester contribution to Testing Effectiveness. An Empirical ResearchTester contribution to Testing Effectiveness. An Empirical Research
Tester contribution to Testing Effectiveness. An Empirical Research
Natalia Juristo
 

Similar to Anomaly Detection: A Survey (20)

Outlier analysis,Chapter-12, Data Mining: Concepts and Techniques
Outlier analysis,Chapter-12, Data Mining: Concepts and TechniquesOutlier analysis,Chapter-12, Data Mining: Concepts and Techniques
Outlier analysis,Chapter-12, Data Mining: Concepts and Techniques
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit iv
 
Feature selection with imbalanced data in agriculture
Feature selection with  imbalanced data in agricultureFeature selection with  imbalanced data in agriculture
Feature selection with imbalanced data in agriculture
 
MLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for AnomaliesMLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for Anomalies
 
8 sampling & sample size (Dr. Mai,2014)
8  sampling & sample size (Dr. Mai,2014)8  sampling & sample size (Dr. Mai,2014)
8 sampling & sample size (Dr. Mai,2014)
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Pattern recognition at scale anomaly detection in banking on stream data
Pattern recognition at scale anomaly detection in banking on stream dataPattern recognition at scale anomaly detection in banking on stream data
Pattern recognition at scale anomaly detection in banking on stream data
 
Chapter8
Chapter8Chapter8
Chapter8
 
Analytical chemistry_Instrumentation_Introduction
Analytical chemistry_Instrumentation_IntroductionAnalytical chemistry_Instrumentation_Introduction
Analytical chemistry_Instrumentation_Introduction
 
Sampling techniques market research
Sampling techniques market researchSampling techniques market research
Sampling techniques market research
 
Regenstrief WIP 07012015
Regenstrief WIP 07012015Regenstrief WIP 07012015
Regenstrief WIP 07012015
 
Week_8machine learning (feature selection).pptx
Week_8machine learning (feature selection).pptxWeek_8machine learning (feature selection).pptx
Week_8machine learning (feature selection).pptx
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
 
Unit 8 - Parametric Tests, Research and Statistics
Unit 8 - Parametric Tests, Research and StatisticsUnit 8 - Parametric Tests, Research and Statistics
Unit 8 - Parametric Tests, Research and Statistics
 
Sampling (statistics and probability)
Sampling (statistics and probability)Sampling (statistics and probability)
Sampling (statistics and probability)
 
Data wrangling week 10
Data wrangling week 10Data wrangling week 10
Data wrangling week 10
 
03 presentation-bothiesson
03 presentation-bothiesson03 presentation-bothiesson
03 presentation-bothiesson
 
Classification Using Decision Trees and RulesChapter 5.docx
Classification Using Decision Trees and RulesChapter 5.docxClassification Using Decision Trees and RulesChapter 5.docx
Classification Using Decision Trees and RulesChapter 5.docx
 
Tester contribution to Testing Effectiveness. An Empirical Research
Tester contribution to Testing Effectiveness. An Empirical ResearchTester contribution to Testing Effectiveness. An Empirical Research
Tester contribution to Testing Effectiveness. An Empirical Research
 

More from Konkuk University, Korea

(Slide)concentration effect
(Slide)concentration effect(Slide)concentration effect
(Slide)concentration effect
Konkuk University, Korea
 
House price
House priceHouse price
Titanic data analysis
Titanic data analysisTitanic data analysis
Titanic data analysis
Konkuk University, Korea
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
Konkuk University, Korea
 
Extended Isolation Forest
Extended Isolation Forest Extended Isolation Forest
Extended Isolation Forest
Konkuk University, Korea
 
Random Forest
Random ForestRandom Forest
Decision Tree
Decision Tree Decision Tree
Decision Tree
Konkuk University, Korea
 

More from Konkuk University, Korea (7)

(Slide)concentration effect
(Slide)concentration effect(Slide)concentration effect
(Slide)concentration effect
 
House price
House priceHouse price
House price
 
Titanic data analysis
Titanic data analysisTitanic data analysis
Titanic data analysis
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
 
Extended Isolation Forest
Extended Isolation Forest Extended Isolation Forest
Extended Isolation Forest
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Decision Tree
Decision Tree Decision Tree
Decision Tree
 

Recently uploaded

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Anomaly Detection: A Survey

  • 1. Anomaly Detectoin: A Survey by Varun Chandola, Arindam Banerjee, Vipin Kumar Presenter: Jaehyeong Ahn Department of Applied Statistics, Konkuk University jayahn0104@gmail.com 1 / 48
  • 2. Table of Contents Introduction • Rapid Introduction to Anomaly Detection Problem • Different Aspects of Anomaly Detection Problem Anomaly Detection Techniques • Classification Based Anomaly Detection • Nearest Neighbor Based Anomaly Detection • Clustering Based Anomaly Detection • Statistical Anomaly Detection • Spectral Anomaly Detection Conclusion 2 / 48
  • 4. Rapid Introduction to Anomaly Detection Problem What are Anomalies? • Anomalies are patterns in data that do not conform to a well defined notion of normal behavior Figure: Simple illustration of anomalies in a two-dimensional data set • N, N: Normal regions, o, o, O: Anomalies 4 / 48
  • 5. Rapid Introduction to Anomaly Detection Problem How are they called? • Anomalies, Outliers, Discordant observations, Exceptions, Aberrations, Surprises, Peculiarities, or Contaminants in different application domains What is an Anomaly Detection? • Anomaly Detection refers to the problem of finding patterns in data that do not conform to expected behavior 5 / 48
  • 6. Rapid Introduction to Anomaly Detection Problem Various applications of Anomaly Detection • fraud detection for credit card, insurance, or health care • intrusion detection for cyber-security • fault detection in safety critical systems • military surveillance for enemy activities Why Detecting Anomaly is important? • The importance of anomaly detection is due to the fact that in spite of the small numbers, anomalies cause significant and critical issues • For example, credit card fraud, cyber-intrusion, terrorist activity or break down a system, etc 6 / 48
  • 7. Different Aspects of An Anomaly Detection Problem Anomaly Detection problem can be characterized as a specific formulation by categorizing several different factors: • Nature of Input Data • Type of Anomaly • Availability of Data Labels • Output Type of Anomaly Detection Technique 7 / 48
  • 8. Different Aspects of An Anomaly Detection Problem Nature of Input Data Input data can be categorized based on the nature of attribute type and the relationship among the data instances • Attribute type • Univariate vs Multivariate • Categorical vs Continuous • Same type vs Mixture of different types • Relationship among data instances • No relationship vs Related (e.g. sequence data, spatial data, graph data) 8 / 48
  • 9. Different Aspects of An Anomaly Detection Problem Type of Anomaly An important aspect of an anomaly detection technique is the nature of the desired anomaly. Anomalies can be classified into three categories • Point Anomalies • Contextual Anomalies • Collective Anomalies 9 / 48
  • 10. Different Aspects of An Anomaly Detection Problem Point Anomalies • If an individual data instance can be considered as anomalous with respect to the rest of the data, then the instance is termed a point anomaly • This is the simplest type of anomaly and is the focus of majority of research on anomaly detection 10 / 48
  • 11. Different Aspects of An Anomaly Detection Problem Contextual Anomalies • If a data instance is anomalous in a specific context, but not otherwise, then it is termed a contextual anomaly • Thus when a data instance is a contextual anomaly in a given context, but an identical data instance could be considered normal in a different context • Most commonly explored in time-series data and spatial data 11 / 48
  • 12. Different Aspects of An Anomaly Detection Problem Collective Anomalies • If a collection of related data instances is anomalous with respect to the entire data set, it is termed a collective anomaly • Thus the individual instances in a collective anomaly may not be anomalies by themselves, but their occurrence together as a collection is anomalous 12 / 48
  • 13. Different Aspects of An Anomaly Detection Problem Data Labels • Obtaining labeled data that is accurate is often prohibitively expensive • Furthermore, getting a labeled set of anomalous data instances that covers all possible type of anomalous behavior is more difficult than getting labels for normal behavior (Because Anomalous behaviors are not in standardized manner) • Based on the extent to which the labels are available, anomaly detection techniques can operate in one of the three modes: • Supervised Anomaly Detection • Semi-Supervised Anomaly Detection • Unsupervised Anomaly Detection 13 / 48
  • 14. Different Aspects of An Anomaly Detection Problem Supervised Anomaly Detection • Require full accurate class labels • Extremely imbalanced data set • Similar to solve an imbalanced classification problem Semi-Supervised Anomaly Detection • Require labels only for the normal instances • More widely applicable than supervised techniques • Typical approach: to build a model for the normal behavior, and use the model to identify anomalies in the test data Unsupervised Anomaly Detection • Require no label information. Thus it is most widely applicable • Requires implicit assumption which is ”Anomalies are few and different from the normal points” (If this assumption is not true then such techniques suffer from high false alarm rate) 14 / 48
  • 15. Different Aspects of An Anomaly Detection Problem Output of Anomaly Detection The outputs produced by anomaly detection techniques are one of the following two types: Scores • Scoring techniques report anomaly score to each instance in the test data depending of the degree to which that instance is considered an anomaly • Thus the outputs can be represented in a ranked form Labels • These techniques assign a label (normal or anomalous) to each test instances 15 / 48
  • 16. Figure: General approach for solving Anomaly Detection problem 16 / 48
  • 18. Classification Based General Assumption A classifier that can distinguish between normal and anomalous classes can be learned in the given feature space Basic Idea It works similar to classification problem in supervised learning except for that it only obtains normal class instances for training • Training Phase • Learns a classifier using the available labeled training data • Testing Phase • Classifies a test instance as normal or anomalous, using the classifier 18 / 48
  • 19. Classification Based Two broad categories based on the availability of labels • One-Class Classification based anomaly detection • Assumption: all training instances have only one class label • Learns a discriminative boundary around the normal instances using a one-class classification algorithm • Multi-Class Classification based anomaly detection • Assumption: the training data contains labeled instances belonging to multiple normal classes • Learns discriminative boundaries around the normal class instances Both techniques identify test instance as anomalous if it is not classified as normal by classifier 19 / 48
  • 20. Classification Based Two broad categories based on the availability of labels (a) One-Class Classification (b) Multi-Class Classification 20 / 48
  • 21. Classification Based Various Classification-Based Anomaly Detection Techniques • Neural Networks-Based • Bayesian Networks-Based • Support Vector Machines-Based • · · · Computational Complexity • It mainly depends on the classification algorithm being used • The testing phase is usually very fast since the testing phase uses a learned model for classification 21 / 48
  • 22. Classification Based Advantages and Disadvantages (+) It can make use of powerful algorithms that can distinguish between instances belonging to different classes (+) The testing phase is fast, since each test instance needs to be compared against the precomputed model (–) Require the availability of accurate labels which is often not possible (–) Return binary output, normal or anomalous, which can be a disadvantage when a meaningful anomaly score (or ranking) is desired (some techniques obtain a probabilistic prediction score to overcome this problem) 22 / 48
  • 23. Nearest Neighbor Based General Assumption Normal data instances occur in dense neighborhoods, while anomalies occur far from their closest neighbors Basic Idea • Lazy learning algorithm (Training stage does just storing the training data) • Training Phase • Store training data • Testing Phase • Compute specified distance between test instance and all training instances • For a test instance, if computed distance is high, then it is identified as anomalous 23 / 48
  • 24. Nearest Neighbor Based Two Broad Categories of Nearest Neighbor Techniques • Using Distance to kth Nearest Neighbor • Use the distance of a data instance to its kth nearest neighbor as the anomaly score • Using Local Density of test instance • use the inverse of local density of each data instance to compute its anomaly score 24 / 48
  • 25. Nearest Neighbor Based Using Distance to kth Nearest Neighbor • The anomaly score of a data instance is defined as its distance to its kth nearest neighbor in a given data set (a) Normal point (b) Anomaly point • Basic Extensions • Modifies the definition to obtain anomaly score (e.g. average,max, · · · ) • Use different distance measure to handle complex data type • Improve the computational efficiency 25 / 48
  • 26. Nearest Neighbor Based Density-Based • An instance that lies in a neighborhood with low density is declared to be anomalous while an instance that lies in a dense neighborhood is declared to be normal (a) Normal point (b) Anomaly point • Density of the instance is usually estimated as k divided by the radius of hypershpere, centered at the given data instance, which contains k other instances 26 / 48
  • 27. Nearest Neighbor Based Problem of Basic Nearest Neighbor Method • If the data has regions of varying densities, the basic nearest neighbor method fails to detect anomalies Figure: Problem of Basic Nearest Neighbor method, retrieved from [2] To handle this issue, Relative Density method has been proposed 27 / 48
  • 28. Nearest Neighbor Based Using Relative Density Figure: Local Outlier Factor • Basic Extensions • Estimate the local density in a different way • Compute more complex data types • Improve the computational efficiency 28 / 48
  • 29. Nearest Neighbor Based Advantages and Disadvantages (+) Unsupervised in nature (+) Adapting nearest neighbor-based techniques to a different data type is straightforward, and primarily requires defining a appropriate distance measure for the given data (–) If the general assumption does not conform to the given data set, the detector fails to catch anomalies (–) The computational complexity of the testing phase is a challenge O(N2 ). Since it has to compute all distances between each test instance to all training data points (–) Performance greatly relies on a distance measure 29 / 48
  • 30. Clustering Based • Clustering is used to group similar data instances into clusters • Clustering-Based anomaly detection techniques can be grouped into three categories based on its assumption • Normal data instances belong to a cluster in the data, while anomalies do not belong to any cluster • Normal data instances lie close to their closest cluster centroid, while anomalies are far away from their closest cluster centroid • Normal data instances belong to large and dense clusters, while anomalies either belong to small or sparse clusters 30 / 48
  • 31. Clustering Based First Category • Assumption • Normal data instances belong to a cluster in the data, while anomalies do not belong to any cluster • Several clustering algorithms that do not force every data instance to belong to a cluster can be used (e.g. DBSCAN, ROCK, SNN, · · · ) • Disadvantage • They are not optimized to find anomalies, since the main aim of the underlying clustering algorithm is to find clusters 31 / 48
  • 32. Clustering Based First Category Figure: Clustering-based anomaly detection of first category, retrieved from [3] 32 / 48
  • 33. Clustering Based Second Category • Assumption • Normal data instances lie close to their closest cluster centroid, while anomalies are far away from their closest cluster centroid • These kind of techniques are operated in two-steps 1 The data is clustered using a clustering algorithm 2 For each data instance, its distance to its closest cluster centroid is calculated as its anomaly score • Various clustering algorithms can be obtained (e.g. Self- Organizing Maps (SOM), K-means Clustering, · · · ) • Disadvantage • If the anomalies in the data form clusters by themselves, these techniques will not be able to detect such anomalies 33 / 48
  • 34. Clustering Based Second Category Figure: Clustering-based anomaly detection of second category, retrieved from [4] 34 / 48
  • 35. Clustering Based Third Category • Assumption • Normal data instances belong to large and dense clusters, while anomalies either belong to small or sparse clusters • Techniques based on this assumption declare instances belonging to clusters whose size and/or density is below a certain threshold, as anomalous • Several variations have been proposed for this categorry • For example, Cluster-Bassed Local Outlier Factor (CBLOF) captures the size of the cluser to which the data instance belongs, as well as the distance of the data instance to its cluster centroid 35 / 48
  • 36. Clustering Based Advantages and Disadvantages (+) Can operate in an unsupervised mode (+) Can be adapted to other complex data types by simply plugging in a clustering algorithm that can handle the particular data type (+) The testing phase is fast since the number of clusters against which every test instance needs to be compared is a small constant (–) Performance is highly dependent on the effectiveness of clustering algorithms (–) They are not optimized for anomaly detection (–) The computational complexity for clustering the data is often a bottleneck 36 / 48
  • 37. Statistical Anomaly Detection General Assumption Normal data instances occur in high probability regions of a stochastic model, while anomalies occur in the low probability regions of the stochastic model Basic Idea • Fit a statistical model to the given data (usually for normal behavior) and then apply a statistical inference test to determine if an unseen instance belongs to this model or not Two broad categories of statistical anomaly detection • Parametric Techniques • which assume the knowledge of the underlying distribution and estimate the parameters from the given data • Nonparametric Techniques 37 / 48
  • 38. Statistical Anomaly Detection Parametric Techniques • Assume that the normal data is generated by a parametric distribution with parameters Θ and probability density function f(x, Θ) • The anomaly score of a test instance x is the inverse of the probability density function, f(x, Θ). Θ is estimated from the given data • Alternatively, a statistical hypothesis test may be used • The null hypothesis H: the data instance x has been generated using the estimated distribution (with parameters Θ) • If the statistical test rejects H, x is declared to be anomaly • Since a statistical hypothesis test is associated with a test statistic, the test statistic value can be obtained as an anomaly score for x 38 / 48
  • 39. Statistical Anomaly Detection Parametric Techniques • Basic Example: Gaussian Model-Based • Assume that the data is generated from a Gaussian distribution • The parameters are estimated using Maximum Likelihood Estimates(MLE) • The distance of a data instance to the estimated mean is the anomaly score for that instance • A certain threshold is applied to the anomaly scores to determine the anomalies (e.g. µ±3σ, since this region contains 99.7% of the data) 39 / 48
  • 40. Statistical Anomaly Detection Nonparametric Techniques • The model structure is not defined a priori, but is instead determined from given data • Basic Example: Histogram-Based 1 Build a histogram based on the different values in the training data 2 Check if a test instance falls in any one of the bins of the histogram • The size of the bin used when building the histogram is a key for anomaly detection • Alternatively the height of the bin that contains the instance can be obtained as an anomaly score 40 / 48
  • 41. Statistical Anomaly Detection Advantages and Disadvantages (+) If the underlying data distribution assumption holds true, they provide a statistically justifiable solution for anomaly detection (+) The anomaly score is associated with a confidence interval, which can be used as additional information (–) They rely on the assumption that the data is generated from a particular distribution. (–) Choosing the best statistic is often not a straightforward task 41 / 48
  • 42. Spectral Anomaly Detection General Assumption Data can be embedded into a lower dimensional subspace in which normal instances and anomalies appear significantly different Basic Idea • Spectral techniques try to find an approximation of the data using a combination of attributes that capture the bulk of the variability in the data • Thus these techniques seek to determine the subspaces in which the anomalous instances can be easily identified • Basic Example: Principal Component Analysis (PCA) • Projecting the data into lower dimensional space while maximizing the variability • Compute the distance between a data instance and the subspace, generated by PCA as an anomaly score 42 / 48
  • 43. Spectral Anomaly Detection Figure: Simple illustration of PCA-based anomaly detection, retrieved from [5] 43 / 48
  • 44. Spectral Anomaly Detection Advantages and Disadvantages (+) These techniques automatically perform dimensionality reduction and hence are suitable for handling high dimensional data (+) Thus they can also be used as a preprocessing step before applying any existing anomaly detection technique (+) Can be used in unsupervised setting (–) Useful only if the normal and anomalous instances are separable in the lower dimensional embedding of the data (–) High computational complexity 44 / 48
  • 46. Conclusion Relative Strengths and Weaknesses of Anomaly Detection Techniques • Each of the anomaly detection techniques discussed in the previous sections have their unique strengths and weaknesses • Thus it is important to know which anomaly detection technique is best suited for a given problem Concluding Remarks • We have discussed the way that anomaly detection problem can be characterized and have provided a overview of the various techniques • For each category, we have identified a unique assumption regarding the notion of normal and anomalous data • When applying a given technique to a particular domain, these assumptions can be used as a guidelines for application 46 / 48
  • 47. Conclusion The Contribution of this paper • This survey is an attempt to provide a structured and broad overview of extensive research on anomaly detection techniques spanning multiple research areas and application domains 47 / 48
  • 48. References 1 Chandola, Varun, Arindam Banerjee, and Vipin Kumar. ”Anomaly detection: A survey.” ACM computing surveys (CSUR) 41, no. 3 (2009): 1-58. 2 Breunig, Markus M., Hans-Peter Kriegel, Raymond T. Ng, and J¨org Sander. ”LOF: identifying density-based local outliers.” In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp. 93-104. 2000. 3 Eik Loo, Chong, Mun Yong Ng, Christopher Leckie, and Marimuthu Palaniswami. ”Intrusion detection for routing attacks in sensor networks.” International Journal of Distributed Sensor Networks 2, no. 4 (2006): 313-332. 4 Anomaly Detection Using K means Clustering, Ashen Weerathunga, https://ashenweerathunga.wordpress.com/2016/01/05/anomaly- detection-using-k-means-clustering/ 5 Anomaly detection using PCA reconstruction error, https://stats.stackexchange.com/questions/259806/anomaly-detection- using-pca-reconstruction-error 48 / 48