SlideShare a Scribd company logo
1 of 26
Download to read offline
15. Anomaly Detection
≫ PROBLEM MOTIVATION:
Inputs:
Test Data:
Plot:
Problem: we are given some training examples. We assume them
to be NON-ANOMALOUS. So, we need to find if a new example is
anomalous or not.
We train a PROBABILITY MODEL:
It divides the plot into various regions, such that each region
corresponds to a level of provability.
Centre → highest probability
Outside → lowest P
Example:
≫ GAUSSIAN DISTRIBUTION:
Examples:
σ → controls the width and height of curve
µ → mean of all values of x: controls the center..
Area below the curve is always = 1
So, the curve is either taller or wider.
Estimation of µ and σ :
≫ ALGORITHM FOR ANOMALY DETECTION:
Each feature can be individually distributred using
Normal(Gaussian) distribution:
P( x ) = product of probabilities of individual features
Note:
Steps of algorithm:
Example:
Gaussian Curves of both features:
P( x ) → height of curve == P(x1 ; µ1 , σ2
2) x P(x2 ; µ2 , σ2
2)
For a new example:
This means that anything below a particular height in the plot,
given by ϵ : is an anomaly
OR
Anything outside the learned region is an anomaly:
i.e., everything outside the magenta curve:
≫ DEVELOPING ANOMALY DETECTION SYSTEM:
Anomaly detection system problem can be converted to a likes of
Supervised Learning problem.
Therefore:
Thus we can take:
Meaning all the training set examples are non-anomalous
In Cross validation set and test set, we can have some examples of
anomalous (y=1) and some of non-anomalous (y=0) type:
Example:
≫ How to evaluate the algorithm:
≫ How to choose ϵ :
We can choose the value of ϵ which gives the best value of F1 score.
Anomaly detection vs Supervised Learning:
In anomaly detection, it’s better to model anomalies based on
negative examples, rather than positive examples… as future
anomalies may be totally different.
Applications of Anomaly detection vs Supervised Learning:
Fraud detection can be a supervised learning application but only if
there are a lot of people on the website who are doing fraudulent
activity, i.e., most of the examples are positive, otherwise, it’s an
anomaly detection problem only
≫ Choosing what features to use:
We plot the data and the histogram looks like :
We’d be happy to see this as this means that the feature x is a
gaussian feature
But: if the histogram looks like:
This is a non gaussian distribution
So, we use different transforms on our data to make it as close as
possible to a gaussian distribution:
➔ These are feature v/s P( x )
We can use different transforms for different features to make
them gaussian features:
We may have to try out different transforms for the same feature
to find the best (which gives the best gaussian look to the data).
These parameters in the red are parameters we can vary to make
the data look more and more Gaussian.
≫ Coming up with features:
Error analysis method:
If there is an anomalous example in middle of some non-anomalous
examples, then the algo will fail.
➢So, we can look at that particular example and try to come up
with a new feature that can tell what went wrong with that
example
Example:
It may occur that if one of the computers is stuck in an infinite loop,
the CPU load grows but the network traffic doesn’t:
Then we can come up with a new feature:
OR
≫ MULTIVARIATE GAUSSIAN DISTRIBUTION:
Here, red points are training data
Green point is our test data
➢Lets look at both features individually:
The algo will not predict the right o/p … since the i/p data is
distributed on the whole axis, so all the points have some
probability of being correct.
Probability Contours:
To solve this:
Examples:
If we decr Σ :
If we incr Σ :
If we decr the variance of only one of the features:
OR
If we incr the variance of only one of the features:
OR
Other variations in Σ :
Varying the mean ( µ ): it moves the centre of contours, where
the probability ( P( x ) ) is highest.
≫ Multivariate Gaussian Distribution Algorithm:
Algorithm:
≫ Relationship of multivariate Gaussian Model with
Original Gaussian Model:
Original gaussian model is actually a special case of multivariate
model, in which, the contours have their axes aligned with the
features axes, i.e., the contours are not at nay angles:
Original model is mathematically the multivariate model with a
constraint, that is:
≫ WHEN TO USE ORIGINAL MODEL vs MULTIVARIATE MODEL:
➢In some cases, in original model, we may require to manually
create extra features, so that the model can work fine.
➢In case of multivariate model, its important to get rid of
redundant features, o/w the algo is very expensive, and Σ may
even be non-invertible
…………………………………………………………………………………………………………
…………………………………………………………………………………………………………

More Related Content

What's hot

9 neural network learning
9 neural network learning9 neural network learning
9 neural network learningTanmayVijay1
 
6 logistic regression classification algo
6 logistic regression   classification algo6 logistic regression   classification algo
6 logistic regression classification algoTanmayVijay1
 
10 advice for applying ml
10 advice for applying ml10 advice for applying ml
10 advice for applying mlTanmayVijay1
 
U6 Cn2 Definite Integrals Intro
U6 Cn2 Definite Integrals IntroU6 Cn2 Definite Integrals Intro
U6 Cn2 Definite Integrals IntroAlexander Burt
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine LearningKuppusamy P
 
Chapter 2 continuous_random_variable_2009
Chapter 2 continuous_random_variable_2009Chapter 2 continuous_random_variable_2009
Chapter 2 continuous_random_variable_2009ayimsevenfold
 
Simplex Method Flowchart/Algorithm
Simplex Method Flowchart/AlgorithmSimplex Method Flowchart/Algorithm
Simplex Method Flowchart/AlgorithmRaja Adapa
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulationAnurag Jaiswal
 
Monte carlo-simulation
Monte carlo-simulationMonte carlo-simulation
Monte carlo-simulationjaimarbustos
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximationMileacre
 
MolinaLeydi_FinalProject
MolinaLeydi_FinalProjectMolinaLeydi_FinalProject
MolinaLeydi_FinalProjectLeydi Molina
 
Regression Analysis and model comparison on the Boston Housing Data
Regression Analysis and model comparison on the Boston Housing DataRegression Analysis and model comparison on the Boston Housing Data
Regression Analysis and model comparison on the Boston Housing DataShivaram Prakash
 
Random number generation
Random number generationRandom number generation
Random number generationVinit Dantkale
 

What's hot (20)

9 neural network learning
9 neural network learning9 neural network learning
9 neural network learning
 
6 logistic regression classification algo
6 logistic regression   classification algo6 logistic regression   classification algo
6 logistic regression classification algo
 
10 advice for applying ml
10 advice for applying ml10 advice for applying ml
10 advice for applying ml
 
7 regularization
7 regularization7 regularization
7 regularization
 
U6 Cn2 Definite Integrals Intro
U6 Cn2 Definite Integrals IntroU6 Cn2 Definite Integrals Intro
U6 Cn2 Definite Integrals Intro
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
 
Lecture two
Lecture twoLecture two
Lecture two
 
Chapter 2 continuous_random_variable_2009
Chapter 2 continuous_random_variable_2009Chapter 2 continuous_random_variable_2009
Chapter 2 continuous_random_variable_2009
 
Calc 2.1
Calc 2.1Calc 2.1
Calc 2.1
 
Riemann's Sum
Riemann's SumRiemann's Sum
Riemann's Sum
 
R nonlinear least square
R   nonlinear least squareR   nonlinear least square
R nonlinear least square
 
Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
 
Simplex Method Flowchart/Algorithm
Simplex Method Flowchart/AlgorithmSimplex Method Flowchart/Algorithm
Simplex Method Flowchart/Algorithm
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulation
 
Monte carlo-simulation
Monte carlo-simulationMonte carlo-simulation
Monte carlo-simulation
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximation
 
MolinaLeydi_FinalProject
MolinaLeydi_FinalProjectMolinaLeydi_FinalProject
MolinaLeydi_FinalProject
 
Regression Analysis and model comparison on the Boston Housing Data
Regression Analysis and model comparison on the Boston Housing DataRegression Analysis and model comparison on the Boston Housing Data
Regression Analysis and model comparison on the Boston Housing Data
 
simplex method
simplex methodsimplex method
simplex method
 
Random number generation
Random number generationRandom number generation
Random number generation
 

Similar to Anomaly Detection Explained

Anomaly detection Full Article
Anomaly detection Full ArticleAnomaly detection Full Article
Anomaly detection Full ArticleMenglinLiu1
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxHaibinSu2
 
Bootcamp of new world to taken seriously
Bootcamp of new world to taken seriouslyBootcamp of new world to taken seriously
Bootcamp of new world to taken seriouslykhaled125087
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptShayanChowdary
 
Prob and statistics models for outlier detection
Prob and statistics models for outlier detectionProb and statistics models for outlier detection
Prob and statistics models for outlier detectionTrilochan Panigrahi
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAminaRepo
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdfBong-Ho Lee
 
Estimating Causal Effects from Observations
Estimating Causal Effects from ObservationsEstimating Causal Effects from Observations
Estimating Causal Effects from ObservationsAntigoni-Maria Founta
 
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...Golden Helix Inc
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional VerificationSai Kiran Kadam
 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notesUmeshJagga1
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning conceptsJoe li
 
PRML Chapter 5
PRML Chapter 5PRML Chapter 5
PRML Chapter 5Sunwoo Kim
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 

Similar to Anomaly Detection Explained (20)

Anomaly detection Full Article
Anomaly detection Full ArticleAnomaly detection Full Article
Anomaly detection Full Article
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
Bootcamp of new world to taken seriously
Bootcamp of new world to taken seriouslyBootcamp of new world to taken seriously
Bootcamp of new world to taken seriously
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.ppt
 
Prob and statistics models for outlier detection
Prob and statistics models for outlier detectionProb and statistics models for outlier detection
Prob and statistics models for outlier detection
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clustering
 
2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data 2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
 
working with python
working with pythonworking with python
working with python
 
Estimating Causal Effects from Observations
Estimating Causal Effects from ObservationsEstimating Causal Effects from Observations
Estimating Causal Effects from Observations
 
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notes
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
 
PRML Chapter 5
PRML Chapter 5PRML Chapter 5
PRML Chapter 5
 
Ann a Algorithms notes
Ann a Algorithms notesAnn a Algorithms notes
Ann a Algorithms notes
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Anomaly Detection Explained

  • 1. 15. Anomaly Detection ≫ PROBLEM MOTIVATION: Inputs: Test Data: Plot:
  • 2. Problem: we are given some training examples. We assume them to be NON-ANOMALOUS. So, we need to find if a new example is anomalous or not. We train a PROBABILITY MODEL: It divides the plot into various regions, such that each region corresponds to a level of provability. Centre → highest probability Outside → lowest P
  • 4. Examples: σ → controls the width and height of curve µ → mean of all values of x: controls the center.. Area below the curve is always = 1 So, the curve is either taller or wider.
  • 5. Estimation of µ and σ : ≫ ALGORITHM FOR ANOMALY DETECTION: Each feature can be individually distributred using Normal(Gaussian) distribution:
  • 6. P( x ) = product of probabilities of individual features Note: Steps of algorithm:
  • 8. P( x ) → height of curve == P(x1 ; µ1 , σ2 2) x P(x2 ; µ2 , σ2 2) For a new example:
  • 9. This means that anything below a particular height in the plot, given by ϵ : is an anomaly OR Anything outside the learned region is an anomaly: i.e., everything outside the magenta curve: ≫ DEVELOPING ANOMALY DETECTION SYSTEM: Anomaly detection system problem can be converted to a likes of Supervised Learning problem. Therefore:
  • 10. Thus we can take: Meaning all the training set examples are non-anomalous In Cross validation set and test set, we can have some examples of anomalous (y=1) and some of non-anomalous (y=0) type: Example:
  • 11. ≫ How to evaluate the algorithm: ≫ How to choose ϵ : We can choose the value of ϵ which gives the best value of F1 score. Anomaly detection vs Supervised Learning: In anomaly detection, it’s better to model anomalies based on negative examples, rather than positive examples… as future anomalies may be totally different.
  • 12. Applications of Anomaly detection vs Supervised Learning: Fraud detection can be a supervised learning application but only if there are a lot of people on the website who are doing fraudulent activity, i.e., most of the examples are positive, otherwise, it’s an anomaly detection problem only ≫ Choosing what features to use: We plot the data and the histogram looks like : We’d be happy to see this as this means that the feature x is a gaussian feature
  • 13. But: if the histogram looks like: This is a non gaussian distribution So, we use different transforms on our data to make it as close as possible to a gaussian distribution: ➔ These are feature v/s P( x ) We can use different transforms for different features to make them gaussian features: We may have to try out different transforms for the same feature to find the best (which gives the best gaussian look to the data).
  • 14. These parameters in the red are parameters we can vary to make the data look more and more Gaussian. ≫ Coming up with features: Error analysis method:
  • 15. If there is an anomalous example in middle of some non-anomalous examples, then the algo will fail. ➢So, we can look at that particular example and try to come up with a new feature that can tell what went wrong with that example Example:
  • 16. It may occur that if one of the computers is stuck in an infinite loop, the CPU load grows but the network traffic doesn’t: Then we can come up with a new feature: OR ≫ MULTIVARIATE GAUSSIAN DISTRIBUTION:
  • 17. Here, red points are training data Green point is our test data ➢Lets look at both features individually: The algo will not predict the right o/p … since the i/p data is distributed on the whole axis, so all the points have some probability of being correct.
  • 20. If we incr Σ : If we decr the variance of only one of the features: OR
  • 21. If we incr the variance of only one of the features: OR
  • 23. Varying the mean ( µ ): it moves the centre of contours, where the probability ( P( x ) ) is highest.
  • 24. ≫ Multivariate Gaussian Distribution Algorithm: Algorithm:
  • 25. ≫ Relationship of multivariate Gaussian Model with Original Gaussian Model: Original gaussian model is actually a special case of multivariate model, in which, the contours have their axes aligned with the features axes, i.e., the contours are not at nay angles: Original model is mathematically the multivariate model with a constraint, that is:
  • 26. ≫ WHEN TO USE ORIGINAL MODEL vs MULTIVARIATE MODEL: ➢In some cases, in original model, we may require to manually create extra features, so that the model can work fine. ➢In case of multivariate model, its important to get rid of redundant features, o/w the algo is very expensive, and Σ may even be non-invertible ………………………………………………………………………………………………………… …………………………………………………………………………………………………………