SlideShare a Scribd company logo
1 of 58
Five Things I Learned While Building 
Anomaly Detection Tools 
(Or: 5 things that bit me in the …) 
Toufic Boubez, Ph.D. 
Founder, CTO 
Metafor Software 
toufic@metaforsoftware.com
2 
Preamble 
• IANA Data Scientist! I’m just an engineer that needed to get stuff done! 
• I learned (!) many more things, but cannnot be mentioned! 
– Because lawyers  
– But ask me later  
• I usually beat up on parametric, Gaussian, supervised techniques 
– This talk is not an exception, 
– But more of a “lessons learned” message 
• Note: all data real 
• Note: no y-axis labels on charts – on purpose!! 
• Note to self: remember to SLOW DOWN! 
• Note to self: mention the cats!! Everybody loves cats!!
3 
Toufic intro – who I am 
• Co-Founder/CTO Metafor Software 
• Co-Founder/CTO Layer 7 Technologies 
– Acquired by Computer Associates in 2013 
– I escaped  
• CTO Saffron Technology 
• IBM Chief Architect for SOA 
• Co-Author, Co-Editor: WS-Trust, WS-SecureConversation, 
WS-Federation, WS-Policy 
• Building large scale software systems for >20 
years (I’m older than I look, I know!)
4 
Why Anomaly Detection? 
• Watching screens on the “Wall of Charts” 
cannot scale! 
– Leads to alert fatigue 
• Need to automate detection of anomalous 
behaviors 
• Anomaly detection is the search for items or 
events which do not conform to an expected 
pattern. [Chandola, V.; Banerjee, A.; Kumar, V. (2009). "Anomaly 
detection: A survey". ACM Computing Surveys 41 (3): 1]
Thing 1: 
Your data is NOT Gaussian 
1
6 
Gaussian or Normal distribution 
• Bell-shaped distribution 
– Has a mean and a standard deviation
7 
This is Normally distributed data
8 
Quick check: Histogram
9 
Normal distributions are really useful 
• I can make powerful predictions because of 
the statistical properties of the data 
• I can easily compare different metrics since 
they have similar statistical properties 
• There is a HUGE body of statistical work on 
parametric techniques for normally 
distributed data
Normally distributed vs Not 
- Confidential - 10 
Normal distributions 
• Most naturally occurring 
processes 
• Population height, IQ 
distributions (present 
company excepted of 
course) 
• Widget sizes, weights in 
manufacturing 
• … 
Not 
• Your metrics!
11 
Why is that important? 
• Most analytics tools are based on two 
assumptions: 
1. Parametric techniques: Data is normally 
distributed with a useful and usable mean 
and standard deviation 
2. Supervised Learning techniques: Data is 
probabilistically “stationary”
12 
Example: Three-Sigma Rule 
• Three-sigma rule 
– ~68% of the values lie within 1 std deviation of the mean 
– ~95% of the values lie within 2 std deviations 
– 99.73% of the values lie within 3 std deviations: anything 
else is considered an outlier
13 
Aaahhhh 
• The mysterious red lines explained 
3s 
mean 
3s
14 
Doesn’t work because THIS
15 
Histogram – probability distribution
16 
3-sigma rule alerts
17 
Holt-Winters predictions
18 
Or worse, THIS!
19 
Histogram – probability distribution
20 
3-sigma rule alerts
Thing 2: 
2 
Yesterday’s anomaly is today’s normal
22 
Why is that important? 
• Most analytics tools are based on two 
assumptions: 
1. Parametric techniques: Data is normally 
distributed with a useful and usable mean 
and standard deviation 
2. Supervised Learning techniques: Data is 
probabilistically “stationary”
23 
Remember this data?
24 
No matter where you look
25 
Its characteristics are stationary
26 
Meanwhile, in our real world 
• Stationarity is not a realistic assumption in the 
large complex systems with which we’re 
dealing 
• “Concept Drift” is very common 
– http://en.wikipedia.org/wiki/Concept_drift 
“ … the statistical properties of the target variable, which 
the model is trying to predict, change over time in 
unforeseen ways. This causes problems because the 
predictions become less accurate as time passes.”
27 
Meanwhile, in our real world 
• Stationarity is not a realistic assumption in the 
large complex systems with which we’re 
dealing 
• “Concept Drift” is very common 
– http://en.wikipedia.org/wiki/Concept_drift 
“ … the statistical properties of the target variable, which 
the model is trying to predict, change over time in 
unforeseen ways. This causes problems because the 
predictions become less accurate as time passes.”
28 
Supervised learning 
• In ML, Supervised Learning is the general set of 
techniques for inferring a model from a set of 
observations: 
– Observations in a Training Set are labelled with the 
desired outcomes (e.g. “normal vs. anomalous”, 
“normal vs. fraudulent”, “red/green/yellow”, etc) 
– As observations are fed into the learning system, it 
learns to differentiate by inferring a model based on 
these labels 
– Once sufficiently “trained”, the system is used in 
production on “real” unlabelled data and can label the 
new data based on the inferred model
What happens when something changes in your 
fundamentals? 
29
This is your new normal: all red all the time 
30
31 
Mean Shift and Breakout Detection 
• https://blog.twitter.com/2014/breakout-detection- 
in-the-wild
Thing 3: 
Saying Kolmogorov-Smirnov is a great way to 
impress everyone 
3
33 
Why is that important? 
• Seriously!? 
• Ok, actually non-parametric techniques that 
make no assumptions about normality or any 
other probability distribution are crucial in 
your effort to understand what’s going on in 
your systems
34 
The Kolmogorov-Smirnov test 
• Non-parametric test 
– Compare two probability 
distributions 
– Makes no assumptions (e.g. 
Gaussian) about the 
distributions of the samples 
– Measures maximum 
distance between 
cumulative distributions 
– Can be used to compare 
periodic/seasonal metric 
periods (e.g. day-to-day or 
week-to-week) 
http://en.wikipedia.org/wiki/Kol 
mogorov%E2%80%93Smirnov_te 
st
35 
KS with windowing
36 
Data from similar windows
Cumulative distribution for those windows 
37
38 
Data from dissimilar windows
Cumulative distribution for those windows 
39
40 
Sliding window of KS scores
41 
KS anomaly results
Thing 4: 
4 
Take Scope and Context into account!
43 
Some data – is that normal?
44 
Wider scope
45 
Is this an anomlay?
46 
Even wider scope
47 
Is every weekend an anomaly?
48 
Would this be more accurate?
49 
Use domain knowledge! 
• Domain knowledge is NOT a bad thing! 
– There is no algorithm that will work on everything 
– Know your data and it general patterns 
• Periodicity/Seasonality 
• Known events (maintenance, backups, etc) 
– Apply the appropriate algorithms, taking into 
account enough scope for any inherent periodicity 
to appear 
– Customize your alerts to take into accounts known 
events
Thing 5: 
No data != No information
51 
Why is that important? 
• Some data channels are inherently non-chatty: 
– We don’t have the luxury of always generating 
non-zero values 
– There is a lot of useful information in the fact that 
nothing is happening on a particular channel 
• A lot of time series analytics techniques fail on 
time series with too few values (e.g. RF, 
adjusted box plot, etc)
52 
Communication channel
53 
Box plot results
54
55 
Simple lookup table with priors
56 
Don’t be an analytics snob 
• Sparse data is VERY hard to analyze using 
typical analytics techniques 
• Sparse data conveys VERY important 
information 
• Sometimes the simplest rules, thresholds, 
lookup tables will work
57 
Recap 
1. Your data is NOT Gaussian 
2. Yesterday’s anomaly is today’s normal 
3. Kolmogorov-Smirnov is really cool 
4. Scope and Context are important 
5. No data != No information
58 
Questions? 
• Shout out to the Metafor Data Science team! 
– Fred Zhang 
– Iman Makaremi

More Related Content

What's hot

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLANumenta
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataSubutai Ahmad
 
Anomaly detection, part 1
Anomaly detection, part 1Anomaly detection, part 1
Anomaly detection, part 1David Khosid
 
Science of Anomaly Detection
Science of Anomaly Detection Science of Anomaly Detection
Science of Anomaly Detection Numenta
 
Getting Started with Numenta Technology
Getting Started with Numenta Technology Getting Started with Numenta Technology
Getting Started with Numenta Technology Numenta
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnodot
 
Hierarchical Temporal Memory for Real-time Anomaly Detection
Hierarchical Temporal Memory for Real-time Anomaly DetectionHierarchical Temporal Memory for Real-time Anomaly Detection
Hierarchical Temporal Memory for Real-time Anomaly DetectionIhor Bobak
 
Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)Numenta
 
Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup Numenta
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronArun Kejariwal
 
Streaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameStreaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameNumenta
 
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the CloudLeveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureMarco Parenzan
 
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunk
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation TechnTed Dunning
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Alex Pinto
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsHenrique O. Santos
 
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...Paulo Pinheiro
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoopTed Dunning
 

What's hot (20)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
 
Anomaly detection, part 1
Anomaly detection, part 1Anomaly detection, part 1
Anomaly detection, part 1
 
Science of Anomaly Detection
Science of Anomaly Detection Science of Anomaly Detection
Science of Anomaly Detection
 
Getting Started with Numenta Technology
Getting Started with Numenta Technology Getting Started with Numenta Technology
Getting Started with Numenta Technology
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event data
 
Hierarchical Temporal Memory for Real-time Anomaly Detection
Hierarchical Temporal Memory for Real-time Anomaly DetectionHierarchical Temporal Memory for Real-time Anomaly Detection
Hierarchical Temporal Memory for Real-time Anomaly Detection
 
Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)
 
Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using Heron
 
Streaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameStreaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same Game
 
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the CloudLeveraging NLP and Deep Learning for Document Recommendations in the Cloud
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
 
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation Techn
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research Environments
 
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Coll...
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoop
 

Viewers also liked

Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...tboubez
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream miningManuel Martín
 
Architecting for Change: QCONNYC 2012
Architecting for Change: QCONNYC 2012Architecting for Change: QCONNYC 2012
Architecting for Change: QCONNYC 2012Kellan
 
The Dark of Building an Production Incident Syste
The Dark of Building an Production Incident SysteThe Dark of Building an Production Incident Syste
The Dark of Building an Production Incident SysteAlois Reitbauer
 
Traffic anomaly detection and attack
Traffic anomaly detection and attackTraffic anomaly detection and attack
Traffic anomaly detection and attackQrator Labs
 
Anomaly Detection for Security
Anomaly Detection for SecurityAnomaly Detection for Security
Anomaly Detection for SecurityCody Rioux
 
Anomaly Detection for Real-World Systems
Anomaly Detection for Real-World SystemsAnomaly Detection for Real-World Systems
Anomaly Detection for Real-World SystemsManojit Nandi
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteTed Dunning
 
Parallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisParallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisManojit Nandi
 
Can a monitoring tool pass the turing test
Can a monitoring tool pass the turing testCan a monitoring tool pass the turing test
Can a monitoring tool pass the turing testAlois Reitbauer
 
Monitoring large scale Docker production environments
Monitoring large scale Docker production environmentsMonitoring large scale Docker production environments
Monitoring large scale Docker production environmentsAlois Reitbauer
 
Monitoring without alerts
Monitoring without alertsMonitoring without alerts
Monitoring without alertsAlois Reitbauer
 
The Dark Art of Production Alerting
The Dark Art of Production AlertingThe Dark Art of Production Alerting
The Dark Art of Production AlertingAlois Reitbauer
 
The definition of normal - An introduction and guide to anomaly detection.
The definition of normal - An introduction and guide to anomaly detection. The definition of normal - An introduction and guide to anomaly detection.
The definition of normal - An introduction and guide to anomaly detection. Alois Reitbauer
 
SSL Certificate Expiration and Howler Monkey's Inception
SSL Certificate Expiration and Howler Monkey's InceptionSSL Certificate Expiration and Howler Monkey's Inception
SSL Certificate Expiration and Howler Monkey's Inceptionroyrapoport
 
Cloud Tech III: Actionable Metrics
Cloud Tech III: Actionable MetricsCloud Tech III: Actionable Metrics
Cloud Tech III: Actionable Metricsroyrapoport
 
Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014royrapoport
 
Monitoring Docker Application in Production
Monitoring Docker Application in ProductionMonitoring Docker Application in Production
Monitoring Docker Application in ProductionAlois Reitbauer
 

Viewers also liked (20)

Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
 
DevOps for Managers
DevOps for ManagersDevOps for Managers
DevOps for Managers
 
Architecting for Change: QCONNYC 2012
Architecting for Change: QCONNYC 2012Architecting for Change: QCONNYC 2012
Architecting for Change: QCONNYC 2012
 
The Dark of Building an Production Incident Syste
The Dark of Building an Production Incident SysteThe Dark of Building an Production Incident Syste
The Dark of Building an Production Incident Syste
 
Traffic anomaly detection and attack
Traffic anomaly detection and attackTraffic anomaly detection and attack
Traffic anomaly detection and attack
 
Anomaly Detection for Security
Anomaly Detection for SecurityAnomaly Detection for Security
Anomaly Detection for Security
 
Anomaly Detection for Real-World Systems
Anomaly Detection for Real-World SystemsAnomaly Detection for Real-World Systems
Anomaly Detection for Real-World Systems
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Parallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisParallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysis
 
Can a monitoring tool pass the turing test
Can a monitoring tool pass the turing testCan a monitoring tool pass the turing test
Can a monitoring tool pass the turing test
 
Monitoring large scale Docker production environments
Monitoring large scale Docker production environmentsMonitoring large scale Docker production environments
Monitoring large scale Docker production environments
 
Monitoring without alerts
Monitoring without alertsMonitoring without alerts
Monitoring without alerts
 
The Dark Art of Production Alerting
The Dark Art of Production AlertingThe Dark Art of Production Alerting
The Dark Art of Production Alerting
 
PyGotham 2016
PyGotham 2016PyGotham 2016
PyGotham 2016
 
The definition of normal - An introduction and guide to anomaly detection.
The definition of normal - An introduction and guide to anomaly detection. The definition of normal - An introduction and guide to anomaly detection.
The definition of normal - An introduction and guide to anomaly detection.
 
SSL Certificate Expiration and Howler Monkey's Inception
SSL Certificate Expiration and Howler Monkey's InceptionSSL Certificate Expiration and Howler Monkey's Inception
SSL Certificate Expiration and Howler Monkey's Inception
 
Cloud Tech III: Actionable Metrics
Cloud Tech III: Actionable MetricsCloud Tech III: Actionable Metrics
Cloud Tech III: Actionable Metrics
 
Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014
 
Monitoring Docker Application in Production
Monitoring Docker Application in ProductionMonitoring Docker Application in Production
Monitoring Docker Application in Production
 

Similar to Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez - Metafor Software - LISA 2014

Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...tboubez
 
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25tboubez
 
Statistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignoreStatistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignoreTuri, Inc.
 
The zen of predictive modelling
The zen of predictive modellingThe zen of predictive modelling
The zen of predictive modellingQuinton Anderson
 
1. Intro DS.pptx
1. Intro DS.pptx1. Intro DS.pptx
1. Intro DS.pptxAnusuya123
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Artificial Intelligence Approaches
Artificial Intelligence  ApproachesArtificial Intelligence  Approaches
Artificial Intelligence ApproachesJincy Nelson
 
Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi Professor Lili Saghafi
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systemsEyob Sisay
 
Three methodological issues for system dynamics practice
Three methodological issues for system dynamics practiceThree methodological issues for system dynamics practice
Three methodological issues for system dynamics practiceAndreas Größler
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInfluxData
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...Alex Pinto
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networksCSIRO
 

Similar to Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez - Metafor Software - LISA 2014 (20)

Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
 
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25
Data centre analytics toufic boubez-metafor-dev ops days vancouver-2013-10-25
 
Statistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignoreStatistics in the age of data science, issues you can not ignore
Statistics in the age of data science, issues you can not ignore
 
The zen of predictive modelling
The zen of predictive modellingThe zen of predictive modelling
The zen of predictive modelling
 
1. Intro DS.pptx
1. Intro DS.pptx1. Intro DS.pptx
1. Intro DS.pptx
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Intro scikitlearnstatsmodels
Intro scikitlearnstatsmodelsIntro scikitlearnstatsmodels
Intro scikitlearnstatsmodels
 
Artificial Intelligence Approaches
Artificial Intelligence  ApproachesArtificial Intelligence  Approaches
Artificial Intelligence Approaches
 
Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi Data science unit 1 By: Professor Lili Saghafi
Data science unit 1 By: Professor Lili Saghafi
 
L15. Machine Learning - Black Art
L15. Machine Learning - Black ArtL15. Machine Learning - Black Art
L15. Machine Learning - Black Art
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
data analysis.ppt
data analysis.pptdata analysis.ppt
data analysis.ppt
 
data analysis.pptx
data analysis.pptxdata analysis.pptx
data analysis.pptx
 
Vissec2014
Vissec2014Vissec2014
Vissec2014
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systems
 
Three methodological issues for system dynamics practice
Three methodological issues for system dynamics practiceThree methodological issues for system dynamics practice
Three methodological issues for system dynamics practice
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 

Recently uploaded

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

Recently uploaded (20)

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez - Metafor Software - LISA 2014

  • 1. Five Things I Learned While Building Anomaly Detection Tools (Or: 5 things that bit me in the …) Toufic Boubez, Ph.D. Founder, CTO Metafor Software toufic@metaforsoftware.com
  • 2. 2 Preamble • IANA Data Scientist! I’m just an engineer that needed to get stuff done! • I learned (!) many more things, but cannnot be mentioned! – Because lawyers  – But ask me later  • I usually beat up on parametric, Gaussian, supervised techniques – This talk is not an exception, – But more of a “lessons learned” message • Note: all data real • Note: no y-axis labels on charts – on purpose!! • Note to self: remember to SLOW DOWN! • Note to self: mention the cats!! Everybody loves cats!!
  • 3. 3 Toufic intro – who I am • Co-Founder/CTO Metafor Software • Co-Founder/CTO Layer 7 Technologies – Acquired by Computer Associates in 2013 – I escaped  • CTO Saffron Technology • IBM Chief Architect for SOA • Co-Author, Co-Editor: WS-Trust, WS-SecureConversation, WS-Federation, WS-Policy • Building large scale software systems for >20 years (I’m older than I look, I know!)
  • 4. 4 Why Anomaly Detection? • Watching screens on the “Wall of Charts” cannot scale! – Leads to alert fatigue • Need to automate detection of anomalous behaviors • Anomaly detection is the search for items or events which do not conform to an expected pattern. [Chandola, V.; Banerjee, A.; Kumar, V. (2009). "Anomaly detection: A survey". ACM Computing Surveys 41 (3): 1]
  • 5. Thing 1: Your data is NOT Gaussian 1
  • 6. 6 Gaussian or Normal distribution • Bell-shaped distribution – Has a mean and a standard deviation
  • 7. 7 This is Normally distributed data
  • 8. 8 Quick check: Histogram
  • 9. 9 Normal distributions are really useful • I can make powerful predictions because of the statistical properties of the data • I can easily compare different metrics since they have similar statistical properties • There is a HUGE body of statistical work on parametric techniques for normally distributed data
  • 10. Normally distributed vs Not - Confidential - 10 Normal distributions • Most naturally occurring processes • Population height, IQ distributions (present company excepted of course) • Widget sizes, weights in manufacturing • … Not • Your metrics!
  • 11. 11 Why is that important? • Most analytics tools are based on two assumptions: 1. Parametric techniques: Data is normally distributed with a useful and usable mean and standard deviation 2. Supervised Learning techniques: Data is probabilistically “stationary”
  • 12. 12 Example: Three-Sigma Rule • Three-sigma rule – ~68% of the values lie within 1 std deviation of the mean – ~95% of the values lie within 2 std deviations – 99.73% of the values lie within 3 std deviations: anything else is considered an outlier
  • 13. 13 Aaahhhh • The mysterious red lines explained 3s mean 3s
  • 14. 14 Doesn’t work because THIS
  • 15. 15 Histogram – probability distribution
  • 16. 16 3-sigma rule alerts
  • 18. 18 Or worse, THIS!
  • 19. 19 Histogram – probability distribution
  • 20. 20 3-sigma rule alerts
  • 21. Thing 2: 2 Yesterday’s anomaly is today’s normal
  • 22. 22 Why is that important? • Most analytics tools are based on two assumptions: 1. Parametric techniques: Data is normally distributed with a useful and usable mean and standard deviation 2. Supervised Learning techniques: Data is probabilistically “stationary”
  • 24. 24 No matter where you look
  • 25. 25 Its characteristics are stationary
  • 26. 26 Meanwhile, in our real world • Stationarity is not a realistic assumption in the large complex systems with which we’re dealing • “Concept Drift” is very common – http://en.wikipedia.org/wiki/Concept_drift “ … the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes.”
  • 27. 27 Meanwhile, in our real world • Stationarity is not a realistic assumption in the large complex systems with which we’re dealing • “Concept Drift” is very common – http://en.wikipedia.org/wiki/Concept_drift “ … the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes.”
  • 28. 28 Supervised learning • In ML, Supervised Learning is the general set of techniques for inferring a model from a set of observations: – Observations in a Training Set are labelled with the desired outcomes (e.g. “normal vs. anomalous”, “normal vs. fraudulent”, “red/green/yellow”, etc) – As observations are fed into the learning system, it learns to differentiate by inferring a model based on these labels – Once sufficiently “trained”, the system is used in production on “real” unlabelled data and can label the new data based on the inferred model
  • 29. What happens when something changes in your fundamentals? 29
  • 30. This is your new normal: all red all the time 30
  • 31. 31 Mean Shift and Breakout Detection • https://blog.twitter.com/2014/breakout-detection- in-the-wild
  • 32. Thing 3: Saying Kolmogorov-Smirnov is a great way to impress everyone 3
  • 33. 33 Why is that important? • Seriously!? • Ok, actually non-parametric techniques that make no assumptions about normality or any other probability distribution are crucial in your effort to understand what’s going on in your systems
  • 34. 34 The Kolmogorov-Smirnov test • Non-parametric test – Compare two probability distributions – Makes no assumptions (e.g. Gaussian) about the distributions of the samples – Measures maximum distance between cumulative distributions – Can be used to compare periodic/seasonal metric periods (e.g. day-to-day or week-to-week) http://en.wikipedia.org/wiki/Kol mogorov%E2%80%93Smirnov_te st
  • 35. 35 KS with windowing
  • 36. 36 Data from similar windows
  • 37. Cumulative distribution for those windows 37
  • 38. 38 Data from dissimilar windows
  • 39. Cumulative distribution for those windows 39
  • 40. 40 Sliding window of KS scores
  • 41. 41 KS anomaly results
  • 42. Thing 4: 4 Take Scope and Context into account!
  • 43. 43 Some data – is that normal?
  • 45. 45 Is this an anomlay?
  • 46. 46 Even wider scope
  • 47. 47 Is every weekend an anomaly?
  • 48. 48 Would this be more accurate?
  • 49. 49 Use domain knowledge! • Domain knowledge is NOT a bad thing! – There is no algorithm that will work on everything – Know your data and it general patterns • Periodicity/Seasonality • Known events (maintenance, backups, etc) – Apply the appropriate algorithms, taking into account enough scope for any inherent periodicity to appear – Customize your alerts to take into accounts known events
  • 50. Thing 5: No data != No information
  • 51. 51 Why is that important? • Some data channels are inherently non-chatty: – We don’t have the luxury of always generating non-zero values – There is a lot of useful information in the fact that nothing is happening on a particular channel • A lot of time series analytics techniques fail on time series with too few values (e.g. RF, adjusted box plot, etc)
  • 53. 53 Box plot results
  • 54. 54
  • 55. 55 Simple lookup table with priors
  • 56. 56 Don’t be an analytics snob • Sparse data is VERY hard to analyze using typical analytics techniques • Sparse data conveys VERY important information • Sometimes the simplest rules, thresholds, lookup tables will work
  • 57. 57 Recap 1. Your data is NOT Gaussian 2. Yesterday’s anomaly is today’s normal 3. Kolmogorov-Smirnov is really cool 4. Scope and Context are important 5. No data != No information
  • 58. 58 Questions? • Shout out to the Metafor Data Science team! – Fred Zhang – Iman Makaremi