SlideShare a Scribd company logo
1 of 21
Download to read offline
@nathanielvcook
WATCH ANYTHING,
WATCH EVERYTHING
ANOMALY DETECTION BY NATHANIEL COOK
@nathanielvcook
In DevOps we are good at collecting metrics
Why? Because the tooling makes it easy and it's in our
culture.
Is it not hard to collect millions of unique metrics at tens of
terabytes a month.
@nathanielvcook
The Problem - Scalability
● Dashboarding doesn’t scale
● Static thresholds don’t scale
● Tooling isn’t easy enough
We need to automate watching
metrics, aka anomaly detection.
@nathanielvcookHow many anomalies does this graph have?
@nathanielvcookHow many anomalies does this graph have?
@nathanielvcook
TICK Stack
@nathanielvcook
Ways we can “watch” metrics
● With our eyes
● Static Thresholds
● Machine Learning / Statistical models
@nathanielvcook
Machine Learning 101
1. Get a set of training data
2. Create a model from the data
3. Compare new raw metrics to the model
4. (If you are cool update the model again)
@nathanielvcook
Standard Deviation Model
1. Yesterday’s data at the same time of day.
2. Compute the mean and standard deviation of the
training data.
3. The current data is anomalous if: abs(data - mean) >
(threshold * stddev)
Threshold -- is the number of standard deviations to expect
around the mean. Typically it’s greater than 2.
@nathanielvcook
Visualizing error bands. How would you express this process in code?
@nathanielvcookvar yesterday = batch
|query('SELECT mean(value), stddev(value) FROM request_latency')
.offset(1d)
.period(1h)
.every(5m)
.align()
|shift(1d)
var today = batch
|query('SELECT mean(value) FROM request_latency')
.period(1h)
.every(5m)
.align()
yesterday
|join(today)
.as('yesterday', 'today')
|alert()
.crit(lambda: abs("today.mean" - "yesterday.mean") > (3.5 *
"yesterday.stddev"))
This code is TICKscript the DSL Kapacitor uses to define tasks.
@nathanielvcook
Predictive Model
Holt-Winters: A forecasting method from the 60s.
Find anomalies by predicting a trend for our current data.
1. Get previous 30 days of data.
2. Using Holt-Winters forecast today day.
3. If the predicted values differ significantly from real
values we found an anomaly.
@nathanielvcook
Predictive model for detecting unexpected data.
var training = batch
|query('SELECT max(value) FROM request_count')
.offset(1d)
.groupBy(time(1d))
.period(30d)
.every(1d)
var predicted = training
|holtWinters('max', 1, 7, 1d)
|last('max')
.as('value')
var current = batch
|query('SELECT max(value) FROM request_count')
.period(1d)
.every(1d)
|last('max')
.as('value')
predicted
|join(current)
.as('predicted', 'current')
|alert()
.crit(lambda: abs("predicted.value" - "current.value") / "predicted.value" > 0.2)
@nathanielvcook
Custom Model
Morgoth: An unsupervised anomaly detection framework.
Find anomalies by using a custom anomaly detection
framework.
1. Not needed
2. Give each window an anomaly score via Morgoth.
3. Check the anomaly score.
@nathanielvcook
Custom algorithm
stream
|from()
.measurement('request_count')
|window()
.period(5m)
.every(5m)
@morgoth()
.field('value')
.scoreField('anomaly_score')
.sigma(3.5)
|alert()
.crit(lambda: "anomaly_score" > 0.9)
@nathanielvcook
How do you pick a model?
● This is the golden question.
● No one model that does best.
● Simple is better, start with something simple.
● Let data help you choose a model.
@nathanielvcook
Properties of an Anomaly Detection Method:
● False Positive Rate (FPR)-- Boy who cried wolf
● False Negative Rate (FNR) -- Missed anomalies
● Detection Delay (DD)
Ask yourself: What is the cost of each?
@nathanielvcook
Try it out
1. Pick a metric
2. Pick a model
3. Evaluate the model on a set of historical data
4. Rate the model based on its FPR, FNR and DD values.
If the model isn’t good enough try a different one or
improve your existing one.
@nathanielvcook
Kapacitor makes this easy
● Select historical data and replay it against your task:
kapacitor replay-live batch -task request_count_alert -past 180d -rec-time
● Save static data sets to use as test fixtures.
kapacitor record batch -task request_count_alert -past 180d
● Store anomalies back into InfluxDB to compute FPR and FNR.
@nathanielvcook
Automate “watching”
your metrics
@nathanielvcook
Q&A / More Resources:
● Anomaly Detection 101 -- Elizabeth (Betsy) Nichols Ph.D. https://www.
youtube.com/watch?v=5vrY4RbeWkM
● Kapacitor is Open Source check it out on Github https://github.
com/influxdata/kapacitor
● Wikipedia is your friend. There are many good explanations of how to
employ various anomaly detection techniques.

More Related Content

Similar to Watch everything, Watch anything

Use Machine Learning to Get the Most out of Your Big Data Clusters
Use Machine Learning to Get the Most out of Your Big Data ClustersUse Machine Learning to Get the Most out of Your Big Data Clusters
Use Machine Learning to Get the Most out of Your Big Data Clusters
Databricks
 

Similar to Watch everything, Watch anything (20)

Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
 
Functional Reactive Programming (RxJava) on Android
Functional Reactive Programming (RxJava) on AndroidFunctional Reactive Programming (RxJava) on Android
Functional Reactive Programming (RxJava) on Android
 
QCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for EveryoneQCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for Everyone
 
Part2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer Statistics
 
Unit test
Unit testUnit test
Unit test
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMEREVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
 
Creating Your First Predictive Model In Python
Creating Your First Predictive Model In PythonCreating Your First Predictive Model In Python
Creating Your First Predictive Model In Python
 
SAST, fight against potential vulnerabilities
SAST, fight against potential vulnerabilitiesSAST, fight against potential vulnerabilities
SAST, fight against potential vulnerabilities
 
We Are All Testers Now: The Testing Pyramid and Front-End Development
We Are All Testers Now: The Testing Pyramid and Front-End DevelopmentWe Are All Testers Now: The Testing Pyramid and Front-End Development
We Are All Testers Now: The Testing Pyramid and Front-End Development
 
Test driven development
Test driven developmentTest driven development
Test driven development
 
Use Machine Learning to Get the Most out of Your Big Data Clusters
Use Machine Learning to Get the Most out of Your Big Data ClustersUse Machine Learning to Get the Most out of Your Big Data Clusters
Use Machine Learning to Get the Most out of Your Big Data Clusters
 
Современные технологии и инструменты анализа вредоносного ПО
Современные технологии и инструменты анализа вредоносного ПОСовременные технологии и инструменты анализа вредоносного ПО
Современные технологии и инструменты анализа вредоносного ПО
 
Современные технологии и инструменты анализа вредоносного ПО_PHDays_2017_Pisk...
Современные технологии и инструменты анализа вредоносного ПО_PHDays_2017_Pisk...Современные технологии и инструменты анализа вредоносного ПО_PHDays_2017_Pisk...
Современные технологии и инструменты анализа вредоносного ПО_PHDays_2017_Pisk...
 
Testing in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita GalkinTesting in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita Galkin
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutes
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 

Watch everything, Watch anything