SlideShare a Scribd company logo
1 of 30
EVALUATING REAL-TIME ANOMALY DETECTION:
THE NUMENTA ANOMALY BENCHMARK
MLCONF San Francisco
November 13, 2015
Subutai Ahmad
sahmad@numenta.com
2
Monitoring
IT infrastructure
Uncovering
fraudulent
transactions
Tracking
vehicles
Real-time
health
monitoring
Monitoring
energy
consumption
Detection is necessary, but prevention is often the goal
REAL-TIME ANOMALY DETECTION
• Exponential growth in IoT, sensors and real-time data collection is driving an
explosion of streaming data
• The biggest application for machine learning is anomaly detection
3
EXAMPLE: PREVENTATIVE MAINTENANCE
4
EXAMPLE: PREVENTATIVE MAINTENANCE
Planned
shutdown
Behavioral change
preceding failure
Catastrophic
failure
5
YET ANOTHER BENCHMARK?
• A benchmark consists of:
• Labeled data sets
• Scoring mechanism
• Versioning system
• Most existing benchmarks are designed for batch data, not
streaming data
• Hard to find benchmarks containing real world data labeled with
anomalies
• We saw a need for a benchmark that is designed to test anomaly
detection algorithms on real-time, streaming data
• A standard community benchmark could spur innovation in real-
time anomaly detection algorithms
6
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
7
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
8
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Reward early detection
• Anomaly windows
• Scoring function
• Different “application profiles”
9
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Reward early detection
• Anomaly windows
• Scoring function
• Different “application profiles”
• Open resource
• AGPL repository contains data, source code,
and documentation
• github.com/numenta/NAB
10
EXAMPLE: LOAD BALANCER HEALTH
Unusually high load balancer latency
11
EXAMPLE: HOURLY SERVICE DEMAND
Spike in demand
Unusually low demand
12
EXAMPLE: PRODUCTION SERVER CPU
Spiking behavior becomes the new norm
Spike anomaly
13
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
• Provides detections in real time
• Triggers no false alarms
• Requires no parameter tuning
• Automatically adapts to changing statistics
14
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
• Provides detections in real time
• Triggers no false alarms
• Requires no parameter tuning
• Automatically adapts to changing statistics
• Scoring methods in traditional benchmarks are insufficient
• Precision/recall does not incorporate importance of early detection
• Artificial separation into training and test sets does not handle continuous learning
• Batch data files allow look ahead and multiple passes through the data
15
WHERE IS THE ANOMALY?
16
NAB DEFINES ANOMALY WINDOWS
17
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positives (scored low)
• Multiple detections within window are
ignored (use earliest one)
SCORING FUNCTION
18
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positives (scored low)
• Multiple detections within window are
ignored (use earliest one)
• Total score is sum of scaled detections
+ weighted sum of missed detections:
SCORING FUNCTION
19
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff between
false positives and false negatives.
• EKG data on a cardiac patient favors False Positives.
• IT / DevOps professionals hate False Positives.
• Three application profiles: standard, favor low false positives, favor low false negatives.
20
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff between
false positives and false negatives.
• EKG data on a cardiac patient favors False Positives.
• IT / DevOps professionals hate False Positives.
• Three application profiles: standard, favor low false positives, favor low false negatives.
• NAB emulates practical real-time scenarios
• Look ahead not allowed for algorithms. Detections must be made on the fly.
• No separation between training and test files. Invoke model, start streaming, and go.
• No batch, per dataset, parameter tuning. Must be fully automated with single set of
parameters across datasets. Any further parameter tuning must be done on the fly.
21
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a large number of algorithms
• Very easy to plug in and test new algorithms
22
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a large number of algorithms
• Very easy to plug in and test new algorithms
• Seed results with three algorithms:
• Hierarchical Temporal Memory
• Numenta’s open source streaming anomaly detection algorithm
• Models temporal sequences in data, continuously learning
• Etsy Skyline
• Popular open source anomaly detection technique
• Mixture of statistical experts, continuously learning
• Twitter ADVec
• Open source anomaly detection released earlier this year
• Robust outlier statistics + piecewise approximation
23
NAB V1.0 RESULTS (58 FILES)
24
DETECTION RESULTS: CPU USAGE ON
PRODUCTION SERVER
Simple spike, all 3
algorithms detect
Shift in usage
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
25
DETECTION RESULTS: MACHINE
TEMPERATURE READINGS
HTM detects purely
temporal anomaly
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
All 3 detect
catastrophic failure
26
DETECTION RESULTS: TEMPORAL CHANGES IN
BEHAVIOR OFTEN PRECEDE A LARGER SHIFT
HTM detects anomaly 3
hours earlier
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
27
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for streaming anomaly detection
• Includes a labeled dataset with real data
• Scoring methodology designed for practical real-time applications
• Fully open source codebase
28
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for streaming anomaly detection
• Includes a labeled dataset with real data
• Scoring methodology designed for practical real-time applications
• Fully open source codebase
• What’s next for NAB?
• We hope to see researchers test additional algorithms
• We hope to spark improved algorithms for streaming
• More data sets!
• Could incorporate UC Irvine dataset, Yahoo labs dataset (not open source)
• Would love to get more labeled streaming datasets from you
• Add support for multivariate anomaly detection
29
NAB RESOURCES
Table 12 at MLConf
Repository: https://github.com/numenta/NAB
Paper:
A. Lavin and S. Ahmad, “Evaluating Real-time Anomaly Detection Algorithms –
the Numenta Anomaly Benchmark,” to appear in 14th International Conference
on Machine Learning and Applications (IEEE ICMLA’15), 2015.
Preprint available: http://arxiv.org/abs/1510.03336
Contact info:
sahmad@numenta.com , alavin@numenta.com
THANK YOU!
QUESTIONS?

More Related Content

What's hot

Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Spark Summit
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
Databricks
 

What's hot (20)

How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
 
How to Extend Apache Spark with Customized Optimizations
How to Extend Apache Spark with Customized OptimizationsHow to Extend Apache Spark with Customized Optimizations
How to Extend Apache Spark with Customized Optimizations
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Observability & Datadog
Observability & DatadogObservability & Datadog
Observability & Datadog
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
 
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
 
Parallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected WaysParallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected Ways
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
What is New with Apache Spark Performance Monitoring in Spark 3.0
What is New with Apache Spark Performance Monitoring in Spark 3.0What is New with Apache Spark Performance Monitoring in Spark 3.0
What is New with Apache Spark Performance Monitoring in Spark 3.0
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
Anomaly Detection and Spark Implementation - Meetup Presentation.pptxAnomaly Detection and Spark Implementation - Meetup Presentation.pptx
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
 
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted MalaskaTop 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
Top 5 Mistakes When Writing Spark Applications by Mark Grover and Ted Malaska
 
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
meetup devops aix marseille du 16/05/23
meetup devops aix marseille du 16/05/23meetup devops aix marseille du 16/05/23
meetup devops aix marseille du 16/05/23
 
How to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache SparkHow to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache Spark
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 

Viewers also liked

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
Numenta
 
Beginner's Guide to NuPIC
Beginner's Guide to NuPICBeginner's Guide to NuPIC
Beginner's Guide to NuPIC
Numenta
 
2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff
Numenta
 

Viewers also liked (20)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
 
Science of Anomaly Detection
Science of Anomaly Detection Science of Anomaly Detection
Science of Anomaly Detection
 
Predictive Analytics with Numenta Machine Intelligence
Predictive Analytics with Numenta Machine IntelligencePredictive Analytics with Numenta Machine Intelligence
Predictive Analytics with Numenta Machine Intelligence
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
 
State of NuPIC
State of NuPICState of NuPIC
State of NuPIC
 
Beginner's Guide to NuPIC
Beginner's Guide to NuPICBeginner's Guide to NuPIC
Beginner's Guide to NuPIC
 
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine IntelligencePrinciples of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
 
What the Brain says about Machine Intelligence
What the Brain says about Machine Intelligence What the Brain says about Machine Intelligence
What the Brain says about Machine Intelligence
 
2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff
 
2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff
 
A Whole New World [DEMO #4] (2014 Fall NuPIC Hackathon)
A Whole New World [DEMO #4] (2014 Fall NuPIC Hackathon)A Whole New World [DEMO #4] (2014 Fall NuPIC Hackathon)
A Whole New World [DEMO #4] (2014 Fall NuPIC Hackathon)
 
Temporal memory in racket
Temporal memory in racketTemporal memory in racket
Temporal memory in racket
 
We'll Always Have Paris
We'll Always Have ParisWe'll Always Have Paris
We'll Always Have Paris
 
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
 
Getting Started with Numenta Technology
Getting Started with Numenta Technology Getting Started with Numenta Technology
Getting Started with Numenta Technology
 
Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)Applications of Hierarchical Temporal Memory (HTM)
Applications of Hierarchical Temporal Memory (HTM)
 
Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure
 
Why Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brainWhy Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brain
 
Real-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMReal-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTM
 
Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup Numenta Anomaly Benchmark - SF Data Science Meetup
Numenta Anomaly Benchmark - SF Data Science Meetup
 

Similar to Evaluating Real-Time Anomaly Detection: The Numenta Anomaly Benchmark

Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
confluent
 

Similar to Evaluating Real-Time Anomaly Detection: The Numenta Anomaly Benchmark (20)

Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOps
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
Apeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_dayApeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_day
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Machine Learning Application Development
Machine Learning Application DevelopmentMachine Learning Application Development
Machine Learning Application Development
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
 
Chris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, TridiumChris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, Tridium
 
Reliability Case Results
Reliability Case ResultsReliability Case Results
Reliability Case Results
 
StatsCraft 2015: Introduction to monitoring - Yoav Abrahami and Mark Sonis
StatsCraft 2015: Introduction to monitoring - Yoav Abrahami and Mark SonisStatsCraft 2015: Introduction to monitoring - Yoav Abrahami and Mark Sonis
StatsCraft 2015: Introduction to monitoring - Yoav Abrahami and Mark Sonis
 
Customer Request Field Meter Testing Programs
Customer Request Field Meter Testing ProgramsCustomer Request Field Meter Testing Programs
Customer Request Field Meter Testing Programs
 
Automating the cip compliance test lab
Automating the cip compliance test labAutomating the cip compliance test lab
Automating the cip compliance test lab
 
Mathworks CAE simulation suite – case in point from automotive and aerospace.
Mathworks CAE simulation suite – case in point from automotive and aerospace.Mathworks CAE simulation suite – case in point from automotive and aerospace.
Mathworks CAE simulation suite – case in point from automotive and aerospace.
 
Performance tuning Grails applications
Performance tuning Grails applicationsPerformance tuning Grails applications
Performance tuning Grails applications
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
 
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
 
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming AnalyticsSmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
 
Lie detector
Lie detectorLie detector
Lie detector
 
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor MihajlovicMachine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
 

More from Numenta

Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Numenta
 
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
Numenta
 

More from Numenta (20)

Deep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesDeep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devices
 
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth RamaswamyBrains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
 
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas MiconiBrains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
 
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
 
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
 
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence SpracklenSBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
 
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
 
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
 
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
 
Sparsity In The Neocortex, And Its Implications For Machine Learning
Sparsity In The Neocortex,  And Its Implications For Machine LearningSparsity In The Neocortex,  And Its Implications For Machine Learning
Sparsity In The Neocortex, And Its Implications For Machine Learning
 
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
 
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
 
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
 
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ... Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
 
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
 
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Evaluating Real-Time Anomaly Detection: The Numenta Anomaly Benchmark

  • 1. EVALUATING REAL-TIME ANOMALY DETECTION: THE NUMENTA ANOMALY BENCHMARK MLCONF San Francisco November 13, 2015 Subutai Ahmad sahmad@numenta.com
  • 2. 2 Monitoring IT infrastructure Uncovering fraudulent transactions Tracking vehicles Real-time health monitoring Monitoring energy consumption Detection is necessary, but prevention is often the goal REAL-TIME ANOMALY DETECTION • Exponential growth in IoT, sensors and real-time data collection is driving an explosion of streaming data • The biggest application for machine learning is anomaly detection
  • 4. 4 EXAMPLE: PREVENTATIVE MAINTENANCE Planned shutdown Behavioral change preceding failure Catastrophic failure
  • 5. 5 YET ANOTHER BENCHMARK? • A benchmark consists of: • Labeled data sets • Scoring mechanism • Versioning system • Most existing benchmarks are designed for batch data, not streaming data • Hard to find benchmarks containing real world data labeled with anomalies • We saw a need for a benchmark that is designed to test anomaly detection algorithms on real-time, streaming data • A standard community benchmark could spur innovation in real- time anomaly detection algorithms
  • 6. 6 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications
  • 7. 7 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points
  • 8. 8 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points • Scoring mechanism • Reward early detection • Anomaly windows • Scoring function • Different “application profiles”
  • 9. 9 NUMENTA ANOMALY BENCHMARK (NAB) • NAB: a rigorous benchmark for anomaly detection in streaming applications • Real-world benchmark data set • 58 labeled data streams (47 real-world, 11 artificial streams) • Total of 365,551 data points • Scoring mechanism • Reward early detection • Anomaly windows • Scoring function • Different “application profiles” • Open resource • AGPL repository contains data, source code, and documentation • github.com/numenta/NAB
  • 10. 10 EXAMPLE: LOAD BALANCER HEALTH Unusually high load balancer latency
  • 11. 11 EXAMPLE: HOURLY SERVICE DEMAND Spike in demand Unusually low demand
  • 12. 12 EXAMPLE: PRODUCTION SERVER CPU Spiking behavior becomes the new norm Spike anomaly
  • 13. 13 HOW SHOULD WE SCORE ANOMALIES? • The perfect detector • Detects every anomaly • Detects anomalies as soon as possible • Provides detections in real time • Triggers no false alarms • Requires no parameter tuning • Automatically adapts to changing statistics
  • 14. 14 HOW SHOULD WE SCORE ANOMALIES? • The perfect detector • Detects every anomaly • Detects anomalies as soon as possible • Provides detections in real time • Triggers no false alarms • Requires no parameter tuning • Automatically adapts to changing statistics • Scoring methods in traditional benchmarks are insufficient • Precision/recall does not incorporate importance of early detection • Artificial separation into training and test sets does not handle continuous learning • Batch data files allow look ahead and multiple passes through the data
  • 15. 15 WHERE IS THE ANOMALY?
  • 17. 17 • Effect of each detection is scaled relative to position within window: • Detections outside window are false positives (scored low) • Multiple detections within window are ignored (use earliest one) SCORING FUNCTION
  • 18. 18 • Effect of each detection is scaled relative to position within window: • Detections outside window are false positives (scored low) • Multiple detections within window are ignored (use earliest one) • Total score is sum of scaled detections + weighted sum of missed detections: SCORING FUNCTION
  • 19. 19 OTHER DETAILS • Application profiles • Three application profiles assign different weightings based on the tradeoff between false positives and false negatives. • EKG data on a cardiac patient favors False Positives. • IT / DevOps professionals hate False Positives. • Three application profiles: standard, favor low false positives, favor low false negatives.
  • 20. 20 OTHER DETAILS • Application profiles • Three application profiles assign different weightings based on the tradeoff between false positives and false negatives. • EKG data on a cardiac patient favors False Positives. • IT / DevOps professionals hate False Positives. • Three application profiles: standard, favor low false positives, favor low false negatives. • NAB emulates practical real-time scenarios • Look ahead not allowed for algorithms. Detections must be made on the fly. • No separation between training and test files. Invoke model, start streaming, and go. • No batch, per dataset, parameter tuning. Must be fully automated with single set of parameters across datasets. Any further parameter tuning must be done on the fly.
  • 21. 21 TESTING ALGORITHMS WITH NAB • NAB is a community effort • The goal is to have researchers independently evaluate a large number of algorithms • Very easy to plug in and test new algorithms
  • 22. 22 TESTING ALGORITHMS WITH NAB • NAB is a community effort • The goal is to have researchers independently evaluate a large number of algorithms • Very easy to plug in and test new algorithms • Seed results with three algorithms: • Hierarchical Temporal Memory • Numenta’s open source streaming anomaly detection algorithm • Models temporal sequences in data, continuously learning • Etsy Skyline • Popular open source anomaly detection technique • Mixture of statistical experts, continuously learning • Twitter ADVec • Open source anomaly detection released earlier this year • Robust outlier statistics + piecewise approximation
  • 23. 23 NAB V1.0 RESULTS (58 FILES)
  • 24. 24 DETECTION RESULTS: CPU USAGE ON PRODUCTION SERVER Simple spike, all 3 algorithms detect Shift in usage Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  • 25. 25 DETECTION RESULTS: MACHINE TEMPERATURE READINGS HTM detects purely temporal anomaly Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key All 3 detect catastrophic failure
  • 26. 26 DETECTION RESULTS: TEMPORAL CHANGES IN BEHAVIOR OFTEN PRECEDE A LARGER SHIFT HTM detects anomaly 3 hours earlier Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  • 27. 27 SUMMARY • Anomaly detection is most common application for streaming analytics • NAB is a community benchmark for streaming anomaly detection • Includes a labeled dataset with real data • Scoring methodology designed for practical real-time applications • Fully open source codebase
  • 28. 28 SUMMARY • Anomaly detection is most common application for streaming analytics • NAB is a community benchmark for streaming anomaly detection • Includes a labeled dataset with real data • Scoring methodology designed for practical real-time applications • Fully open source codebase • What’s next for NAB? • We hope to see researchers test additional algorithms • We hope to spark improved algorithms for streaming • More data sets! • Could incorporate UC Irvine dataset, Yahoo labs dataset (not open source) • Would love to get more labeled streaming datasets from you • Add support for multivariate anomaly detection
  • 29. 29 NAB RESOURCES Table 12 at MLConf Repository: https://github.com/numenta/NAB Paper: A. Lavin and S. Ahmad, “Evaluating Real-time Anomaly Detection Algorithms – the Numenta Anomaly Benchmark,” to appear in 14th International Conference on Machine Learning and Applications (IEEE ICMLA’15), 2015. Preprint available: http://arxiv.org/abs/1510.03336 Contact info: sahmad@numenta.com , alavin@numenta.com