SlideShare a Scribd company logo
DETECTING ANOMALIES IN STREAMING DATA
Data By The Bay
May 19, 2016
Subutai Ahmad
@SubutaiAhmad
sahmad@numenta.com
OUTLINE
• Real-time streaming analytics
• Anomaly detection with Hierarchical Temporal Memory
• Benchmarking real-time anomaly detection
• Summary
Monitoring
IT infrastructure
Uncovering
fraudulent
transactions
Tracking
vehicles
Real-time
health
monitoring
Monitoring
energy
consumption
Detection is necessary, but prevention is often the goal
REAL-TIME ANOMALY DETECTION
•  Exponential growth in IoT, sensors and real-time data collection is driving an
explosion of streaming data
•  The biggest application for machine learning is anomaly detection
EXAMPLE: PREVENTIVE MAINTENANCE
EXAMPLE: PREVENTIVE MAINTENANCE
Planned
shutdown
Behavioral change
preceding failure
Catastrophic
failure
THE STREAMING ANALYTICS PROBLEM
Given all past input and current
input, decide whether the system
behavior is anomalous right now.
Must report decision, perform
any retraining, bookkeeping,
etc. before next input arrives.
No look-ahead
No training/test set split – everything must be done online
System must be automated, and customized to each stream
HIERARCHICAL TEMPORAL MEMORY (HTM)
• Powerful sequence memory derived
from recent findings in experimental
neuroscience
• High capacity memory based system
• Models temporal sequences in data
• Inherently streaming
• Continuously learning and predicting
• No need to tune hyper-parameters
• Open source: github.com/numenta
HTM PREDICTS FUTURE INPUT
• Input to the system is a stream of data
• Encoded into a sparse high dimensional vector
• Learns temporal sequences in input stream and makes a prediction
in the form of a sparse vector
•  represents a prediction for upcoming input
HTM
ANOMALY DETECTION WITH HTM
HTM
Raw anomaly
score
Anomaly
likelihood
is an instantaneous measure of
prediction error
•  0 if input was perfectly prediction
•  1 if it was completely unpredicted
•  Could threshold it directly to report
anomalies, but in very noisy
environments we can do better
ANOMALY LIKELIHOOD
• Second order measure: did the predictability of the metric change?
1.  Estimate historical distribution of anomaly scores
2.  Check if recent scores are very different
ANOMALY LIKELIHOOD
• Second order measure: did the predictability of the metric change?
1.  Estimate historical distribution of anomaly scores
2.  Check if recent scores are very different
ANOMALY DETECTION WITH HTM
HTM
Raw anomaly
score
Anomaly
likelihood
Learns temporal sequences
Continuously makes predictions
Continuously learning
Was current input
predicted?
Has level of
predictability changed
significantly?
ANOMALIES IN IT INFRASTRUCTURE
• Grok
•  Commercial server based product detects anomalies in IT infrastructure
•  Runs thousands of HTM anomaly detectors in real time
•  10 milliseconds per input per metric, including continuous learning
•  No parameter tuning required
•  http://grokstream.com
ANOMALIES IN FINANCIAL DATA
• HTM for Stocks
•  Real-time free demo application
•  Continuously monitors top 200 stocks
•  Available on iOS App Store or Google Play Store
•  Open source application: github.com/numenta/numenta-apps
OUTLINE
• Real-time streaming analytics
• Anomaly detection with Hierarchical Temporal Memory
• Benchmarking real-time anomaly detection
• Summary
EVALUATING STREAMING ANOMALY DETECTION
•  Most existing benchmarks are designed for batch data, not
streaming data
•  Hard to find benchmarks containing real world data labeled with
anomalies
•  There is a need for an open benchmark designed to test real-time
anomaly detection
•  A standard community benchmark could spur innovation in
streaming anomaly detection algorithms
NUMENTA ANOMALY BENCHMARK (NAB)
•  NAB: a rigorous benchmark for anomaly
detection in streaming applications
NUMENTA ANOMALY BENCHMARK (NAB)
•  NAB: a rigorous benchmark for anomaly
detection in streaming applications
•  Real-world benchmark data set
•  58 labeled data streams
(47 real-world, 11 artificial streams)
•  Total of 365,551 data points
NUMENTA ANOMALY BENCHMARK (NAB)
•  NAB: a rigorous benchmark for anomaly
detection in streaming applications
•  Real-world benchmark data set
•  58 labeled data streams
(47 real-world, 11 artificial streams)
•  Total of 365,551 data points
•  Scoring mechanism
•  Rewards early detection
•  Different “application profiles”
NUMENTA ANOMALY BENCHMARK (NAB)
•  NAB: a rigorous benchmark for anomaly
detection in streaming applications
•  Real-world benchmark data set
•  58 labeled data streams
(47 real-world, 11 artificial streams)
•  Total of 365,551 data points
•  Scoring mechanism
•  Rewards early detection
•  Different “application profiles”
•  Open resource
•  AGPL repository contains data, source code,
and documentation
•  github.com/numenta/NAB
•  Ongoing competition to expand NAB
EXAMPLE: HOURLY SERVICE DEMAND
Spike in demand
Unusually low demand
EXAMPLE: PRODUCTION SERVER CPU
Spiking behavior becomes the new norm
Spike anomaly
HOW SHOULD WE SCORE ANOMALIES?
•  The perfect detector
•  Detects anomalies as soon as possible
•  Provides detections in real time
•  Triggers no false alarms
•  Requires no parameter tuning
•  Automatically adapts to changing statistics
•  Scoring methods in traditional benchmarks are insufficient
•  Precision/recall does not incorporate importance of early detection
•  Artificial separation into training and test sets does not handle continuous learning
•  Batch data files allow look ahead and multiple passes through the data
WHERE IS THE ANOMALY?
NAB DEFINES ANOMALY WINDOWS
NAB scoring function gives higher score to earlier detections in window
OTHER DETAILS
•  Application profiles
•  Three application profiles assign different weightings based on the tradeoff between
false positives and false negatives.
•  EKG data on a cardiac patient favors False Positives.
•  IT / DevOps professionals hate False Positives.
•  Three application profiles: standard, favor low false positives, favor low false negatives.
•  NAB emulates practical real-time scenarios
•  Look ahead not allowed for algorithms. Detections must be made on the fly.
•  No separation between training and test files. Invoke model, start streaming, and go.
•  No batch parameter tuning. Must be fully automated with single set of parameters
across data streams. Any further parameter tuning must be done on the fly.
TESTING ALGORITHMS WITH NAB
•  NAB is designed to easily plug in and test new algorithms
•  Results with several algorithms:
•  Hierarchical Temporal Memory
•  Etsy Skyline
•  Popular open source anomaly detection technique
•  Mixture of statistical experts, continuously learning
•  Twitter ADVec
•  Open source anomaly detection released last year
•  Robust outlier statistics + piecewise approximation
•  Bayesian Online Change Point Detection
•  Formal Bayesian method for detecting anomalies in time series
NAB V1.0 RESULTS (58 FILES)
DETECTION RESULTS: CPU USAGE ON
PRODUCTION SERVER
Simple spike, all 3
algorithms detect
Shift in usage
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
DETECTION RESULTS: MACHINE
TEMPERATURE READINGS
HTM detects purely
temporal anomaly
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
All 3 detect
catastrophic failure
DETECTION RESULTS: TEMPORAL CHANGES IN
BEHAVIOR OFTEN PRECEDE A LARGER SHIFT
HTM detects anomaly 3
hours earlier
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
NAB COMPETITION!!
•  NAB is a resource for the streaming analytics community
•  Need additional real-world data files and more algorithms tested
•  NAB Competition offers cash prizes for:
•  Additional anomaly detection algorithms tested on NAB
•  Submission of real-world data files with labeled real anomalies
•  Cash prizes of $2,500 each for algorithms and data
•  Easy to enter, high likelihood of winning!
•  Go to http://numenta.org/nab for details
SUMMARY
•  Anomaly detection for streaming data imposes unique challenges
•  Stringent real-time constraints and automation requirements
•  Typical batch methodologies do not work well
•  HTM learning algorithms
•  Can be used to create a streaming anomaly detection system
•  Performs very well across a wide range of datasets
•  Open source, commercially deployable
•  NAB is an open source benchmark for streaming anomaly detection
•  Includes a labeled dataset with real world data
•  Scoring methodology designed for practical real-time applications
•  NAB competition!
RESOURCES
Grok (anomalies in IT infrastructure): http://grokstream.com
HTM Studio (desktop app for easy experimentation): contact me
Open Source Repositories:
Algorithm code: https://github.com/numenta/nupic
HTM Stocks demo: https://github.com/numenta/numenta-apps
NAB code + paper: https://github.com/numenta/nab
Apache Flink: https://github.com/nupic-community/flink-htm
Contact info:
Subutai Ahmad sahmad@numenta.com, @SubutaiAhmad
Alex Lavin alavin@numenta.com, @theAlexLavin

More Related Content

What's hot

Real-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMReal-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTM
Numenta
 
HTM & Apache Flink (2016-06-27)
HTM & Apache Flink (2016-06-27)HTM & Apache Flink (2016-06-27)
HTM & Apache Flink (2016-06-27)
Eron Wright
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
Subutai Ahmad
 
State of NuPIC
State of NuPICState of NuPIC
State of NuPIC
Numenta
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
BigPanda
 
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
DevOpsDays Tel Aviv
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
tboubez
 
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunk
 
Tuning Java Servers
Tuning Java Servers Tuning Java Servers
Tuning Java Servers
Srinath Perera
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Brian Brazil
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
Rising Media, Inc.
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
nathanmarz
 
Practical Deep Learning
Practical Deep LearningPractical Deep Learning
Practical Deep Learning
André Karpištšenko
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrong
nathanmarz
 
Migrating to Prometheus: what we learned running it in production
Migrating to Prometheus: what we learned running it in productionMigrating to Prometheus: what we learned running it in production
Migrating to Prometheus: what we learned running it in production
Marco Pracucci
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
Turi, Inc.
 
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
Humoyun Ahmedov
 
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
Kalman Graffi
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
Albert Bifet
 

What's hot (20)

Real-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMReal-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTM
 
HTM & Apache Flink (2016-06-27)
HTM & Apache Flink (2016-06-27)HTM & Apache Flink (2016-06-27)
HTM & Apache Flink (2016-06-27)
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
 
State of NuPIC
State of NuPICState of NuPIC
State of NuPIC
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
 
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
 
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine LearningSplunkLive! Prelert Session - Extending Splunk with Machine Learning
SplunkLive! Prelert Session - Extending Splunk with Machine Learning
 
Tuning Java Servers
Tuning Java Servers Tuning Java Servers
Tuning Java Servers
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
 
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
 
Practical Deep Learning
Practical Deep LearningPractical Deep Learning
Practical Deep Learning
 
Your Code is Wrong
Your Code is WrongYour Code is Wrong
Your Code is Wrong
 
Migrating to Prometheus: what we learned running it in production
Migrating to Prometheus: what we learned running it in productionMigrating to Prometheus: what we learned running it in production
Migrating to Prometheus: what we learned running it in production
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
Splash: User-friendly Programming Interface for Parallelizing Stochastic Lear...
 
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
Apache Storm based Real Time Analytics for Recommending Trending Topics and S...
 
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 

Viewers also liked

Temporal memory in racket
Temporal memory in racketTemporal memory in racket
Temporal memory in racket
Numenta
 
Beginner's Guide to NuPIC
Beginner's Guide to NuPICBeginner's Guide to NuPIC
Beginner's Guide to NuPICNumenta
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Spark Summit
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
Ted Dunning
 
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine IntelligencePrinciples of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Numenta
 
2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff
Numenta
 
Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure
Numenta
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
Jeff Henrikson
 
Anomaly detection in deep learning
Anomaly detection in deep learningAnomaly detection in deep learning
Anomaly detection in deep learning
Adam Gibson
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
James Sirota
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using Heron
Arun Kejariwal
 
HawkEye : A Real-time Anomaly Detection System
HawkEye : A Real-time Anomaly Detection SystemHawkEye : A Real-time Anomaly Detection System
HawkEye : A Real-time Anomaly Detection System
Satnam Singh
 
Internship_presentation
Internship_presentationInternship_presentation
Internship_presentationAditya Gautam
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event data
Anodot
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
Arun Kejariwal
 
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A..."Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
Dataconomy Media
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Revolution Analytics
 
Deploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsDeploying R in BI and Real time Applications
Deploying R in BI and Real time Applications
Lou Bajuk
 
2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon KickoffNumenta
 

Viewers also liked (20)

Temporal memory in racket
Temporal memory in racketTemporal memory in racket
Temporal memory in racket
 
Beginner's Guide to NuPIC
Beginner's Guide to NuPICBeginner's Guide to NuPIC
Beginner's Guide to NuPIC
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
 
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine IntelligencePrinciples of Hierarchical Temporal Memory - Foundations of Machine Intelligence
Principles of Hierarchical Temporal Memory - Foundations of Machine Intelligence
 
2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff2014 Fall NuPIC Hackathon Kickoff
2014 Fall NuPIC Hackathon Kickoff
 
Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure Sparse Distributed Representations: Our Brain's Data Structure
Sparse Distributed Representations: Our Brain's Data Structure
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
 
Anomaly detection in deep learning
Anomaly detection in deep learningAnomaly detection in deep learning
Anomaly detection in deep learning
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using Heron
 
HawkEye : A Real-time Anomaly Detection System
HawkEye : A Real-time Anomaly Detection SystemHawkEye : A Real-time Anomaly Detection System
HawkEye : A Real-time Anomaly Detection System
 
Internship_presentation
Internship_presentationInternship_presentation
Internship_presentation
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event data
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
 
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A..."Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
"Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, A...
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
Deploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsDeploying R in BI and Real time Applications
Deploying R in BI and Real time Applications
 
2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff2014 Spring NuPIC Hackathon Kickoff
2014 Spring NuPIC Hackathon Kickoff
 

Similar to Detecting Anomalies in Streaming Data

Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
MLconf
 
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming AnalyticsSmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
DATAVERSITY
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOps
CA Technologies
 
Chris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, TridiumChris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, Tridium
Global Business Intelligence
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
VMware Tanzu Korea
 
Apeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_dayApeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_day
MIDIH_EU
 
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor MihajlovicMachine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
Institute of Contemporary Sciences
 
Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
Fraud Analytics with Machine Learning and Big Data Engineering for TelecomFraud Analytics with Machine Learning and Big Data Engineering for Telecom
Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
Sudarson Roy Pratihar
 
Anomaly detection - TIBCO Data Science Central
Anomaly detection - TIBCO Data Science CentralAnomaly detection - TIBCO Data Science Central
Anomaly detection - TIBCO Data Science Central
Michael O'Connell
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
confluent
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
Amit Kejriwal
 
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Aleksandr Tavgen
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
GR8Conf
 
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
ManageEngine, Zoho Corporation
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Impetus Technologies
 
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data CollectionCNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
Sam Bowne
 
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
John Furrier
 
Hyper-Convergence CrowdChat
Hyper-Convergence CrowdChatHyper-Convergence CrowdChat
Hyper-Convergence CrowdChat
Wikibon Community
 

Similar to Detecting Anomalies in Streaming Data (20)

Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
Subutai Ahmad, VP of Research, Numenta at MLconf SF - 11/13/15
 
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming AnalyticsSmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOps
 
Chris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, TridiumChris Irwin - Business Development Director, Tridium
Chris Irwin - Business Development Director, Tridium
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
 
Apeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_dayApeman masta midih-oc2_demo_day
Apeman masta midih-oc2_demo_day
 
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor MihajlovicMachine Intelligence in Manufacturing Industry - Igor Mihajlovic
Machine Intelligence in Manufacturing Industry - Igor Mihajlovic
 
Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
Fraud Analytics with Machine Learning and Big Data Engineering for TelecomFraud Analytics with Machine Learning and Big Data Engineering for Telecom
Fraud Analytics with Machine Learning and Big Data Engineering for Telecom
 
Anomaly detection - TIBCO Data Science Central
Anomaly detection - TIBCO Data Science CentralAnomaly detection - TIBCO Data Science Central
Anomaly detection - TIBCO Data Science Central
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Observability -  The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data CollectionCNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
 
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE Wikibon #IoT #HyperConvergence Presentation via @theCUBE
Wikibon #IoT #HyperConvergence Presentation via @theCUBE
 
Hyper-Convergence CrowdChat
Hyper-Convergence CrowdChatHyper-Convergence CrowdChat
Hyper-Convergence CrowdChat
 

More from Numenta

Deep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesDeep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devices
Numenta
 
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth RamaswamyBrains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Numenta
 
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas MiconiBrains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Numenta
 
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Numenta
 
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Numenta
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Numenta
 
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence SpracklenSBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
Numenta
 
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
Numenta
 
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
Numenta
 
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Numenta
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
Numenta
 
Sparsity In The Neocortex, And Its Implications For Machine Learning
Sparsity In The Neocortex,  And Its Implications For Machine LearningSparsity In The Neocortex,  And Its Implications For Machine Learning
Sparsity In The Neocortex, And Its Implications For Machine Learning
Numenta
 
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
Numenta
 
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Numenta
 
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Numenta
 
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ... Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
Numenta
 
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Numenta
 
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
Numenta
 
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
Numenta
 

More from Numenta (20)

Deep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devicesDeep learning at the edge: 100x Inference improvement on edge devices
Deep learning at the edge: 100x Inference improvement on edge devices
 
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth RamaswamyBrains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth Ramaswamy
 
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas MiconiBrains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas Miconi
 
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
 
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
 
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence SpracklenSBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence Spracklen
 
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...
 
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...
 
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...
 
Sparsity In The Neocortex, And Its Implications For Machine Learning
Sparsity In The Neocortex,  And Its Implications For Machine LearningSparsity In The Neocortex,  And Its Implications For Machine Learning
Sparsity In The Neocortex, And Its Implications For Machine Learning
 
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...
 
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...
 
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...Location, Location, Location - A Framework for Intelligence and Cortical Comp...
Location, Location, Location - A Framework for Intelligence and Cortical Comp...
 
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ... Have We Missed Half of What the Neocortex Does?  A New Predictive Framework ...
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...
 
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
Locations in the Neocortex: A Theory of Sensorimotor Prediction Using Cortica...
 
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
The Predictive Neuron: How Active Dendrites Enable Spatiotemporal Computation...
 
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)
 

Recently uploaded

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 

Recently uploaded (20)

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 

Detecting Anomalies in Streaming Data

  • 1. DETECTING ANOMALIES IN STREAMING DATA Data By The Bay May 19, 2016 Subutai Ahmad @SubutaiAhmad sahmad@numenta.com
  • 2. OUTLINE • Real-time streaming analytics • Anomaly detection with Hierarchical Temporal Memory • Benchmarking real-time anomaly detection • Summary
  • 3. Monitoring IT infrastructure Uncovering fraudulent transactions Tracking vehicles Real-time health monitoring Monitoring energy consumption Detection is necessary, but prevention is often the goal REAL-TIME ANOMALY DETECTION •  Exponential growth in IoT, sensors and real-time data collection is driving an explosion of streaming data •  The biggest application for machine learning is anomaly detection
  • 5. EXAMPLE: PREVENTIVE MAINTENANCE Planned shutdown Behavioral change preceding failure Catastrophic failure
  • 6. THE STREAMING ANALYTICS PROBLEM Given all past input and current input, decide whether the system behavior is anomalous right now. Must report decision, perform any retraining, bookkeeping, etc. before next input arrives. No look-ahead No training/test set split – everything must be done online System must be automated, and customized to each stream
  • 7. HIERARCHICAL TEMPORAL MEMORY (HTM) • Powerful sequence memory derived from recent findings in experimental neuroscience • High capacity memory based system • Models temporal sequences in data • Inherently streaming • Continuously learning and predicting • No need to tune hyper-parameters • Open source: github.com/numenta
  • 8. HTM PREDICTS FUTURE INPUT • Input to the system is a stream of data • Encoded into a sparse high dimensional vector • Learns temporal sequences in input stream and makes a prediction in the form of a sparse vector •  represents a prediction for upcoming input HTM
  • 9. ANOMALY DETECTION WITH HTM HTM Raw anomaly score Anomaly likelihood is an instantaneous measure of prediction error •  0 if input was perfectly prediction •  1 if it was completely unpredicted •  Could threshold it directly to report anomalies, but in very noisy environments we can do better
  • 10. ANOMALY LIKELIHOOD • Second order measure: did the predictability of the metric change? 1.  Estimate historical distribution of anomaly scores 2.  Check if recent scores are very different
  • 11. ANOMALY LIKELIHOOD • Second order measure: did the predictability of the metric change? 1.  Estimate historical distribution of anomaly scores 2.  Check if recent scores are very different
  • 12. ANOMALY DETECTION WITH HTM HTM Raw anomaly score Anomaly likelihood Learns temporal sequences Continuously makes predictions Continuously learning Was current input predicted? Has level of predictability changed significantly?
  • 13. ANOMALIES IN IT INFRASTRUCTURE • Grok •  Commercial server based product detects anomalies in IT infrastructure •  Runs thousands of HTM anomaly detectors in real time •  10 milliseconds per input per metric, including continuous learning •  No parameter tuning required •  http://grokstream.com
  • 14. ANOMALIES IN FINANCIAL DATA • HTM for Stocks •  Real-time free demo application •  Continuously monitors top 200 stocks •  Available on iOS App Store or Google Play Store •  Open source application: github.com/numenta/numenta-apps
  • 15. OUTLINE • Real-time streaming analytics • Anomaly detection with Hierarchical Temporal Memory • Benchmarking real-time anomaly detection • Summary
  • 16. EVALUATING STREAMING ANOMALY DETECTION •  Most existing benchmarks are designed for batch data, not streaming data •  Hard to find benchmarks containing real world data labeled with anomalies •  There is a need for an open benchmark designed to test real-time anomaly detection •  A standard community benchmark could spur innovation in streaming anomaly detection algorithms
  • 17. NUMENTA ANOMALY BENCHMARK (NAB) •  NAB: a rigorous benchmark for anomaly detection in streaming applications
  • 18. NUMENTA ANOMALY BENCHMARK (NAB) •  NAB: a rigorous benchmark for anomaly detection in streaming applications •  Real-world benchmark data set •  58 labeled data streams (47 real-world, 11 artificial streams) •  Total of 365,551 data points
  • 19. NUMENTA ANOMALY BENCHMARK (NAB) •  NAB: a rigorous benchmark for anomaly detection in streaming applications •  Real-world benchmark data set •  58 labeled data streams (47 real-world, 11 artificial streams) •  Total of 365,551 data points •  Scoring mechanism •  Rewards early detection •  Different “application profiles”
  • 20. NUMENTA ANOMALY BENCHMARK (NAB) •  NAB: a rigorous benchmark for anomaly detection in streaming applications •  Real-world benchmark data set •  58 labeled data streams (47 real-world, 11 artificial streams) •  Total of 365,551 data points •  Scoring mechanism •  Rewards early detection •  Different “application profiles” •  Open resource •  AGPL repository contains data, source code, and documentation •  github.com/numenta/NAB •  Ongoing competition to expand NAB
  • 21. EXAMPLE: HOURLY SERVICE DEMAND Spike in demand Unusually low demand
  • 22. EXAMPLE: PRODUCTION SERVER CPU Spiking behavior becomes the new norm Spike anomaly
  • 23. HOW SHOULD WE SCORE ANOMALIES? •  The perfect detector •  Detects anomalies as soon as possible •  Provides detections in real time •  Triggers no false alarms •  Requires no parameter tuning •  Automatically adapts to changing statistics •  Scoring methods in traditional benchmarks are insufficient •  Precision/recall does not incorporate importance of early detection •  Artificial separation into training and test sets does not handle continuous learning •  Batch data files allow look ahead and multiple passes through the data
  • 24. WHERE IS THE ANOMALY?
  • 25. NAB DEFINES ANOMALY WINDOWS NAB scoring function gives higher score to earlier detections in window
  • 26. OTHER DETAILS •  Application profiles •  Three application profiles assign different weightings based on the tradeoff between false positives and false negatives. •  EKG data on a cardiac patient favors False Positives. •  IT / DevOps professionals hate False Positives. •  Three application profiles: standard, favor low false positives, favor low false negatives. •  NAB emulates practical real-time scenarios •  Look ahead not allowed for algorithms. Detections must be made on the fly. •  No separation between training and test files. Invoke model, start streaming, and go. •  No batch parameter tuning. Must be fully automated with single set of parameters across data streams. Any further parameter tuning must be done on the fly.
  • 27. TESTING ALGORITHMS WITH NAB •  NAB is designed to easily plug in and test new algorithms •  Results with several algorithms: •  Hierarchical Temporal Memory •  Etsy Skyline •  Popular open source anomaly detection technique •  Mixture of statistical experts, continuously learning •  Twitter ADVec •  Open source anomaly detection released last year •  Robust outlier statistics + piecewise approximation •  Bayesian Online Change Point Detection •  Formal Bayesian method for detecting anomalies in time series
  • 28. NAB V1.0 RESULTS (58 FILES)
  • 29. DETECTION RESULTS: CPU USAGE ON PRODUCTION SERVER Simple spike, all 3 algorithms detect Shift in usage Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  • 30. DETECTION RESULTS: MACHINE TEMPERATURE READINGS HTM detects purely temporal anomaly Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key All 3 detect catastrophic failure
  • 31. DETECTION RESULTS: TEMPORAL CHANGES IN BEHAVIOR OFTEN PRECEDE A LARGER SHIFT HTM detects anomaly 3 hours earlier Etsy Skyline Numenta HTM Twitter ADVec Red denotes False Positive Key
  • 32. NAB COMPETITION!! •  NAB is a resource for the streaming analytics community •  Need additional real-world data files and more algorithms tested •  NAB Competition offers cash prizes for: •  Additional anomaly detection algorithms tested on NAB •  Submission of real-world data files with labeled real anomalies •  Cash prizes of $2,500 each for algorithms and data •  Easy to enter, high likelihood of winning! •  Go to http://numenta.org/nab for details
  • 33. SUMMARY •  Anomaly detection for streaming data imposes unique challenges •  Stringent real-time constraints and automation requirements •  Typical batch methodologies do not work well •  HTM learning algorithms •  Can be used to create a streaming anomaly detection system •  Performs very well across a wide range of datasets •  Open source, commercially deployable •  NAB is an open source benchmark for streaming anomaly detection •  Includes a labeled dataset with real world data •  Scoring methodology designed for practical real-time applications •  NAB competition!
  • 34. RESOURCES Grok (anomalies in IT infrastructure): http://grokstream.com HTM Studio (desktop app for easy experimentation): contact me Open Source Repositories: Algorithm code: https://github.com/numenta/nupic HTM Stocks demo: https://github.com/numenta/numenta-apps NAB code + paper: https://github.com/numenta/nab Apache Flink: https://github.com/nupic-community/flink-htm Contact info: Subutai Ahmad sahmad@numenta.com, @SubutaiAhmad Alex Lavin alavin@numenta.com, @theAlexLavin