1
Dr. Stephen Dodson, Tech Lead Machine Learning,
Elastic
Machine Learning and
the Elastic Stack
2
Overview
•  Background
•  Machine Learning Overview
•  Machine Learning and the Elastic Stack
•  Demo
•  Architecture
Background
•  Me
–  Currently, Tech Lead, Machine Learning @ Elastic
–  Formally, Founder and CTO of Prelert (acquired by Elastic
September 2016)
‒  Presented overview of Prelert at Elastic London User Group in
May 2016
•  Prelert
–  VC backed software company, founded 2009
–  Behavioural analytics for machine data based (mainly) on
unsupervised machine learning
–  100+ customers + OEMs with CA, Bluecoat, NetApp + others
‒  IT Operations, IT Security, Retail analytics, IoT etc..
4
Machine Learning
•  Algorithms and methods for data driven prediction, decision making, and
modelling1
‒  Learn models from past behaviour (training, modelling)
‒  Use models to predict future behaviour (prediction)
‒  Use predictions to make decisions
•  Examples
‒  Image Recognition
‒  Language Translation
‒  Anomaly Detection
1Machine Learning Overview, Tommi Jaakkola, MIT
5
How is this relevant to the Elastic Stack?
•  Extracting useful, valuable information is hard
Search
Aggregations
Visualization
Machine Learning
6
How is this relevant to the Elastic Stack?
•  What if we want to search for:
‒  Has my order rate dropped significantly?
‒  Do my application logs contain unusual messages?
‒  Are any users behaving unusually?
‒  What transactions are fraudulent?
•  Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of
questions and get understandable answers
•  Constraints:
‒  Data may be limited: no markup may be available or relevant
‒  Compute resource dedicated to machine learning may be limited
‒  User should not need to be a machine learning expert or data scientist
7
Has my order rate dropped significantly?
8
Has my order rate dropped significantly?
•  Learn models from past
behaviour (training, modelling)
•  Use models to predict future
behaviour (prediction)
•  Use predictions to make
decisions
Expected value @ 15:05 = 1859
Actual value @ 15:05 = 280
Probability = 0.0000174025
Demo: Simple Time Series
10
Do my application logs contain unusual messages?
11
Do my application logs contain unusual messages?
Classify unstructured log messages by clustering similar messages
NormalLogMessages
UnusuallogMessages
Demo: Multiple Data Sources
13
Analytics Outside of Elastic Architecture
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch Prelert analysis node
Data
Kibana
Prelert UI
•  Issues
–  Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node
–  Context – anomalies and data are stored in different data stores and viewed in different Uis
–  Scale – Prelert analysis was not easily distributable across nodes
–  Resilience – Prelert analysis needed to be restored manually on failover
14
Architecture
•  Machine Learning will be part of X-Pack
•  Machine Learning jobs will be automatically distributed across
the Elasticsearch cluster
•  Machine Learning jobs will be resilient to failover
•  Machine Learning results and data can be in the same cluster
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch
Security
Alerting
Monitoring
Reporting
Graph
Machine LearningICON
TBD!!
X-Pack
15
Status
•  Demo on Elastic 5.4 available at Elastic{ON} (March 7th 2017)
•  GA shortly after… (ask Sophie!)
•  Focus of initial ML product is time series analysis in real-time
‒  Metric anomaly detection
‒  Log message classification and anomaly detection
‒  Population analysis (entity profiling)
•  Shrink-wrapped configurations on Beats data - full Elastic Stack
experience!
Beats
X-Pack
Elasticsearch AlertingMachine LearningICON
TBD!!
Kibana

Machine Learning and the Elastic Stack

  • 1.
    1 Dr. Stephen Dodson,Tech Lead Machine Learning, Elastic Machine Learning and the Elastic Stack
  • 2.
    2 Overview •  Background •  MachineLearning Overview •  Machine Learning and the Elastic Stack •  Demo •  Architecture
  • 3.
    Background •  Me –  Currently,Tech Lead, Machine Learning @ Elastic –  Formally, Founder and CTO of Prelert (acquired by Elastic September 2016) ‒  Presented overview of Prelert at Elastic London User Group in May 2016 •  Prelert –  VC backed software company, founded 2009 –  Behavioural analytics for machine data based (mainly) on unsupervised machine learning –  100+ customers + OEMs with CA, Bluecoat, NetApp + others ‒  IT Operations, IT Security, Retail analytics, IoT etc..
  • 4.
    4 Machine Learning •  Algorithmsand methods for data driven prediction, decision making, and modelling1 ‒  Learn models from past behaviour (training, modelling) ‒  Use models to predict future behaviour (prediction) ‒  Use predictions to make decisions •  Examples ‒  Image Recognition ‒  Language Translation ‒  Anomaly Detection 1Machine Learning Overview, Tommi Jaakkola, MIT
  • 5.
    5 How is thisrelevant to the Elastic Stack? •  Extracting useful, valuable information is hard Search Aggregations Visualization Machine Learning
  • 6.
    6 How is thisrelevant to the Elastic Stack? •  What if we want to search for: ‒  Has my order rate dropped significantly? ‒  Do my application logs contain unusual messages? ‒  Are any users behaving unusually? ‒  What transactions are fraudulent? •  Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of questions and get understandable answers •  Constraints: ‒  Data may be limited: no markup may be available or relevant ‒  Compute resource dedicated to machine learning may be limited ‒  User should not need to be a machine learning expert or data scientist
  • 7.
    7 Has my orderrate dropped significantly?
  • 8.
    8 Has my orderrate dropped significantly? •  Learn models from past behaviour (training, modelling) •  Use models to predict future behaviour (prediction) •  Use predictions to make decisions Expected value @ 15:05 = 1859 Actual value @ 15:05 = 280 Probability = 0.0000174025
  • 9.
  • 10.
    10 Do my applicationlogs contain unusual messages?
  • 11.
    11 Do my applicationlogs contain unusual messages? Classify unstructured log messages by clustering similar messages NormalLogMessages UnusuallogMessages
  • 12.
  • 13.
    13 Analytics Outside ofElastic Architecture Beats Logstash Kibana X-Pack X-Pack Elasticsearch Prelert analysis node Data Kibana Prelert UI •  Issues –  Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node –  Context – anomalies and data are stored in different data stores and viewed in different Uis –  Scale – Prelert analysis was not easily distributable across nodes –  Resilience – Prelert analysis needed to be restored manually on failover
  • 14.
    14 Architecture •  Machine Learningwill be part of X-Pack •  Machine Learning jobs will be automatically distributed across the Elasticsearch cluster •  Machine Learning jobs will be resilient to failover •  Machine Learning results and data can be in the same cluster Beats Logstash Kibana X-Pack X-Pack Elasticsearch Security Alerting Monitoring Reporting Graph Machine LearningICON TBD!! X-Pack
  • 15.
    15 Status •  Demo onElastic 5.4 available at Elastic{ON} (March 7th 2017) •  GA shortly after… (ask Sophie!) •  Focus of initial ML product is time series analysis in real-time ‒  Metric anomaly detection ‒  Log message classification and anomaly detection ‒  Population analysis (entity profiling) •  Shrink-wrapped configurations on Beats data - full Elastic Stack experience! Beats X-Pack Elasticsearch AlertingMachine LearningICON TBD!! Kibana