SlideShare a Scribd company logo
1 of 25
Copyright © 2016 Splunk Inc.
Operationalizing
Machine Learning
Pierre Brunel
Senior Sales Engineer
Splunk, Inc.
Disclaimer
During the course of this presentation, we may make forward looking statements regarding future events
or the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results
could differ materially. For important factors that may cause actual results to differ from those contained
in our forward-looking statements, please review our filings with the SEC. The forward-looking
statements made in the this presentation are being made as of the time and date of its live presentation.
If reviewed after its live presentation, this presentation may not contain current or accurate information.
We do not assume any obligation to update any forward looking statements we may make.
In addition, any information about our roadmap outlines our general product direction and is subject to
change at any time without notice. It is for informational purposes only and shall not, be incorporated
into any contract or other commitment. Splunk undertakes no obligation either to develop the features
or functionality described or to include any such feature or functionality in a future release.
Copyright © 2016 Splunk Inc.
Why do we need ML?
Copyright © 2016 Splunk Inc.
Historical Data Real-time Data Statistical Models
DB, HDFS, NoSQL, Splunk, etc Machine Learning
T – a few days T + a few days
Why is this so challenging using traditional methods?
• DATA IS STILL IN MOTION, still in a BUSINESS PROCESS.
• Enrich real-time MACHINE DATA with structured HISTORICAL DATA
• Make decisions IN REAL TIME using ALL THE DATA
Splunk
Security Operations Center
Network Operations Center
Business Operations Center
Copyright © 2016 Splunk Inc.
What is ML?
ML 101: What is it?
• Machine Learning (ML) is a process for generalizing from examples
– Examples = example or “training” data
– Generalizing = building “statistical models” to capture correlations
– Process = ML is never done, you must keep validating & refitting models
• Simple ML workflow:
– Explore data
– FIT models based on data
– APPLY models in production
– Keep validating models
“All models are wrong, but some are useful.”
- George Box
3 Types of Machine Learning
1. Supervised Learning: generalizing from labeled data
3 Types of Machine Learning
2. Unsupervised Learning: generalizing from unlabeled data
Clustering
3 Types of Machine Learning
3. Reinforcement Learning: generalizing from rewards in time
Copyright © 2016 Splunk Inc.
ML Use Cases
IT Ops: Predictive Maintenance
1. Get resource usage data (CPU, latency, outage reports)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast resource saturation, demand & usage
5. Surface incidents to IT Ops, who INVESTIGATES & ACTS
Problem: Network outages and truck rolls cause big time & money expense
Solution: Build predictive model to forecast outage scenarios, act pre-emptively & learn
Security: Find Insider Threats
Problem: Security breaches cause big time & money expense
Solution: Build predictive model to forecast threat scenarios, act pre-emptively & learn
1. Get security data (data transfers, authentication, incidents)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast abnormal behavior, risk scores & notable events
5. Surface incidents to Security Ops, who INVESTIGATES & ACTS
Business Analytics: Predict Customer Churn
Problem: Customer churn causes big time & money expense
Solution: Build predictive model to forecast possible churn, act pre-emptively & learn
1. Get customer data (set-top boxes, web logs, transaction history)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Identify customers likely to churn
5. Surface incidents to Business Ops, who INVESTIGATES & ACTS
Summary: The ML Process
Problem: <Stuff in the world> causes big time & money expense
Solution: Build predictive model to forecast <possible incidents>, act pre-emptively & learn
1. Get all relevant data to problem
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast KPIs & metrics associated to use case
5. Surface incidents to X Ops, who INVESTIGATES & ACTS
Operationalize
Copyright © 2016 Splunk Inc.
How do we do ML
with Splunk?
Analysts Business Users
1. Get Data & Find Decision-Makers!
18
IT Users
ODBC
SDK
API
DB Connect
Look-Ups
Ad Hoc
Search
Monitor
and Alert
Reports /
Analyze
Custom
Dashboards
GPS /
Cellular
Devices Networks Hadoop
Servers Applications Online
Shopping Carts
Analysts Business Users
Structured Data Sources
CRM ERP HR Billing Product Finance
Data Warehouse
Clickstreams
2. Explore Data, Build Searches & Dashboards
• Start with the Exploratory Data Analysis phase
– “80% of data science is sourcing, cleaning, and preparing the data”
• For each data source, build “data diagnostic” dashboard
– What’s interesting? Throw up some basic charts.
– What’s relevant for this use case?
– Any anomalies? Are thresholds useful?
• Mix data streams & compute aggregates
– Compute KPIs & statistics w/ stats, eventstats, etc.
– Enrich data streams with useful structured data
– stats count by X Y – where X,Y from different sources
3. Get the ML Toolkit & Showcase App
• Leverages Python for Scientific Computing (PSC) add-on:
– Open-source Python data science ecosystem
– NumPy, SciPy, scitkit-learn, pandas, statsmodels
• Showcase use cases: Hard Drive Failure, Server Power consumption,
Server Response Time, Application Usage
• Standard algorithms out of the box:
– Supervised: LogisticRegression, SVM, PCA
– Unsupervised: LinearRegression, KMeans, DBSCAN, SpectralClustering
• Implement one of 300+ algorithms by editing Python scripts
4. Fit, Apply & Validate Models
• ML SPL – New grammar for doing ML in Splunk
• fit – fit models based on training data
– [training data] | fit LinearRegression costly_KPI
from feature1 feature2 feature3 into my_model
• apply – apply models on testing and production data
– [testing/production data] | apply my_model
• Validate Your Model (The Hard Part)
– Why hard? Because statistics is hard! Also: model error ≠ real world risk.
– Analyze residuals, mean-square error, goodness of fit, cross-validate, etc.
– Take Splunk’s Analytics & Data Science Education course
5. Operationalize Your Models
• Remember the ML Process:
1. Get data
2. Explore data & fit models
3. Apply & validate models
4. Forecast KPIs
5. Surface incidents to Ops team
• Then operationalize: feed back Ops analysis to data inputs, repeat
• Lots of hard work & stats, but lots of value will come out.
Operationalize
Copyright © 2016 Splunk Inc.
Show me the ML!
Sneak Peak Recap: What’s new in GA
• More use cases to explore
• Support added for Search Head Clustering
• Removed 50k limit on model fitting
• Sampling for training/test data
• Guided ML via a ML Assistant aka Model / Query Builder
• Install on 6.4 Search Head
Next Steps with Splunk ML
• Reach out to your Tech Team! We can help architect ML workflows.
• Lots of ML commands in Core Splunk (predict, anomalydetection, stats)
• ML Toolkit & Showcase – available and free, ready to use
• Splunk ITSI: Applied ML for ITOA use cases
– Manage 1000s of KPIs & alerts
– Adaptive Thresholding & Anomaly Detection
• Splunk UBA: Applied ML for Security
– Unsupervised learning of Users & Entities
– Surfaces Anomalies & Threats
• ML Customer Advisory Program:
– Connect with Product & Engineering teams - mlprogram@splunk.com
Copyright © 2016 Splunk Inc.
• September 26-29, 2016
• The Disney Swan and Dolphin, Orlando
• 5000+ IT & Business Professionals
• 3 days of technical content
• 165+ sessions
• 3 days of Splunk University
• Sept 24-26, 2016
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP
• Save thousands on Splunk education!
• 80+ Customer Speakers
• 35+ Apps in Splunk Apps Showcase
• 75+ Technology Partners
• 1:1 networking: Ask The Experts and
• Security Experts, Birds of a Feather and Chalk Talks
• NEW hands-on labs!
• Expanded show floor, Dashboards Control Room &
Clinic, and MORE!
.conf2016: The 7th Annual
Splunk Worldwide Users’ Conference
Copyright © 2016 Splunk Inc.
Thank you!

More Related Content

What's hot

What's hot (20)

Getting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnGetting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-On
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Leverage Machine Data
Leverage Machine DataLeverage Machine Data
Leverage Machine Data
 
Splunk Enterpise for Information Security Hands-On
Splunk Enterpise for Information Security Hands-OnSplunk Enterpise for Information Security Hands-On
Splunk Enterpise for Information Security Hands-On
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
SplunkLive! Munich 2018: Intro to Security Analytics Methods
SplunkLive! Munich 2018: Intro to Security Analytics MethodsSplunkLive! Munich 2018: Intro to Security Analytics Methods
SplunkLive! Munich 2018: Intro to Security Analytics Methods
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Drive more value through data source and use case optimization
Drive more value through data source and use case optimization Drive more value through data source and use case optimization
Drive more value through data source and use case optimization
 
Splunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout SessionSplunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout Session
 
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & LogsSplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
SplunkLive! Frankfurt 2018 - Integrating Metrics & Logs
 
Devops Powered by Splunk
Devops Powered by SplunkDevops Powered by Splunk
Devops Powered by Splunk
 
Getting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnGetting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-On
 
IT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout SessionIT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout Session
 
SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...
SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...
SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...
 
Splunk for Enterprise Security featuring User Behavior Analytics
Splunk for Enterprise Security featuring User Behavior AnalyticsSplunk for Enterprise Security featuring User Behavior Analytics
Splunk for Enterprise Security featuring User Behavior Analytics
 
Machine Learning + Analytics in Splunk
Machine Learning + Analytics in Splunk Machine Learning + Analytics in Splunk
Machine Learning + Analytics in Splunk
 
SplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT OperationsSplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT Operations
 
SplunkLive! Munich 2018: Predictive, Proactive, and Collaborative ML with IT ...
SplunkLive! Munich 2018: Predictive, Proactive, and Collaborative ML with IT ...SplunkLive! Munich 2018: Predictive, Proactive, and Collaborative ML with IT ...
SplunkLive! Munich 2018: Predictive, Proactive, and Collaborative ML with IT ...
 
Splunk for Enterprise Security Featuring User Behavior Analytics
Splunk for Enterprise Security Featuring User Behavior Analytics Splunk for Enterprise Security Featuring User Behavior Analytics
Splunk for Enterprise Security Featuring User Behavior Analytics
 
Splunk for Developers
Splunk for DevelopersSplunk for Developers
Splunk for Developers
 

Similar to Splunk for Machine Learning and Analytics

SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
Splunk
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 

Similar to Splunk for Machine Learning and Analytics (20)

Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
SplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AI
SplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AISplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AI
SplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AI
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
SplunkLive! Paris 2018: Legacy SIEM to Splunk
SplunkLive! Paris 2018: Legacy SIEM to SplunkSplunkLive! Paris 2018: Legacy SIEM to Splunk
SplunkLive! Paris 2018: Legacy SIEM to Splunk
 
SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
SplunkLive! Zurich 2018: Legacy SIEM to Splunk, How to Conquer Migration and ...
 
SplunkLive! Frankfurt 2018 - Predictive, Proactive, and Collaborative ML with...
SplunkLive! Frankfurt 2018 - Predictive, Proactive, and Collaborative ML with...SplunkLive! Frankfurt 2018 - Predictive, Proactive, and Collaborative ML with...
SplunkLive! Frankfurt 2018 - Predictive, Proactive, and Collaborative ML with...
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...
Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...
Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...
 
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
 
SplunkLive! Paris 2018: Splunk And AI 101
SplunkLive! Paris 2018: Splunk And AI 101SplunkLive! Paris 2018: Splunk And AI 101
SplunkLive! Paris 2018: Splunk And AI 101
 
SplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and LogsSplunkLive! Munich 2018: Integrating Metrics and Logs
SplunkLive! Munich 2018: Integrating Metrics and Logs
 
Course 2 Machine Learning Data LifeCycle in Production - Week 1
Course 2   Machine Learning Data LifeCycle in Production - Week 1Course 2   Machine Learning Data LifeCycle in Production - Week 1
Course 2 Machine Learning Data LifeCycle in Production - Week 1
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
SplunkLive! Zurich 2018: Get More From Your Machine Data with Splunk & AI
SplunkLive! Zurich 2018: Get More From Your Machine Data with Splunk & AISplunkLive! Zurich 2018: Get More From Your Machine Data with Splunk & AI
SplunkLive! Zurich 2018: Get More From Your Machine Data with Splunk & AI
 
SplunkLive! Munich 2018: Get More From Your Machine Data Splunk & AI
SplunkLive! Munich 2018: Get More From Your Machine Data Splunk & AISplunkLive! Munich 2018: Get More From Your Machine Data Splunk & AI
SplunkLive! Munich 2018: Get More From Your Machine Data Splunk & AI
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 

More from Splunk

More from Splunk (20)

.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine
 
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
 
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
.conf Go 2023 - Navegando la normativa SOX (Telefónica).conf Go 2023 - Navegando la normativa SOX (Telefónica)
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
 
.conf Go 2023 - Raiffeisen Bank International
.conf Go 2023 - Raiffeisen Bank International.conf Go 2023 - Raiffeisen Bank International
.conf Go 2023 - Raiffeisen Bank International
 
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett .conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
 
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär).conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
 
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu....conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
 
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever....conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
 
.conf go 2023 - De NOC a CSIRT (Cellnex)
.conf go 2023 - De NOC a CSIRT (Cellnex).conf go 2023 - De NOC a CSIRT (Cellnex)
.conf go 2023 - De NOC a CSIRT (Cellnex)
 
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
 
Splunk - BMW connects business and IT with data driven operations SRE and O11y
Splunk - BMW connects business and IT with data driven operations SRE and O11ySplunk - BMW connects business and IT with data driven operations SRE and O11y
Splunk - BMW connects business and IT with data driven operations SRE and O11y
 
Splunk x Freenet - .conf Go Köln
Splunk x Freenet - .conf Go KölnSplunk x Freenet - .conf Go Köln
Splunk x Freenet - .conf Go Köln
 
Splunk Security Session - .conf Go Köln
Splunk Security Session - .conf Go KölnSplunk Security Session - .conf Go Köln
Splunk Security Session - .conf Go Köln
 
Data foundations building success, at city scale – Imperial College London
 Data foundations building success, at city scale – Imperial College London Data foundations building success, at city scale – Imperial College London
Data foundations building success, at city scale – Imperial College London
 
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...
 
SOC, Amore Mio! | Security Webinar
SOC, Amore Mio! | Security WebinarSOC, Amore Mio! | Security Webinar
SOC, Amore Mio! | Security Webinar
 
.conf Go 2022 - Observability Session
.conf Go 2022 - Observability Session.conf Go 2022 - Observability Session
.conf Go 2022 - Observability Session
 
.conf Go Zurich 2022 - Keynote
.conf Go Zurich 2022 - Keynote.conf Go Zurich 2022 - Keynote
.conf Go Zurich 2022 - Keynote
 
.conf Go Zurich 2022 - Platform Session
.conf Go Zurich 2022 - Platform Session.conf Go Zurich 2022 - Platform Session
.conf Go Zurich 2022 - Platform Session
 
.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security Session.conf Go Zurich 2022 - Security Session
.conf Go Zurich 2022 - Security Session
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Splunk for Machine Learning and Analytics

  • 1. Copyright © 2016 Splunk Inc. Operationalizing Machine Learning Pierre Brunel Senior Sales Engineer Splunk, Inc.
  • 2. Disclaimer During the course of this presentation, we may make forward looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in the this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release.
  • 3. Copyright © 2016 Splunk Inc. Why do we need ML?
  • 4. Copyright © 2016 Splunk Inc. Historical Data Real-time Data Statistical Models DB, HDFS, NoSQL, Splunk, etc Machine Learning T – a few days T + a few days Why is this so challenging using traditional methods? • DATA IS STILL IN MOTION, still in a BUSINESS PROCESS. • Enrich real-time MACHINE DATA with structured HISTORICAL DATA • Make decisions IN REAL TIME using ALL THE DATA Splunk Security Operations Center Network Operations Center Business Operations Center
  • 5. Copyright © 2016 Splunk Inc. What is ML?
  • 6. ML 101: What is it? • Machine Learning (ML) is a process for generalizing from examples – Examples = example or “training” data – Generalizing = building “statistical models” to capture correlations – Process = ML is never done, you must keep validating & refitting models • Simple ML workflow: – Explore data – FIT models based on data – APPLY models in production – Keep validating models “All models are wrong, but some are useful.” - George Box
  • 7. 3 Types of Machine Learning 1. Supervised Learning: generalizing from labeled data
  • 8. 3 Types of Machine Learning 2. Unsupervised Learning: generalizing from unlabeled data Clustering
  • 9. 3 Types of Machine Learning 3. Reinforcement Learning: generalizing from rewards in time
  • 10. Copyright © 2016 Splunk Inc. ML Use Cases
  • 11. IT Ops: Predictive Maintenance 1. Get resource usage data (CPU, latency, outage reports) 2. Explore data, and fit predictive models on past / real-time data 3. Apply & validate models until predictions are accurate 4. Forecast resource saturation, demand & usage 5. Surface incidents to IT Ops, who INVESTIGATES & ACTS Problem: Network outages and truck rolls cause big time & money expense Solution: Build predictive model to forecast outage scenarios, act pre-emptively & learn
  • 12. Security: Find Insider Threats Problem: Security breaches cause big time & money expense Solution: Build predictive model to forecast threat scenarios, act pre-emptively & learn 1. Get security data (data transfers, authentication, incidents) 2. Explore data, and fit predictive models on past / real-time data 3. Apply & validate models until predictions are accurate 4. Forecast abnormal behavior, risk scores & notable events 5. Surface incidents to Security Ops, who INVESTIGATES & ACTS
  • 13. Business Analytics: Predict Customer Churn Problem: Customer churn causes big time & money expense Solution: Build predictive model to forecast possible churn, act pre-emptively & learn 1. Get customer data (set-top boxes, web logs, transaction history) 2. Explore data, and fit predictive models on past / real-time data 3. Apply & validate models until predictions are accurate 4. Identify customers likely to churn 5. Surface incidents to Business Ops, who INVESTIGATES & ACTS
  • 14. Summary: The ML Process Problem: <Stuff in the world> causes big time & money expense Solution: Build predictive model to forecast <possible incidents>, act pre-emptively & learn 1. Get all relevant data to problem 2. Explore data, and fit predictive models on past / real-time data 3. Apply & validate models until predictions are accurate 4. Forecast KPIs & metrics associated to use case 5. Surface incidents to X Ops, who INVESTIGATES & ACTS Operationalize
  • 15. Copyright © 2016 Splunk Inc. How do we do ML with Splunk?
  • 16. Analysts Business Users 1. Get Data & Find Decision-Makers! 18 IT Users ODBC SDK API DB Connect Look-Ups Ad Hoc Search Monitor and Alert Reports / Analyze Custom Dashboards GPS / Cellular Devices Networks Hadoop Servers Applications Online Shopping Carts Analysts Business Users Structured Data Sources CRM ERP HR Billing Product Finance Data Warehouse Clickstreams
  • 17. 2. Explore Data, Build Searches & Dashboards • Start with the Exploratory Data Analysis phase – “80% of data science is sourcing, cleaning, and preparing the data” • For each data source, build “data diagnostic” dashboard – What’s interesting? Throw up some basic charts. – What’s relevant for this use case? – Any anomalies? Are thresholds useful? • Mix data streams & compute aggregates – Compute KPIs & statistics w/ stats, eventstats, etc. – Enrich data streams with useful structured data – stats count by X Y – where X,Y from different sources
  • 18. 3. Get the ML Toolkit & Showcase App • Leverages Python for Scientific Computing (PSC) add-on: – Open-source Python data science ecosystem – NumPy, SciPy, scitkit-learn, pandas, statsmodels • Showcase use cases: Hard Drive Failure, Server Power consumption, Server Response Time, Application Usage • Standard algorithms out of the box: – Supervised: LogisticRegression, SVM, PCA – Unsupervised: LinearRegression, KMeans, DBSCAN, SpectralClustering • Implement one of 300+ algorithms by editing Python scripts
  • 19. 4. Fit, Apply & Validate Models • ML SPL – New grammar for doing ML in Splunk • fit – fit models based on training data – [training data] | fit LinearRegression costly_KPI from feature1 feature2 feature3 into my_model • apply – apply models on testing and production data – [testing/production data] | apply my_model • Validate Your Model (The Hard Part) – Why hard? Because statistics is hard! Also: model error ≠ real world risk. – Analyze residuals, mean-square error, goodness of fit, cross-validate, etc. – Take Splunk’s Analytics & Data Science Education course
  • 20. 5. Operationalize Your Models • Remember the ML Process: 1. Get data 2. Explore data & fit models 3. Apply & validate models 4. Forecast KPIs 5. Surface incidents to Ops team • Then operationalize: feed back Ops analysis to data inputs, repeat • Lots of hard work & stats, but lots of value will come out. Operationalize
  • 21. Copyright © 2016 Splunk Inc. Show me the ML!
  • 22. Sneak Peak Recap: What’s new in GA • More use cases to explore • Support added for Search Head Clustering • Removed 50k limit on model fitting • Sampling for training/test data • Guided ML via a ML Assistant aka Model / Query Builder • Install on 6.4 Search Head
  • 23. Next Steps with Splunk ML • Reach out to your Tech Team! We can help architect ML workflows. • Lots of ML commands in Core Splunk (predict, anomalydetection, stats) • ML Toolkit & Showcase – available and free, ready to use • Splunk ITSI: Applied ML for ITOA use cases – Manage 1000s of KPIs & alerts – Adaptive Thresholding & Anomaly Detection • Splunk UBA: Applied ML for Security – Unsupervised learning of Users & Entities – Surfaces Anomalies & Threats • ML Customer Advisory Program: – Connect with Product & Engineering teams - mlprogram@splunk.com
  • 24. Copyright © 2016 Splunk Inc. • September 26-29, 2016 • The Disney Swan and Dolphin, Orlando • 5000+ IT & Business Professionals • 3 days of technical content • 165+ sessions • 3 days of Splunk University • Sept 24-26, 2016 • Get Splunk Certified for FREE! • Get CPE credits for CISSP, CAP, SSCP • Save thousands on Splunk education! • 80+ Customer Speakers • 35+ Apps in Splunk Apps Showcase • 75+ Technology Partners • 1:1 networking: Ask The Experts and • Security Experts, Birds of a Feather and Chalk Talks • NEW hands-on labs! • Expanded show floor, Dashboards Control Room & Clinic, and MORE! .conf2016: The 7th Annual Splunk Worldwide Users’ Conference
  • 25. Copyright © 2016 Splunk Inc. Thank you!

Editor's Notes

  1. Splunk handles the full continuum: past, present & future. Q: Why do we need ML? A: ML provides the “models” that we can use to make decisions about missing data. We need ML to predict & forecast possible future events based on historical & real-time data.
  2. Q: Why standalone SH? A: Don’t want ML exploration & production to bring down other Splunk workloads
  3. Q: What is a statistical model? A: A model is a little copy of the world you can hold in your hands. Formal: A model is a parametrized relationship between variables. FITTING a model sets the parameters using feature variables & observed values APPLYING a model fills in predicted values using feature variables Image source: http://phdp.github.io/posts/2013-07-05-dtl.html
  4. Supervised learning is where you have existing LABELS in the data to help you out. Example: If you’re training a model for CUSTOMER CHURN, historically you know which customers stayed and which left. You can build a model to correlate historical churn with other features in the data. Then you can PREDICT churn for each customer based on everything they’re doing in real-time and have done in the past.
  5. Unsupervised learning is where you have NO LABELS to help you out. You have to figure out patterns Example: If you’re trying to do BEHAVIORAL ANALYTICS, you might just have a big confusing pile of IT & Security data to wade through. Unsupervised learning is the art & science of finding PATTERNS, BASELINES and ANOMALIES in the data. Once you understand all this (that’s hard!) you can try to predict possible INCIDENTS and THREATS. Good ML involves FEEDBACK loops. Best bet is to incorporate INCIDENT RESPONSE data and learn from what analysts have done in the past. [NEXT SLIDE: Reinforcement Learning]
  6. Reinforcement Learning is basically Supervised Learning where LABELS = REWARDS, and there is a strong focus on TIME and FEEDBACK LOOPS. This is how you OPERATIONALIZE machine learning: by looping back results of analysis and workflow and LEARN from interactions with the world. Rewards can be POSITIVE or NEGATIVE. Image: The Leitner system is reinforcement learning for flashcards. Correct answers “advance” and accumulate more points. Incorrect answers go back to the beginning. https://en.wikipedia.org/wiki/Leitner_system Reinforcement learning is rooted in behavioral psychology. Humans & animals are hard-wired for rewards https://en.wikipedia.org/wiki/Reinforcement_learning
  7. Reinforcement learning lets us OPERATIONALIZE machine learning. When the machine recommends something to an analyst, the model can LEARN from the outcome of their work. Create a culture of REWARDS for your analytics team, not punishments. The machines can/should learn from ALL the available data. You might have to build complex ML workflows. Want good Splunk admin to help architect. Want VIRTUOUS CYCLE between human-machine interaction
  8. Q: How is this slide similar to the previous one? (go back and forth) A: The ML Process is the same, it’s just that the data & the operations teams are different. Also different will be the actual analysis in the middle, but the *process* of doing that analysis is the same.
  9. The ML process is itself a generalization of the different use cases. ML spans domains! The arrow means OPERATIONALIZE. Feed back incident data & other high-level analysis back into the ML Process. Keep exploring that data & fitting better models to align with reality. Loop Step #5 (Act) back to Step #1 (Data).
  10. Before you do machine learning, you need DATA and DECISION-MAKERS. Walk before you can run! Start with useful data sources that can help people solve problems, and build basic dashboards correlating different things in the data. This is called EXPLORATORY DATA ANALYSIS. Once you do that, THEN try to fit models based on what seems to correlate. Interviewing & iterating with decision-makers is key. DATA: ML isn’t magic. You need good data to learn from. DECISION-MAKERS: Once you find patterns, anomalies, etc., who are you going to deliver to them to? How do they want information presented? Emails? Dashboards? Incident tickets?
  11. Walk before running! Precursor to building models & doing ML. Image: OpenStreetMap logo, from Wikipedia. Creative Commons
  12. Q: Why standalone SH? A: Don’t want ML exploration & production to bring down other Splunk workloads Can use standalone 6.3 SH with older version SH cluster & indexers. PSC add-on is FREE. Go to ML App link above, and click Documentation. Links for all distros ()
  13. Remember: Machine Learning is a PROCESS. Takes a lot of work & elbow grease to get from Exploratory Data Analysis to ML Models in Production. Q: Why hard? A1: Statistics is hard. Subtle questions re: model error & statistical assumptions. Remember: “All models are wrong, some are useful” A2: Validation is also difficult because not everyone has the same requirements. For example, for some users false positives may be much more expensive than false negatives; for others, the opposite may be true. For some users, being 2X wrong is twice as bad as being X wrong; for others, there may be a non-linear relationship between error and badness.
  14. Re: ML App v0.9. To be updated after new release. Time estimate: “soon, stay tuned!” If you want to use ML in production, let us know! We have customers using ML in production TODAY. e.g., New York Air Brake
  15. Time for ML demo! Get the ML App: http://tiny.cc/splunkmlapp Want more? Take Splunk’s Analytics & Data Science course! Course prework: http://bit.ly/splunkanalytics
  16. Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML. Image modified from cover of book Protecting Study Volunteers in Research Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012) NEXT: either leave slide & discuss OR show ML demo
  17. Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML. Image modified from cover of book Protecting Study Volunteers in Research Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012) NEXT: either leave slide & discuss OR show ML demo
  18. We’re headed to the East Coast! 2 inspired Keynotes – General Session and Security Keynote + Super Sessions with Splunk Leadership in Cloud, IT Ops, Security and Business Analytics! 165+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE! 30+ hours of invaluable networking time with industry thought leaders, technologists, and other Splunk Ninjas and Champions waiting to share their business wins with you! Join the 50%+ of Fortune 100 companies who attended .conf2015 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers. Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Orlando a Splunk user, leave Orlando a Splunk Ninja! REGISTRATION OPENS IN MARCH 2016 – STAY TUNED FOR NEWS ON OUR BEST REGISTRATION RATES – COMING SOON!
  19. Time for ML demo! Get the ML App: http://tiny.cc/splunkmlapp Want more? Take Splunk’s Analytics & Data Science course! Course prework: http://bit.ly/splunkanalytics
  20. (OPTIONAL) For deep dive with ML experts to discuss how to use ML Toolkit
  21. Q: What is a statistical model? A: A model is a little copy of the world you can hold in your hands. Formal: A model is a parametrized relationship between variables. FITTING a model sets the parameters using feature variables & observed values APPLYING a model fills in predicted values using feature variables