19. 19
ML Toolkit & Showcase
• Splunk Supported framework for building ML Apps
– Get it for free: http://tiny.cc/splunkmlapp
• Leverages Python for Scientific Computing (PSC) add-on:
– Open-source Python data science ecosystem
– NumPy, SciPy, scitkit-learn, pandas, statsmodels
• Showcase use cases: Predict Hard Drive Failure, Server Power
Consumption, Application Usage, Customer Churn & more
• Standard algorithms out of the box:
– Supervised: Logistic Regression, SVM, Linear Regression, Random Forest, etc.
– Unsupervised: KMeans, DBSCAN, Spectral Clustering, PCA, KernelPCA, etc.
• Implement one of 300+ algorithms by editing Python scripts
25. 25
3. Fit, Apply & Validate Models
• ML SPL – New grammar for doing ML in Splunk
• fit – fit models based on training data
– [training data] | fit LinearRegression costly_KPI
from feature1 feature2 feature3 into my_model
• apply – apply models on testing and production data
– [testing/production data] | apply my_model
• Validate Your Model (The Hard Part)
– Why hard? Because statistics is hard! Also: model error ≠ real world risk.
– Analyze residuals, mean-square error, goodness of fit, cross-validate, etc.
– Take Splunk’s Analytics & Data Science Education course
26. 26
4. Predict & Act
• Forecast KPIs & predict notable events
– When will my system have a critical error?
– In which service or process?
– What’s the probable root cause?
• How will people act on predictions?
– Is this a Sev 1/2/3 event? Who responds?
– Deliver via Notable Events or dashboard?
– Human response or automated response?
• How do you improve the models?
– Iterate, add more data, extract more features
– Keep track of true/false positives
27. 27
5. Operationalize Your Models
• Operationalizing closes the loop of the ML Process:
1. Get data
2. Explore data & fit models
3. Apply & validate models
4. Forecast KPIs & events
5. Surface incidents to Ops team
• When you deliver the outcome, keep track of the response
– Human-generated response (detailed journal logs, etc)
– Machine-generated response (workflow actions, etc)
– External knowledge (closed tickets data, DB records, etc)
• Then operationalize: feed back Ops analysis to data inputs, repeat
• Lots of hard work & stats, but lots of value will come out.
Operationalize
30. 30
SEPT 26-29, 2016
WALT DISNEY WORLD, ORLANDO
SWAN AND DOLPHIN RESORTS
• 5000+ IT & Business Professionals
• 3 days of technical content
• 165+ sessions
• 80+ Customer Speakers
• 35+ Apps in Splunk Apps Showcase
• 75+ Technology Partners
• 1:1 networking: Ask The Experts and Security
Experts, Birds of a Feather and Chalk Talks
• NEW hands-on labs!
• Expanded show floor, Dashboards Control
Room & Clinic, and MORE!
The 7th Annual Splunk Worldwide Users’ Conference
PLUS Splunk University
• Three days: Sept 24-26, 2016
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP
• Save thousands on Splunk education!