The document provides an overview of operationalizing machine learning with Splunk. It discusses how machine learning can help with challenges around data being in constant motion and the need to make real-time decisions using all available historical and real-time data. The document then covers machine learning concepts, different types of machine learning, examples of use cases in IT operations, security, and business analytics. It outlines the typical machine learning process of getting data, exploring it, fitting and applying models, and validating and operationalizing models. Finally, it discusses how machine learning can be done with Splunk through its machine learning toolkit and apps.
The Big Data phenomenon is being driven by the growth of machine data. Critical insights found in machine data enable IT and Security teams to ensure uptime, detect fraud and identify threats. Today, forward-thinking organizations are discovering its value to better understand their customers, improve products, optimize marketing and improve business processes. Learn how Splunk and your machine data can deliver real-time insights from this new class of data and complement your existing BI investments.
Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms. In this session, we'll present an overview of the app architecture and API and show you how to use Splunk to easily perform a variety of tasks, including outlier and anomaly detection, predictive analytics, and event clustering. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
Splunk: How to Design, Build and Map IT ServicesSplunk
Your IT department supports critical business functions, processes and products. You're most effective when your technology initiatives are closely aligned and measured with specific business objectives. This session covers best practices and techniques for designing and building an effective service model, using the domain knowledge of your experts and capturing and reporting on key metrics that everyone can understand.
Come and learn from our experts on ways to improve you IT Operational Visibility by using Splunk for monitoring environment health. In this hands-on session we will cover recommended approaches for end to end monitoring, across applications, OSes, and devices. Topics will include: critical services to monitor, use of the Splunk Common Information Model (CIM) for cross-dataset normalization, commonly deployed apps and TAs to gather data for IT infrastructure uses, and use of pre-made dashboard panels to quickly build dashboards for monitoring your environment.
Getting Started with Splunk Enterprise Hands-OnSplunk
Here’s your chance to get hands-on with Splunk for the first time! Bring your laptop, and we’ll go through a simple install of Splunk. Then we’ll load some sample data, and see Splunk in action. At the end of this session you’ll have a hands-on understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. We’ll share practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Drive more value through data source and use case optimization Splunk
Do you wish you had a way to better illustrate the value of Splunk to your leadership? Better yet, do you wish you had a way to illustrate how much MORE value is possible from Splunk leveraging the data you plan on indexing in one functional area or are already indexing today? We can help with that! Come learn the how and why of data source and use case optimization.
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
The Big Data phenomenon is being driven by the growth of machine data. Critical insights found in machine data enable IT and Security teams to ensure uptime, detect fraud and identify threats. Today, forward-thinking organizations are discovering its value to better understand their customers, improve products, optimize marketing and improve business processes. Learn how Splunk and your machine data can deliver real-time insights from this new class of data and complement your existing BI investments.
Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms. In this session, we'll present an overview of the app architecture and API and show you how to use Splunk to easily perform a variety of tasks, including outlier and anomaly detection, predictive analytics, and event clustering. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
Splunk: How to Design, Build and Map IT ServicesSplunk
Your IT department supports critical business functions, processes and products. You're most effective when your technology initiatives are closely aligned and measured with specific business objectives. This session covers best practices and techniques for designing and building an effective service model, using the domain knowledge of your experts and capturing and reporting on key metrics that everyone can understand.
Come and learn from our experts on ways to improve you IT Operational Visibility by using Splunk for monitoring environment health. In this hands-on session we will cover recommended approaches for end to end monitoring, across applications, OSes, and devices. Topics will include: critical services to monitor, use of the Splunk Common Information Model (CIM) for cross-dataset normalization, commonly deployed apps and TAs to gather data for IT infrastructure uses, and use of pre-made dashboard panels to quickly build dashboards for monitoring your environment.
Getting Started with Splunk Enterprise Hands-OnSplunk
Here’s your chance to get hands-on with Splunk for the first time! Bring your laptop, and we’ll go through a simple install of Splunk. Then we’ll load some sample data, and see Splunk in action. At the end of this session you’ll have a hands-on understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. We’ll share practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Drive more value through data source and use case optimization Splunk
Do you wish you had a way to better illustrate the value of Splunk to your leadership? Better yet, do you wish you had a way to illustrate how much MORE value is possible from Splunk leveraging the data you plan on indexing in one functional area or are already indexing today? We can help with that! Come learn the how and why of data source and use case optimization.
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms, which are exposed via an API and demonstrated in a showcase. In this session, we'll present an overview of the app architecture and API and then show you how to use Splunk to easily perform a wide variety of tasks, including outlier detection, predictive analytics, event clustering, and anomaly detection. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...Splunk
Presented at SplunkLive! Frankfurt 2018:
Introduction
SIEM Migration Methodology
Use Cases
Datasources & Data Onboarding
ES Architecture
Third-Party Integrations
You Got This!
SplunkLive! Munich 2018: Predictive, Proactive, and Collaborative ML with IT ...Splunk
Presented at SplunkLive! Munich 2018:
- What data do we need?
- We need Machine Learning
- Real Use Case Example
- Let's Drive Into How it Works
- Next Steps
SplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AISplunk
Presented at SpluknLive! Frankfurt 2018:
Why AI & Machine Learning?
What is Machine Learning?
Splunk's Machine Learning Tour
Use Cases & Customer Stories
Wrap Up
Model Monitoring at Scale with Apache Spark and VertaDatabricks
For any organization whose core product or business depends on ML models (think Slack search, Twitter feed ranking, or Tesla Autopilot), ensuring that production ML models are performing with high efficacy is crucial. In fact, according to the McKinsey report on model risk, defective models have led to revenue losses of hundreds of millions of dollars in the financial sector alone. However, in spite of the significant harms of defective models, tools to detect and remedy model performance issues for production ML models are missing.
Based on our experience building ML debugging and robustness tools at MIT CSAIL and managing large-scale model inference services at Twitter, Nvidia, and now at Verta, we developed a generalized model monitoring framework that can monitor a wide variety of ML models, work unchanged in batch and real-time inference scenarios, and scale to millions of inference requests. In this talk, we focus on how this framework applies to monitoring ML inference workflows built on top of Apache Spark and Databricks. We describe how we can supplement the massively scalable data processing capabilities of these platforms with statistical processors to support the monitoring and debugging of ML models.
Learn how ML Monitoring is fundamentally different from application performance monitoring or data monitoring. Understand what model monitoring must achieve for batch and real-time model serving use cases. Then dig in with us as we focus on the batch prediction use case for model scoring and demonstrate how we can leverage the core Apache Spark engine to easily monitor model performance and identify errors in serving pipelines.
Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...Splunk
Presented at Splunk Discovery Warsaw 2018:
SIEM Replacement Methodology
Use Cases
Data Sources & Data Onboarding
Architecture
Third Party Integration
You Got This!
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Precisely
Enterprises with mainframes and Cloud/server architectures face unique issues and challenges and if your enterprise delivers a service whose operation spans mainframe and distributed and/or Cloud infrastructures (e.g. a mobile banking/customer app), this webinar is for you.
See how you can gain unique business and service-relevant context using your own machine data, including that from your z/OS mainframe. Implicitly learn patterns, eliminate costly false alerts, identify anomalies, and baseline normal operations by employing advanced analytics driven by machine learning. You’ll also see and learn about:
• Accelerating root-cause analysis and getting ahead of customer-impacting outages and slow-downs for your service
• “Glass Table” view for clickable visualization of the entire service-relevant infrastructure
• Machine Learning in IT Service Intelligence
• The Machine Learning Toolkit available today
SplunkLive! Paris 2018: Legacy SIEM to SplunkSplunk
Presented at SplunkLive! Paris 2018: Legacy SIEM to Splunk, How to Conquer Migration and Not Die Trying:
- Why?
- SIEM Replacement
- Use Cases
- Data Sources & Data Onboarding
- Architecture
- Third Party Integrations
- You Got This
-
SplunkLive! Munich 2018: Get More From Your Machine Data Splunk & AISplunk
Presented at SplunkLive! Munich 2018:
- Why AI & Machine Learning?
- What is Machine Learning?
- Splunk's Machine Learning Tour
- Use Cases & Customer Stories
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...Splunk
.conf Go 2023 presentation:
"Das passende Rezept für die digitale (Security) Revolution zur Telematik Infrastruktur 2.0 im Gesundheitswesen?"
Speaker: Stefan Stein -
Teamleiter CERT | gematik GmbH M.Eng. IT-Sicherheit & Forensik,
doctorate student at TH Brandenburg & Universität Dresden
.conf Go 2023 presentation:
De NOC a CSIRT
Speakers:
Daniel Reina - Country Head of Security Cellnex (España) & Global SOC Manager Cellnex
Samuel Noval - Global CSIRT Team Leader, Cellnex
Splunk - BMW connects business and IT with data driven operations SRE and O11ySplunk
BMW is defining the next level of mobility - digital interactions and technology are the backbone to continued success with its customers. Discover how an IT team is tackling the journey of business transformation at scale whilst maintaining (and showing the importance of) business and IT service availability. Learn how BMW introduced frameworks to connect business and IT, using real-time data to mitigate customer impact, as Michael and Mark share their experience in building operations for a resilient future.
Data foundations building success, at city scale – Imperial College LondonSplunk
Universities have more in common with modern cities than traditional places of learning. This mini city needs to empower its citizens to thrive and achieve their ambitions. Operationalising data is key to building critical services; from understanding complex IT estates for smarter decision-making to robust security and a more reliable, resilient student experience. Juan will share his experience in building data foundations for a resilient future whilst enabling digital transformation at Imperial College London.
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...Splunk
Learn how Vodafone has provided end-to-end visibility across services by building an Operational Analytics Platform. In this session, you will hear how Stefan and his team manage legacy, on premise, hybrid and public cloud services, and how they are providing a platform for complex triage and debugging to tackle use cases across Vodafone’s extensive ecosystem.
.italo operates an Essential Service by connecting more than 100 million people annually across Italy with its super fast and secure railway. And CISO Enrico Maresca has been on a whirlwind journey of his own.
Formerly a Cyber Security Engineer, Enrico started at .italo as an IT Security Manager. One year later, he was promoted to CISO and tasked with building out – and significantly increasing the maturity level – of the SOC. The result was a huge step forward for .italo.
So how did he successfully achieve this ambitious ask? Join Enrico as he reveals the key insights and lessons learned in his SOC journey, including:
Top challenges faced in improving security posture
Key KPIs implemented in order to measure success
Strategies and approaches applied in the SOC
How MITRE ATT&CK and Splunk Enterprise Security were utilised
Next steps in their maturity journey ahead
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
2. 2
Disclaimer
During the course of this presentation, we may make forward looking statements regarding future events
or the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results
could differ materially. For important factors that may cause actual results to differ from those contained
in our forward-looking statements, please review our filings with the SEC. The forward-looking
statements made in the this presentation are being made as of the time and date of its live presentation.
If reviewed after its live presentation, this presentation may not contain current or accurate information.
We do not assume any obligation to update any forward looking statements we may make.
In addition, any information about our roadmap outlines our general product direction and is subject to
change at any time without notice. It is for informational purposes only and shall not, be incorporated
into any contract or other commitment. Splunk undertakes no obligation either to develop the features
or functionality described or to include any such feature or functionality in a future release.
6. 6
ML 101: What is it?
• Machine Learning (ML) is a process for generalizing from examples
– Examples = example or “training” data
– Generalizing = building “statistical models” to capture correlations
– Process = ML is never done, you must keep validating & refitting models
• Simple ML workflow:
– Explore data
– FIT models based on data
– APPLY models in production
– Keep validating models
“All models are wrong, but some are useful.”
- George Box
7. 7
3 Types of Machine Learning
1. Supervised Learning: generalizing from labeled data
8. 8
3 Types of Machine Learning
2. Unsupervised Learning: generalizing from unlabeled data
9. 9
3 Types of Machine Learning
3. Reinforcement Learning: generalizing from rewards in time
Leitner System Recommender systems
11. 12
IT Ops: Predictive Maintenance
1. Get resource usage data (CPU, latency, outage reports)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast resource saturation, demand & usage
5. Surface incidents to IT Ops, who INVESTIGATES & ACTS
Problem: Network outages and truck rolls cause big time & money expense
Solution: Build predictive model to forecast outage scenarios, act pre-emptively & learn
12. 13
Security: Find Insider Threats
Problem: Security breaches cause big time & money expense
Solution: Build predictive model to forecast threat scenarios, act pre-emptively & learn
1. Get security data (data transfers, authentication, incidents)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast abnormal behavior, risk scores & notable events
5. Surface incidents to Security Ops, who INVESTIGATES & ACTS
13. 14
Business Analytics: Predict Customer Churn
Problem: Customer churn causes big time & money expense
Solution: Build predictive model to forecast possible churn, act pre-emptively & learn
1. Get customer data (set-top boxes, web logs, transaction history)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Identify customers likely to churn
5. Surface incidents to Business Ops, who INVESTIGATES & ACTS
14. 15
Summary: The ML Process
Problem: <Stuff in the world> causes big time & money expense
Solution: Build predictive model to forecast <possible incidents>, act pre-emptively & learn
1. Get all relevant data to problem
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast KPIs & metrics associated to use case
5. Surface incidents to X Ops, who INVESTIGATES & ACTS
Operationalize
16. 17
Analysts Business Users
1. Get Data & Find Decision-Makers!
1
IT Users
ODBC
SDK
API
DB Connect
Look-Ups
Ad Hoc
Search
Monitor
and Alert
Reports /
Analyze
Custom
Dashboards
GPS /
Cellular
Devices Networks Hadoop
Servers Applications Online
Shopping Carts
Analysts Business Users
Structured Data Sources
CRM ERP HR Billing Product Finance
Data Warehouse
Clickstreams
17. 18
2. Explore Data, Build Searches & Dashboards
• Start with the Exploratory Data Analysis phase
– “80% of data science is sourcing, cleaning, and preparing the data”
• For each data source, build “data diagnostic” dashboard
– What’s interesting? Throw up some basic charts.
– What’s relevant for this use case?
– Any anomalies? Are thresholds useful?
• Mix data streams & compute aggregates
– Compute KPIs & statistics w/ stats, eventstats, etc.
– Enrich data streams with useful structured data
– stats count by X Y – where X,Y from different sources
18. 19
3. Get the ML Toolkit & Showcase App
• Get the App: http://tiny.cc/splunkmlapp
• Leverages Python for Scientific Computing (PSC) add-on:
– Open-source Python data science ecosystem
– NumPy, SciPy, scitkit-learn, pandas, statsmodels
• Showcase use cases: Hard Drive Failure, Server Power consumption,
Server Response Time, Application Usage
• Standard algorithms out of the box:
– Supervised: Logistic Regression, SVM, Linear Regression, Random Forest
– Unsupervised: KMeans, DBSCAN, Spectral Clustering
• Implement one of 300+ algorithms by editing Python scripts
19. 20
4. Fit, Apply & Validate Models
• ML SPL – New grammar for doing ML in Splunk
• fit – fit models based on training data
– [training data] | fit LinearRegression costly_KPI
from feature1 feature2 feature3 into my_model
• apply – apply models on testing and production data
– [testing/production data] | apply my_model
• Validate Your Model (The Hard Part)
– Why hard? Because statistics is hard! Also: model error ≠ real world risk.
– Analyze residuals, mean-square error, goodness of fit, cross-validate, etc.
– Take Splunk’s Analytics & Data Science Education course
20. 21
5. Operationalize Your Models
• Remember the ML Process:
1. Get data
2. Explore data & fit models
3. Apply & validate models
4. Forecast KPIs
5. Surface incidents to Ops team
• Then operationalize: feed back Ops analysis to data inputs, repeat
• Lots of hard work & stats, but lots of value will come out.
Operationalize
22. 23
Sneak Peak Recap: What’s new in GA
• New Algorithms (Random Forest, Lasso, Kernel PCA, and more…)
• More use cases to explore
• Support added for Search Head Clustering
• Removed 50k limit on model fitting
• Sampling for training/test data
• Guided ML via a ML Assistant aka Model / Query Builder
• Install on 6.4 Search Head
23. 24
Next Steps with Splunk ML
• Reach out to your Tech Team! We can help architect ML workflows.
• Lots of ML commands in Core Splunk (predict, anomalydetection, stats)
• ML Toolkit & Showcase – available and free, ready to use
• Splunk ITSI: Applied ML for ITOA use cases
– Manage 1000s of KPIs & alerts
– Adaptive Thresholding & Anomaly Detection
• Splunk UBA: Applied ML for Security
– Unsupervised learning of Users & Entities
– Surfaces Anomalies & Threats
• ML Customer Advisory Program:
– Connect with Product & Engineering teams - mlprogram@splunk.com
24. 25
Northern Cal Tech Talks!
Monthly WebEx Sessions
• Ted Talk style presentation
• Q&A Chat forum
So what’s next on the agenda?
• March 23rd @ 10AM PST - Building &
Deploying Apps.
• April 20th @ 10AM PST - Top 5 most useful
search commands.
See more at:
http://live.splunk.com/NorCalTechTalks
25. 26
SEPT 26-29, 2016
WALT DISNEY WORLD, ORLANDO
SWAN AND DOLPHIN RESORTS
• 5000+ IT & Business Professionals
• 3 days of technical content
• 165+ sessions
• 80+ Customer Speakers
• 35+ Apps in Splunk Apps Showcase
• 75+ Technology Partners
• 1:1 networking: Ask The Experts and Security
Experts, Birds of a Feather and Chalk Talks
• NEW hands-on labs!
• Expanded show floor, Dashboards Control
Room & Clinic, and MORE!
The 7th Annual Splunk Worldwide Users’ Conference
PLUS Splunk University
• Three days: Sept 24-26, 2016
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP
• Save thousands on Splunk education!
Splunk handles the full continuum: past, present & future.
Q: Why do we need ML?
A: ML provides the “models” that we can use to make decisions about missing data. We need ML to predict & forecast possible future events based on historical & real-time data.
Q: What is a statistical model?A: A model is a little copy of the world you can hold in your hands.
Formal: A model is a parametrized relationship between variables.
FITTING a model sets the parameters using feature variables & observed values
APPLYING a model fills in predicted values using feature variables
Image source: http://phdp.github.io/posts/2013-07-05-dtl.html
Supervised learning is where you have existing LABELS in the data to help you out.
Example: If you’re training a model for CUSTOMER CHURN, historically you know which customers stayed and which left. You can build a model to correlate historical churn with other features in the data. Then you can PREDICT churn for each customer based on everything they’re doing in real-time and have done in the past.
Unsupervised learning is where you have NO LABELS to help you out. You have to figure out patterns
Example: If you’re trying to do BEHAVIORAL ANALYTICS, you might just have a big confusing pile of IT & Security data to wade through. Unsupervised learning is the art & science of finding PATTERNS, BASELINES and ANOMALIES in the data. Once you understand all this (that’s hard!) you can try to predict possible INCIDENTS and THREATS.
Good ML involves FEEDBACK loops. Best bet is to incorporate INCIDENT RESPONSE data and learn from what analysts have done in the past.
[NEXT SLIDE: Reinforcement Learning]
Reinforcement Learning is basically Supervised Learning where LABELS = REWARDS, and there is a strong focus on TIME and FEEDBACK LOOPS. This is how you OPERATIONALIZE machine learning: by looping back results of analysis and workflow and LEARN from interactions with the world.
Rewards can be POSITIVE or NEGATIVE.
Image: The Leitner system is reinforcement learning for flashcards. Correct answers “advance” and accumulate more points. Incorrect answers go back to the beginning.
https://en.wikipedia.org/wiki/Leitner_system
Reinforcement learning is rooted in behavioral psychology. Humans & animals are hard-wired for rewards
https://en.wikipedia.org/wiki/Reinforcement_learning
Reinforcement learning lets us OPERATIONALIZE machine learning. When the machine recommends something to an analyst, the model can LEARN from the outcome of their work. Create a culture of REWARDS for your analytics team, not punishments.
The machines can/should learn from ALL the available data. You might have to build complex ML workflows. Want good Splunk admin to help architect.
Want VIRTUOUS CYCLE between human-machine interaction
Q: How is this slide similar to the previous one? (go back and forth)
A: The ML Process is the same, it’s just that the data & the operations teams are different. Also different will be the actual analysis in the middle, but the *process* of doing that analysis is the same.
The ML process is itself a generalization of the different use cases. ML spans domains!
The arrow means OPERATIONALIZE. Feed back incident data & other high-level analysis back into the ML Process. Keep exploring that data & fitting better models to align with reality. Loop Step #5 (Act) back to Step #1 (Data).
Before you do machine learning, you need DATA and DECISION-MAKERS. Walk before you can run! Start with useful data sources that can help people solve problems, and build basic dashboards correlating different things in the data.
This is called EXPLORATORY DATA ANALYSIS. Once you do that, THEN try to fit models based on what seems to correlate. Interviewing & iterating with decision-makers is key.
DATA: ML isn’t magic. You need good data to learn from.
DECISION-MAKERS: Once you find patterns, anomalies, etc., who are you going to deliver to them to? How do they want information presented? Emails? Dashboards? Incident tickets?
Walk before running! Precursor to building models & doing ML.
Image: OpenStreetMap logo, from Wikipedia. Creative Commons
Q: Why standalone SH?
A: Don’t want ML exploration & production to bring down other Splunk workloads
Can use standalone 6.3 SH with older version SH cluster & indexers.
PSC add-on is FREE. Go to ML App link above, and click Documentation. Links for all distros ()
Remember: Machine Learning is a PROCESS. Takes a lot of work & elbow grease to get from Exploratory Data Analysis to ML Models in Production.
Q: Why hard?
A1: Statistics is hard. Subtle questions re: model error & statistical assumptions. Remember: “All models are wrong, some are useful”
A2: Validation is also difficult because not everyone has the same requirements. For example, for some users false positives may be much more expensive than false negatives; for others, the opposite may be true. For some users, being 2X wrong is twice as bad as being X wrong; for others, there may be a non-linear relationship between error and badness.
Re: ML App v0.9. To be updated after new release. Time estimate: “soon, stay tuned!”
If you want to use ML in production, let us know! We have customers using ML in production TODAY. e.g., New York Air Brake
Time for ML demo!
Get the ML App: http://tiny.cc/splunkmlapp
Want more? Take Splunk’s Analytics & Data Science course!
Course prework: http://bit.ly/splunkanalytics
Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML.
Image modified from cover of book Protecting Study Volunteers in Research
Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012)
NEXT: either leave slide & discuss OR show ML demo
Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML.
Image modified from cover of book Protecting Study Volunteers in Research
Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012)
NEXT: either leave slide & discuss OR show ML demo
A direct customer-Splunk engagement focused on real-world use of the Splunk Enterprise - MachineLearning Toolkit and Showcase app and related SPL commands
Objectives• Help the customer to be successful in the impactful use of ML• Help Splunk to understand customer use cases and product requirements
Details• Splunk Account SE plus PM/Engineering work directly with customer to guide usage, providesupport, note analytics and product requirements and refine product where feasible• Customer participates in the above, developing 1 or more models and putting them in production• Customer agrees to be referenced publically; sharing reasonable detail and business impact• Customer agrees to participate in a set of activities that may include: case study, press quote, use
of logo, PR/AR reference call, video profile
We’re headed to the East Coast!
2 inspired Keynotes – General Session and Security Keynote + Super Sessions with Splunk Leadership in Cloud, IT Ops, Security and Business Analytics!
165+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE!
30+ hours of invaluable networking time with industry thought leaders, technologists, and other Splunk Ninjas and Champions waiting to share their business wins with you!
Join the 50%+ of Fortune 100 companies who attended .conf2015 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers.
Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Orlando a Splunk user, leave Orlando a Splunk Ninja!
REGISTRATION OPENS IN MARCH 2016 – STAY TUNED FOR NEWS ON OUR BEST REGISTRATION RATES – COMING SOON!
Time for ML demo!
Get the ML App: http://tiny.cc/splunkmlapp
Want more? Take Splunk’s Analytics & Data Science course!
Course prework: http://bit.ly/splunkanalytics