Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms. In this session, we'll present an overview of the app architecture and API and show you how to use Splunk to easily perform a variety of tasks, including outlier and anomaly detection, predictive analytics, and event clustering. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
Splunk in the Cisco Unified Computing System (UCS) Splunk
Cisco has been a Splunk customer for 8 years, with a strong engineering partnership for 3+ years. Learn how several Cisco customers as well as Cisco IT have deployed, grown, and transformed our businesses using the advantages of Splunk Enterprise software together with Cisco UCS and Nexus hardware. We will also talk about scalability and performance considerations for all scales of data footprint and business growth.
ntroduced in Splunk 6.2, the Distributed Management Console helps Splunk Admins deal with the monitoring and health of their Splunk deployment. In Splunk 6.3, we built views for Splunk Index and Volume Usage, Forwarder Monitoring, Search Head Cluster Monitoring, Index Cluster Monitoring, and tools for visualizing your Splunk Topology. Leverage Splunk DMC and come see the forest -and- the trees in your Splunk deployment!
Splunk in the Cisco Unified Computing System (UCS) Splunk
Cisco has been a Splunk customer for 8 years, with a strong engineering partnership for 3+ years. Learn how several Cisco customers as well as Cisco IT have deployed, grown, and transformed our businesses using the advantages of Splunk Enterprise software together with Cisco UCS and Nexus hardware. We will also talk about scalability and performance considerations for all scales of data footprint and business growth.
ntroduced in Splunk 6.2, the Distributed Management Console helps Splunk Admins deal with the monitoring and health of their Splunk deployment. In Splunk 6.3, we built views for Splunk Index and Volume Usage, Forwarder Monitoring, Search Head Cluster Monitoring, Index Cluster Monitoring, and tools for visualizing your Splunk Topology. Leverage Splunk DMC and come see the forest -and- the trees in your Splunk deployment!
Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms, which are exposed via an API and demonstrated in a showcase. In this session, we'll present an overview of the app architecture and API and then show you how to use Splunk to easily perform a wide variety of tasks, including outlier detection, predictive analytics, event clustering, and anomaly detection. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
Distributed Management Console helps Splunk Admins deal with the monitoring and health of their Splunk deployment. In Splunk 6.3, we built views for Splunk Index and Volume Usage, Forwarder Monitoring, Search Head Cluster Monitoring, Index Cluster Monitoring, and tools for visualizing your Splunk Topology. Leverage Splunk DMC and come see the forest -and- the trees in your Splunk deployment!
In addition to seeing the latest features in Splunk Enterprise, learn some of the top commands that will solve most search and analytics needs. Ninja’s can use these blindfolded. New features will be demonstrated in the following areas: TCO and Performance Improvements, Platform Management and New Interactive Visualizations.
Taking Splunk to the Next Level - ArchitectureSplunk
This session led by Michael Donnelly will teach you how to take your Splunk deployment to the next level. Learn about Splunk high availability architectures with Splunk Search Head Clustering and Index Replication. Additionally, learn how to manage your deployment with Splunk’s operational and management controls to manage Splunk capacity and end user experience
Come and learn from our experts on ways to improve you IT Operational Visibility by using Splunk for monitoring environment health. In this hands-on session we will cover recommended approaches for end to end monitoring, across applications, OSes, and devices. Topics will include: critical services to monitor, use of the Splunk Common Information Model (CIM) for cross-dataset normalization, commonly deployed apps and TAs to gather data for IT infrastructure uses, and use of pre-made dashboard panels to quickly build dashboards for monitoring your environment.
Splunk is a powerful platform for understanding your data. The preview of the Machine Learning Toolkit and Showcase App extends Splunk with a rich suite of advanced analytics and machine learning algorithms, which are exposed via an API and demonstrated in a showcase. In this session, we'll present an overview of the app architecture and API and then show you how to use Splunk to easily perform a wide variety of tasks, including outlier detection, predictive analytics, event clustering, and anomaly detection. We’ll use real data to explore these techniques and explain the intuition behind the analytics.
Distributed Management Console helps Splunk Admins deal with the monitoring and health of their Splunk deployment. In Splunk 6.3, we built views for Splunk Index and Volume Usage, Forwarder Monitoring, Search Head Cluster Monitoring, Index Cluster Monitoring, and tools for visualizing your Splunk Topology. Leverage Splunk DMC and come see the forest -and- the trees in your Splunk deployment!
In addition to seeing the latest features in Splunk Enterprise, learn some of the top commands that will solve most search and analytics needs. Ninja’s can use these blindfolded. New features will be demonstrated in the following areas: TCO and Performance Improvements, Platform Management and New Interactive Visualizations.
Taking Splunk to the Next Level - ArchitectureSplunk
This session led by Michael Donnelly will teach you how to take your Splunk deployment to the next level. Learn about Splunk high availability architectures with Splunk Search Head Clustering and Index Replication. Additionally, learn how to manage your deployment with Splunk’s operational and management controls to manage Splunk capacity and end user experience
Come and learn from our experts on ways to improve you IT Operational Visibility by using Splunk for monitoring environment health. In this hands-on session we will cover recommended approaches for end to end monitoring, across applications, OSes, and devices. Topics will include: critical services to monitor, use of the Splunk Common Information Model (CIM) for cross-dataset normalization, commonly deployed apps and TAs to gather data for IT infrastructure uses, and use of pre-made dashboard panels to quickly build dashboards for monitoring your environment.
Latinoamérica, Contexto Social y la presencia del Regionalismo Europeo.
Arquitectura Ecléctica e Historicista en Latinoamérica.
Art Deco en Latinoamerica
Impacto urbano (PROCESO DE URBANIZACION EN AMERICA LATINA) en Latinoamerica con la llegada del Movimiento Moderno.
Europa y América, Cambios de la Arquitectura Moderna a partir de la II Guerra Mundial.
Venezuela, Contexto Social, Político y Económico: Dictadura Andina. Explotación Petrolera
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Precisely
Enterprises with mainframes and Cloud/server architectures face unique issues and challenges and if your enterprise delivers a service whose operation spans mainframe and distributed and/or Cloud infrastructures (e.g. a mobile banking/customer app), this webinar is for you.
See how you can gain unique business and service-relevant context using your own machine data, including that from your z/OS mainframe. Implicitly learn patterns, eliminate costly false alerts, identify anomalies, and baseline normal operations by employing advanced analytics driven by machine learning. You’ll also see and learn about:
• Accelerating root-cause analysis and getting ahead of customer-impacting outages and slow-downs for your service
• “Glass Table” view for clickable visualization of the entire service-relevant infrastructure
• Machine Learning in IT Service Intelligence
• The Machine Learning Toolkit available today
SplunkLive! Frankfurt 2018 - Legacy SIEM to Splunk, How to Conquer Migration ...Splunk
Presented at SplunkLive! Frankfurt 2018:
Introduction
SIEM Migration Methodology
Use Cases
Datasources & Data Onboarding
ES Architecture
Third-Party Integrations
You Got This!
SplunkLive! Paris 2018: Legacy SIEM to SplunkSplunk
Presented at SplunkLive! Paris 2018: Legacy SIEM to Splunk, How to Conquer Migration and Not Die Trying:
- Why?
- SIEM Replacement
- Use Cases
- Data Sources & Data Onboarding
- Architecture
- Third Party Integrations
- You Got This
-
Splunk for Enterprise Security and User Behavior AnalyticsSplunk
This session will review Splunk’s two premium solutions for information security organizations: Splunk for Enterprise Security (ES) and Splunk User Behavior Analytics (UBA). Splunk ES is Splunk's award-winning security intelligence solution that brings immediate value for continuous monitoring across SOC and incident response environments – allowing you to quickly detect and respond to external and internal attacks, simplifying threat management while decreasing risk. Splunk UBA is a new technology that applies unsupervised machine learning and data science to solving one of the biggest problems in information security today: insider threat. You’ll learn how Splunk UBA works in tandem with ES, or third-party data sources, to bring significant automated analytical power to your SOC and Incident Response teams. We’ll discuss each solution and see them integrated and in action through detailed demos.
Splunk Discovery: Warsaw 2018 - Legacy SIEM to Splunk, How to Conquer Migrati...Splunk
Presented at Splunk Discovery Warsaw 2018:
SIEM Replacement Methodology
Use Cases
Data Sources & Data Onboarding
Architecture
Third Party Integration
You Got This!
Splunk, Software Tools, Big Data, Logging, PCI, Information security, Cisco Systems, VMware ESX, Regulatory compliance, FISMA, Enterprise architecture, Data center, security software, SCADA, Windows,Unix,Scanners, Citrix, Microsoft Active Directory
Splunk for Enterprise Security featuring User Behavior AnalyticsSplunk
This session will review Splunk’s two premium solutions - Splunk Enterprise Security (ES) is Splunk's award-winning security intelligence solution that brings immediate value for continuous monitoring across SOC and
incident response environments. Splunk UBA is a new technology that applies unsupervised machine learning and data science to solving one of the biggest problems in information security today: insider threat. You’ll learn how Splunk UBA works in tandem with ES, or third-party data sources, to bring significant automated analytical power to your SOC and Incident Response teams.
SplunkLive! Frankfurt 2018 - Get More From Your Machine Data with Splunk AISplunk
Presented at SpluknLive! Frankfurt 2018:
Why AI & Machine Learning?
What is Machine Learning?
Splunk's Machine Learning Tour
Use Cases & Customer Stories
Wrap Up
Similar to Machine Learning + Analytics in Splunk (20)
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...Splunk
.conf Go 2023 presentation:
"Das passende Rezept für die digitale (Security) Revolution zur Telematik Infrastruktur 2.0 im Gesundheitswesen?"
Speaker: Stefan Stein -
Teamleiter CERT | gematik GmbH M.Eng. IT-Sicherheit & Forensik,
doctorate student at TH Brandenburg & Universität Dresden
.conf Go 2023 presentation:
De NOC a CSIRT
Speakers:
Daniel Reina - Country Head of Security Cellnex (España) & Global SOC Manager Cellnex
Samuel Noval - Global CSIRT Team Leader, Cellnex
Splunk - BMW connects business and IT with data driven operations SRE and O11ySplunk
BMW is defining the next level of mobility - digital interactions and technology are the backbone to continued success with its customers. Discover how an IT team is tackling the journey of business transformation at scale whilst maintaining (and showing the importance of) business and IT service availability. Learn how BMW introduced frameworks to connect business and IT, using real-time data to mitigate customer impact, as Michael and Mark share their experience in building operations for a resilient future.
Data foundations building success, at city scale – Imperial College LondonSplunk
Universities have more in common with modern cities than traditional places of learning. This mini city needs to empower its citizens to thrive and achieve their ambitions. Operationalising data is key to building critical services; from understanding complex IT estates for smarter decision-making to robust security and a more reliable, resilient student experience. Juan will share his experience in building data foundations for a resilient future whilst enabling digital transformation at Imperial College London.
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...Splunk
Learn how Vodafone has provided end-to-end visibility across services by building an Operational Analytics Platform. In this session, you will hear how Stefan and his team manage legacy, on premise, hybrid and public cloud services, and how they are providing a platform for complex triage and debugging to tackle use cases across Vodafone’s extensive ecosystem.
.italo operates an Essential Service by connecting more than 100 million people annually across Italy with its super fast and secure railway. And CISO Enrico Maresca has been on a whirlwind journey of his own.
Formerly a Cyber Security Engineer, Enrico started at .italo as an IT Security Manager. One year later, he was promoted to CISO and tasked with building out – and significantly increasing the maturity level – of the SOC. The result was a huge step forward for .italo.
So how did he successfully achieve this ambitious ask? Join Enrico as he reveals the key insights and lessons learned in his SOC journey, including:
Top challenges faced in improving security posture
Key KPIs implemented in order to measure success
Strategies and approaches applied in the SOC
How MITRE ATT&CK and Splunk Enterprise Security were utilised
Next steps in their maturity journey ahead
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
2. 2
Disclaimer
During the course of this presentation, we may make forward looking statements regarding future events
or the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results
could differ materially. For important factors that may cause actual results to differ from those contained
in our forward-looking statements, please review our filings with the SEC. The forward-looking
statements made in the this presentation are being made as of the time and date of its live presentation.
If reviewed after its live presentation, this presentation may not contain current or accurate information.
We do not assume any obligation to update any forward looking statements we may make.
In addition, any information about our roadmap outlines our general product direction and is subject to
change at any time without notice. It is for informational purposes only and shall not, be incorporated
into any contract or other commitment. Splunk undertakes no obligation either to develop the features
or functionality described or to include any such feature or functionality in a future release.
7. 7
Detect Network Outliers at Large Telco
• Monitor network behavior in real time & respond to changes in environment
– “The ability to model the behavior of complex systems and alert on deviations is
where IT operations and security operations are headed and Splunk and MLTS have
given us a head start in moving to the next level...”
• Benefits with Splunk ML:
– Reduced Downtime
– Increased Service Availability
– Better Customer Satisfaction
– Better prioritization of anomalous events
– Increased revenue per unit (RPU)
• Tech Overview:
– An operationalized solution in production that is based on Splunk ML Toolkit outlier
detection. Customized outlier detection that leverages the data of up to a month
using voting strategies for a robust outlier detection.
8. 8
Noise Rise Monitoring at Large Telco
• Cell Tower Monitoring:
– Monitor noise rise of more than 20,000 cells per tower automatically
– Detect if cells are misconfigured
– Detect if cells are jammed by an external interfering device
• Benefits with Splunk ML:
– Better Troubleshooting
– Increased Service Availability
– Increased Device Lifespan
• Tech Overview:
– Use Linear Regression to model the Received Total Wideband Power (RTWP)
using the total traffic at the cell.
– By checking model fit, customer is able to say if the cell is configured correctly.
– Monitor cells in real time using properly fit models, identify sudden changes in their
behavior caused by overloading or external interference.
10. 11
ML 101: What is it?
• Machine Learning (ML) is a process for generalizing from examples
– Examples = example or “training” data
– Generalizing = build “statistical models” to capture correlations
– Process = ML is never done, you must keep validating & refitting models
• Simple ML workflow:
– Explore data
– FIT models based on data
– APPLY models in production
– Keep validating models
“All models are wrong, but some are useful.”
- George Box
11. 12
3 Types of Machine Learning
1. Supervised Learning: generalizing from labeled data
12. 13
3 Types of Machine Learning
2. Unsupervised Learning: generalizing from unlabeled data
13. 14
3 Types of Machine Learning
3. Reinforcement Learning: generalizing from rewards in time
Leitner System Recommender systems
15. 16
IT Ops: Predictive Maintenance
1. Get resource usage data (CPU, latency, outage reports)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast resource saturation, demand & usage
5. Surface incidents to IT Ops, who INVESTIGATES & ACTS
Problem: Network outages and truck rolls cause big time & money expense
Solution: Build predictive model to forecast outage scenarios, act pre-emptively & learn
Operationalize
16. 17
Security: Find Insider Threats
Problem: Security breaches cause big time & money expense
Solution: Build predictive model to forecast threat scenarios, act pre-emptively & learn
1. Get security data (data transfers, authentication, incidents)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast abnormal behavior, risk scores & notable events
5. Surface incidents to Security Ops, who INVESTIGATES & ACTS
Operationalize
17. 18
Business Analytics: Predict Customer Churn
Problem: Customer churn causes big time & money expense
Solution: Build predictive model to forecast possible churn, act pre-emptively & learn
1. Get customer data (set-top boxes, web logs, transaction history)
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast churn rate & identify customers likely to churn
5. Surface incidents to Business Ops, who INVESTIGATES & ACTS
Operationalize
18. 19
Summary: The ML Process
Problem: <Stuff in the world> causes big time & money expense
Solution: Build predictive model to forecast <possible incidents>, act pre-emptively & learn
1. Get all relevant data to problem
2. Explore data, and fit predictive models on past / real-time data
3. Apply & validate models until predictions are accurate
4. Forecast KPIs & notable events associated to use case
5. Surface incidents to X Ops, who INVESTIGATES & ACTS
Operationalize
20. 22
Splunk User Behavior Analytics (UBA)
• ~100% of breaches involve valid credentials (Mandiant Report)
• Need to understand normal & anomalous behaviors for ALL users
• UBA detects Advanced Cyberattacks and Malicious Insider Threats
• Lots of ML under the hood:
– Behavior Baselining & Modeling
– Anomaly Detection (30+ models)
– Advanced Threat Detection
• E.g., Data Exfil Threat:
– “Saw this strange login & data transfer
for user mpittman at 3am in China…”
– Surface threat to SOC Analysts
21. 23
Machine Learning in Splunk ITSI
Adaptive Thresholding:
• Learn baselines & dynamic thresholds
• Alert & act on deviations
• Manage for 1000s of KPIs & entities
• Stdev/Avg, Quartile/Median, Range
Anomaly Detection:
• Find “hiccups” in expected patterns
• Catches deviations beyond thresholds
• Uses Holt-Winters algorithm
22. 24
ML Toolkit & Showcase
• Splunk Supported framework for building ML Apps
– Get it for free: http://tiny.cc/splunkmlapp
• Leverages Python for Scientific Computing (PSC) add-on:
– Open-source Python data science ecosystem
– NumPy, SciPy, scitkit-learn, pandas, statsmodels
• Showcase use cases: Predict Hard Drive Failure, Server Power
Consumption, Application Usage, Customer Churn & more
• Standard algorithms out of the box:
– Supervised: Logistic Regression, SVM, Linear Regression, Random Forest, etc.
– Unsupervised: KMeans, DBSCAN, Spectral Clustering, PCA, KernelPCA, etc.
• Implement one of 300+ algorithms by editing Python scripts
24. 28
Analysts Business Users
1. Get Data & Find Decision-Makers
2
IT Users
ODBC
SDK
API
DB Connect
Look-Ups
Ad Hoc
Search
Monitor
and Alert
Reports /
Analyze
Custom
Dashboards
GPS /
Cellular
Devices Networks Hadoop
Servers Applications Online
Shopping Carts
Analysts Business Users
Structured Data Sources
CRM ERP HR Billing Product Finance
Data Warehouse
Clickstreams
25. 29
2. Explore Data, Build Searches & Dashboards
• Start with the Exploratory Data Analysis phase
– “80% of data science is sourcing, cleaning, and preparing the data”
– Tip: leverage ITSI KPIs – lots of domain knowledge
• For each data source, build “data diagnostic” dashboard
– What’s interesting? Throw up some basic charts.
– What’s relevant for this use case?
– Any anomalies? Are thresholds useful?
• Mix data streams & compute aggregates
– Compute KPIs & statistics w/ stats, eventstats, etc.
– Enrich data streams with useful structured data
– stats count by X Y – where X,Y from different sources
– Build new KPIs from what you find
26. 30
3. Fit, Apply & Validate Models
• ML SPL – New grammar for doing ML in Splunk
• fit – fit models based on training data
– [training data] | fit LinearRegression costly_KPI
from feature1 feature2 feature3 into my_model
• apply – apply models on testing and production data
– [testing/production data] | apply my_model
• Validate Your Model (The Hard Part)
– Why hard? Because statistics is hard! Also: model error ≠ real world risk.
– Analyze residuals, mean-square error, goodness of fit, cross-validate, etc.
– Take Splunk’s Analytics & Data Science Education course
27. 31
4. Predict & Act
• Forecast KPIs & predict notable events
– When will my system have a critical error?
– In which service or process?
– What’s the probable root cause?
• How will people act on predictions?
– Is this a Sev 1/2/3 event? Who responds?
– Deliver via Notable Events or dashboard?
– Human response or automated response?
• How do you improve the models?
– Iterate, add more data, extract more features
– Keep track of true/false positives
28. 32
5. Operationalize Your Models
• Operationalizing closes the loop of the ML Process:
1. Get data
2. Explore data & fit models
3. Apply & validate models
4. Forecast KPIs & events
5. Surface incidents to Ops team
• When you deliver the outcome, keep track of the response
– Human-generated response (detailed journal logs, etc)
– Machine-generated response (workflow actions, etc)
– External knowledge (closed tickets data, DB records, etc)
• Then operationalize: feed back Ops analysis to data inputs, repeat
• Lots of hard work & stats, but lots of value will come out.
Operationalize
30. 34
Next Steps with Splunk ML
• Reach out to your Tech Team! We can help architect ML solutions.
• Lots of ML commands in Core Splunk (predict, anomalydetection, stats)
• ML Toolkit & Showcase – available and free, ready to use
– Get it for free: http://tiny.cc/splunkmlapp
• Splunk UBA: Applied ML for Security
– Unsupervised learning of Users & Entities
– Surfaces Anomalies & Threats
• Splunk ITSI: Applied ML for ITOA use cases
– Manage 1000s of KPIs & alerts
– Adaptive Thresholding & Anomaly Detection
• ML Early Adopter Program:
– Connect with Product & Engineering teams - mlprogram@splunk.com
Editor's Notes
We’re headed to the East Coast!
2 inspired Keynotes – General Session and Security Keynote + Super Sessions with Splunk Leadership in Cloud, IT Ops, Security and Business Analytics!
165+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE!
30+ hours of invaluable networking time with industry thought leaders, technologists, and other Splunk Ninjas and Champions waiting to share their business wins with you!
Join the 50%+ of Fortune 100 companies who attended .conf2015 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers.
Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Orlando a Splunk user, leave Orlando a Splunk Ninja!
REGISTRATION OPENS IN MARCH 2016 – STAY TUNED FOR NEWS ON OUR BEST REGISTRATION RATES – COMING SOON!
[Shawn]
Q: Why does BA matter?
A1: BA can drive VOLUME in your accounts. Customer examples to come.
A2: BA can drive VALUE in your accounts. Strategic, high-level, high-profile use cases. Use HIGH-VALUE SMALL DATA to enrich LARGE VOLUMES of MACHINE
Q: What is a statistical model?A: A model is a little copy of the world you can hold in your hands.
Formal: A model is a parametrized relationship between variables.
FITTING a model sets the parameters using feature variables & observed values
APPLYING a model fills in predicted values using feature variables
Image source: http://phdp.github.io/posts/2013-07-05-dtl.html
Supervised learning is where you have existing LABELS in the data to help you out.
Example: If you’re training a model for CUSTOMER CHURN, historically you know which customers stayed and which left. You can build a model to correlate historical churn with other features in the data. Then you can PREDICT churn for each customer based on everything they’re doing in real-time and have done in the past.
Unsupervised learning is where you have NO LABELS to help you out. You have to figure out patterns
Example: If you’re trying to do BEHAVIORAL ANALYTICS, you might just have a big confusing pile of IT & Security data to wade through. Unsupervised learning is the art & science of finding PATTERNS, BASELINES and ANOMALIES in the data. Once you understand all this (that’s hard!) you can try to predict possible INCIDENTS and THREATS.
Good ML involves FEEDBACK loops. Best bet is to incorporate INCIDENT RESPONSE data and learn from what analysts have done in the past.
[NEXT SLIDE: Reinforcement Learning]
Reinforcement Learning is basically Supervised Learning where LABELS = REWARDS, and there is a strong focus on TIME and FEEDBACK LOOPS. This is how you OPERATIONALIZE machine learning: by looping back results of analysis and workflow and LEARN from interactions with the world.
Rewards can be POSITIVE or NEGATIVE.
Image: The Leitner system is reinforcement learning for flashcards. Correct answers “advance” and accumulate more points. Incorrect answers go back to the beginning.
https://en.wikipedia.org/wiki/Leitner_system
Reinforcement learning is rooted in behavioral psychology. Humans & animals are hard-wired for rewards
https://en.wikipedia.org/wiki/Reinforcement_learning
Q: How is this slide similar to the previous one? (go back and forth)
A: The ML Process is the same, it’s just that the data & the operations teams are different. Also different will be the actual analysis in the middle, but the *process* of doing that analysis is the same.
The ML process is itself a generalization of the different use cases. ML spans domains!
The arrow means OPERATIONALIZE. Feed back incident data & other high-level analysis back into the ML Process. Keep exploring that data & fitting better models to align with reality. Loop Step #5 (Act) back to Step #1 (Data).
Reinforcement learning lets us OPERATIONALIZE machine learning. When the machine recommends something to an analyst, the model can LEARN from the outcome of their work. Create a culture of REWARDS for your analytics team, not punishments.
The machines can/should learn from ALL the available data. You might have to build complex ML workflows. Want good Splunk admin to help architect.
Want VIRTUOUS CYCLE between human-machine interaction
Re: 100% of breaches involve valid credentials:
"Mandiant is the leader in incident response. They are the best of the best. They're brought in to deal with the largest, highest profile, most damaging breaches. When you read a news headline about a large organization being compromised, there is a great chance that Mandiant is working behind the scenes to eradicate the attackers from the environment. In a recent yearly report-when they looked across all the very damaging attacks they responded to-they noticed that valid credentials were used at some point in every single one of them. Why do we care? Well we care because it means that we cannot use simple techniques like counting failed logins to detect an attack. In fact based on this stat, there may not even be any failed logins to count! Instead we need to be much smarter. For example how can we look at 1000 successful logins and determine which of them was the malicious one? We do that through behavior analytics, baseline/outlier, ML, etc."
Free app! Toolkit & PSC are both free. Go to ML App link above, and click Documentation. Links for all distros ()
Q: Why standalone SH?
A: Don’t want ML exploration & production to bring down other Splunk workloads
Can use standalone 6.4 SH with older version SH cluster & indexers.
Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML.
Image modified from cover of book Protecting Study Volunteers in Research
Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012)
NEXT: either leave slide & discuss OR show ML demo
Before you do machine learning, you need DATA and DECISION-MAKERS. Walk before you can run! Start with useful data sources that can help people solve problems, and build basic dashboards correlating different things in the data.
This is called EXPLORATORY DATA ANALYSIS. Once you do that, THEN try to fit models based on what seems to correlate. Interviewing & iterating with decision-makers is key.
DATA: ML isn’t magic. You need good data to learn from.
DECISION-MAKERS: Once you find patterns, anomalies, etc., who are you going to deliver to them to? How do they want information presented? Emails? Dashboards? Incident tickets?
Walk before running! Precursor to building models & doing ML.
Source for “80% of data science is EDA” quote:
http://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html?_r=0
Image: OpenStreetMap logo, from Wikipedia. Creative Commons
Remember: Machine Learning is a PROCESS. Takes a lot of work & elbow grease to get from Exploratory Data Analysis to ML Models in Production.
Q: Why hard?
A1: Statistics is hard. Subtle questions re: model error & statistical assumptions. Remember: “All models are wrong, some are useful”
A2: Validation is also difficult because not everyone has the same requirements. For example, for some users false positives may be much more expensive than false negatives; for others, the opposite may be true. For some users, being 2X wrong is twice as bad as being X wrong; for others, there may be a non-linear relationship between error and badness.
Also: Output an aggregate table for further analysis? (send via ODBC Driver, DB Connect or Hunk/Hadoop)
Re: ML App v0.9. To be updated after new release. Time estimate: “soon, stay tuned!”
If you want to use ML in production, let us know! We have customers using ML in production TODAY. e.g., New York Air Brake
Time for ML demo!
Get the ML App: http://tiny.cc/splunkmlapp
Want more? Take Splunk’s Analytics & Data Science course!
Course prework: http://bit.ly/splunkanalytics
Re: ML App v0.9. To be updated after new release. Stay tuned! Lots to come w/ Splunk ML.
Image modified from cover of book Protecting Study Volunteers in Research
Publisher: CenterWatch LLC; 4th Edition edition (June 15, 2012)
NEXT: either leave slide & discuss OR show ML demo
A direct customer-Splunk engagement focused on real-world use of the Splunk Enterprise - MachineLearning Toolkit and Showcase app and related SPL commands
Objectives• Help the customer to be successful in the impactful use of ML• Help Splunk to understand customer use cases and product requirements
Details• Splunk Account SE plus PM/Engineering work directly with customer to guide usage, providesupport, note analytics and product requirements and refine product where feasible• Customer participates in the above, developing 1 or more models and putting them in production• Customer agrees to be referenced publically; sharing reasonable detail and business impact• Customer agrees to participate in a set of activities that may include: case study, press quote, use
of logo, PR/AR reference call, video profile