The document discusses building IT service intelligence with Splunk. It introduces key concepts like services, KPIs, health scores, and the benefits of Splunk's approach to machine data. The presentation demonstrates how to design service intelligence for an example company, Buttercup Games, to gain visibility into their supply chain and online store processes. It also provides a hands-on example of quickly configuring a new KPI and modifying a dashboard within Splunk IT Service Intelligence.
3. Safe Harbor Statement
During the course of this presentation, we may make forward looking statements regarding future events
or the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC. The forward-looking statements
made in this presentation are being made as of the time and date of its live presentation. If reviewed
after its live presentation, this presentation may not contain current or accurate information. We do not
assume any obligation to update any forward looking statements we may make. In addition, any
information about our roadmap outlines our general product direction and is subject to change at any
time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
3
8. Data-Defined & Driven Service Insights
Infrastructure LayerApplication Layer
Splunk> is the missing link
• Data Fidelity
• Single Repository for ALL data
• Easier to Manage Services
• Reduced Integrations
• Reduced Point Solutions
• Collaborative Approach
• Quick time to value
Data Fabric Platform
Service Intelligence
Network
Packet, Payload, Traffic,
Utilization, Perf
Synthetic APM
Availability, Capacity,
User Experience
Byte Code Instrumentation
Usage, Experience,
Performance, Quality
Adaptive Thresholding
Apps, Services, Systems74%
-36%
Server
Performance, Usage,
Dependency
Storage
Utilization, Capacity,
Performance
MACHINE DATA
12. IT Service Intelligence Value Stack
§ Adaptive Threshold
§ Behavior Anomaly
§ Correlates Data into Knowledge
§ Visualizes entire stack
§ View the entire Ecosystem
§ 3 clicks to get the answer versus 10
§ Time Series Index
§ Schema on Read
§ Data Model
Service
Model
ML
§ Accelerators
§ Trend aggregation
§ Multi KPI Alerts
ITSI
30. 30
Your Service Intelligence Collaborators
Service Owners
• Business
functions
• Performance
indicators
• Common
business issues
• Frequency of
issues
• Business impact
of issues
Operations and
Support
• Common issues
• Performance
indicators
• Resolution
processes
• Tools used for
resolving issues
• Frequency of
issues
• IT impact of
issues
Enterprise
Architecture
• Business
processes
• Key inputs and
outputs
• Technology
architecture
• Data
architecture
• Common issues
Administrators
• Current tools
and usage, and
adoption levels
• Splunk expertise
• Environment
expertise
• Personal pain
33. Service Intelligence Design – Buttercup Games
Infrastructure Layer
Application Layer
Business Layer
Service Layer
Order Entry Manufacturing Shipping Fulfillment
Supply Chain
Online Store EDI
Web Tier Middleware
• Total Orders
• Total Revenue
• Unit Count
• Unit Failures
• Service Level • Delivery Time
• Online Orders
• Online Revenue
• Response Time
• ServiceHealth
• Incidents/Changes
• Customer Satisfaction
• HTTP Hits
• Error Rate
• CPU Load
• Memory Used
• Disk Used
• IO Latency
• CPU Load
• Memory Used
• Disk Used
• IO Latency
• Response Time
• Error Rate
• Response Time
• Storage Free
37. Putting It All Together
Infrastructure Layer
Application Layer
Business Layer
Service Layer
Order Entry Manufacturing Shipping Fulfillment
Supply Chain
Online Store EDI
Web Tier Middleware
• Total Orders
• Total Revenue
• Unit Count
• Unit Failures
• Service Level • Delivery Time
• Online Orders
• Online Revenue
• Response Time
• ServiceHealth
• Incidents/Changes
• Customer Satisfaction
• HTTP Hits
• Error Rate
• CPU Load
• Memory Used
• Disk Used
• IO Latency
• CPU Load
• Memory Used
• Disk Used
• IO Latency
• Response Time
• Error Rate
• Response Time
• Storage Free
38. Typical Data Sources
Infrastructure Layer
Application Layer
Business Layer
Service Layer
Order Entry Manufacturing Shipping Fulfillment
Supply Chain
Online Store EDI
Web Tier Middleware
• Application Logs
• Corporate Databases
• Service Management
• Application Logs
• Webserver Logs
• DB Perf Counters
• Wire data
• Perf Counters
• Access Logs
• Network Logs
47. Let’s Talk Entities
47
● Select DB Service
● Entities are the relevant things which support
this service (usually hosts)
● Select the right entries with filters, ANDs, ORs
● Original Entity list can come from CMDB,
spreadsheet, Splunk search, others
51. Final Steps …
51
Set your thresholds:
● Aggregate (All)
● Per Entity
● Click “Add Threshold” TWICE
● Make the Neapolitan ice cream colors
Yellow, Green, Yellow
● Drag the sliders around in order to get
the current data graph entirely inside the
Green (normal) band
● Click Finish
● Other options are also available,
including adaptive thresholds and
anomaly detection
60. Finishing up …
60
• Add a ServiceHealthScore widget for Online
Store under Buttercup
• Choose a Viz Type with a sparkline graph, then
resize to make it look pretty
• Modify the Custom Drilldown action to go to
the saved glass table,
Buttercup Games Online Store
• Bonus Points: Make the label bigger, more
readable
• Click Save
• View when done
66. Multi-KPI Alerts and Notable Events
66
● Click on Notable Events Review
● Multiple KPIs and Healthscores can
be combined in sophisticated ways
to create Multi-KPI alerts
● When a Multi-KPI alert fires, one
of the outcomes is the creation of
a Notable Event
● Notable Events allow NOC
personnel and others to triage and
coordinate event management
efforts
67. Service Analyzer
67
● Click on Service Analyzer > Default Service Analyzer
● Back where we started!
● This view shows a “no-frills” list of
services (top) and hottest KPIs
(bottom)
● Provides access into Service Details
● It is useful for NOCs and others
who need a high-level situational
view
69. Summary
69
● High-value services can be decomposed and modeled in ITSI, using machine data
from the relevant systems
● Services and KPIs can be created in minutes, with sophisticated thresholding
techniques to distinguish “normal” from “not normal”
● Glass Tables allow service health and KPI metrics to be displayed in a way that
makes sense to specific groups, such as Executive Leadership, Business Service
Owners, the NOC, DevOps & Others
● Deep Dives allow KPIs to be compared side-by-side across any time range,
accelerating root cause analysis and significantly reducing MTTR
● Multi-KPI Alerts and Notable Events reduce alert noise, producing actionable
events and a means to manage them
● … and it’s fast+fun to build!
73. Call Center Service
Service Health Transactions
ACD Analysis – Core Splunk
Call Wait History
Inbound Analysis
Social Media
Online Msg
Social Media
Mail SupportVOIP Service
Inbound Calls
74. Online Transactions
Internal Transfer Service
External Wire Service
Money Exchange Service
Money Transfer Services
Service Health Corporate
Reconciliation Service
Fed Exchange Service
Core Splunk Searches
Transaction History
System Investigation
Heat Map Analysis
75. CIO Scorecard
Enterprise Service Status Major Incidents
Service Health
Continuous Operational Visibility
Volume Revenue Incidents Changes
Major Changes
Service Health Volume Revenue Incidents Changes
Service Health Volume Ontime DeliveryIncidents Changes Service Health VolumeRevenue Incidents Changes
Service Health Volume Revenue Incidents Changes Container UtilService Health Throughput Incidents Changes
77. Sign Up Now – We’re here to help!
Harness the creativity and domain knowledge of your
organization to unlock the value of data and solve an
important Business Service problem through a joint service
intelligence workshop with key stakeholders
Define methods for:
› Proactive service monitoring
› Reduced risk and failures
› Faster issue resolution
› Increased business performance
What is it?
› 1 Day Onsite Workshop
› Tightly linked with value
› Collaborative approach
› Build your own Glass
Table