Charles Wheelus gave a presentation on using Splunk to ensure compliance with service level agreements (SLAs) at a wireless carrier. He described developing a system using Splunk to measure key performance indicators, ingest data from various systems, build reports on SLA compliance, and provide real-time analytics and dashboards. Wheelus emphasized how Splunk helped solve challenges like integrating diverse data sources, performing load testing analysis, and alerting operations teams.
3. About me:
Charles Wheelus, MSCS
• Senior Data Scientist, Cequint
• Ph.D. Candidate, Florida Atlantic University
research interests: Data Mining and Machine Learning
• 2012 Splunk Ninja Revolution award recipient
• Splunk Certified Architect
• Technology consultant for 20 years
• Splunk user and evangelist for three years
• Started with version 4.3
3
3
4. About
Cequint provides handset and Carrier data services to
most major wireless carriers in the U.S.
http://cequint.com
4
4
8. ...or
How to kill a flock of birds with one stone
Charles Wheelus
December 12th, 2013
8
8
9. Disclaimer: No birds were injured during the
production of this presentation. :)
Charles Wheelus
December 12th, 2013
9
9
10. SLA Compliance
(on a Wireless Carrier network)
The project:
Develop a system that provides proof
of our SLA compliance with our
carrier customer
Time is of the essence!
Charles Wheelus
December 12th, 2013
10
10
11. SLA Compliance
Determine the Key Performance Indicators
• Numerous subsystems
• Different development teams
• Different programming languages
• Different operating systems
• Wide variety of hardware types
Charles Wheelus
December 12th, 2013
11
11
13. SLA Compliance
Determine what data to get
Study the SLA
Engage others in the process
• Developers
• Management
• Product team
• Operations
Charles Wheelus
December 12th, 2013
13
13
15. SLA Compliance
Establish best practice for data input
What simple step can you take
in the beginning that will save time later?
Best practices document
Verify the data is in the expected format!
Charles Wheelus
December 12th, 2013
15
15
22. SLA Compliance
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
• KPI counter aggregation and
reconciliation
• Use Splunk REST API to build the
report
Charles Wheelus
December 12th, 2013
20
20
23. SLA Compliance
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
• KPI counter aggregation and
reconciliation
• Use Splunk REST API to build the
report
Charles Wheelus
December 12th, 2013
20
20
27. “Black-box” testing
The problem:
Performance information about the
Carrier’s self provisioning gateway is
unavailable. We have to run our own tests
to determine the expected performance
Time is of the essence!
Charles Wheelus
December 12th, 2013
23
23
33. Load test results analysis
The problem:
We need a quick way to evaluate the
results of load testing.
Time is of the essence!
Charles Wheelus
December 12th, 2013
27
27
34. Load test results analysis
Charles Wheelus
December 12th, 2013
28
28
35. Load test results analysis
Charles Wheelus
December 12th, 2013
28
28
36. Load test results analysis
Charles Wheelus
December 12th, 2013
29
29
37. Load test results analysis
Charles Wheelus
December 12th, 2013
29
29
40. Event Reporting
The problem:
Thousands of subsystem events may be
generated into the log files, some events are
inter-dependent. We need a comprehensive
and robust system for detecting, correlating,
and reporting these events to the correct
development team.
Time is of the essence!
Charles Wheelus
December 12th, 2013
31
31
41. Event Reporting
The solution:
Splunk saved and scheduled searches!
With very brief training, the developers are
building their own queries, saving and scheduling
Charles Wheelus
December 12th, 2013
32
32
47. Event Monitoring and Alarming
The problem:
The operations team requires that the
KPI produce alarm output into their preexisting monitoring and alarm system
Time is of the essence!
Charles Wheelus
December 12th, 2013
34
34
48. Event Monitoring and Alarming
• Operations has pre-existing alarming software
• Splunk was connected to OPS alarm system
using the Splunk API
Charles Wheelus
December 12th, 2013
35
35
52. Performance Analysis
The problem:
The entire team needs to have up to the
minute business intelligence.
Time is of the essence!
Charles Wheelus
December 12th, 2013
37
37
54. Performance Analysis
• Customized tools for Developers
• Dashboards for Operations
• Trouble shooting for Developers and
Operations
• Business Intelligence for Management
Charles Wheelus
December 12th, 2013
39
39
61. Cut to the chase
Splunk’s greatest benefits:
•Time savings
•Ability to react quickly (SPL)
•Real time analytics
•Rapid dashboard production
•The Splunk Community !!
Charles Wheelus
December 12th, 2013
41
41
62. What’s next?
• New metrics & dashboards
• Modular inputs
• More use of Splunk 3rd party Apps
• Predictive Analytics
• Data Models / Pivots
Charles Wheelus
December 12th, 2013
42
42