2. 2
About Sabre
• Technology leader in global
travel and tourism industry
• Founded in 1960 by American
Airlines; spun off in 2000
• Sabre’s technology processes
over 1.1 trillion system
messages each year
• Based in Southlake, Texas
3. 3
About Me
• John Bland – Sr Software Developer
within the rapidly growing platform
support team
• Responsible for all things Splunk – cluster
design, building views, adding new data
• Spends 95% of day in Splunk
• Was hooked on Splunk the first day
• I am ¼ Australian, so naturally my favorite
Splunk Slogan is “Australian for Grep”
4. 4
Before Splunk: Logs Here, There, & Everywhere
• Had to look at log files directly
• No central view of connected
services in different locations
• Limits on logins due to compliance
• Could see errors happening, but
couldn’t drill in for details
• Manual problem-solving - we had to
log onto one server after another
• Harder to maintain 99.xx% uptime
“Figuring out
problems was one big
long chain of
searching for puzzle
pieces one at a time.”
5. 5
And Tons of Third-Party Applications
• Before Splunk, we had no visibility into our
60+ different applications
• DB Connect / SQL server
• Oracle lookup tables
• IIS
• WMI Inputs
• Tomcat Logs
• Linux System Logs
• Custom Application Logging
• Windows Event – System, Application, and Security
logs
• Splunk for JMX
• And many more….
6. 6
Choosing Splunk
• Staff was struggling with disparate
logs & manual searches
• One day they gave us Splunk
and told me ‘just play with it’
• We quickly had all the means of
searching we’d been needing for
years in one package
“After my first 8
hours in Splunk,
I was totally
impressed.”
7. Splunk at Sabre
• 4 Splunk environments
• 9 clustered indexers
• 40+ GB License for Pre-
Production
• 300+ GB License for
Production
7
Hundreds of Universal Forwarders
9 Indexers
9 Search Heads + 4 Deployment Servers
8. • Visibility into applications at all levels
w/o labor-intensive searches
• Performance and monitoring alerts
• Individual access for everyone to data
to get their jobs done – no waiting for
a DBA to set something up
• More efficiencies in the day-to-day
Splunk Gives Us….
“We can now be 100%
sure we always know
the health of our
application and all its
parts.”
9. 9
Use Case: Splunk for Application Dev
• 80 alerts in Splunk monitor:
• Server and app performance
• CPU and memory status
• Specific event-to-error alerts
• Splunk monitoring is part of all
phases of our development cycle
• Multiple teams have easy
information access
• Meet compliance w/ developers
accessing production, not servers
“We count on Splunk
to ensure our
applications are
operating normally.”
10. 10
Use Case: Splunk for Pre-Production
• 7 “entirely Splunked” pre-production, mirrored
environments
• Load test and Automation views are updated in Splunk
and shared with teams
• We find & fix problems before they get into production
11. 11
Use Case: Splunk for Business KPIs
• With Splunk, we can:
•Ensure SLA uptime for customers
•Diagnose issues and achieve faster MTTR
•See and search multiple applications
•Streamline application performance, monitoring and
alerting
12. 12
Use Case: Splunk for Support
• Effective search tools make log
searches easy
• All levels of users can use complex
search language
• Searching email logs can pinpoint and
solve reservation issues
• Smoother maintenance process - from
lengthy batch processes to easy
searches of IIS logs – we can see
anything we want
“Splunk is like Google
for searches.
Effective search tools
took manual effort
and human error out
of the picture.”
13. 13
Internal Splunkers
• Support team runs queries w/o knowing SQL
• Teams have over 300 Splunk logins
(account mgmt, support, technical staff, product managers)
• Able to manage extreme growth (2X
customer base and developers, 3X servers)
• Users can get info they need easily w/o
accessing production systems
• Dashboards/reports keep stakeholders
informed
14. 14
Splunk for External Customers
• 1st level support can now provide instant answers to
phone queries without escalation
• Able to provide second tier support to customers’
own support team
• Answer field queries
• Understand usage spikes
• Improve MTTR in many cases problems that took 2
days to get answers now handled on the spot
• Provide key customers direct access to select Splunk
data on their paid application services
15. 15
Splunking Ahead….
• We’re migrating Splunk to more
and more processes
• Exposing Splunk to more customers
• Developing email tool to give
customers insight into how
applications are performing
• Migrating Splunk configuration into
Source Control to mirror dev cycle
16. 16
A Little Splunk Wisdom….
• Think of monitoring from the start and how you
can incorporate it
• Plan ahead what you can do --
• Build configuration items in the appropriate location.
Be able to rationalize which app the config should
belong to and ensure the config is moved to default
once finalized.
• Standardize alerts and views by picking a naming
convention and sticking with it. Use lookups, macros,
and eventtypes to manage differences between
environments allowing searches to work universally.
History in hosting, managed services, first non-hosting/MS gig.
Been at clickbank about a year
Splunk user in the past
I’d stand in the breakroom and have someone mention “did you see that splunk search I sent you”
I’d have the team wonder over to ask questions, and I’d have random folks send me splunk searches to show me things
Innovation-a-thon’s included splunk!
ELK was winning because it was free, and we’ve got a big OpenSource community at the office
We had an honest review of how much it would cost in people time, as well as financial cost to deploy each
Loosing people time was more important than the cost to deploy (more engineers back to focusing on their jobs, and not logs, means more getting done!)
We used it for PCI in a tiny way
Deployed significantly larger installation, and online(d) in weeks, value was immediate
All our PCI stuff is centered around splunk
Splunk is connected to JIRA, and is used to create reports, and daily functionality (security engineers just have to review jiras)
Constant searches for sensitive information, immediately notifies us of bad juju
Dev can now create logs in dev, add things to help them triage, and have them appear within the next release
Dev/QA can review logs in Dev, and across dev instances to confirm fixes/breakage
More stable releases!
“Think of monitring at the start and how you’ll incorporate it”
“plan ahead what you can do with Splunk – build configuration items, standardization for alerts, views, etc”