Distributed Management Console helps Splunk Admins deal with the monitoring and health of their Splunk deployment. In Splunk 6.3, we built views for Splunk Index and Volume Usage, Forwarder Monitoring, Search Head Cluster Monitoring, Index Cluster Monitoring, and tools for visualizing your Splunk Topology. Leverage Splunk DMC and come see the forest -and- the trees in your Splunk deployment!
2. 2
Personal Introduction
2
• Kamilo “Kam” Amir
• Works on the Splunk MidAtlantic Majors Team
• 4 years with Splunk, prior worked at BMC Software (BladeLogic) and Verizon
Business (Digex)
• Mike Wilson
• Works on Splunk Public Sector Team
• Yes, he works at Splunk for the last million years…
3. 3
Agenda
• 6.4 DMC Recap
– Continuous Investment
– DMC Deployment Architectures
• So What’s Up With My Search Head Cluster?
• And that other Clustering thing, the Indexer Cluster?
• Indexes and Volumes Everywhere
• Forwarders (Really Everywhere)
• Oh, and One Other Thing…
3
10. 10
Continuous Investment in Management/Monitoring
• Started with Introspection in 6.1
• Items in 6.3 that will make Admins happy
– Data Integrity Control
– Forwarder Director
– Runaway Search Preventer
• The future
– Radically simplified setup/expansion
– Granular controls in distributed deployment
– Standard flows for common tasks in a distributed deployment
– Better App model for installation/management
1
11. 11
History of Splunk Monitoring Tools
1
• index=_internal sourcetype=splunkd
– Go look at the logs!
• Splunkbase Tools
• Status/System Activity Dashboards
• Deployment Monitor
– License Usage Reporting!
– Alerting, Summarization
• S.o.S
– Developed by Splunk Support for Splunk Support and Customers
– Platform Resource Utilization collection with Technology Add-Ons
– Topology View
13. 13
Setup Tasks
1
• Prerequisites
– Where does the DMC live?
– Topology Definition
– Forward all logs from all components back to the indexing tier
– All components must be Search Peers of the DMC Host
• Standalone vs Distributed Mode
– Server Roles
– Custom Groups
– Cluster Labels!
16. 16
Search Head Clustering Views
1
• Motivation
– Plenty of data in logs/CLI
– Lots of customers deploying SHC
– What is going on in my Search Head
Cluster?
27. 27
Indexes and Volumes Views
2
• Motivation
– Customers love Fire Brigade
– Figuring out if you are meeting your
retention policies is tricky
• Demo
36. 36
Forwarder Monitoring Views
3
• Motivation
– No Forwarder info in 6.2!
– Deployment Monitor no longer
improved/supported
– Some customers don’t use
Deployment Server
• Forwarder Monitoring Setup
– Runs a search against indexers
– Configurable period
– View reads from Asset Table
• Demo
44. 44
DC Area Splunk Meetups
DC Area
• http://www.meetup.com/SplunkersDC/
• Q&A Chat forum
So what’s next on the agenda?
• April 27th 6:30pm McLean, VA – Happy
Splunk, Happy Splunker
45. 45
SEPT 26-29, 2016
WALT DISNEY WORLD, ORLANDO
SWAN AND DOLPHIN RESORTS
• 5000+ IT & Business Professionals
• 3 days of technical content
• 165+ sessions
• 80+ Customer Speakers
• 35+ Apps in Splunk Apps Showcase
• 75+ Technology Partners
• 1:1 networking: Ask The Experts and Security
Experts, Birds of a Feather and Chalk Talks
• NEW hands-on labs!
• Expanded show floor, Dashboards Control
Room & Clinic, and MORE!
The 7th Annual Splunk Worldwide Users’ Conference
PLUS Splunk University
• Three days: Sept 24-26, 2016
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP
• Save thousands on Splunk education!
Obvious questions about what can be co-hosted.
What does Splunk look like when it gets big?
A typical DMC setup page
The Status and Configuration dashboard is an overview of your search head cluster. It is high-level information.
The Configuration Replication dashboard provides insight into configurations that a user changes on any SHC member, and how these changes propagate through the cluster.
The Artifact Replication dashboard contains several panels describing the cluster's "backlog" of search artifacts to replicate.
Provides visibility into the captain’s role as a coordinator for scheduled searches in the cluster.
In the Apps status panel, a persistent discrepancy indicates that the deployer has not finished deploying apps to its members.
2 indexes, 1 status view
The status of several indexer clusters can now be consulted from a single location!
No need to connect to several Cluster Master instances
This view shows service tasks undertaken by the indexer clustering framework to meet data replication targets
The marker shows a time when an indexer went down, requiring the surviving ones to start copying data buckets to repair the cluster
We clearly see an initial peak of fix-up tasks identified, which slowly decreases over time as the cluster fixes itself
In that manner, this view provides visibility into the progress of such unplanned reconfiguration events
We’re looking at the _audit index on the ‘potato’ indexer cluster.
We have a target time retention of 150 days for this index, which seems to be respected based on this ‘median data age’ metric.
However, looking at the breakdown of data age per indexer, we can see that one indexer (svdev-centos6-006.sv.splunk.com) does not meet the target of 150 days of retention.
To investigate further, we click on the table row corresponding to this index, which leads us to the Index Detail – Instance view.
Looking in detail at the index that fails to meet the target retention for the _audit index, we see that:
Data is not being deleted due to hitting the time-based retention policy (1st column)
Data is not being deleted due to hitting the index-wide disk usage retention policy (2nd column)
Data is not being deleted due to hitting directory-level (home & cold path) retention policies (3rd and 4th columns)
Looking at how data age evolved over time, we can see a sharp drop-off on 09/08, indicating an incident on that day
Furthermore, we see that on 09/08 we lost almost all cold buckets, indicating that something happened to the cold directory of this index on that day
Let’s take a closer look at the settings for this index: Is this leveraging volumes?
Indeed, both paths for this index are referencing volumes
homePath (hot + warm buckets) is referencing a volume named “opt”
coldPath (cold buckets) is referencing a volume named “cold”
We should look at these volumes next, using the Volume Detail – Instance scoped to this indexer
First let’s look at the ‘opt’ volume
We see that this volume is _not_ full, so it’s not pushing data out
We also see that the _audit index’s ‘home’ directory is hosted on this volume, with ~3GB worth of data
Let’s move on to the ‘cold’ volume
Looking at the ‘cold’ volume now
This volume *is* full! It is pushing data out aggressively!
All space in this volume is used by the ‘latex_imports’ index, representing only ~ 1 day’s worth of data
Given that a full volume freezes older data first, the surge of recent data from ‘latex_imports’ has caused the volume to push out all data from the ‘_audit’ index
Solution: separate indexes with different data density and target retention periods in different volumes
Forwarder Monitoring – Deployment view can highlight missing forwarders
Here we can clearly see two forwarders that have gone missing
The first one – ‘atruong-mbpr15’ – hasn’t sent data to the indexers for ~ 3 hours
The second one – ‘uf-dmcdemo’ – hasn’t sent data to the indexers for ~ 13 hours
Let’s click on one of these missing forwarders for a drill-down to the Forwarder Monitoring – Instance view
Forwarder Monitoring – Instance view
We’re now looking in more detail at the history of forwarder ‘uf-dmcdemo’ connections to the indexers on the previous day
We can clearly see a gap of several hours during which this forwarder did not connect to the indexers, which would have resulted in a “missing” status
Missing forwarders can also be pro-actively detected using a built-in alert!
We’re headed to the East Coast!
2 inspired Keynotes – General Session and Security Keynote + Super Sessions with Splunk Leadership in Cloud, IT Ops, Security and Business Analytics!
165+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE!
30+ hours of invaluable networking time with industry thought leaders, technologists, and other Splunk Ninjas and Champions waiting to share their business wins with you!
Join the 50%+ of Fortune 100 companies who attended .conf2015 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers.
Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Orlando a Splunk user, leave Orlando a Splunk Ninja!
REGISTRATION OPENS IN MARCH 2016 – STAY TUNED FOR NEWS ON OUR BEST REGISTRATION RATES – COMING SOON!