2. 2
Customer Discovery
2
• Activity brief in the box folder:
– https://splunk.box.com/SLPhx
• In the room:
– Door2door’s Splunk Architect (Steve)
– Splunk SE (Nate)
3. 3
Splunk at the Next Level
Time to move beyond initial Splunk environment
• More use cases – how to tackle?
• More data – how do we scale?
• Splunk is mission critical == HA
• Global deployments
• Splunk user experience Screenshot here
5. 5
Growing your Splunk Deployment
Many customers start with a single use case…
• Ex: Monitor the web servers
• Help ensure up-time & response times
• Track usage, errors
• Provides business value
6. 6
Growing your Splunk Deployment
Value statement for each overall service
Your services exist in a larger context than just one app, or one tier.
What is the value of the service as a whole?
What are CIO commitments for the service?
• The company’s web store is one of the most critical parts of the business.
• Performance of the overall environment must be maintained at all times.
• Failures in any portion of the web store must be quickly identified, send
notification to the appropriate parties.
• Dependencies on external processes must be monitored as well.
7. 7
Growing your Splunk Deployment
The larger context
• Failure in one system cascades
• Map dependencies, estimate costs
• Use Splunk to track all dependencies.
• What happens when it is down?
Dependencies often include:
• Networking dependencies
• Shared storage
• Databases, middleware, custom apps
• Virtualization layer
Screenshot here
11. 11
Scaling - Storage
Simple storage to complex
Raw data rate net compression of ~ 50% on disk.
Simple: rate * compression * retention
200 GB / day * 50% * 100 days = 10TB
Consider cold storage on NAS
– Changes storage story.
– Retention on fast, retention on slow
Clustering
– Changes storage story
16. 17
Scaling the Search Heads
Splunk Search is critical, too!
Splunk Search high availability needs
Scale to handle # of concurrent queries
17. 18
SHP vs SHC
SHC
• SHP
• Available since v4.2
• Sharing configurations through NFS
• Single point of failure
• Performance issues
• No NFS
• Replication using local storage
• Commodity hardware
NFS
19. 20
Search Head Clustering
Use “Captain” for Master to avoid confusion with Index-Clustering
Minimum 3 nodes required. Odd is always preferred.
Cluster takes certain key decisions based on *majority* (consensus)
In multi-site setup have more nodes in main datacenter
21. 22
Deployment Server
Central management of Splunk Forwarders
Deployment Server manages Apps, Configs
Select one or more classes for each host
Class defines apps & configs
Works by phone-home
Notes:
DS does not push forwarder binaries
Use Cluster Master to manage indexers in cluster, not DS
24. 25
Discovery
2
• 1Tb/day peak ingest
• Up to 50 concurrent users
• All data is being generated from a single data center
• Fault tolerant design for high availability of Splunk
• 90 days data retention
• Standard hardware models in the Activity Brief
26. 27
Forwarding Tier
2
Design Factors
• Syslog Collectors (HA)
• DBConnect Inputs
– McAfee EPO data
• TA Inputs
– CheckPoint
• Assorted Inputs
– Microsoft AD logs
– MicroSoft Exchange Server
– Microsoft Sharepoint logs
– Log4j, Linux, IIS
27. 28
Syslog Collectors
2
• Best Practice to use dedicated syslog servers
• Syslog-NG/rSyslog recommended
• Syslog can write events to dedicated log files allowing for easy sourcetype classification on inputs
28. 29
Syslog Collectors
2
• Using a Load Balancer/VIP
with Linux Heartbeat to
provide failover for the syslog
listener
• Syslog-NG PE Client-side
failover
High Availability
29. 30
Forwarder for TA’s
3
• TA-McAfee requires DBConnect
to pull endpoint events
• TA-Checkpoint uses the LEA Client
to retrieve Firewall log events
• Not a HA design, but could be
hosted on a VM to standby or
failover
30. 31
Deployment Server
3
● Deployment Server to manage Linux and
Windows forwarders
● Not a HA design, but could be hosted on a VM to
standby or failover
32. 33
Forwarding Tier BOM
3
Role Type Config #
Syslog Server F
4vCPU/12Gb/200
Gb
2
HWF E
2vCPU/8Gb/20G
b
1
Deployment
Server
F
4vCPU/12Gb/200
Gb
1
Load Balancer - - -
33. 34
Forwarding Tier Design Best Practices
3
• Use a Syslog Server for Syslog data
• Be careful with Intermediate forwarders
– They can introduce bottlenecks
– Reduce the distribution of events across Indexers
• AutoLB will spread over all available indexers, but don’t assume
evenly!
– Enable forceTimebasedAutoLB
• May need to increase UF thruput setting for high velocity sources
– [thruput]
– maxKBps
38. 39
Storage Types
3
• Local vs Direct Attached vs SAN vs NAS
• SSD/Flash vs Spinning Disk
– SSDs offer much higher IOPS with no latency
– Significant performance increases with Sparse Searches
39. 40
Cluster Master Server
4
• Indexer Apps are deployed via CM
• Not a HA design, but could be hosted on a VM to standby or failover
41. 42
Indexing Tier BOM – Solution A
4
Role Type Config #
Indexer A
16CPU/64Gb/12*
1Tb (RAID10)
20
Cluster Master F
4vCPU/12Gb/200
GB
1
42. 43
Indexing Tier BOM – Solution B
4
Role Type Config #
Indexer C
24CPU/96Gb/6*8
00Gb(RAID6)+6*
2Tb(RAID10)
13
Cluster Master F
4vCPU/12Gb/200
GB
1
43. 44
Indexing Tier Design Best Practices
4
• Depending on Searchload 100-250Gb max/idx/day***
• Max # of Indexes (indices) when clustering is enabled
44. 45
How Clustering Affects Sizing
• Increased storage:
– 15% of raw usage for every replica copy
– 35% MORE to make that searchable
• Increased processing
– Incoming data to indexer is streamed to indexing peers to satisfy required
number of copies
• More hosts
– Need “replication factor” + 2 (search head, cluster master)
4
45. 46
Benefits of Clustering
• Data redundancy
• Data availability
• Indexer resiliency
• Simpler management of indexers
• Simpler setup of distributed search
• Multi-site clustering allows site-specific search to reduce WAN traffic
4
46. 47
Downsides of Clustering
• Increased Storage
• Extra machine (cluster master) required
• Increased bandwidth
• Hard to manage with DS (read: don’t)
4
48. 49
Search Tier
4
Design Factors
• High Availability
• Search Head Clustering
• # users
• # concurrent searches
• Forward all data to indexers
49. 50
SHC & Deployer
5
• Search Head Cluster Apps need to be installed by the Deployer
• A minimum of 3 Search Heads are required for a SHC
• No Exchange or VMware app with SHC
– Anything leveraging tscollect based searches will need modification
51. 52
Search Tier BOM
5
Role Type Config #
Search Head B
16CPU/64Gb/2*8
00Gb
3
Deployer E
2vCPU/8Gb/20G
b
1
License Server E
2vCPU/8Gb/20G
b
1
Load Balancer - - -
52. 53
Search Tier Design Best Practices
5
• ES will still require a Separate Search Head or dedicated SHC
• Use LDAP/AD/SSO for user Authentication
• Load Balancer configured for sticky sessions
56. 57
Hybrid Approach
5
• Add the existing Splunk
instance as a search peer
until the data retention
period has expired
• Disable scheduled searches
on the old instance
• Migrate any Summary
Index data to new Indexers
58. 59
Top 5 things to consider
5
• Indexer Storage requirements – Size and IOPS
• Minimum buy-in for a SHC is 3
• Use VMs for CM/LS/DS/Deployer if possible
• Consider a dedicated SH for a Distributed Management Console
• When in doubt – add another Indexer
59. 60
How Apps Affect Sizing
• Enterprise Security – Requires a dedicated search head
• Don’t share hosts with other services
– Not co-located with Exchange, Active Directory, Hypervisors
• Don’t let anti-virus run on the Splunk partition
• Some data collection apps require a full instance (heavy forwarder)
– VMWare
– Checkpoint LEA
6
62. 63
63
The 6th Annual Splunk Worldwide Users’ Conference
• September 21-24, 2015
• The MGM Grand Hotel, Las Vegas
• 4000 IT & Business Professionals
• 2 Keynote Sessions
• 3 days of technical content
– 165+ sessions
• 3 days of Splunk University
– Sept 19-21, 2015
– Get Splunk Certified for FREE!
– Get CPE credits for CISSP, CAP, SSCP, etc.
– Save thousands on Splunk education!
• 80 Customer Speakers
• 80 Splunk Speakers
• 35+ Apps in Splunk Apps Showcase
• 65 Technology Partners
• Ask The Experts and Security Experts,
Birds of a Feather, Chalk Talks and a new
& improved Partner Pavilion!
• Register at conf.splunk.com
64. 65
We Want to Hear your Feedback!
After the Breakout Sessions conclude
Text Splunk PHX to 878787
And be entered for a chance to win a $100 AMEX gift card!
Default 3/2 cluster uses 3*.15 + 2*.35 = 115% of license usage for that redudancy
Processing : a little more CPU and more network
this is much better in current versions, the indexed data (tsidx, etc) is streamed to the replica peer, rather than forcing the peer to re-index.
Availability – Cervelli famously smashed a laptop that was part of a distributed cluster, another host answered, search still available
As discussed – default parameters require *more than* original log size
Indexing volume per day (reference indexer = 250 GB / day = 3 MB/s .. ~ ¼ of a forwarder)
Long-term storage (retention)
Users = search activity
Saved searches = search activity
Dense (cpu, time spend unzipping data) / rare / sparse (1 in a million or one in 10 million – IOPS)
2 inspired Keynotes – General Session and Security Keynote
150+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE!
Join the 50%+ of Fortune 100 companies who attended .conf2014 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers.
Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Vegas a Splunk user, leave Vegas a Splunk Ninja!
----- Meeting Notes (4/22/15 10:47) -----
Splunk Apptitude is live and open.
You've got 90 days.
To win more than $150,000 in cash and prizes.
Last day to submit is July 20th, 2015.
We'll announce the winners at Black Hat in August.
Good luck!