Data Onboarding Breakout Session

Copyright © 2015 Splunk Inc.
Data Onboarding 101
KirkHanson
SalesEngineering

Agenda
1 Splunk Platform – a refresher
Data Onboarding – live looks
TA’s & Apps what are they?
Go Beyond The Logs
Q&A
2
3
4
5

3
Make machine data accessible,
usable and valuable to everyone.
3

Big Data Comes from Machines
Volume | Velocity | Variety | Variability
4
GPS
RFID
Hypervisor
Web Servers
Email
Messaging
Clickstreams
Mobile
Telephony
IVR
Databases
Sensor Servers
Telematics
Storage
Security Devices
Desktops

6
Machine Data
Sources
Order Processing
Twitter
Care IVR
Middleware
Error

7
Machine Data Contains Critical Insights
Order ID
Customer’s Tweet
Time Waiting On Hold
Product ID
Company’s Twitter ID
Order ID
Customer ID
Twitter ID
Customer ID
Customer ID
Sources
Order Processing
Twitter
Care IVR
Middleware
Error

88
Turn Machine Data into Operational Intelligence
INDEX ANY MACHINE DATA: ANY SOURCE, TYPE, VOLUME GAIN REAL-TIME VISIBILITY
Online
Services Web
Services
Servers
Security GPS
Location
Storage
Desktops
Networks
Packaged
Applications
Custom
ApplicationsMessaging
Telecoms
Online
Shopping
Cart
Web
Clickstreams
Databases
Energy
Meters
Call Detail
Records
Smartphones
and Devices
RFID
On-
Premises
Private
Cloud
Public
Cloud
Application
Delivery
Security and
Compliance
Infrastructure
Monitoring
Business Analytics
Internet of Things
8

Best Practices Data
Onboarding

Six things to get right at Index time
Event Boundary
/ LineBreaking
Date
Timestamp
Sourcetype
Source
Host
Index

Best Practices
Local
before Prod
Confirm
Sourcetype
Separate
Index
Specific as
possible
Try before
you buy
Save not
override

1414
Data Onboarding Examples
Complex JSON
Configured
Settings
Simple JSON
Default Settings
Complex CSV
Minimal Settings

1515
Data Onboarding Examples
Complex JSON
Configured
Settings
Simple JSON
Default Settings
Complex CSV
Minimal Settings

Data Onboarding Live Look
(simple JSON)

1717
Simple JSON – Lessons Learned
Complex JSON
Configured
Settings
Simple JSON
Default Settings
Complex CSV
Minimal Settings
• Structured
• TimeStamp found
in first event
• Smaller set of data

1818
Complex CSV onboarding
Complex JSON
Configured
Settings
Complex CSV
Minimal Settings
Simple JSON
Default Settings
• Structured
• TimeStamp found
in first event

(CSV)

2020
Complex CSV - Lessons Learned
Complex JSON
Configured
Settings
Complex CSV
Minimal Settings
Simple JSON
Default Settings
• Structured
• TimeStamp found
in first event
• TimeStamp not
found
• Data otherwise
standard

2121
Complex JSON Onboarding
Complex JSON
Configured
Settings
Simple JSON
Default Settings
• Structured
• TimeStamp found
in first event
Complex CSV
Minimal Settings
• TimeStamp not
found
• Data otherwise
standard

(complex JSON)

2323
Complex JSON - Lessons Learned
Complex JSON
Configured
Settings
Simple JSON
Default Settings
• Structured
• TimeStamp found
in first event
Complex CSV
Minimal Settings
• TimeStamp not
found
• Data otherwise
standard
•Nested
•Multiple
TimeStamp fields
•Larger single event

Why reinvent the wheel
Splunkbase (apps & TA’s)

A Growing, Global Community of Users
Dev.splunk.com40,000+ questions
and answers
1000+ apps Local User Groups
and
SplunkLive! events

Web Framework
SDKs
REST API
Log directly
to Splunk
Extract Splunk
data for archiving
Integrate with third-party
reporting tools and portals
Integrate Splunk search
results into your application
. . . and
more
The Splunk Platform
User&DeveloperInterfaces
Ticketing/Help Desk
Custom Biz Applications
Inputs,Apps,OtherContent
Scripted inputs
(.sh, .py, .bat, .ps1, etc.)
Get data from APIs and other
remote data interfaces and message
queues.
Databases
(JDBC)
Splunk DB Connect lets you enrich
and combine your machine data
with database data.
Network events
(TCP, UDP, SNMP, NetFlow, HTTP(S))
Get data from any network port,
SNMP events, or send your
application data directly via HTTP
(or HTTPS) through HTTP Event
Collector.
Forwarders
(TCP)
Gather machine and historical
data (e.g. text-based files,
Windows event logs, Active
Directory).
Modular inputs
(Stream data as plain text or XML)
Extend the Splunk Enterprise
framework to define a custom
input capability (e.g. Twitter, S3,
Splunk MINT).
Enrich and extend the usefulness of
your event data through interactions
with external resources like asset
info, employee info, threat feeds,
honeypots, and more.
External lookups
(.py or .csv)
Customize the
Splunk Web UI
Real time data collection, indexing and search, as well as alerting, large scale distributed
processing, user authentication (through Splunk’s built-in system, LDAP or a scripted authentication
API for use with an external authentication system), and role-based access control.
CoreEngine
Business Intelligence
ODB
C
Capture wire data from endpoints
and key network locations with the
Splunk App for Stream.Wire data
Systems Management
Infrastructure Apps
XenApp
XenDesktop
Cloud Services
Examples
Mainframe
Other Monitoring
Splunk Premium Solutions
Server, Storage, Network
Examples

Splunkbase by the #’s
IOT
(49)
Application
Management
(158)
IT Ops
(381)
Security &
Compliance
(384)
Business
Analytics
(70)
Utilities
292
Cool Stuff
210

3030
Which input(s) do we NOT support today?
1) Text-based files
2) Windows sources
3) TCP / UDP ports
4) SMNP events
5) NetFlow
6) HTTP(S)
7) FIFO queues
8) Scripted inputs
9) Message queues
10)Modular inputs
11)Databases
12)External lookups
13)Wire data
14)SDK

3131
We support ALL of these inputs!
31

3232
Our focus today
1) Text-based files
2) Windows sources
3) TCP / UDP ports
4) SMNP events
5) NetFlow
6) HTTP(S)
7) FIFO queues
8) Scripted inputs
9) Message queues
10) Modular inputs
(specifically MINT)
11) Databases
12) External lookups
13) Wire data
14) SDK

3434
Http Event Collect
34
EC
HTTP or HTTPS POST
<protocol>://<host>:<mPort>/services/collector(/raw)
Indexer Search Head
Event Source(s)

3636
Splunk> Mobile Intelligence
36
MINT SDKs
MINT Data
Collector
MINT App
MINT Management Console

3838
Stream Concept
Users Search Head(s)* Indexer(s)
Universal Forwarder
+
TA
Physical Data Center
Physical or
Virtual ServersEnd Users
Internet
Firewall
Public or
Private Cloud
LOCAL COLLECTION
+
TA
+
splunk_app_stream

4343
Where to go to learn more
Data Pipeline
– http://goo.gl/FP3JTM
Distributed Deployment Manual
– http://goo.gl/MTJr0K
How Indexing works (the data pipeline)
– https://goo.gl/SGRC1y
Tutorial & tutorial data
– http://goo.gl/OYNCnc
Date and time format variables
– http://goo.gl/E9Onpq
43

4444
Resources: HTTP Event Collector
• Introduction to Splunk HTTP Event Collector (Developer Portal)
• Set up and use HTTP Event Collector (Docs)
• Troubleshooting HTTP Event Collector (Confluence)
• HTTP Event Collector, your DIRECT event pipe to Splunk 6.3
(Blogs: Tips & Tricks)
• Liberate Your Application Logging (.conf2015)
44
EC

4545
Resources: MINT
• Splunk MINT Manual (Docs)
• Start with Splunk MINT SDKs (Management Console)
• Getting Started with Splunk MINT (Blogs: Mobile)
• Splunk MINT: Security & Privacy (Blogs: Mobile)
• What's the difference between MINT Management Console and
the Splunk MINT App?
45

4646
Resources: Stream
• Performance test results and recommendations (Docs)
• Supported protocols (Docs)
• Splunk App for Stream 6.4 (TEC)
• Everything you always wanted to know about SPAN ports,
Network Taps, Packet Mirrors, and the Splunk App for Stream
(but were afraid to ask) (Blogs: Security)
• How Can You Use Ephemeral Streams? (Blogs: Tips & Tricks)
46

47
Northern Cal Tech Talks!
Monthly WebEx Sessions
• Ted Talk style presentation
• Q&A Chat forum
So what’s next on the agenda?
• March 23rd @ 10AM PST - Building &
Deploying Apps.
• April 20th @ 10AM PST - Top 5 most useful
search commands.
See more at:
http://live.splunk.com/NorCalTechTalks

48
SEPT 26-29, 2016
WALT DISNEY WORLD, ORLANDO
SWAN AND DOLPHIN RESORTS
• 5000+ IT & Business Professionals
• 3 days of technical content
• 165+ sessions
• 80+ Customer Speakers
• 35+ Apps in Splunk Apps Showcase
• 75+ Technology Partners
• 1:1 networking: Ask The Experts and Security
Experts, Birds of a Feather and Chalk Talks
• NEW hands-on labs!
• Expanded show floor, Dashboards Control
Room & Clinic, and MORE!
The 7th Annual Splunk Worldwide Users’ Conference
PLUS Splunk University
• Three days: Sept 24-26, 2016
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP
• Save thousands on Splunk education!

5353
Complex JSON
Configured settings
•Nested
•Multiple TimeStamp
fields
•Larger single event
Simple JSON
Default settings
•Structured
•TimeStamp found in first
event
•Smaller set of data
Complex JSON – Lessons Learned

Scalable Syslog Event Collection
Dedicated Syslog
Collector with Splunk
Forwarder
Splunk Forwarder
With Syslog Listener

Data Onboarding Breakout Session

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data Onboarding Breakout Session

Similar to Data Onboarding Breakout Session (20)

More from Splunk

More from Splunk (20)

Recently uploaded

Recently uploaded (20)

Data Onboarding Breakout Session

Editor's Notes