2. 2
Why Splunk?
NPR uses Splunk to cost-effectively measure the ebb and flow of its online
"listeners," evaluate the effectiveness of new programs and campaigns,
optimize resource allocation and content delivery and more accurately
account for revenue sharing and royalty payments.
Ubisoft depends on Splunk to make sure its online services stay up and running
for fans of popular games, and to better understand usage patterns and
optimize its services to keep game developers and players happy.
The IT team at Home Depot uses Splunk to optimize performance and quickly
resolve issues across the complex interconnected network of applications.
3. 3
Splunk Company Overview
3
Company
• Global HQs:
San Francisco
London
Hong Kong
• 2,000 employees globally
• FY14 Revenue: $450M
(YoY +50%)
• NASDAQ: SPLK
Products
Free trial to massive scale
Splunk Enterprise
Splunk Cloud
Splunk Light
Hunk
Splunk MINT
Customers
• 10,000+ customers
• Across 100+ countries
• Small to large orgs
• 80+ of the Fortune 100
• Largest: 400+ TB/day
Solutions Apps
ES
ITSI
UBA
VMWare
Exchange
PCI
4. 4
Our Plan of Action
4
1.Big Data - setting the stage
2.How does Splunk fit in the landscape?
3.What differentiates Splunk?
4.Components that make up Splunk?
5.Demo - How it works?
5. 5
The Accelerating Pace of Data
Volume | Velocity | Variety | Variability
GPS,
RFID,
Hypervisor,
Web Servers,
Email, Messaging,
Clickstreams, Mobile,
Telephony, IVR, Databases,
Sensors, Telematics, Storage,
Servers, Security Devices, Desktops
Machine data is the fastest growing, most
complex, most valuable area of big data
5
7. 7
Big Data Landscape
Key/Value, Columnar or
Other (semi-structured)
Cassandra
CouchDB
MongoDB
NoSQL
7
Relational Database
(highly structured)
SQL &
MapReduce
RDBMS
Oracle,
MySQL,
IBM DB2,
Teradata
Teradata Aster Data
SQL on Hadoop
Distributed File System
(semi-structured)
Hadoop
HDFS Storage +
MapReduce
Temporal, Unstructured
Heterogeneous
Real-Time Indexing
MapReduce
8. 8
Big Data Landscape
Key/Value, Columnar or
Other (semi-structured)
Cassandra
CouchDB
MongoDB
NoSQL
8
Relational Database
(highly structured)
SQL &
MapReduce
RDBMS
Oracle,
MySQL,
IBM DB2,
Teradata
Teradata Aster Data
SQL on Hadoop
Distributed File System
(semi-structured)
Hadoop
HDFS Storage +
MapReduce
Temporal, Unstructured
Heterogeneous
Real-Time Indexing
MapReduce
9. 9
Industry Leading Platform For Machine Data
Machine Data: Any Location, Type, Volume
Online
Services Web
Services
Servers
Security GPS
Location
Storage
Desktops
Networks
Packaged
Applications
Custom
ApplicationsMessaging
Telecoms
Online
Shopping
Cart
Web
Clickstreams
Databases
Energy
Meters
Call Detail
Records
Smartphones
and Devices
RFID
On-
Premises
Private
Cloud
Public
Cloud
Platform Support (Apps / API / SDKs)
Enterprise Scalability
Universal Indexing
Answer Any Question
Developer
Platform
Report
and
analyze
Custom
dashboards
Monitor
and alert
Ad hoc
search
Any amount, any location, any source
Schema-
on-the-fly
Universal
indexing
No
back-end
RDBMS
No need
to filter
data
11. 13
1.
2.
3.
4.
How to Get Started
Download
Install
Forward Data
Search
Databases
Networks
Servers
Virtual
Machines
Smart
phones
and
Devices
Custom
Applications
Security
WebServer
Sensors
Four steps:
12. 14
Demo – How it Works
14
1. Installing and Starting Splunk
2. Ingesting Data
3. Search Basics: Search Bar, Time Picker, Extracted Fields
4. Alerting
5. Statistics and Reporting
6. Dynamic Field Extraction
7. Command Language
8. Splunk Applications
16. 18
Things to Remember
18
1. Splunk is Free – Download and get started today
2. Quick Time to Value
3. Data Gold Mines – what informational fortune awaits?!
4. Leverage the Splunk Community
• splunkbase.splunk.com
• answers.splunk.com
• blogs.splunk.com
5. Happy Splunking!
18. 20
We Want to Hear your Feedback!
After the Breakout Sessions conclude
Text Splunk to 20691
And be entered for a chance to win a $100 AMEX gift card!
[Why Splunk? What makes it special? SE presenting can tell their own story]
What gets me excited about Splunk? It’s disruptive! It’s impactful! In my 11 years of working with Enterprise software I have never seen any other other software that invigorates, inspires, and resonates with customers, myself included. So I figured I why not share some of Splunk’s testimonials.
Splunk has more than 2000 employees worldwide, with our global headquarters in San Francisco. Our 10,000 customers in 100 countries are using Splunk software to improve service levels, reduce operations costs, mitigate security risks, enable compliance, enhance DevOps collaboration and create new product and service offerings.
Our products are designed to fit your needs and are built to be as frictionless to deploy as possible. Simple download Splunk software, point it at your data, and you’ll up and running in minutes.
Please always refer to latest company data found here: http://www.splunk.com/company.
Data is growing and embodies new characteristics not found in traditional structured data: Volume, Velocity, Variety, Variability.
"Big data" is a term applied to these expanding data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time.
Machine data is one of the fastest, growing, most complex and most valuable segments of big data and embodies new characteristics not found in traditional structured data terms of Volume, Velocity, Variety, Variability.
All the webservers, applications, network devices – all of the technology infrastructure running an enterprise or organization – generates massive streams of data, digital exhaust per say. It comes in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner.
So why is this “machine data” valuable? Because it contains a trace - a categorical record - of user behavior, cyber-security risks, application behavior, service levels, fraudulent activity and customer experiences.
Splunk’s mission is to make YOUR machine data accessible, usable and valuable to everyone. It’s this overarching mission that drives our company and products that we deliver.
How has big data evolved over time. For a long time, ‘big data’ was was simply a large database.
The database industry – in order to handle large data – moved to smaller databases, but many of them. Horizontal partitioning (Also known as Sharding) is a database design principle whereby rows of a database table are held separately (For example, A -> D in one database E -> H in a second database, etc ..)
Hadoop was introduced by Google and was adapted as the de-facto big data system. Hadoop is an open source project from Apache that has evolved rapidly into a major technology movement. It has emerged as a popular way to handle massive amounts of data, including structured and complex unstructured data. Its popularity is due in part to its ability to store and process large amounts of data effectively across clusters of commodity hardware, particularly cheaply. Apache Hadoop is not actually a single product but instead a collection of several components. For the most part, Hadoop is a batch oriented system.
** Teradata Aster Data & SQL on Hadoop are SQL interface systems that can talk to Hadoop
** Cassandra & HBase are NoSQL databases that can process data using a Key / Value in real-time.
Splunk = Temporal, Unstructured, Heterogeneous, real-time analytics platform.
Besides relational databases, the technologies leverage a form of MapReduce – which is a programming model for processing and generating large data sets. So we’ll dig deeper in a bit to see what truly differentiates Splunk.
Splunk can also enrich your machine data with several types of external data sources, included are databases, Hadoop, and NoSQL data stores.
How has big data evolved over time. For a long time, ‘big data’ was was simply a large database.
The database industry – in order to handle large data – moved to smaller databases, but many of them. Horizontal partitioning (Also known as Sharding) is a database design principle whereby rows of a database table are held separately (For example, A -> D in one database E -> H in a second database, etc ..)
Hadoop was introduced by Google and was adapted as the de-facto big data system. Hadoop is an open source project from Apache that has evolved rapidly into a major technology movement. It has emerged as a popular way to handle massive amounts of data, including structured and complex unstructured data. Its popularity is due in part to its ability to store and process large amounts of data effectively across clusters of commodity hardware, particularly cheaply. Apache Hadoop is not actually a single product but instead a collection of several components. For the most part, Hadoop is a batch oriented system.
** Teradata Aster Data & SQL on Hadoop are SQL interface systems that can talk to Hadoop
** Cassandra & HBase are NoSQL databases that can process data using a Key / Value in real-time.
Splunk = Temporal, Unstructured, Heterogeneous, real-time analytics platform.
Besides relational databases, the technologies leverage a form of MapReduce – which is a programming model for processing and generating large data sets. So we’ll dig deeper in a bit to see what truly differentiates Splunk.
Splunk can also enrich your machine data with several types of external data sources, included are databases, Hadoop, and NoSQL data stores.
Splunk is able to do this because there’s no requirement to “understand” the data upfront – this is one of our key differentiators that we call “schema on the fly”.
Simply point Splunk at the data or deploy Splunk forwarders to stream data from remote systems. Splunk immediately starts collecting and indexing, so users can start searching and analyzing. No more armies of consultants, backend database or DBA to make it work. Once you’ve Splunked your data, it is time-stamped and easily searchable. Because we don’t have to do all the up front work to be able to look at the data we can load it all and make it all relevant. There’s no need to limit what you load and what you don’t.
Getting data into Splunk is designed to be as flexible and easy as possible. In most cases you’ll find that no configuration is required; you just have to determine what data to collect and which method you want to use to get it into Splunk.
Splunk is THE universal machine data platform. It goes beyond ingesting just log files, ingesting data from syslog, scripts, system events, API’s, even wire data!
The result is beautifully indexed time-based series events, previously in disparate silos that can now be cross-correlated and made accessible to everyone your organization.
Notice here that we are ingesting local files, data from syslogs, output from scripts and even wire data. Let’s see how the Splunk platform supports all this data collection.
Three major tiers and components of Splunk Distribution
Data Collection Layer -> The star of the show here is Splunk’s Universal Forwarder.
Data Indexing Layer -> The Data Layer’s job is to collect and/or forward data to the Data Indexing Layer - Powered by Splunk Indexers. Indexers are containers of indexes, logical containers for data to reside in.
Data Presentation Layer -> Powered by Search Heads is responsible for distributing searches to the indexing layer, aggregate the final results, and present it to the end user.
Viewing the data -> No special or custom client needed! Simply use your favorite browser and point to your Search Head.
Now, in modestly small deployments the data indexing and searching will be done with the same Splunk Instance.
It only takes minutes to download and install Splunk on the platform of your choice, bringing you fast time to value. Once Splunk has been downloaded and installed the next step is to get data into a Splunk instance. The data then becomes searchable from a single place! Since Splunk stores only a copy of the raw data, searches won’t affect the end devices data comes from. Having a central place to search your data not only simplifies things, it also decreases risk since a user doesn’t have to log into the end devices.
Splunk can be installed on a single small instance, such as a laptop, or installed on multiple servers to scale as needed. The ability to scale from a single desktop to an enterprise is another of our key differentiators. When installed on multiple servers the functions can be split up to meet any performance, security, or availability requirements.
Start up a brand new Splunk
Have a ready data set, typically use tutorial
Literally drag and drop.
Go back to components, what make them up
Run two manual queries, paints picture of we can do.
Patterns
Create a data model (Use instant pivot)
Create output
Do something completely impressive. (create party on third party system, 3d graph, alert, something tangible outside of Splunk)
Highlight best Splunk 6 features, add data, patterns, instant pivot,
Data is growing and embodies new characteristics not found in traditional structured data: Volume, Velocity, Variety, Variability.
"Big data" is a term applied to these expanding data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time.
Machine data is one of the fastest, growing, most complex and most valuable segments of big data and embodies new characteristics not found in traditional structured data terms of Volume, Velocity, Variety, Variability.
All the webservers, applications, network devices – all of the technology infrastructure running an enterprise or organization – generates massive streams of data, digital exhaust per say. It comes in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner.
So why is this “machine data” valuable? Because it contains a trace - a categorical record - of user behavior, cyber-security risks, application behavior, service levels, fraudulent activity and customer experiences.
We’re headed to the East Coast!
2 inspired Keynotes – General Session and Security Keynote + Super Sessions with Splunk Leadership in Cloud, IT Ops, Security and Business Analytics!
165+ Breakout sessions addressing all areas and levels of Operational Intelligence – IT, Business Analytics, Mobile, Cloud, IoT, Security…and MORE!
30+ hours of invaluable networking time with industry thought leaders, technologists, and other Splunk Ninjas and Champions waiting to share their business wins with you!
Join the 50%+ of Fortune 100 companies who attended .conf2015 to get hands on with Splunk. You’ll be surrounded by thousands of other like-minded individuals who are ready to share exciting and cutting edge use cases and best practices. You can also deep dive on all things Splunk products together with your favorite Splunkers.
Head back to your company with both practical and inspired new uses for Splunk, ready to unlock the unimaginable power of your data! Arrive in Orlando a Splunk user, leave Orlando a Splunk Ninja!
REGISTRATION OPENS IN MARCH 2016 – STAY TUNED FOR NEWS ON OUR BEST REGISTRATION RATES – COMING SOON!