How to integrate Splunk with any data solution

How to Integrate Splunk
with any Data Solution
Julian Hyde (Optiq) @julianhyde

http://github.com/julianhyde/optiq
http://github.com/julianhyde/optiq-splunk

Splunk Worldwide Users
Conference 2012

Why are we here?
I'm going to explain how to use Splunk to access all of the data in your
enterprise.
And also to let people in your enterprise use data in Splunk.
This isn't easy. We'll be showing some raw technology – the new Optiq
project and its Splunk adapter.
But it's open source, so you can all get your hands on it. :)

About me
Database hacker
Open source hacker
Author of Mondrian (Pentaho Analysis)
Startup fiend

http://www.flickr.com/photos/torkildr/3462606643

http://www.flickr.com/photos/sylvar/31436961/

“Big Data”
Right data, right time
Diverse data sources / Performance / Suitable format

Example
Accessing Splunk data via SQL
Sqlline (a standard JDBC client)

How do it (wrong)
action =
'purchase'
“search”

Splunk Optiq filter

SELECT “source”, “product_id”
FROM “splunk”.”splunk”
WHERE “action” = 'purchase'

How do it (right)
“search
action=purchase”

Splunk Optiq

SELECT “source”, “product_id”
FROM “splunk”.”splunk”
WHERE “action” = 'purchase'

Example #2
Combining data from 2 sources (Splunk & MySQL)
Also possible: 3 or more sources; 3-way joins; unions

Expression tree SELECT p.“product_name”, COUNT(*) AS c
FROM “splunk”.”splunk” AS s
JOIN “mysql”.”products” AS p
ON s.”product_id” = p.”product_id”
WHERE s.“action” = 'purchase'
GROUP BY p.”product_name”
Splunk ORDER BY c DESC

Table: splunk
Key: product_name
Key: product_id Agg: count
Condition: Key: c DESC
action =
'purchase'
scan
join
MySQL filter group sort
scan
Table: products

Expression tree SELECT p.“product_name”, COUNT(*) AS c
FROM “splunk”.”splunk” AS s
(optimized) JOIN “mysql”.”products” AS p
ON s.”product_id” = p.”product_id”
WHERE s.“action” = 'purchase'
GROUP BY p.”product_name”
Splunk ORDER BY c DESC
Condition:
Table: splunk action =
'purchase' Key: product_name
Agg: count
Key: c DESC
Key: product_id
scan filter

MySQL
join group sort
scan
Table: products

http://www.flickr.com/photos/telstra-corp/5069403309/

Conventional database architecture
JDBC client

JDBC server
SQL parser /
validator Metadata
Query
optimizer
Data-flow
operators

Data Data

Optiq architecture
JDBC client

JDBC server
Optional SQL parser / Metadata
validator SPI
Core Query Pluggable
optimizer rules
3rd 3rd
Pluggable party party
ops ops
3rd party 3rd party
data data

What is Optiq?
A really, really smart JDBC driver
Framework
Potential core of a data management system

Writing an adapter
Driver – if you want a vanity URL like “jdbc:splunk:”
Schema – describes what tables exist (Splunk has just one)
Table – what are the columns, and how to get the data. (Splunk's table has
any column you like... just ask for it.)
Operators (optional) – non-relational operations
Rules (optional, but recommended) – improve efficiency by changing the
question
Parser (optional) – to query via a language other than SQL

Splunk Adapter
Rules for pushing down filters, projections
The tricky bit: changed the validator to allow tables to have any column
To be written: rules for pushing down aggregations, joins
(What you've seen today is in github.)

Would be really nice if... Splunk pushed down filters, projections,
aggregations from its search pipeline to the MySQL connector.
(Currently you have to hand-write a SQL statement.)

http://www.flickr.com/photos/walkercarpenter/4697637143/

Optiq roadmap ideas
Mondrian use Optiq to read from data sources such as Splunk
Kettle integration (read/write SQL to ETL)
Adapters: Cascading, MongoDB, Hbase, Apache Drill, …?
Front-ends: linq4j, Scala SLICK, Java8 streams
Contributions

Conclusions
Liberate your data!
Optiq is a framework
Build & share Optiq adapters

Questions?

@julianhyde
http://julianhyde.blogspot.com
http://github.com/julianhyde/optiq
http://github.com/julianhyde/optiq-splunk

Additional material: The following queries were used in the demo

select s."source", s."sourcetype" select * from "mysql"."products";
from "splunk"."splunk" as s;

select p."product_name",
select s."source", s."sourcetype", s."action"
s."action" from
"splunk"."splunk" as s from "splunk"."splunk" as s

where s."action" = 'purchase'; join "mysql"."products" as p
on s."product_id" =
p."product_id";
select s."source", s."sourcetype",
s."action" from

How to integrate Splunk with any data solution

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to How to integrate Splunk with any data solution

Similar to How to integrate Splunk with any data solution (20)

More from Julian Hyde

More from Julian Hyde (20)

Recently uploaded

Recently uploaded (20)

How to integrate Splunk with any data solution

Editor's Notes