Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to integrate Splunk with any data solution

8,469 views

Published on

A presentation Julian Hyde gave to the Splunk 2012 User conference in Las Vegas, Tue 2012/9/11. Julian demonstrated a new technology called Optiq, described how it could be used to integrate data in Splunk with other systems, and demonstrated several queries accessing data in Splunk via SQL and JDBC.

Published in: Technology
  • Be the first to comment

How to integrate Splunk with any data solution

  1. 1. Copyright © 2012 Splunk Inc.
  2. 2. How to Integrate Splunk with any Data SolutionJulian Hyde (Optiq) @julianhydehttp://github.com/julianhyde/optiqhttp://github.com/julianhyde/optiq-splunkSplunk Worldwide UsersConference 2012
  3. 3. Why are we here?Im going to explain how to use Splunk to access all of the data in your enterprise.And also to let people in your enterprise use data in Splunk.This isnt easy. Well be showing some raw technology – the new Optiq project and its Splunk adapter.But its open source, so you can all get your hands on it. :)
  4. 4. About me Database hacker Open source hacker Author of Mondrian (Pentaho Analysis) Startup fiend
  5. 5. http://www.flickr.com/photos/torkildr/3462606643
  6. 6. http://www.flickr.com/photos/sylvar/31436961/
  7. 7. “Big Data”Right data, right timeDiverse data sources / Performance / Suitable format
  8. 8. ExampleAccessing Splunk data via SQLSqlline (a standard JDBC client)
  9. 9. How do it (wrong) action = purchase “search” Splunk Optiq filterSELECT “source”, “product_id”FROM “splunk”.”splunk”WHERE “action” = purchase
  10. 10. How do it (right) “search action=purchase” Splunk OptiqSELECT “source”, “product_id”FROM “splunk”.”splunk”WHERE “action” = purchase
  11. 11. Example #2Combining data from 2 sources (Splunk & MySQL)Also possible: 3 or more sources; 3-way joins; unions
  12. 12. Expression tree SELECT p.“product_name”, COUNT(*) AS c FROM “splunk”.”splunk” AS s JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id” WHERE s.“action” = purchase GROUP BY p.”product_name” Splunk ORDER BY c DESC Table: splunk Key: product_name Key: product_id Agg: count Condition: Key: c DESC action = purchase scan join MySQL filter group sort scan Table: products
  13. 13. Expression tree SELECT p.“product_name”, COUNT(*) AS c FROM “splunk”.”splunk” AS s(optimized) JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id” WHERE s.“action” = purchase GROUP BY p.”product_name” Splunk ORDER BY c DESC Condition: Table: splunk action = purchase Key: product_name Agg: count Key: c DESC Key: product_id scan filter MySQL join group sort scan Table: products
  14. 14. Optiq is not a database.
  15. 15. http://www.flickr.com/photos/torkildr/3462606643
  16. 16. http://www.flickr.com/photos/telstra-corp/5069403309/
  17. 17. Conventional database architecture JDBC client JDBC server SQL parser / validator Metadata Query optimizer Data-flow operators Data Data
  18. 18. Optiq architecture JDBC client JDBC server Optional SQL parser / Metadata validator SPI Core Query Pluggable optimizer rules 3rd 3rd Pluggable party party ops ops 3rd party 3rd party data data
  19. 19. What is Optiq?A really, really smart JDBC driverFrameworkPotential core of a data management system
  20. 20. Writing an adapterDriver – if you want a vanity URL like “jdbc:splunk:”Schema – describes what tables exist (Splunk has just one)Table – what are the columns, and how to get the data. (Splunks table has any column you like... just ask for it.)Operators (optional) – non-relational operationsRules (optional, but recommended) – improve efficiency by changing the questionParser (optional) – to query via a language other than SQL
  21. 21. Splunk AdapterRules for pushing down filters, projectionsThe tricky bit: changed the validator to allow tables to have any columnTo be written: rules for pushing down aggregations, joins(What youve seen today is in github.)Would be really nice if... Splunk pushed down filters, projections, aggregations from its search pipeline to the MySQL connector. (Currently you have to hand-write a SQL statement.)
  22. 22. http://www.flickr.com/photos/walkercarpenter/4697637143/
  23. 23. Optiq roadmap ideasMondrian use Optiq to read from data sources such as SplunkKettle integration (read/write SQL to ETL)Adapters: Cascading, MongoDB, Hbase, Apache Drill, …?Front-ends: linq4j, Scala SLICK, Java8 streamsContributions
  24. 24. ConclusionsLiberate your data!Optiq is a frameworkBuild & share Optiq adapters
  25. 25. Questions?@julianhydehttp://julianhyde.blogspot.comhttp://github.com/julianhyde/optiqhttp://github.com/julianhyde/optiq-splunk
  26. 26. Additional material: The following queries were used in the demoselect s."source", s."sourcetype" select * from "mysql"."products"; from "splunk"."splunk" as s; select p."product_name",select s."source", s."sourcetype", s."action" s."action" from "splunk"."splunk" as s from "splunk"."splunk" as swhere s."action" = purchase; join "mysql"."products" as p on s."product_id" = p."product_id";select s."source", s."sourcetype", s."action" from

×