Optiq: A dynamic data management framework
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Optiq: A dynamic data management framework

on

  • 3,477 views

Optiq is a dynamic data management framework. In this talk, which I gave at the Pentaho Community Meetup in Sintra, Portugal in October, 2013, I describe how Optiq can be combined with Pentaho tools ...

Optiq is a dynamic data management framework. In this talk, which I gave at the Pentaho Community Meetup in Sintra, Portugal in October, 2013, I describe how Optiq can be combined with Pentaho tools such as Mondrian to build high-performance analytics on Hadoop, NoSQL, and heterogeneous big data systems.

Statistics

Views

Total Views
3,477
Views on SlideShare
1,625
Embed Views
1,852

Actions

Likes
6
Downloads
32
Comments
0

34 Embeds 1,852

http://todobi.blogspot.com.es 879
http://blogs.ambientelivre.com.br 291
http://todobi.blogspot.com 282
https://twitter.com 95
http://todobi.blogspot.mx 61
http://www.scoop.it 54
http://cloud.feedly.com 46
http://todobi.blogspot.com.ar 36
http://feeds.feedburner.com 25
http://todobi.blogspot.co.uk 12
http://todobi.blogspot.de 8
http://www.blogger.com 8
http://mucho95.rssing.com 7
http://newsblur.com 7
http://www.todobi.blogspot.com 5
http://todobi.blogspot.it 4
http://todobi.blogspot.in 4
http://feedreader.com 4
http://todobi.blogspot.com.br 4
http://www.todobi.blogspot.com.es 3
http://todobi.blogspot.fr 2
http://todobi.blogspot.jp 2
http://todobi.blogspot.ca 2
http://digg.com 1
http://translate.googleusercontent.com 1
http://todobi.blogspot.tw 1
http://reader.aol.com 1
http://todobi.blogspot.com.au 1
http://todobi.blogspot.ae 1
http://todobi.blogspot.ro 1
http://todobi.blogspot.fi 1
http://todobi.blogspot.ru 1
http://127.0.0.1 1
http://inoreader.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Optiq: A dynamic data management framework Presentation Transcript

  • 1. Optiq A dynamic data management framework @julianhyde Friday, October 4, 13
  • 2. 3 things 1. Databases are good. 2. It is hard to build analytics if your data is “all over the place.” 3. Optiq makes heterogeneous data look and behave more like a database. Friday, October 4, 13
  • 3. Mondrian on Optiq on hybrid data + memory Friday, October 4, 13
  • 4. Big Data Friday, October 4, 13
  • 5. Friday, October 4, 13
  • 6. “Data all over the place” • Different locations (HDFS, memory, DBMS) • Different formats • Different workloads • Transactional: Mainly write, targeted read • Analytic: Mainly read (bulk write) • Query latency: Interactive vs. batch • Data latency: Can we show out-of-date data? Friday, October 4, 13
  • 7. Databases are good • Central point to manage data • Simple, standard API for apps • Powerful modeling techniques (e.g. star schemas) • Data independence (i.e. tune your data after you write your application) • Query optimization Friday, October 4, 13
  • 8. Optiq Friday, October 4, 13
  • 9. Conventional DB architecture Friday, October 4, 13
  • 10. Optiq architecture Friday, October 4, 13
  • 11. Examples Friday, October 4, 13
  • 12. Example #1: CSV • Uses CSV adapter (optiq-csv) • Demo using sqlline • Easy to run this for yourself: $ git clone https://github.com/julianhyde/optiq-csv $ cd optiq-csv $ mvn install $ ./sqlline Friday, October 4, 13
  • 13. More adapters Adapters Embedded Planned CSV Cascading (Lingual) HBase (Phoenix) JDBC Apache Drill Spark MongoDB Cassandra Splunk Mondrian linq4j Friday, October 4, 13
  • 14. Example #2: Splunk + MySQL SELECT p.product_name, COUNT(*) AS c FROM splunk.splunk AS s JOIN mysql.products AS p ON s.product_id = p.product_id WHERE s.action = 'purchase' GROUP BY p.product_name ORDER BY c DESC Friday, October 4, 13
  • 15. Expression tree Friday, October 4, 13
  • 16. Optimized tree Friday, October 4, 13
  • 17. Analytics on heterogeneous data Friday, October 4, 13
  • 18. Simple analytics problem • 100M U.S. census records • 1KB each record, 100GB total • 4 SATA3 disks, total 1.2GB/s • How to count all records in under 5s? Friday, October 4, 13
  • 19. Simple analytics problem • 100M U.S. census records • 1KB each record, 100GB total • 4 SATA3 disks, total 1.2GB/s • How to count all records in under 5s? Friday, October 4, 13
  • 20. Simple analytics problem • 100M U.S. census records • 1KB each record, 100GB total • 4 SATA3 disks, total 1.2GB/s • How to count all records in under 5s? • Not possible?! It takes 80s just to read the data. Friday, October 4, 13
  • 21. Solution: Cheat! Friday, October 4, 13
  • 22. Solution: Cheat! • Compress data • Column-oriented storage • Store data in sorted order • Put data in memory • Cache previous query results • Pre-compute (materialize) aggregates Friday, October 4, 13
  • 23. How Optiq helps you to cheat Friday, October 4, 13
  • 24. How Optiq helps you to cheat • Materialized views • Pre-defined aggregate tables • Cached query results = In-memory tables • Smart cache maintenance • Quickly bring materializations online & offline • Materializations over a subset of the data • Spark distributed, in-memory processing & cache • Application thinks it is talking to a single SQL database Friday, October 4, 13
  • 25. Mondrian on Optiq on hybrid data + memory Friday, October 4, 13
  • 26. Summary Friday, October 4, 13
  • 27. 3 things (reprise) 1. Databases are good. Especially the flexibility that SQL gives us. 2. It is hard to build analytics if your data is all over the place. Different workloads (operational vs. analytic, small write vs. bulk read) require different data structures. 3. Optiq is not a database. But Optiq creates a federated data architecture that performs well, and looks like a database to your tools. Friday, October 4, 13
  • 28. @julianhyde optiq https://github.com/julianhyde/optiq mondrian http://mondrian.pentaho.com blog http://julianhyde.blogspot.com Friday, October 4, 13