Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mondrian and OLAP Overview


Published on

Attached is an overview of Mondrian and OLAP, first presented at the RTP Pentaho User Group Q1 2012 Meetup.

Published in: Technology

Mondrian and OLAP Overview

  1. 1. Mondrian and OLAP Frontends RTP Pentaho User Group Q1 2012 Meetup
  2. 2. Pentaho Overview <ul><li>BI Server (Frontend for tools)
  3. 3. Report Designer (canned report designer)
  4. 4. Mondrian (Schema workbench, aggregate designer)
  5. 5. Data Integrator
  6. 6. Other ad-hoc tools (reporting)
  7. 7. Weka (predictive analytics) </li></ul>
  8. 8. Enterprise Extras <ul><li>Analyzer
  9. 9. Interactive Reporting
  10. 10. Dashboard Designer
  11. 11. Data Integration Scheduler
  12. 12. Support </li></ul>
  13. 14. OLAP? <ul><li>On-line Analytical Processing </li></ul><ul><ul><li>Designed for Analytics, not transactions </li></ul></ul><ul><li>ROLAP (Mondrian) </li></ul><ul><ul><li>Relational OLAP </li></ul></ul><ul><li>MOLAP (Palo) </li></ul><ul><ul><li>Multidimensional OLAP </li></ul></ul>
  14. 15. ROLAP <ul><li>Benefits </li></ul><ul><ul><li>Data is stored in a Kimball-style star schema </li><ul><li>Usable by all other tools (reporting, dashboards, etc.) </li></ul><li>“Cube” is stored in memory </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Performance while “cube” is being cached
  15. 16. Performance depending on backend database </li></ul></ul>
  16. 17. MOLAP <ul><li>Benefits </li></ul><ul><ul><li>Data stored in multidimensional format
  17. 18. Usually highly compressed </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Potentially long processing times to handle permutations
  18. 19. Higher cardinality (dimensions with millions of records) increases processing </li></ul></ul>
  19. 20. Mondrian Development Life-cycle
  20. 21. ROLAP Optimizations <ul><li>Columnar data stores </li></ul><ul><ul><li>Built for huge datasets in a conformed dimension format
  21. 22. Highly compresses and scales
  22. 23. Examples: LucidDB, Infobright, InfiniDB </li></ul></ul>
  23. 24. Mondrian Specific Optimizations <ul><li>Aggregate Designer </li></ul><ul><ul><li>Performs cost/benefit analysis on all permutations of data
  24. 25. Builds SQL queries that can be loaded into ETL or plugins (like with LucidDB) and run at set times </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Have to refresh aggregate data as new data comes in – will get stale otherwise!
  25. 26. Time to refresh is dependent on data set </li></ul></ul>
  26. 27. MDX <ul><li>MultiDimensional Expressions
  27. 28. “SQL” for OLAP
  28. 29. Open standard developed by Microsoft </li></ul>
  29. 30. MDX Source: