SlideShare a Scribd company logo
1 of 21
Mondriaan update
Pentaho community meetup
 Amsterdam
September 2012

@julianhyde
Agenda
Mondrian 4 – beta
Other new stuff


(Yahoo)
Mondrian 4 – What's new?
Attributes


Measure groups


Physical schema


Internals
Richer semantic model
Physical schema:

    Only define attributes and relationships once

    Compound keys


Attribute hierarchies


Hierarchies & attributes grouped into dimensions

    E.g. Customers dimension contains Customer hierarchy
    (State-City-Customer) and Age, Gender, Salary attribute
    hierarchies
Measure groups
In Mondrian 3.x, if you want a cube with multiple
  fact tables, you build a virtual cube:

  <Cube name=“Sales”>
    <Table name=“sales_fact”/>
 </Cube>
 <Cube name=“Warehouse”>
    <Table name=“warehouse_fact”/>
 </Cube>
 <VirtualCube name=“Warehouse and Sales”>
    <Cube name=“Sales”/>
    <Cube name=“Warehouse”/>
 </VirtualCube>
Measure groups (2)
In Mondrian 4, cubes can contain       <Cube name=“Warehouse and Sales”>
                                          <MeasureGroups>
   multiple measure groups                   <MeasureGroup name=“Sales”>
                                                <Table name=“sales_fact”/>
                                                <Measure name=“unit_sales”/>
                                             </MeasureGroup>
                                             <MeasureGroup name=“Warehouse”>
Virtual cubes are obsolete                      <Table name=“warehousee_fact”/>
                                                <Measure name=“inventory_units”/>
                                             </MeasureGroup>
                                          </MeasureGroups>
                                       </Cube>
Many-to-many association between
  measure groups and dimensions


Different ways to link dimensions to                        Sales        Warehouse

fact tables                                   Time          X            X

                                              Product       X            X
Aggregate tables are measure groups           Customer      X

                                              Warehouse                  X
Gone / Replacements

Mondrian 3 schema         Mondrian 4 Schema
                          Schema upgrader
Aggregate recognizer      Aggregate table API
                          (define / enable /
                          disable)
Schema workbench          Pentaho modeler?
XMLA server               olap4j-xmlaserver
                          @github
Hierarchy syntax          SSAS-style syntax
   [Time.Weekly].[Day]      [Time].[Weekly].[Day]
   [Time].[Month]           [Time].[Time].[Month]
Done / Remaining

The important things   Ragged hierarchies
work!
Schema converter       Analyzer upgrade
2511 of 2770 tests     Aggregate table API
pass
                       Complex schema
                       mappings
Beta
1. Download from CI
http://ci.pentaho.com/view/Analysis/job/mondrian-git-4.0/


2. Run Mondrian-4 on your current schema

    Auto-upgrade

    Schema converter tool TBA

    MDX syntax differences
    mondrian.olap.SsasCompatibleNaming=true


3. Write a new-style schema


4. Log bugs!
Futures
“Mondrian in Action” book
Publish date: Spring 2013


Join the early-access program:
  http://www.manning.com/back/
Future features
Shelved aggregate tables


Connections

    Defined in schema

    Multiple connections

    Non-JDBC databases


Advanced SQL generation
Regular aggregate table
Shelved aggregate table
Aggregate table API – some ideas

    Define

    Enable

    Disable

    Specify beginning/end of valid range

    Kettle can tell Mondrian that aggregate table is
    no longer valid

    Kettle can ask Mondrian to tell it when it has
    finished using an aggregate table
Multiple connections in schema
          <Schema name='FoodMart'>
               <Connections>
                 <Connection name='default' default='true' uuid='abcd-1234'>
                   <Jdbc>jdbc:mysql://localhost/foodmart?
             characterEncoding=latin1&lt;/Jdbc>
                   <JdbcUser>foodmart</JdbcUser>
                   <JdbcPassword>foodmart</JdbcPassword>
                 </Connection>
                 <Connection name='aggs' default='false' uuid='abcd-2345'>
                   <Jdbc>jdbc:mysql://localhost/foodmartAggs?
             characterEncoding=latin1</Jdbc>
                   <JdbcUser>foodmartAggs</JdbcUser>
                   <Properties>
                     <Property name='prop1'>value1</Property>
                     <Property name='prop2'>value2</Property>
                   </Properties>
                 </Connection>
               </Connections>




    Cannot join tables from different connections

    Also: non-JDBC connection (via SPI or Optiq)
Advanced SQL generation

    Access control

    Killing big IN lists

    Push down aggregates (esp. time ranges)

    Need a new strategy... TBD
Summary
Mondrian 4 – A major improvement to Mondrian
 model & engine


As compatible as possible


Will enable further improvements in performance
 / flexibility in upcoming releases


Help us test it, and get it to production quality
 faster
Questions?
@julianhyde


jhyde@pentaho.com


http://julianhyde.blogspot.com


https://github.com/julianhyde/

More Related Content

What's hot

Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Julian Hyde
 
SQL for NoSQL and how Apache Calcite can help
SQL for NoSQL and how  Apache Calcite can helpSQL for NoSQL and how  Apache Calcite can help
SQL for NoSQL and how Apache Calcite can helpChristian Tzolov
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.Julian Hyde
 
Drill / SQL / Optiq
Drill / SQL / OptiqDrill / SQL / Optiq
Drill / SQL / OptiqJulian Hyde
 
Discardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With HadoopDiscardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With HadoopJulian Hyde
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Optiq: a SQL front-end for everything
Optiq: a SQL front-end for everythingOptiq: a SQL front-end for everything
Optiq: a SQL front-end for everythingJulian Hyde
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllMichael Mior
 
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache CalciteA smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache CalciteJulian Hyde
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartEvans Ye
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Julian Hyde
 
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache CalciteOpen Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache CalciteJulian Hyde
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Michael Rys
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Michael Rys
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Michael Rys
 
Azure Data Lake and U-SQL
Azure Data Lake and U-SQLAzure Data Lake and U-SQL
Azure Data Lake and U-SQLMichael Rys
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteChris Baynes
 
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Jason L Brugger
 
U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)Michael Rys
 

What's hot (20)

Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
SQL for NoSQL and how Apache Calcite can help
SQL for NoSQL and how  Apache Calcite can helpSQL for NoSQL and how  Apache Calcite can help
SQL for NoSQL and how Apache Calcite can help
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
 
Drill / SQL / Optiq
Drill / SQL / OptiqDrill / SQL / Optiq
Drill / SQL / Optiq
 
Discardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With HadoopDiscardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With Hadoop
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Azure data lake sql konf 2016
Azure data lake   sql konf 2016Azure data lake   sql konf 2016
Azure data lake sql konf 2016
 
Optiq: a SQL front-end for everything
Optiq: a SQL front-end for everythingOptiq: a SQL front-end for everything
Optiq: a SQL front-end for everything
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them All
 
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache CalciteA smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smart
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
 
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache CalciteOpen Source SQL - beyond parsers: ZetaSQL and Apache Calcite
Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
Azure Data Lake and U-SQL
Azure Data Lake and U-SQLAzure Data Lake and U-SQL
Azure Data Lake and U-SQL
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
 
U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)
 

Viewers also liked

Intridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitIntridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitOpen Analytics
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summitOpen Analytics
 
DataCleaner API and extensibility
DataCleaner API and extensibilityDataCleaner API and extensibility
DataCleaner API and extensibilityKasper Sørensen
 
BI A Practical Perspective - By Team Computers
BI A Practical Perspective - By Team ComputersBI A Practical Perspective - By Team Computers
BI A Practical Perspective - By Team ComputersDhiren Gala
 
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de servicios
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de serviciosJornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de servicios
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de serviciosANTONIO ALONSO
 
Open analytics summit nyc
Open analytics summit nycOpen analytics summit nyc
Open analytics summit nycOpen Analytics
 
Los profesionales BI y su formación
Los profesionales BI y su formaciónLos profesionales BI y su formación
Los profesionales BI y su formaciónUOC Sede de Madrid
 
Estado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCEstado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCJosep Curto
 
No sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNo sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNicholas Goodman
 
Casos prácticos en la vida de un profesional del BI
Casos prácticos en la vida de un profesional del BICasos prácticos en la vida de un profesional del BI
Casos prácticos en la vida de un profesional del BIUOC Sede de Madrid
 
2014 Open Source Business Intelligence tips, tricks and more stuff
2014 Open Source  Business Intelligence tips, tricks and more stuff2014 Open Source  Business Intelligence tips, tricks and more stuff
2014 Open Source Business Intelligence tips, tricks and more stuffStratebi
 
ETL Market Webcast
ETL Market WebcastETL Market Webcast
ETL Market Webcastmark madsen
 
Big Data y Social Intelligence en el Sector Turismo
Big Data y Social Intelligence en el Sector TurismoBig Data y Social Intelligence en el Sector Turismo
Big Data y Social Intelligence en el Sector TurismoStratebi
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming hongbin ma
 
Presentacion de Jedox (Planning and Forecasting) with Business Intelligence
Presentacion de Jedox (Planning and Forecasting) with Business IntelligencePresentacion de Jedox (Planning and Forecasting) with Business Intelligence
Presentacion de Jedox (Planning and Forecasting) with Business IntelligenceStratebi
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA SummitOpen Analytics
 
Great Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceGreat Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceStratebi
 

Viewers also liked (20)

Ikanow oanyc summit
Ikanow oanyc summitIkanow oanyc summit
Ikanow oanyc summit
 
Intridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitIntridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC Summit
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
 
DataCleaner API and extensibility
DataCleaner API and extensibilityDataCleaner API and extensibility
DataCleaner API and extensibility
 
BI A Practical Perspective - By Team Computers
BI A Practical Perspective - By Team ComputersBI A Practical Perspective - By Team Computers
BI A Practical Perspective - By Team Computers
 
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de servicios
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de serviciosJornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de servicios
Jornada UOC Madrid 2014 BI & BIg Data. Experiencia de una compañia de servicios
 
Open analytics summit nyc
Open analytics summit nycOpen analytics summit nyc
Open analytics summit nyc
 
Los profesionales BI y su formación
Los profesionales BI y su formaciónLos profesionales BI y su formación
Los profesionales BI y su formación
 
Estado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCEstado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOC
 
No sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNo sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architectures
 
On Demand BI
On Demand BIOn Demand BI
On Demand BI
 
Casos prácticos en la vida de un profesional del BI
Casos prácticos en la vida de un profesional del BICasos prácticos en la vida de un profesional del BI
Casos prácticos en la vida de un profesional del BI
 
2014 Open Source Business Intelligence tips, tricks and more stuff
2014 Open Source  Business Intelligence tips, tricks and more stuff2014 Open Source  Business Intelligence tips, tricks and more stuff
2014 Open Source Business Intelligence tips, tricks and more stuff
 
ETL Market Webcast
ETL Market WebcastETL Market Webcast
ETL Market Webcast
 
Big Data y Social Intelligence en el Sector Turismo
Big Data y Social Intelligence en el Sector TurismoBig Data y Social Intelligence en el Sector Turismo
Big Data y Social Intelligence en el Sector Turismo
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming
 
Presentacion de Jedox (Planning and Forecasting) with Business Intelligence
Presentacion de Jedox (Planning and Forecasting) with Business IntelligencePresentacion de Jedox (Planning and Forecasting) with Business Intelligence
Presentacion de Jedox (Planning and Forecasting) with Business Intelligence
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA Summit
 
BI Presentation
BI PresentationBI Presentation
BI Presentation
 
Great Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceGreat Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open Source
 

Similar to Mondrian update (Pentaho community meetup 2012, Amsterdam)

Learning Open Source Business Intelligence
Learning Open Source Business IntelligenceLearning Open Source Business Intelligence
Learning Open Source Business IntelligenceSaltmarch Media
 
Understand when to use user defined functions in sql server tech-republic
Understand when to use user defined functions in sql server   tech-republicUnderstand when to use user defined functions in sql server   tech-republic
Understand when to use user defined functions in sql server tech-republicKaing Menglieng
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)qhzhou
 
Introduction To Pentaho Analysis
Introduction To Pentaho AnalysisIntroduction To Pentaho Analysis
Introduction To Pentaho Analysispentaho Content
 
Introduction To Pentaho Analysis
Introduction To Pentaho AnalysisIntroduction To Pentaho Analysis
Introduction To Pentaho AnalysisDataminingTools Inc
 
Session 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufSession 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufCTE Solutions Inc.
 
Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008Eduardo Castro
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with SupersetDataWorks Summit
 
Evolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkEvolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkPaolo Mottadelli
 
Use Oracle 9i Summary Advisor To Better Manage Your Data Warehouse
Use Oracle 9i Summary Advisor To Better Manage Your Data WarehouseUse Oracle 9i Summary Advisor To Better Manage Your Data Warehouse
Use Oracle 9i Summary Advisor To Better Manage Your Data Warehouseinfo_sunrise24
 
Effective Java. By materials of Josch Bloch's book
Effective Java. By materials of Josch Bloch's bookEffective Java. By materials of Josch Bloch's book
Effective Java. By materials of Josch Bloch's bookRoman Tsypuk
 
SetFocus SQL Portfolio
SetFocus SQL PortfolioSetFocus SQL Portfolio
SetFocus SQL Portfoliogeometro17
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Transforming Feature Ideas into Machine Learning Inputs
Transforming Feature Ideas into Machine Learning InputsTransforming Feature Ideas into Machine Learning Inputs
Transforming Feature Ideas into Machine Learning InputsFeatureByte
 
materialized view description presentation
materialized view description presentationmaterialized view description presentation
materialized view description presentationdbmanhero
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Whatsnew in-my sql-primary
Whatsnew in-my sql-primaryWhatsnew in-my sql-primary
Whatsnew in-my sql-primaryKaizenlogcom
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 

Similar to Mondrian update (Pentaho community meetup 2012, Amsterdam) (20)

Learning Open Source Business Intelligence
Learning Open Source Business IntelligenceLearning Open Source Business Intelligence
Learning Open Source Business Intelligence
 
Understand when to use user defined functions in sql server tech-republic
Understand when to use user defined functions in sql server   tech-republicUnderstand when to use user defined functions in sql server   tech-republic
Understand when to use user defined functions in sql server tech-republic
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
 
Introduction To Pentaho Analysis
Introduction To Pentaho AnalysisIntroduction To Pentaho Analysis
Introduction To Pentaho Analysis
 
Introduction To Pentaho Analysis
Introduction To Pentaho AnalysisIntroduction To Pentaho Analysis
Introduction To Pentaho Analysis
 
Session 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufSession 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian Malbeuf
 
Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
EVOLVE'13 | Enhance | Ecommerce Framework | Paolo Mottadelli
EVOLVE'13 | Enhance | Ecommerce Framework | Paolo MottadelliEVOLVE'13 | Enhance | Ecommerce Framework | Paolo Mottadelli
EVOLVE'13 | Enhance | Ecommerce Framework | Paolo Mottadelli
 
Evolve13 cq-commerce-framework
Evolve13 cq-commerce-frameworkEvolve13 cq-commerce-framework
Evolve13 cq-commerce-framework
 
Use Oracle 9i Summary Advisor To Better Manage Your Data Warehouse
Use Oracle 9i Summary Advisor To Better Manage Your Data WarehouseUse Oracle 9i Summary Advisor To Better Manage Your Data Warehouse
Use Oracle 9i Summary Advisor To Better Manage Your Data Warehouse
 
Effective Java. By materials of Josch Bloch's book
Effective Java. By materials of Josch Bloch's bookEffective Java. By materials of Josch Bloch's book
Effective Java. By materials of Josch Bloch's book
 
SetFocus SQL Portfolio
SetFocus SQL PortfolioSetFocus SQL Portfolio
SetFocus SQL Portfolio
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Transforming Feature Ideas into Machine Learning Inputs
Transforming Feature Ideas into Machine Learning InputsTransforming Feature Ideas into Machine Learning Inputs
Transforming Feature Ideas into Machine Learning Inputs
 
materialized view description presentation
materialized view description presentationmaterialized view description presentation
materialized view description presentation
 
C-Project Report-SSRS
C-Project Report-SSRSC-Project Report-SSRS
C-Project Report-SSRS
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Whatsnew in-my sql-primary
Whatsnew in-my sql-primaryWhatsnew in-my sql-primary
Whatsnew in-my sql-primary
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 

More from Julian Hyde

Building a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteBuilding a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteJulian Hyde
 
Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!Julian Hyde
 
Adding measures to Calcite SQL
Adding measures to Calcite SQLAdding measures to Calcite SQL
Adding measures to Calcite SQLJulian Hyde
 
Morel, a data-parallel programming language
Morel, a data-parallel programming languageMorel, a data-parallel programming language
Morel, a data-parallel programming languageJulian Hyde
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Julian Hyde
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query LanguageJulian Hyde
 
The evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityThe evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityJulian Hyde
 
What to expect when you're Incubating
What to expect when you're IncubatingWhat to expect when you're Incubating
What to expect when you're IncubatingJulian Hyde
 
Efficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databasesEfficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databasesJulian Hyde
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!Julian Hyde
 
Spatial query on vanilla databases
Spatial query on vanilla databasesSpatial query on vanilla databases
Spatial query on vanilla databasesJulian Hyde
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
 
Lazy beats Smart and Fast
Lazy beats Smart and FastLazy beats Smart and Fast
Lazy beats Smart and FastJulian Hyde
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 
Data profiling with Apache Calcite
Data profiling with Apache CalciteData profiling with Apache Calcite
Data profiling with Apache CalciteJulian Hyde
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache CalciteJulian Hyde
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Julian Hyde
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
 

More from Julian Hyde (20)

Building a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteBuilding a semantic/metrics layer using Calcite
Building a semantic/metrics layer using Calcite
 
Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!Cubing and Metrics in SQL, oh my!
Cubing and Metrics in SQL, oh my!
 
Adding measures to Calcite SQL
Adding measures to Calcite SQLAdding measures to Calcite SQL
Adding measures to Calcite SQL
 
Morel, a data-parallel programming language
Morel, a data-parallel programming languageMorel, a data-parallel programming language
Morel, a data-parallel programming language
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
The evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its CommunityThe evolution of Apache Calcite and its Community
The evolution of Apache Calcite and its Community
 
What to expect when you're Incubating
What to expect when you're IncubatingWhat to expect when you're Incubating
What to expect when you're Incubating
 
Efficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databasesEfficient spatial queries on vanilla databases
Efficient spatial queries on vanilla databases
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!Don't optimize my queries, organize my data!
Don't optimize my queries, organize my data!
 
Spatial query on vanilla databases
Spatial query on vanilla databasesSpatial query on vanilla databases
Spatial query on vanilla databases
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
Lazy beats Smart and Fast
Lazy beats Smart and FastLazy beats Smart and Fast
Lazy beats Smart and Fast
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Data profiling with Apache Calcite
Data profiling with Apache CalciteData profiling with Apache Calcite
Data profiling with Apache Calcite
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache Calcite
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
 

Mondrian update (Pentaho community meetup 2012, Amsterdam)

  • 1. Mondriaan update Pentaho community meetup Amsterdam September 2012 @julianhyde
  • 2. Agenda Mondrian 4 – beta Other new stuff (Yahoo)
  • 3. Mondrian 4 – What's new? Attributes Measure groups Physical schema Internals
  • 4. Richer semantic model Physical schema:  Only define attributes and relationships once  Compound keys Attribute hierarchies Hierarchies & attributes grouped into dimensions  E.g. Customers dimension contains Customer hierarchy (State-City-Customer) and Age, Gender, Salary attribute hierarchies
  • 5. Measure groups In Mondrian 3.x, if you want a cube with multiple fact tables, you build a virtual cube: <Cube name=“Sales”> <Table name=“sales_fact”/> </Cube> <Cube name=“Warehouse”> <Table name=“warehouse_fact”/> </Cube> <VirtualCube name=“Warehouse and Sales”> <Cube name=“Sales”/> <Cube name=“Warehouse”/> </VirtualCube>
  • 6. Measure groups (2) In Mondrian 4, cubes can contain <Cube name=“Warehouse and Sales”> <MeasureGroups> multiple measure groups <MeasureGroup name=“Sales”> <Table name=“sales_fact”/> <Measure name=“unit_sales”/> </MeasureGroup> <MeasureGroup name=“Warehouse”> Virtual cubes are obsolete <Table name=“warehousee_fact”/> <Measure name=“inventory_units”/> </MeasureGroup> </MeasureGroups> </Cube> Many-to-many association between measure groups and dimensions Different ways to link dimensions to Sales Warehouse fact tables Time X X Product X X Aggregate tables are measure groups Customer X Warehouse X
  • 7.
  • 8. Gone / Replacements Mondrian 3 schema Mondrian 4 Schema Schema upgrader Aggregate recognizer Aggregate table API (define / enable / disable) Schema workbench Pentaho modeler? XMLA server olap4j-xmlaserver @github Hierarchy syntax SSAS-style syntax  [Time.Weekly].[Day]  [Time].[Weekly].[Day]  [Time].[Month]  [Time].[Time].[Month]
  • 9. Done / Remaining The important things Ragged hierarchies work! Schema converter Analyzer upgrade 2511 of 2770 tests Aggregate table API pass Complex schema mappings
  • 10. Beta 1. Download from CI http://ci.pentaho.com/view/Analysis/job/mondrian-git-4.0/ 2. Run Mondrian-4 on your current schema  Auto-upgrade  Schema converter tool TBA  MDX syntax differences mondrian.olap.SsasCompatibleNaming=true 3. Write a new-style schema 4. Log bugs!
  • 12.
  • 13. “Mondrian in Action” book Publish date: Spring 2013 Join the early-access program: http://www.manning.com/back/
  • 14. Future features Shelved aggregate tables Connections  Defined in schema  Multiple connections  Non-JDBC databases Advanced SQL generation
  • 17. Aggregate table API – some ideas  Define  Enable  Disable  Specify beginning/end of valid range  Kettle can tell Mondrian that aggregate table is no longer valid  Kettle can ask Mondrian to tell it when it has finished using an aggregate table
  • 18. Multiple connections in schema <Schema name='FoodMart'> <Connections> <Connection name='default' default='true' uuid='abcd-1234'> <Jdbc>jdbc:mysql://localhost/foodmart? characterEncoding=latin1&lt;/Jdbc> <JdbcUser>foodmart</JdbcUser> <JdbcPassword>foodmart</JdbcPassword> </Connection> <Connection name='aggs' default='false' uuid='abcd-2345'> <Jdbc>jdbc:mysql://localhost/foodmartAggs? characterEncoding=latin1</Jdbc> <JdbcUser>foodmartAggs</JdbcUser> <Properties> <Property name='prop1'>value1</Property> <Property name='prop2'>value2</Property> </Properties> </Connection> </Connections>  Cannot join tables from different connections  Also: non-JDBC connection (via SPI or Optiq)
  • 19. Advanced SQL generation  Access control  Killing big IN lists  Push down aggregates (esp. time ranges)  Need a new strategy... TBD
  • 20. Summary Mondrian 4 – A major improvement to Mondrian model & engine As compatible as possible Will enable further improvements in performance / flexibility in upcoming releases Help us test it, and get it to production quality faster