Mondrian - Geo Mondrian


Published on

Presentation about GeoMondrian features for trajectory data warehousing

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Mondrian - Geo Mondrian

  1. 1. a Trajectory Data Warehouse using Master Project Introduction Simone Campora Advisors Laura Spinsanti Jose Antonio Fernandes de Macedo Stefano Spaccapietra
  2. 2. Choosing OLAP Platform <ul><li>Which Architecture should we use to develop a Trajectory Data Warehouse? </li></ul><ul><li>Several Candidates </li></ul><ul><ul><li>Oracle OLAP </li></ul></ul><ul><ul><li>Microsoft SQL Server BI Workbench </li></ul></ul><ul><ul><li>SAS ® OLAP Server </li></ul></ul><ul><ul><li>Pentaho Mondrian </li></ul></ul><ul><li>Mondrian stands out of the crowd for several aspects… </li></ul>
  3. 3. What is Mondrian?
  4. 4. Who is Mondrian ? <ul><li>A step forward to Data Warehouse Integration </li></ul><ul><li>Mondrian is an OLAP server written in Java. </li></ul><ul><li>It enables to interactively analyze very large datasets stored in SQL databases without writing SQL. </li></ul>
  5. 5. <ul><li>It uses MDX Query Language as Query Language </li></ul><ul><li>As well as XML Analytics (XMLA) </li></ul><soap:Envelope> <soap:Body> <Execute xmlns=&quot;urn:schemas-microsoft-com:xml-analysis&quot;> <Command> <Statement> SELECT Measures.MEMBERS ON COLUMNS FROM Sales </Statement> </Command> <Properties> <PropertyList> <DataSourceInfo/> <Catalog>FoodMart</Catalog> <Format>Multidimensional</Format> <AxisFormat>TupleFormat</AxisFormat> </PropertyList> </Properties> </Execute> </soap:Body> </soap:Envelope>
  6. 6. More on MDX <ul><li>MDX stands for MultiDimensional eXpressions query language </li></ul><ul><li>De facto standard from Microsoft for SQL Server OLAP Services(now Analysis Services) </li></ul><ul><li>MDX is for OLAP data cubes what SQL is for relational databases </li></ul><ul><li>Looks like a SQL query but relies on a different model (close to the one used in spreadsheets) </li></ul>SELECT { [Measures].[Store Sales] } ON COLUMNS, { [Date].[2002], [Date].[2003] } ON ROWS FROM Sales WHERE ( [Store].[USA].[CA]
  7. 7. XML Cube Definition <ul><li>Mondrian uses XML “Schemas” to define the Cubes, like: </li></ul>< Cube name=&quot;Sales&quot;>   < Table name=&quot;sales&quot;>     < AggName name=&quot;agg_1&quot;>       < AggFactCount column=&quot;row count&quot;/>       < AggMeasure name=&quot;[Measures].[Unit Sales]&quot; column=&quot;sum units&quot;/>       < AggMeasure name=&quot;[Measures].[Min Units]&quot; column=&quot;min units&quot;/>       < AggMeasure name=&quot;[Measures].[Max Units]&quot; column=&quot;max units&quot;/>       < AggMeasure name=&quot;[Measures].[Dollar Sales]&quot; column=&quot;sum dollars&quot;/>       < AggLevel name=&quot;[Time].[Year]&quot; column=&quot;year&quot;/>       < AggLevel name=&quot;[Time].[Quarter]&quot; column=&quot;quarter&quot;/>       < AggLevel name=&quot;[Product].[Mfrid]&quot; column=&quot;mfrid&quot;/>       < AggLevel name=&quot;[Product].[Brand]&quot; column=&quot;brand&quot;/>       < AggLevel name=&quot;[Product].[Prodid]&quot; column=&quot;prodid&quot;/>     </ AggName >   </ Table >     <!-- Rest of the cube definition --> </ Cube >
  8. 8. It is the OLAP version of JDBC <ul><li>It is considered to be for OLAP, what JDBC API is for Relational Databases. </li></ul><ul><li>Using a similar Java Syntax it is possible to query the OLAP Server from any Java Application </li></ul>import mondrian.olap.*; import; Connection connection = DriverManager.getConnection(     &quot;Provider=mondrian;&quot; +     &quot;Jdbc=jdbc:odbc:MondrianFoodMart;&quot; +     &quot;Catalog=/WEB-INF/FoodMart.xml;&quot;,     null,     false); Query query = connection.parseQuery(     &quot;SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} on columns,&quot; +     &quot; {[Product].children} on rows &quot; +     &quot;FROM [Sales] &quot; +     &quot;WHERE ([Time].[1997].[Q1], [Store].[CA].[San Francisco])&quot;); Result result = connection.execute(query); result.print(new PrintWriter(System.out));
  9. 9. Mondrian is used for… <ul><li>&quot;Dimensional&quot; exploration of data </li></ul><ul><li>Parsing of Multi-Dimensional eXpression (MDX) language into Structured Query Language (SQL) to retrieve answers to dimensional queries </li></ul><ul><li>High-speed queries through the use of aggregate tables in the RDBMS </li></ul><ul><li>Advanced calculations using the calculation expressions of the MDX language </li></ul>
  10. 10. Key Features <ul><li>On-Line Analytical Processing (OLAP) cubes </li></ul><ul><ul><li>automated aggregation </li></ul></ul><ul><ul><li>speed-of-thought response times </li></ul></ul><ul><li>Open Architecture </li></ul><ul><ul><li>100% Java </li></ul></ul><ul><ul><li>J2EE </li></ul></ul><ul><ul><li>Supports any JDBC data source </li></ul></ul><ul><ul><li>MDX and XML/A (i.e. SOAP) </li></ul></ul><ul><li>Analysis Viewers </li></ul><ul><ul><li>Enables ad-hoc, interactive data exploration </li></ul></ul><ul><ul><li>Ability to slice-and-dice, drill-down, and pivot </li></ul></ul>
  11. 11. Mondrian’s Architecture
  12. 12. Architecture <ul><li>Database Provides </li></ul><ul><li>Data storage </li></ul><ul><li>SQL query execution </li></ul><ul><li>Heavy-duty sorting, correlation, aggregation </li></ul><ul><li>Mondrian Provides </li></ul><ul><li>Dimensional view of data </li></ul><ul><li>MDX parsing </li></ul><ul><li>SQL generation </li></ul><ul><li>Caching </li></ul><ul><li>Higher-level calculations </li></ul><ul><li>Aggregate awareness </li></ul>Mondrian cube RDBMS Apache Derby, Firebird, hsqldb, IBM DB2, Infobright, Informix, Ingres, Interbase, LucidDB, Microsoft Access, Microsoft SQL Server, MySQL, Netezza, Oracle, PostgreSQL, Sybase, Teradata
  13. 13. Architecture <ul><li>Open Standards (Java, XML, MDX, XML/A, SQL) </li></ul><ul><li>Cross Platform (Windows & Unix/Linux) </li></ul><ul><li>J2EE Architecture </li></ul><ul><ul><li>Server Clustering </li></ul></ul><ul><ul><li>Fault Tolerance </li></ul></ul><ul><li>Data Sources </li></ul><ul><ul><li>JDBC </li></ul></ul><ul><ul><li>JNDI </li></ul></ul>Cube Schema XML Cube Schema XML Cube Schema XML J2EE Application Server Mondrian Web Server JDBC RDBMS cube cube cube File or RDBMS Repository RDBMS JDBC JDBC JPivot servlet Viewers JPivot servlet XML/A servlet Microsoft Excel (via Spreadsheet Services)
  14. 14. Strenghts <ul><li>Database Independant Applications </li></ul><ul><li>(it operates on a JVM) </li></ul><ul><li>Open Source (Eclipse License) </li></ul><ul><li>Standards such as: MDX, XMLA, JDBC </li></ul><ul><li>Relevant Installed Base </li></ul><ul><ul><li>(DivX, iStockPhoto, Sun, Mozilla, MySQL…) </li></ul></ul><ul><li>Widely recognized inside the Open Source Community </li></ul>
  15. 15. RDBMS Design <ul><li>Mondrian does not store data on disk: </li></ul><ul><ul><li>it just read data from the DBMS and copy it into the cache </li></ul></ul><ul><li>It puts limits on Mondrian's performance when Mondrian is applied to a huge dataset. </li></ul><ul><li>This can be overcome by using an “Aggregate Table” Designed DB Schema… </li></ul>
  16. 16. Aggregation Tables <ul><li>This is the plain Fact table </li></ul><ul><li>This is considering a specific Aggregation </li></ul>
  17. 17. Tools and Caching <ul><li>you don't need to do any processing to populate special data structures before you start running OLAP queries. </li></ul><ul><li>mondrian an excellent choice for 'real-time OLAP' -- running multi-dimensional queries on a database which is constantly changing. </li></ul><ul><li>There are specific APIs and Tools that can be customized to handle Aggregate table creations and cache updating. </li></ul>
  18. 18. Geo Mondrian
  19. 19. Geo Mondrian <ul><li>GeoMondrian is a &quot;spatially-enabled&quot; version of the Mondrian OLAP. </li></ul><ul><li>GeoMondrian is an implementation of a Spatial OLAP (SOLAP) server, it is the first implementation of such a server up to now. </li></ul><ul><li>Up to now it is unreleased , and developed by Thierry Badard and Etienne Dubé form University of Laval, Canada. </li></ul>
  20. 20. Geo Mondrian… <ul><li>It adds to Mondrian a Geometry data type, enabling storage of vector geometries (points, lines, polygons) natively within the data cubes. </li></ul><ul><ul><ul><ul><li>Instead of fetching them from an external spatial DBMS, web service or a GIS file </li></ul></ul></ul></ul><ul><li>Some MDX functions allow to add spatial analysis capabilities to the analytical queries. </li></ul>
  21. 21. Example Query <ul><li>Example query: filter spatial dimension members based on distance from a feature </li></ul>SELECT { [Measures].[Population]} on columns, Filter( {[Unite geographique].[Region economique].members}, ST_Distance([Unitegeographique].CurrentMember.Properties(&quot;geom&quot;),[Unite geographique].[Province].[Ontario].Properties(&quot;geom&quot;)) < 2.0 ) on rows FROM [Recensements] WHERE [Temps].[Rencensement 2001 (2001-2003)].[2001]
  22. 22. Features <ul><li>Geometry objects are handled using the JTS library (Open GIS Consortium STD) </li></ul><ul><ul><ul><ul><li> </li></ul></ul></ul></ul><ul><li>For the moment, only PostgreSQL with the PostGIS spatial extension is supported as a data source for Geometry values </li></ul>
  23. 23. Conclusions <ul><li>Open Source Solution for BI Applications </li></ul><ul><li>Active Community Developing on both projects: Mondrian – Geo Mondrian </li></ul><ul><li>Database Independent Alternative for BI </li></ul>
  24. 24. Thanks for the Attention <ul><li>Questions? </li></ul>