Open source BI tool Mondrian is an OLAP engine that operates on normalized relational databases to provide multi-dimensional analysis. It is bundled with other open source packages like JPivot for the UI layer. Mondrian uses a schema file to define the logical multi-dimensional model and dimensions, hierarchies, measures and calculated members. It supports extensions through plug-ins and user-defined functions. For performance, aggregate tables, materialized views and query caching can be used. While it provides an open source alternative to proprietary BI tools, it also has some constraints around key joins and schema normalization.
4. Business Intelligence – Why?
Data is the biggest asset
Structured and Unstructured format
Most of our assets are buried
Helps us understand customer behavior
Helps us deliver better business value
Measure performance
7. Open Source BI – Introduction
Mondrian – OLAP Engine
Initially Independent Open Source Initiative
Now Part of Pentaho Open Source BI Suite
100% Pure Java
Supports MDX and XML/A
Bundled With Other Open Source Packages
8. Open Source BI – Tech Stack
JFreeChart WCF
log4j
log4j
JPivot
Mondrian
RDBMS
9. OLAP Engine – Mondrian
Cube Definition – schema.xml
MDX – Query language to access multi dimensional data
Operates on normalized relational database
10. Mondrian – schema.xml
Logical model of a multi dimensional database
Cube, VirtualCube
Dimensions, Hierarchies, Levels
Measure, CalculatedMember
17. Extensions – Views
<Cube name="Operations">
<View alias="StateCountyCity">
<SQL dialect="generic">
<![CDATA[
SELECT s.state_name, c.county_name, t.city_name, s.state_id, c.county_id, t.city_id
FROM state s
LEFT JOIN county c ON (c.state_id = s.state_id)
LEFT JOIN city t ON (c.county_id = t.county_id)
]]>
</SQL>
</View>
</Cube>
18. Extensions – User Defined Functions
Must implement mondrian.spi.UserDefinedFunction
Implementation must be available in classpath
UDF Definition in schema.xml
<Schema>
...
<UserDefinedFunction name="PlusOne" className=“my.udf.PlusOne" />
</Schema>
MDX Usage
WITH MEMBER [Measures].[Unit Sales Plus One]
AS 'PlusOne([Measures].[Unit Sales])'
SELECT
{[Measures].[Unit Sales]} ON COLUMNS,
{[Gender].MEMBERS} ON ROWS
FROM [Sales]
19. MDX / JDBC Parallels
Mondrian JDBC
Connection – mondrian.olap.Connection Connection – java.sql.Connection
Query – mondrian.olap.Query Statement – java.sql.Statement
Result – mondrian.olap.Result ResultSet – java.sql.ResultSet
Access Axis & Cell from Result Access Rows & Columns from ResultSet
21. Performance & Scalability
Enable SQL statement logging to analyze
mondrian generated SQL statements
Index on foreign/join keys
Use Aggregate Tables & Materialized Views
Query results in session
22. Constraints
Composite key joins are not supported
Uniqueness within a level is not based on id
Have had issues re-using same table with a
different alias
Make mondrian happy schema – must be
normalized
Requires dedicated Time dimension table
23. Summary
100% Pure Java BI tool
Not too difficult to work with
Extensible for different front-end layers
Scalable
Viable alternative to proprietary tools
No vendor lock-in – Open Source
Less TCO
Quicker Time To Market