Tutorial olap4j

4,414 views

Published on

Brief introduction to olap4j library

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Tutorial olap4j

  1. 1. OLAP4J - Introduction Wanna be better than your opponentes ? **Competitive advantage** by @borjaeg
  2. 2. KEY POINTS ● MY PROBLEM (IN DREAMS) ● BUSINESS INTELLIGENCE ● WHAT IS OLAP ? ● WHAT IS OLAP4J? ● MAIN CLASSES/INTERFACES ● MDX ● SNIPPETS
  3. 3. MY PROBLEM ● Yeah! I have a discography, with many artists from Spain and England. I want to get a hit this summer, but now … I haven't got enough money to bet on two different artists. Only one. I need information from data, i need to take advantage from other discographies...
  4. 4. BUSINESS INTELLIGENCE ● Business Intelligence (BI) is the ability of an organization to collect, mantain, and organize data. ● Large amounts of information can help develop new opportunities. ● Identifying them, can provide a competitive market advantage and long-term stability. ● Technology → OLAP, Data Mining, predective analytics...
  5. 5. WHAT IS OLAP ? ● In computing science (installing printers is not included), online analytical processing, or OLAP, is an approach to answering multi-dimensional analytical queries swiftly. ● OLAP is an important part of Business intelligence. ● Main concepts (for me ^^) → Hypercube, dimensions, meassures... – ROLAP – MOLAP – HOLAP – SOLAP – ...
  6. 6. WHAT IS OLAP4J? ● JAVA API ● PROVIDES OLAP ANALISYS ● QUERY LANGUAGE – MDX STRING – MDX PARSE TREE – METADA IS AT THE HEART OF Olap4j – DIFFERENT CUBE SERVERS ARE SUPPORTED ● MSAS via XML/A ● MONDRIAN
  7. 7. MAIN CLASSES/INTERFACES ● Interfaces ...you should know them as they were your girlfriend/hand. – OlapConnection – OlapWrapper – OlapStatement/ PreparedOlapStatement – CellSet – CellSetAxis – Position – Cell
  8. 8. MAIN CASSES / INTERFACES ● Cellset is the set which contains the cell returned by the MDX query. ● Cellset is the set where results are founded. ● What's a cell? (Next slide) – Important API methods: getCell, getFilterAxis, getMetadata...
  9. 9. MAIN CASSES / INTERFACES ● Cell is the structure which contains a part of the cellset. Every cell, contains a part of the result we are searching with a MDX query. ● It could be said that cells are as rows in RDBs or documents in a document-oriented DB. This is just a very simple approach, a really simple analogy. – Important API methods: drillThrough, getValue, isEmpty, isNull,...
  10. 10. MAIN CASSES / INTERFACES ● CellSetAxis is the axis of a cellset (OK?!!) ● A cell set has the same number of axes as the MDX statement which was executed to produce it. ● Each axis is an ordered collection of members or tuples. ● Each member or tuple on an axis is called Position. – Important API methods: getPositions, getAxisMetada...
  11. 11. MAIN CASSES / INTERFACES ● Position is one of the CellSetAxis objects in a CellSet. ● An axis has a particular dimensionality, that is, a set of one or more dimensions which will appear on than axis, and every position on that axis will have a member of each of those dimensions. (Extracted from official API) ● WTF¿! → As The Beatles said “All you need are snippets” (more or less...) – Important API methods: getMembers, getOrdinal...
  12. 12. MDX ● Multidimensional Expressions (MDX) is a query language for OLAP databases. Similar to SQL in RDBs world. It is also a calculation language, with syntax similar to spreadsheet formulas. ● Example (Next slide explain this query)
  13. 13. MDX ● Time dimension ON columns ● Artists dimension ON rows ● Results are filtered by Total Sales. To solve my problem, I'm gonna visualize artists, the last three summers and total sales.
  14. 14. SNIPPETS ● Environment where I'm working...and living. – Macbook Pro ● Core i5 (2.54 Ghz) ● 16 GB RAM – PostgreSQL 9.2 – Eclipse Juno (code is copied from Sublime) ● Index – Getting connection – Validating MDX query – Executing MDX query and print results – Executing a MDX query and go through all results.
  15. 15. SNIPPET I Getting and closing Connection. I'm using some junit annotations (v.4.11) and mondrian driver.
  16. 16. SNIPPET II ● Validating a query. In this snippet, it's also possible to see how to parse a query. When you parse a query, you can obtain the query model (rows, columns, filters...)
  17. 17. SNIPPET III ● Executing MDX query. – executeOlapQuery ● Printing results. – printQueryResult
  18. 18. SNIPPET IV ● Executing MDX query and going through results – It's important to test if cells are empty. Doing some operations with empty cells could throw exceptions.
  19. 19. Conclusion ● Business intelligence is a “huge world” full of different possibilities to get better information quality. This can also be a problem. Each solution has advantages and disadvantages. ● BI concepts aren't new. But nowadays most documents which talk about BI are quite hard to undertands. ● My experience with Olap4j has been really good. ● Documentation is quite good, but it needed to be improved with more examples.
  20. 20. Links recommended ● http://en.wikipedia.org/wiki/Business_intelligence ● http://en.wikipedia.org/wiki/Competitive_advantage ● http://en.wikipedia.org/wiki/Online_analytical_processing ● http://en.wikipedia.org/wiki/MultiDimensional_eXpression s ● http://mondrian.pentaho.com/ ● http://www.olap4j.org/

×