Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Multidimensional Data Analysis  with JRuby   Raimonds Simanovskis      github.com/rsim           @rsim
Relationaldata model
SQL is good for detailed       data queries           Get all sales transactions in           USA, CaliforniaSELECT custom...
SQL becomes complex       for analytical queries           Get total sales in USA, California           in Q1, 2011 by mai...
Maybe write distributedmap reduce function?
Multidimensional      Data ModelMultidimensional cubes     DimensionsHierarchies and levels      Measures
OLAP technologies  On-Line Analytical Processing
http://github.com/rsim/mondrian-olap
MDX query language          Get total units sold and sales amount          in USA, California in Q1, 2011          by main...
Or in Ruby like this       Get total units sold and sales amount       in USA, California in Q1, 2011       by main produc...
Also more complex                queries           Get sales amount and profit %           of top 50 products sold in USA ...
OLAP schema            (mapping cube to tables)schema = Mondrian::OLAP::Schema.define do  cube Sales do    table sales    ...
mondrian-olap gem   eazybi.com
Upcoming SlideShare
Loading in …5
×

Multidimensional Data Analysis with JRuby

2,428 views

Published on

Lightning talk at RailsConf 2011 about mondrian-olap gem

Published in: Technology, Business
  • Be the first to comment

Multidimensional Data Analysis with JRuby

  1. 1. Multidimensional Data Analysis with JRuby Raimonds Simanovskis github.com/rsim @rsim
  2. 2. Relationaldata model
  3. 3. SQL is good for detailed data queries Get all sales transactions in USA, CaliforniaSELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_salesFROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.idWHERE customers.country = USA AND customers.state_province = CA
  4. 4. SQL becomes complex for analytical queries Get total sales in USA, California in Q1, 2011 by main product groupsSELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = Q1 AND customer.country = USA AND customer.state_province = CA GROUP BY product_class.product_family
  5. 5. Maybe write distributedmap reduce function?
  6. 6. Multidimensional Data ModelMultidimensional cubes DimensionsHierarchies and levels Measures
  7. 7. OLAP technologies On-Line Analytical Processing
  8. 8. http://github.com/rsim/mondrian-olap
  9. 9. MDX query language Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsSELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWSFROM [Sales]WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
  10. 10. Or in Ruby like this Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsolap.from(Sales).columns([Measures].[Unit Sales], [Measures].[Store Sales]).rows([Product].children).where([Time].[2011].[Q1], [Customers].[USA].[CA]).execute
  11. 11. Also more complex queries Get sales amount and profit % of top 50 products sold in USA and Canada during Q1, 2011olap.from(Sales).with_member([Measures].[ProfitPct]). as((Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales], :format_string => Percent).columns([Measures].[Store Sales], [Measures].[ProfitPct]).rows([Product].children).crossjoin([Customers].[Canada], [Customers].[USA]). top_count(50, [Measures].[Store Sales])where([Time].[2011].[Q1]).execute
  12. 12. OLAP schema (mapping cube to tables)schema = Mondrian::OLAP::Schema.define do cube Sales do table sales dimension Gender, :foreign_key => customer_id do hierarchy :has_all => true, :primary_key => customer_id do table customer level Gender, :column => gender, :unique_members => true end end dimension Time, :foreign_key => time_id do hierarchy :has_all => false, :primary_key => time_id do table time_by_day level Year, :column => the_year, :type => Numeric, :unique_members => true level Quarter, :column => quarter, :unique_members => false level Month,:column => month_of_year,:type => Numeric,:unique_members => false end end measure Unit Sales, :column => unit_sales, :aggregator => sum measure Store Sales, :column => store_sales, :aggregator => sum endend
  13. 13. mondrian-olap gem eazybi.com

×