RailsWayCon: Multidimensional Data Analysis with JRuby

18,553 views
19,146 views

Published on

Presentation at RailsWayCon 2011 conference

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
18,553
On SlideShare
0
From Embeds
0
Number of Embeds
13,262
Actions
Shares
0
Downloads
46
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

RailsWayCon: Multidimensional Data Analysis with JRuby

  1. 1. Raimonds SimanovskisMultidimensionalData Analysiswith JRuby
  2. 2. Raimonds Simanovskis github.com/rsim @rsim .com
  3. 3. Relationaldata model
  4. 4. SQL is good for detailed data queries Get all sales transactions in USA, CaliforniaSELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_salesFROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.idWHERE customers.country = USA AND customers.state_province = CA
  5. 5. SQL becomes complex for analytical queries Get total sales in USA, California in Q1, 2011 by main product groupsSELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = Q1 AND customer.country = USA AND customer.state_province = CA GROUP BY product_class.product_family
  6. 6. If SQL is not good then we need NoSQL!
  7. 7. Maybe write distributedmap reduce function? http://browsertoolkit.com/fault-tolerance.png
  8. 8. Multidimensional Data ModelMultidimensional cubes DimensionsHierarchies and levels Measures
  9. 9. OLAP technologies On-Line Analytical Processing
  10. 10. Commercial Vendors Oracle Essbase SAP BUSINESSOBJECTSOracle OLAP Cognos Analysis Services
  11. 11. MDX query language Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsSELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWSFROM [Sales]WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
  12. 12. http://github.com/rsim/mondrian-olap
  13. 13. (R)OLAP schemaDimensional model: cubes dimensions (hierarchies & levels) measures, calculated measures MappingRelational model: fact tables, dimension tables joined by foreign keys
  14. 14. OLAP schema definitionschema = Mondrian::OLAP::Schema.define do cube Sales do table sales dimension Gender, :foreign_key => customer_id do hierarchy :has_all => true, :primary_key => customer_id do table customer level Gender, :column => gender, :unique_members => true end end dimension Time, :foreign_key => time_id do hierarchy :has_all => false, :primary_key => time_id do table time_by_day level Year, :column => the_year, :type => Numeric, :unique_members => true level Quarter, :column => quarter, :unique_members => false level Month,:column => month_of_year,:type => Numeric,:unique_members => false end end measure Unit Sales, :column => unit_sales, :aggregator => sum measure Store Sales, :column => store_sales, :aggregator => sum endend
  15. 15. Query Builder in Ruby Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsolap.from(Sales).columns([Measures].[Unit Sales], [Measures].[Store Sales]).rows([Product].children).where([Time].[2011].[Q1], [Customers].[USA].[CA]).execute
  16. 16. Also more complex queries Get sales amount and profit % of top 50 products sold in USA and Canada during Q1, 2011olap.from(Sales).with_member([Measures].[ProfitPct]). as((Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales], :format_string => Percent).columns([Measures].[Store Sales], [Measures].[ProfitPct]).rows([Product].children).crossjoin([Customers].[Canada], [Customers].[USA]). top_count(50, [Measures].[Store Sales])where([Time].[2011].[Q1]).execute
  17. 17. Demo
  18. 18. Used in eazybi.com

×