RailsWayCon: Multidimensional Data Analysis with JRuby
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


RailsWayCon: Multidimensional Data Analysis with JRuby



Presentation at RailsWayCon 2011 conference

Presentation at RailsWayCon 2011 conference



Total Views
Views on SlideShare
Embed Views



15 Embeds 9,651

http://blog.rayapps.com 9492
http://localhost:4000 72
http://localhost 45
http://translate.googleusercontent.com 16
url_unknown 9
http://feeds.feedburner.com 4
http://webcache.googleusercontent.com 3
http://www.slideshare.net 3
http://www.365dailyjournal.com 1
http://www.newsblur.com 1
https://twitter.com 1
http://www.linkedin.com 1
http://honyaku.yahoofs.jp 1
http://www.hanrss.com 1
https://www.google.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

RailsWayCon: Multidimensional Data Analysis with JRuby Presentation Transcript

  • 1. Raimonds SimanovskisMultidimensionalData Analysiswith JRuby
  • 2. Raimonds Simanovskis github.com/rsim @rsim .com
  • 3. Relationaldata model
  • 4. SQL is good for detailed data queries Get all sales transactions in USA, CaliforniaSELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_salesFROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.idWHERE customers.country = USA AND customers.state_province = CA
  • 5. SQL becomes complex for analytical queries Get total sales in USA, California in Q1, 2011 by main product groupsSELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = Q1 AND customer.country = USA AND customer.state_province = CA GROUP BY product_class.product_family
  • 6. If SQL is not good then we need NoSQL!
  • 7. Maybe write distributedmap reduce function? http://browsertoolkit.com/fault-tolerance.png
  • 8. Multidimensional Data ModelMultidimensional cubes DimensionsHierarchies and levels Measures
  • 9. OLAP technologies On-Line Analytical Processing
  • 10. Commercial Vendors Oracle Essbase SAP BUSINESSOBJECTSOracle OLAP Cognos Analysis Services
  • 11. MDX query language Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsSELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWSFROM [Sales]WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
  • 12. http://github.com/rsim/mondrian-olap
  • 13. (R)OLAP schemaDimensional model: cubes dimensions (hierarchies & levels) measures, calculated measures MappingRelational model: fact tables, dimension tables joined by foreign keys
  • 14. OLAP schema definitionschema = Mondrian::OLAP::Schema.define do cube Sales do table sales dimension Gender, :foreign_key => customer_id do hierarchy :has_all => true, :primary_key => customer_id do table customer level Gender, :column => gender, :unique_members => true end end dimension Time, :foreign_key => time_id do hierarchy :has_all => false, :primary_key => time_id do table time_by_day level Year, :column => the_year, :type => Numeric, :unique_members => true level Quarter, :column => quarter, :unique_members => false level Month,:column => month_of_year,:type => Numeric,:unique_members => false end end measure Unit Sales, :column => unit_sales, :aggregator => sum measure Store Sales, :column => store_sales, :aggregator => sum endend
  • 15. Query Builder in Ruby Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsolap.from(Sales).columns([Measures].[Unit Sales], [Measures].[Store Sales]).rows([Product].children).where([Time].[2011].[Q1], [Customers].[USA].[CA]).execute
  • 16. Also more complex queries Get sales amount and profit % of top 50 products sold in USA and Canada during Q1, 2011olap.from(Sales).with_member([Measures].[ProfitPct]). as((Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales], :format_string => Percent).columns([Measures].[Store Sales], [Measures].[ProfitPct]).rows([Product].children).crossjoin([Customers].[Canada], [Customers].[USA]). top_count(50, [Measures].[Store Sales])where([Time].[2011].[Q1]).execute
  • 17. Demo
  • 18. Used in eazybi.com