Your SlideShare is downloading. ×
0
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
RailsWayCon: Multidimensional Data Analysis with JRuby
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

RailsWayCon: Multidimensional Data Analysis with JRuby

14,896

Published on

Presentation at RailsWayCon 2011 conference

Presentation at RailsWayCon 2011 conference

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
14,896
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
45
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Raimonds SimanovskisMultidimensionalData Analysiswith JRuby
  • 2. Raimonds Simanovskis github.com/rsim @rsim .com
  • 3. Relationaldata model
  • 4. SQL is good for detailed data queries Get all sales transactions in USA, CaliforniaSELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_salesFROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.idWHERE customers.country = USA AND customers.state_province = CA
  • 5. SQL becomes complex for analytical queries Get total sales in USA, California in Q1, 2011 by main product groupsSELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = Q1 AND customer.country = USA AND customer.state_province = CA GROUP BY product_class.product_family
  • 6. If SQL is not good then we need NoSQL!
  • 7. Maybe write distributedmap reduce function? http://browsertoolkit.com/fault-tolerance.png
  • 8. Multidimensional Data ModelMultidimensional cubes DimensionsHierarchies and levels Measures
  • 9. OLAP technologies On-Line Analytical Processing
  • 10. Commercial Vendors Oracle Essbase SAP BUSINESSOBJECTSOracle OLAP Cognos Analysis Services
  • 11. MDX query language Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsSELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWSFROM [Sales]WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )
  • 12. http://github.com/rsim/mondrian-olap
  • 13. (R)OLAP schemaDimensional model: cubes dimensions (hierarchies & levels) measures, calculated measures MappingRelational model: fact tables, dimension tables joined by foreign keys
  • 14. OLAP schema definitionschema = Mondrian::OLAP::Schema.define do cube Sales do table sales dimension Gender, :foreign_key => customer_id do hierarchy :has_all => true, :primary_key => customer_id do table customer level Gender, :column => gender, :unique_members => true end end dimension Time, :foreign_key => time_id do hierarchy :has_all => false, :primary_key => time_id do table time_by_day level Year, :column => the_year, :type => Numeric, :unique_members => true level Quarter, :column => quarter, :unique_members => false level Month,:column => month_of_year,:type => Numeric,:unique_members => false end end measure Unit Sales, :column => unit_sales, :aggregator => sum measure Store Sales, :column => store_sales, :aggregator => sum endend
  • 15. Query Builder in Ruby Get total units sold and sales amount in USA, California in Q1, 2011 by main product groupsolap.from(Sales).columns([Measures].[Unit Sales], [Measures].[Store Sales]).rows([Product].children).where([Time].[2011].[Q1], [Customers].[USA].[CA]).execute
  • 16. Also more complex queries Get sales amount and profit % of top 50 products sold in USA and Canada during Q1, 2011olap.from(Sales).with_member([Measures].[ProfitPct]). as((Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales], :format_string => Percent).columns([Measures].[Store Sales], [Measures].[ProfitPct]).rows([Product].children).crossjoin([Customers].[Canada], [Customers].[USA]). top_count(50, [Measures].[Store Sales])where([Time].[2011].[Q1]).execute
  • 17. Demo
  • 18. Used in eazybi.com

×