NB: This presentation was delivered at the Singapore Ruby Brigade meetup 6-Jan-2010 (at hackerspace.sg)
BI & DW for Ruby/Rails “!???”
Why should we care about this enterprisey stuff? Have you heard a client ask for.. A “dashboard”? Management reports? Operational statistics? ..in addition to the actual site?
Or maybe you want to pitch for the dashboard/BI projects themselves? ..using your rails skills of course BI Business Intelligence CPM Corporate Performance Mgmt BPM Business Performance Mgmt B&P Budgeting and Planning EPM Enterprise Performance Mgmt Dashboard Enterprise Dashboards
BI Basics No, BI is not (always) an oxymoron
BI = Business Feedback & Control Systems Keeping the doors open Uptime on the servers; alerts Infrastructure & Systems
BI = Business Feedback & Control Systems Keeping the doors open Optimising in the short term intra-day Focus on systems in isolation Need extra call centre staff on shift? Daily sales numbers? Infrastructure & Systems Operational Management
BI = Business Feedback & Control Systems Keeping the doors open Optimising in the short term intra-day Focus on systems in isolation Strategic performance monthly, quarterly, yearly Across all systems Profitability by product Utilisation and sales performance Infrastructure & Systems Operational Management Executive Management
Traditional Rails perspective.. e.g. NewRelic Custom AR reports Someone else’s problem (opportunity) Infrastructure & Systems Operational Management Executive Management
Someone Else’s Problem.. Your Rails Storefront App Fulfillment (maybe a third party) To report on sales fulfillment.. AR/AP/GL To report on revenue and profitability.. To report on sales revenue, actuals and forecast.. And don’t forget all those other systems.. CRM MRP FA
Who is “Someone Else”? The gigaohm network: “ 5 Free Business Intelligence Crunchers for Your 2010 Arsenal ”
 
ETL ODS Your Rails App Other Transactional Systems Data Sources DBoR, relational reporting BI & DW A copy of transaction data specifically structured for query and analysis Extract – Transform – Load Or, Extract – Load – Transform Or, Transform – Extract – Load (depending on the technology)
“cubes” Sales = $22 Customer ID Product ID Date ID … Customer dimension Date dimension Product dimension Fact categorisation “ Fact”
MOLAP, ROLAP, HOLAP MOLAP: proprietary format to optimize for analytical queries  ROLAP: use relational database to mimic multi-dimensionality HOLAP: hybrid. Drive analytics from MOLAP, drill down to relational Star schema Snowflake
Why?? What’s wrong with.. select a.name, sum (b.amount) from products a join order_items b on a.id = b.product_id group by product_id Product.sum (:amount, :include => :orders, :group => ‘ product_id’) Every question needs it’s own query Can’t predict all the questions in advance Un-scalable grunt work
ActiveWarehouse  ActiveWarehouse-ETL
ActiveWarehouse Rails plugin by Anthony Eden ROLAP solution based on ActiveRecord Features Generators for Facts, Dimensions, Cubes and Bridges Supports calculated fields View helpers for reports with drill down
ActiveWarehouse-ETL Rails gem/plugin by Anthony Eden DSL for extract – transform – load Source/sink: file, db, xml, .. (extensible) Features Pre/post processors Transformations
 
The Cupcakes Store Use Activewarehouse-etl to load seed data from csv to app db (mysql) 1 The Cupcakes BI Dashboard 2 Use Activewarehouse-etl to load dimension and fact data to the warehouse (mysql to mysql) 3 Use Activewarehouse to build a simple analytical “dashboard” and reporting tool Follow the documentation at  http://github.com/tardate/cupcakesinc  to see how this works (and try it yourself)
Product listing at Cupcakes Inc..
Customer listing at Cupcakes Inc..
Order listing at Cupcakes Inc..
Order detail at Cupcakes Inc..
Sales By Product AW Report
Sales By Product (drill to 2009)
Reasons to be Cheerful..
Language ETL processing, cube rules etc typically use custom languages (often archaic and limited) BI Suites It’s … ruby!
UI Customisation and Presentation Integration Web delivery typically very constrained. Often rely on strong integration with office software (Excel). Leads to “custom application development in Excel” syndrome. BI Suites It’s … ActionPack! Google maps mashups, social graph links. .. you get full UI control, as long as you have the development budget.
Speed of development Basic deployments can be very fast. But UI inflexibility can lead to either lots of time wasted trying to shoe-horn, or need to “reset customer expectations” BI Suites It’s … Ruby & Rails. Say no more ;-)
TCO Top-tier suites can come with a hefty $ tag. And prices are going up.. But some analysts are predicting 2010 to be the year BI gets FLOSS momentum (see gigaohm review of 5 well established alternatives) BI Suites It’s … Ruby & Rails. Say no more ;-) Trade-in software license costs for more development.
Caveats..
Native MOLAP Generally good support for database MOLAP features. Can be platform specific though – e.g. Microsoft MDX, SQL Server Analytical Services BI Suites A gap. No real support currently available.  ActiveWarehouse uses relational model to “fake” MOLAP (ROLAP)
Performance Generally, all established analytical engines (and backing databases) have great performance track record. Huge scalability (millions of rows)  BI Suites Unproven. ActiveWarehouse/ETL does not have many (public) proof points.  Given that it is tied to AR performance, expect scalability could be an issue.
Take-aways ~ActiveWarehouse It’s an impressive codebase. When you get it working, it works well.. but Virtually no documentation! No contemporary examples Not under very active development A “textbook” data warehouse implementation. May or may not be exactly what you want.. Remember:  data is batched. Not realtime. Rails 2.x : install the plugin (gem is 1.x) 3
Take-aways ~ ActiveWarehouse-ETL Neat tool. In addition to feeding AW: Generate and load seed/test data Move data between systems But again, Poor documentation When it fails, can do so silently (makes sure filename paths are delimited correctly for your platform!) 2
Take-aways ~ BI on Rails Solutions Plain AR just avoid the rabbit hole AR + ETL get all the data you need in one place AW+ETL traditional ROLAP, make Rails the focus of the BI effort Go the BI suite route When you need to adapt to many transactional systems at scale, and customer has the $$  (Rails remains just for transactional apps) Or… (discussion point;-) 1
Thank you! Questions? 0
Some References ActiveWarehouse:  http://github.com/aeden/activewarehouse   ActiveWarehouse-ETL:  http://github.com/aeden/activewarehouse-etl   Cupcakes Inc sample site(s):  http://github.com/tardate/cupcakesinc   Singapore Ruby Brigade (SRB):  http://groups.google.com/group/singapore-rb

ActiveWarehouse/ETL - BI & DW for Ruby/Rails

  • 1.
    NB: This presentationwas delivered at the Singapore Ruby Brigade meetup 6-Jan-2010 (at hackerspace.sg)
  • 2.
    BI & DWfor Ruby/Rails “!???”
  • 3.
    Why should wecare about this enterprisey stuff? Have you heard a client ask for.. A “dashboard”? Management reports? Operational statistics? ..in addition to the actual site?
  • 4.
    Or maybe youwant to pitch for the dashboard/BI projects themselves? ..using your rails skills of course BI Business Intelligence CPM Corporate Performance Mgmt BPM Business Performance Mgmt B&P Budgeting and Planning EPM Enterprise Performance Mgmt Dashboard Enterprise Dashboards
  • 5.
    BI Basics No,BI is not (always) an oxymoron
  • 6.
    BI = BusinessFeedback & Control Systems Keeping the doors open Uptime on the servers; alerts Infrastructure & Systems
  • 7.
    BI = BusinessFeedback & Control Systems Keeping the doors open Optimising in the short term intra-day Focus on systems in isolation Need extra call centre staff on shift? Daily sales numbers? Infrastructure & Systems Operational Management
  • 8.
    BI = BusinessFeedback & Control Systems Keeping the doors open Optimising in the short term intra-day Focus on systems in isolation Strategic performance monthly, quarterly, yearly Across all systems Profitability by product Utilisation and sales performance Infrastructure & Systems Operational Management Executive Management
  • 9.
    Traditional Rails perspective..e.g. NewRelic Custom AR reports Someone else’s problem (opportunity) Infrastructure & Systems Operational Management Executive Management
  • 10.
    Someone Else’s Problem..Your Rails Storefront App Fulfillment (maybe a third party) To report on sales fulfillment.. AR/AP/GL To report on revenue and profitability.. To report on sales revenue, actuals and forecast.. And don’t forget all those other systems.. CRM MRP FA
  • 11.
    Who is “SomeoneElse”? The gigaohm network: “ 5 Free Business Intelligence Crunchers for Your 2010 Arsenal ”
  • 12.
  • 13.
    ETL ODS YourRails App Other Transactional Systems Data Sources DBoR, relational reporting BI & DW A copy of transaction data specifically structured for query and analysis Extract – Transform – Load Or, Extract – Load – Transform Or, Transform – Extract – Load (depending on the technology)
  • 14.
    “cubes” Sales =$22 Customer ID Product ID Date ID … Customer dimension Date dimension Product dimension Fact categorisation “ Fact”
  • 15.
    MOLAP, ROLAP, HOLAPMOLAP: proprietary format to optimize for analytical queries ROLAP: use relational database to mimic multi-dimensionality HOLAP: hybrid. Drive analytics from MOLAP, drill down to relational Star schema Snowflake
  • 16.
    Why?? What’s wrongwith.. select a.name, sum (b.amount) from products a join order_items b on a.id = b.product_id group by product_id Product.sum (:amount, :include => :orders, :group => ‘ product_id’) Every question needs it’s own query Can’t predict all the questions in advance Un-scalable grunt work
  • 17.
  • 18.
    ActiveWarehouse Rails pluginby Anthony Eden ROLAP solution based on ActiveRecord Features Generators for Facts, Dimensions, Cubes and Bridges Supports calculated fields View helpers for reports with drill down
  • 19.
    ActiveWarehouse-ETL Rails gem/pluginby Anthony Eden DSL for extract – transform – load Source/sink: file, db, xml, .. (extensible) Features Pre/post processors Transformations
  • 20.
  • 21.
    The Cupcakes StoreUse Activewarehouse-etl to load seed data from csv to app db (mysql) 1 The Cupcakes BI Dashboard 2 Use Activewarehouse-etl to load dimension and fact data to the warehouse (mysql to mysql) 3 Use Activewarehouse to build a simple analytical “dashboard” and reporting tool Follow the documentation at http://github.com/tardate/cupcakesinc to see how this works (and try it yourself)
  • 22.
    Product listing atCupcakes Inc..
  • 23.
    Customer listing atCupcakes Inc..
  • 24.
    Order listing atCupcakes Inc..
  • 25.
    Order detail atCupcakes Inc..
  • 26.
  • 27.
    Sales By Product(drill to 2009)
  • 28.
    Reasons to beCheerful..
  • 29.
    Language ETL processing,cube rules etc typically use custom languages (often archaic and limited) BI Suites It’s … ruby!
  • 30.
    UI Customisation andPresentation Integration Web delivery typically very constrained. Often rely on strong integration with office software (Excel). Leads to “custom application development in Excel” syndrome. BI Suites It’s … ActionPack! Google maps mashups, social graph links. .. you get full UI control, as long as you have the development budget.
  • 31.
    Speed of developmentBasic deployments can be very fast. But UI inflexibility can lead to either lots of time wasted trying to shoe-horn, or need to “reset customer expectations” BI Suites It’s … Ruby & Rails. Say no more ;-)
  • 32.
    TCO Top-tier suitescan come with a hefty $ tag. And prices are going up.. But some analysts are predicting 2010 to be the year BI gets FLOSS momentum (see gigaohm review of 5 well established alternatives) BI Suites It’s … Ruby & Rails. Say no more ;-) Trade-in software license costs for more development.
  • 33.
  • 34.
    Native MOLAP Generallygood support for database MOLAP features. Can be platform specific though – e.g. Microsoft MDX, SQL Server Analytical Services BI Suites A gap. No real support currently available. ActiveWarehouse uses relational model to “fake” MOLAP (ROLAP)
  • 35.
    Performance Generally, allestablished analytical engines (and backing databases) have great performance track record. Huge scalability (millions of rows) BI Suites Unproven. ActiveWarehouse/ETL does not have many (public) proof points. Given that it is tied to AR performance, expect scalability could be an issue.
  • 36.
    Take-aways ~ActiveWarehouse It’san impressive codebase. When you get it working, it works well.. but Virtually no documentation! No contemporary examples Not under very active development A “textbook” data warehouse implementation. May or may not be exactly what you want.. Remember: data is batched. Not realtime. Rails 2.x : install the plugin (gem is 1.x) 3
  • 37.
    Take-aways ~ ActiveWarehouse-ETLNeat tool. In addition to feeding AW: Generate and load seed/test data Move data between systems But again, Poor documentation When it fails, can do so silently (makes sure filename paths are delimited correctly for your platform!) 2
  • 38.
    Take-aways ~ BIon Rails Solutions Plain AR just avoid the rabbit hole AR + ETL get all the data you need in one place AW+ETL traditional ROLAP, make Rails the focus of the BI effort Go the BI suite route When you need to adapt to many transactional systems at scale, and customer has the $$ (Rails remains just for transactional apps) Or… (discussion point;-) 1
  • 39.
  • 40.
    Some References ActiveWarehouse: http://github.com/aeden/activewarehouse ActiveWarehouse-ETL: http://github.com/aeden/activewarehouse-etl Cupcakes Inc sample site(s): http://github.com/tardate/cupcakesinc Singapore Ruby Brigade (SRB): http://groups.google.com/group/singapore-rb

Editor's Notes

  • #12 http://www.salon.com/technology/the_gigaom_network/tech_insider/2009/12/22/5_free_business_intelligence_crunchers_for_your_2010_arsenal/index.html