Cubes 1.0 Overview

2,363 views

Published on

New feature overview of Cubes 1.0 – lightweight Python OLAP and pluggable data warehouse. Video: https://www.youtube.com/watch?v=-FDTK80zsXc Github sources: https://github.com/databrewery/cubes

2 Comments
10 Likes
Statistics
Notes
No Downloads
Views
Total views
2,363
On SlideShare
0
From Embeds
0
Number of Embeds
392
Actions
Shares
0
Downloads
35
Comments
2
Likes
10
Embeds 0
No embeds

No notes for slide

Cubes 1.0 Overview

  1. 1. Cubes 1.0 Overview light data warehouse and conceptual modelling Štefan Urbánek, @Stiivi stefan.urbanek@gmail.com November 2014
  2. 2. understanding through metadata
  3. 3. model data reporting apps / modules metadata
  4. 4. ❄ logical physical
  5. 5. Categorical Data Σ =
  6. 6. OLAP (online analytical processing) lightweight framework for conceptual modelling and analytics
  7. 7. Original Cubes before 1.0
  8. 8. Workspace 1 × 1 × store process or server | model
  9. 9. We needed more!
  10. 10. Models Stores file Σ database API Postgres Mongo API multiple model parts, different sources multiple data sources, heterogenous
  11. 11. Cubes 1.0
  12. 12. Python ≥ 3.4 works with ≥ 2.7 too for the “two” series
  13. 13. ■ analytical workspace ■ model providers ■ new and improved backends ■ better extensibility ■ authorisation
  14. 14. Analytical Workspace
  15. 15. Model Providers Cubes Stores Static Model Provider API Model Provider sales churn activations events BI Data (Postgres) BI Data 2 (Mongo) Events (API)
  16. 16. Workspace Model Providers Cubes Stores Static Model Provider sales churn activations events BI Data (Postgres) BI Data 2 (Mongo) crm sales events [workspace] models_path: /var/lib/cubes/models [models] crm: crm.cubesmodel sales: sales.cubesmodel events: events.cubesmodel [store crm] type: sql url: postgresql://localhost/crm [store events] type: mongo host: localhost collection: events
  17. 17. BYOB bring your own backend Slicer
  18. 18. Backend
  19. 19. | Browser " Store # Provider
  20. 20. Logical Physical create model connect physical data store (database or API) | Browser " Store # Provider Σ aggregate model cubes dimensions model backend objects
  21. 21. Model Provider model cubes dimensions
  22. 22. Model Provider ■ metadata on-the-fly ■ local or external source ■ might be linked to a store model cubes dimensions
  23. 23. Model required required automatic automatic automatic Dimensions column (table) key/attribute property dimension Cubes / Facts table collection event metric Backend SQL MongoDB Mixpanel Google Analytics Slicer cube dimension
  24. 24. Model Improvements
  25. 25. Model ■ measures → aggregates ■ more front-end metadata cube categories, dimension role and cardinality ■ customised dimension linking
  26. 26. "measures": [ { "name": "amount", "label": "Sales Amount" }, { "name": "vat", "label": "VAT" } ] "aggregates": [ { "name": “total_sales", "label": "Total Sales Amount", "measure": "amount", "function": "sum" }, { "name": “total_vat", "label": "Total VAT", "measure": "vat", "function": "sum" }, { "name": "item_count", "label": "Item Count", "function": "count" } ]
  27. 27. Aggregates ■ custom name ■ can refer to other aggregates post-aggregation calculations ■ functions are backend-specific SQL aggregations: sum, count, count_nonempty, count_distinct, min, max, avg, stddev, variance, …
  28. 28. Contextual Dimensions { "measures": [ … ], "dimensions": [ {"name": "date", "hierarchies": ["ym", "yqm"]}, {"name": "date", "alias": "contract_date"} ], … } customisable linking properties: alias, hierarchies, exclude_hierarchies, default_hierarchy_name, cardinality, nonadditive
  29. 29. Dimension Roles dimension.role time level.role year, month, day, … hint for reporting applications or backends
  30. 30. Cardinality overload precautions dimension.cardinality level.cardinality tiny < low < medium < high < <
  31. 31. Browser Σ
  32. 32. Browser ■ uses logical model ■ implements aggregation ■ builds queries ■ retrieves data Logical Physical physical data store (database or API) | Browser " Store Σ aggregate model
  33. 33. Browser Methods ■ features() ■ aggregate(cell, drilldown,…) ■ members(cell, dimension, …) ■ facts(cell, …) ■ fact(id) ■ cell_details(cell, drilldown, …)
  34. 34. Split Cell False True aggregate(split=cell) __within_split__ generated dimension
  35. 35. Post-aggregation “statutils” ■ computed on aggregation result in Python ■ moving averages, deviation, variance wma, sma, sms, smstd, smsrd, smsvar ■ aggregate property: window_size
  36. 36. Store
  37. 37. Store ■ provides database or API connection ■ might provide a model ■ slicer tool actions (future) validation, schema, optimization, ... Logical Physical physical data store (database or API) | Browser connect " Store
  38. 38. SQL Backend also known as ROLAP or SQL query generator
  39. 39. SQL Overview ■ new query builder ■ join optimisation ■ support for outer-joins ■ support for “split” dimension ■ new aggregate functions
  40. 40. ❄ fact table join optimisation
  41. 41. facts match date master detail facts detail date master detail facts master date master detail "joins" = [ { "master": "fact_contracts.contract_date_id", "detail": "dim_date.id", "method": "detail" } ]
  42. 42. Authentication and Authorisation
  43. 43. { “lidia”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:3”] } }, “martin”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:5”] } } }
  44. 44. [workspace] authorization: simple [authorization] rights_file: access_rights.json ! Authorizer
  45. 45. Slicer server ✂
  46. 46. Model Queries ■ GET /cubes overview of cubes from all providers ■ GET /cube/sales/model detailed cube model with described dimensions
  47. 47. Browser Queries ■ GET /cube/name/aggregate ■ GET /cube/name/members/dim ■ GET /cube/name/facts ■ GET /cube/name/fact ■ GET /cube/name/cell
  48. 48. Aggregate GET /cube/sales/aggregate? cut=date:2010 & drilldown=date|region & split=status:1 & page=10 & page_size=100
  49. 49. { "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, “date.year": 2009 }, { "record_count": 31, "amount_sum": 566020, “date.year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 } }
  50. 50. Special Characters “category:10-24” → “10-24” “city:Nové Mesto nad Váhom” → “Nové Mesto nad Váhom"
  51. 51. Relative Time uses dimension roles and Calendar date:yesterday date:90daysago-today expiration_date:lastmonth-next2months
  52. 52. Output Format format=csv format=json format=json_lines * *for facts and members
  53. 53. Deployment reporting for your app or stand-alone
  54. 54. Public HTML & JS Application Slicer server store HTTP request JSON reply model Public HTML & JS Application WSGI store HTTP request JSON reply Slicer Flask App model Public HTML Django, Flask, … store JSON reply Cubes Python API model Public Public store Flask HTML HTML Web Application PHP, RoR, Django Slicer server Slicer Blueprint model Internal store HTTP request JSON reply model
  55. 55. Front-ends generic ad-hoc reporting
  56. 56. ✂ Slicer
  57. 57. Cubes Viewer Jose Juan Montes, ! jjmontesl/cubesviewer
  58. 58. checkgermany.de * Front-end by: Felix Ebert (@femeb) Data by: Friedrich Lindenberg (@pudo) Cubes 0.10.2
  59. 59. Summary & Future
  60. 60. Summary ■ heterogenous pluggable environment ■ easier to extend ■ better SQL query generator
  61. 61. Not Mentioned ■ localisation ■ namespaces ■ calendar ■ query logging
  62. 62. Incubated ■ non-additive properties ■ periods-to-date ■ modeler app ■ cubes.js
  63. 63. Future ■ arithmetic expressions ■ SQL improvements ■ improved API for custom browsers ■ cubes.js
  64. 64. Nutrition Facts Serving Size 1 cube Amount Per Serving Total Fat 0g Saturated Fat 0g Trans Fat 0g % Daily Value Total Carbohydrate 0g Dietary Fiber 0g Sugars 0g 0% 0%
  65. 65. Want to contribute? #TODO, #FIXME, Issue # https://github.com/DataBrewery/cubes/issues
  66. 66. Credits
  67. 67. Thanks for 1.0 Robin Thomas Ryan Berlew Jose Juan Montes Squarespace and all contributors on Github
  68. 68. Thank You "Stiivi
  69. 69. github.com/DataBrewery/cubes cubes.databrewery.org

×