Cubes                 Light-weigth Online Analytical ProcessingStefan Urbanekstefan.urbanek@gmail.com                     ...
Features■   logical model (metadata)■   aggregated data browsing■   OLAP HTTP server with json interface■   data and metad...
Introductioncubes, facts and dimensions
data cell            data cube
most detailed information              measurable  Fact examples:               measure examples:  •   contract           ...
locationtype             time           dimensions
dimensions                  type                                   location                                               ...
region 2010   May   1st    datehierarchies
subject area         data mart
Summary■   fact – most detailed information for analysis■   measure – an attribute for computation■   dimension – context ...
✂✂   Slicing and Dicing
08               06               07                      09                             10                           20  ...
contracts inlocation                      Estoniatype                         Czech Republic                           Slo...
✂✂    Estonia     2010                               contracts in         location             Estonia in 2010         typ...
locationtype              IT contracts           time
✂✂                       ✂    IT    Estonia    2010                                  IT contracts in                      ...
spending in 2010                   revenue from IT                      projects   measures can be aggregated             ...
Drilling down
250                                    250∑   amount              125                0                                  Al...
looking at more detailed level
250                                                        250 top level           125                          0         ...
Logical Modeldescription based on how you analyze data
Logical Model     user’s or analyst’s perspective:       how data are beingmeasured, aggregated and reportedamount        ...
abstraction over physical data                        ✶ star ❄ snowflake
Model                                         Legend:                                                         "#localizabl...
SlicerCubes OLAP server
Application                                          modelHTTP request                             JSON reply             ...
GET /model{    "cubes": {        "contracts": {            "measures": [                    {                        "name...
GET /aggregate    aggregate measures
∑    amountGET /aggregate
{    "drilldown": {},    "summary": {        "record_count": 19278,        "zmluva_hodnota_sum": 11222821530.12966    }}  ...
cut=...slice and dice with cut parameter
∑    amount ✂2010GET /aggregate?cut=date:2010
{        "drilldown": {},        "summary": {            "date.year": 2011,            "record_count": 64,            "zml...
✂    Estonia              ∑        amount       ✂    2010        /aggregate?cut=date:2010|region:ee
cut=date:2010|region:es       dimension points         cut
date:2010                        pathdimension            region:ee  dimension points
month level    year level            date:2010,12                           pathdimension        hierarchies
date:2010,12|     category:it,sw|        region:eecontracts in December 2010in Estonia for IT – Software  more hierarchies
pipe separates cutsdate:2010,12|category:it,sw|region:ee     comma separates levels                cut
PUT /reportif you want multiple tables and charts with single                     request
drilldown= get more details
250                                   250∑   amount             125               0                                 All ye...
250                                                      250                   125                        0               ...
report by year:        drilldown=datereport by month:        drilldown=date&cut=date:2010report by day:        drilldown=d...
more dimensions:drilldown=date,supplier     for cross-tables
Atributes and Measures         naming
date.year          date.month        region.region_name        region.city_name        region.region_code     category.des...
hierarchical attributedependencies are implicit     defined in model
there is no “date” data typedate components are normal attributes of           date dimension                year     mont...
received_amount_summeasure     aggregation           record_count
page=...&order=    ordering and pagination
page=3&pagesize=203rd page with 20 results per page
order=region.name   order by region name   order=amount:descorder by amount descending
Natural Ordering  attributes can have default order          specified in model           order=year:asc... might be omitte...
lang=transparent localization
Localization               12      3
+        +master             translationsmodel
drilldown=type&lang=en  drilldown=type&lang=skreport query is language independent
Cubesonline analytical processing  github/bitbucket: Stiivi
☛ References                       and further reading[star] Christopher Adamson: Star Schema, 2010
Upcoming SlideShare
Loading in...5
×

Cubes - Lightweight OLAP Framework

14,484

Published on

Cubes is light-weight online analytical processing (OLAP) framework and HTTP OLAP service server.

Documentation: http://packages.python.org/cubes/

Published in: Technology

Cubes - Lightweight OLAP Framework

  1. 1. Cubes Light-weigth Online Analytical ProcessingStefan Urbanekstefan.urbanek@gmail.com April 2011@Stiivi
  2. 2. Features■ logical model (metadata)■ aggregated data browsing■ OLAP HTTP server with json interface■ data and metadata localization■ multiple backends
  3. 3. Introductioncubes, facts and dimensions
  4. 4. data cell data cube
  5. 5. most detailed information measurable Fact examples: measure examples: • contract • contract amount • donation • revenue • spending fact • duration • invoice • price with VAT • project • ... data cell
  6. 6. locationtype time dimensions
  7. 7. dimensions type location time■ provide context for facts■ used to filter queries or reports■ control scope of aggregation of facts■ used for ordering or sorting■ define master-detail relationships ☛ [star]
  8. 8. region 2010 May 1st datehierarchies
  9. 9. subject area data mart
  10. 10. Summary■ fact – most detailed information for analysis■ measure – an attribute for computation■ dimension – context of facts■ hierarchy – master-detail relationship
  11. 11. ✂✂ Slicing and Dicing
  12. 12. 08 06 07 09 10 20 20 20 20 20locationtype time spending in 2010
  13. 13. contracts inlocation Estoniatype Czech Republic Slovakia Hungary Poland Estonia time
  14. 14. ✂✂ Estonia 2010 contracts in location Estonia in 2010 type time
  15. 15. locationtype IT contracts time
  16. 16. ✂✂ ✂ IT Estonia 2010 IT contracts in Estonia in 2010 location type time
  17. 17. spending in 2010 revenue from IT projects measures can be aggregated top 10 contractors
  18. 18. Drilling down
  19. 19. 250 250∑ amount 125 0 All years drill down by date∑ 70 70 amount 35 50 40 60 30 0 2006 2007 2008 2009 2010
  20. 20. looking at more detailed level
  21. 21. 250 250 top level 125 0 All years 70 70 60 35 50 year level 30 40 0 2006 2007 2008 2009 2010 7 7 5month level 3,5 3 4 3 2 0 1 1 Jan Feb Mar Apr March April May ...
  22. 22. Logical Modeldescription based on how you analyze data
  23. 23. Logical Model user’s or analyst’s perspective: how data are beingmeasured, aggregated and reportedamount ∑ amount 24% 12% 28% 20% 16%
  24. 24. abstraction over physical data ✶ star ❄ snowflake
  25. 25. Model Legend: "#localizable name !#required during cube creation label " description " and denormalisation $#required by browser locale Cube name Dimension label " name Hierarchy description " label " name measures description " label " details default hierarchy fact table ! Level name label " key attribute label attribute !Join ! Attribute Mapping !master name masterdetail label " detail alias description " locales !$
  26. 26. SlicerCubes OLAP server
  27. 27. Application modelHTTP request JSON reply Slicer Cell(point of view) ∑ aggregates facts (details) Aggregation Browser model
  28. 28. GET /model{ "cubes": { "contracts": { "measures": [ { "name": "amount", "label": "Contract amount" } ], "dimensions": ["date", "supplier", "process_type", "cpv"] } }, "dimensions": { "supplier": { ... } } ...}
  29. 29. GET /aggregate aggregate measures
  30. 30. ∑ amountGET /aggregate
  31. 31. { "drilldown": {}, "summary": { "record_count": 19278, "zmluva_hodnota_sum": 11222821530.12966 }} GET /aggregate
  32. 32. cut=...slice and dice with cut parameter
  33. 33. ∑ amount ✂2010GET /aggregate?cut=date:2010
  34. 34. { "drilldown": {}, "summary": { "date.year": 2011, "record_count": 64, "zmluva_hodnota_sum": 78717997.108 } } ✂ 2010GET /aggregate?cut=date:2010
  35. 35. ✂ Estonia ∑ amount ✂ 2010 /aggregate?cut=date:2010|region:ee
  36. 36. cut=date:2010|region:es dimension points cut
  37. 37. date:2010 pathdimension region:ee dimension points
  38. 38. month level year level date:2010,12 pathdimension hierarchies
  39. 39. date:2010,12| category:it,sw| region:eecontracts in December 2010in Estonia for IT – Software more hierarchies
  40. 40. pipe separates cutsdate:2010,12|category:it,sw|region:ee comma separates levels cut
  41. 41. PUT /reportif you want multiple tables and charts with single request
  42. 42. drilldown= get more details
  43. 43. 250 250∑ amount 125 0 All years drilldown=date∑ 70 70 amount 35 50 40 60 30 0 2006 2007 2008 2009 2010
  44. 44. 250 250 125 0 All yearsdrilldown=date 70 70 60 35 50 40 30 0 cut=date:2010 2006 2007 2008 2009 2010&drilldown=date implicit 7 7 3,5 5 4hierarchy 0 3 2 1 3 1 Jan Feb Mar Apr March April May ...
  45. 45. report by year: drilldown=datereport by month: drilldown=date&cut=date:2010report by day: drilldown=date&cut=date:2010,11
  46. 46. more dimensions:drilldown=date,supplier for cross-tables
  47. 47. Atributes and Measures naming
  48. 48. date.year date.month region.region_name region.city_name region.region_code category.description attributedimension
  49. 49. hierarchical attributedependencies are implicit defined in model
  50. 50. there is no “date” data typedate components are normal attributes of date dimension year month day
  51. 51. received_amount_summeasure aggregation record_count
  52. 52. page=...&order= ordering and pagination
  53. 53. page=3&pagesize=203rd page with 20 results per page
  54. 54. order=region.name order by region name order=amount:descorder by amount descending
  55. 55. Natural Ordering attributes can have default order specified in model order=year:asc... might be omitted if model contains:{“name” = “year”, “order” = “asc”}
  56. 56. lang=transparent localization
  57. 57. Localization 12 3
  58. 58. + +master translationsmodel
  59. 59. drilldown=type&lang=en drilldown=type&lang=skreport query is language independent
  60. 60. Cubesonline analytical processing github/bitbucket: Stiivi
  61. 61. ☛ References and further reading[star] Christopher Adamson: Star Schema, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×