SlideShare a Scribd company logo
1 of 61
Download to read offline
Cubes
                 Light-weigth Online Analytical Processing




Stefan Urbanek
stefan.urbanek@gmail.com                                     April 2011
@Stiivi
Features

■   logical model (metadata)
■   aggregated data browsing
■   OLAP HTTP server with json interface
■   data and metadata localization
■   multiple backends
Introduction
cubes, facts and dimensions
data cell




            data cube
most detailed
 information              measurable

  Fact examples:               measure examples:

  •   contract                 •   contract amount

  •   donation                 •   revenue

  •   spending        fact     •   duration

  •   invoice                  •   price with VAT

  •   project
  •   ...




                   data cell
location




type



             time

           dimensions
dimensions                  type

                                   location
                                               time




■   provide context for facts
■   used to filter queries or reports
■   control scope of aggregation of facts
■   used for ordering or sorting
■   define master-detail relationships

                                                 ☛ [star]
region


 2010   May   1st

    date

hierarchies
subject area




         data mart
Summary

■   fact – most detailed information for analysis
■   measure – an attribute for computation
■   dimension – context of facts
■   hierarchy – master-detail relationship
✂
✂   Slicing and Dicing
08
               06

               07




                      09



                             10
                           20
                    20
            20
            20




            20
location




type



           time
                  spending in 2010
contracts in
location                      Estonia




type                         Czech Republic
                           Slovakia
                       Hungary
                    Poland

                  Estonia

           time
✂
✂
    Estonia     2010

                               contracts in

         location             Estonia in 2010




         type



                       time
location




type              IT contracts




           time
✂
✂                       ✂    IT




    Estonia    2010
                                  IT contracts in
                                  Estonia in 2010
    location




    type



                      time
spending in 2010

                   revenue from IT
                      projects

   measures can be aggregated

             top 10
          contractors
Drilling down
250
                                    250



∑   amount
              125

                0
                                  All years




     drill down by date




∑
               70
                                               70
    amount     35           50
                                    40
                                                      60

                     30
                0
                    2006   2007    2008       2009   2010
looking at more detailed level
250
                                                        250
 top level           125

                          0
                                                   All years



                        70
                                                                   70
                                                                             60
                        35                   50
 year level                        30
                                                        40

                          0
                               2006      2007          2008       2009      2010



               7
                                                                        7
                                                   5
month level   3,5
                    3
                                         4
                                                                                  3
                               2
               0                                              1                       1
                    Jan       Feb       Mar       Apr March April May                 ...
Logical Model
description based on how you analyze data
Logical Model
     user’s or analyst’s perspective:


       how data are being
measured, aggregated and reported




amount
             ∑    amount
                                        24% 12%


                                        28%
                                               20%

                                              16%
abstraction over physical data




                        ✶ star
 ❄ snowflake
Model                                         Legend:
                                                         "#localizable
              name
                                                         !#required during cube creation
            label "
         description "
                                                               and denormalisation
                                                         $#required by browser
            locale




            Cube
              name                    Dimension
            label "                        name                           Hierarchy
         description "                   label "                              name
           measures                   description "                         label "
            details                  default hierarchy
         fact table !


                                                                             Level
                                                                              name
                                                                            label "
                                                                          key attribute
                                                                         label attribute

                                     !

Join !                                                                    Attribute
                         Mapping !
master                                                                        name
                           master
detail                                                                      label "
                           detail
 alias                                                                   description "
                                                                          locales !$
Slicer
Cubes OLAP server
Application
                                          model
HTTP request


                             JSON reply



                  Slicer




     Cell
(point of view)
                       ∑       aggregates

                                  facts
                                (details)


         Aggregation Browser

                                            model
GET /model
{
    "cubes": {
        "contracts": {
            "measures": [
                    {
                        "name": "amount",
                        "label": "Contract amount"
                    }
                ],
            "dimensions": ["date", "supplier", "process_type", "cpv"]
        }
    },
    "dimensions": {
        "supplier": {
            ...
        }
    }
    ...
}
GET /aggregate
    aggregate measures
∑    amount


GET /aggregate
{
    "drilldown": {},
    "summary": {
        "record_count": 19278,
        "zmluva_hodnota_sum": 11222821530.12966
    }
}




     GET /aggregate
cut=...
slice and dice with cut parameter
∑    amount
 ✂




2010



GET /aggregate?cut=date:2010
{
        "drilldown": {},
        "summary": {
            "date.year": 2011,
            "record_count": 64,
            "zmluva_hodnota_sum": 78717997.108

        }
    }


                  ✂


                2010



GET /aggregate?cut=date:2010
✂
    Estonia

              ∑        amount
       ✂




    2010



        /aggregate?cut=date:2010|region:ee
cut=date:2010|region:es

       dimension points




         cut
date:2010

                        path
dimension


            region:ee



  dimension points
month level
    year level


            date:2010,12

                           path
dimension


        hierarchies
date:2010,12|
     category:it,sw|
        region:ee

contracts in December 2010
in Estonia for IT – Software



  more hierarchies
pipe separates cuts


date:2010,12|category:it,sw|region:ee



     comma separates levels

                cut
PUT /report
if you want multiple tables and charts with single
                     request
drilldown=
 get more details
250
                                   250



∑   amount
             125

               0
                                 All years




     drilldown=date



∑
             70
                                              70
    amount   35            50
                                   40
                                                     60

                    30
              0
                   2006   2007    2008       2009   2010
250
                                                      250
                   125

                        0
                                                 All years
drilldown=date

                      70
                                                                 70
                                                                           60
                      35                   50
                                                      40
                                 30
                        0
 cut=date:2010               2006      2007          2008       2009      2010

&drilldown=date


 implicit    7
                                                                      7
            3,5                                  5
                                       4
hierarchy    0
                  3
                             2
                                                            1
                                                                                3
                                                                                    1
                  Jan       Feb       Mar       Apr March April May                 ...
report by year:
        drilldown=date



report by month:
        drilldown=date&cut=date:2010


report by day:
        drilldown=date&cut=date:2010,11
more dimensions:


drilldown=date,supplier


     for cross-tables
Atributes and Measures
         naming
date.year
          date.month
        region.region_name
        region.city_name
        region.region_code
     category.description

                     attribute
dimension
hierarchical attribute
dependencies are implicit

     defined in model
there is no “date” data type
date components are normal attributes of
           date dimension



                year     month     day
received_amount_sum


measure     aggregation



           record_count
page=...&order=
    ordering and pagination
page=3&pagesize=20


3rd page with 20 results per page
order=region.name
   order by region name



   order=amount:desc
order by amount descending
Natural Ordering
  attributes can have default order
          specified in model

           order=year:asc


... might be omitted if model contains:

{“name” = “year”, “order” = “asc”}
lang=
transparent localization
Localization
               1


2




      3
+        +

master
             translations
model
drilldown=type&lang=en
  drilldown=type&lang=sk



report query is language independent
Cubes
online analytical processing




  github/bitbucket: Stiivi
☛ References
                       and further reading




[star] Christopher Adamson: Star Schema, 2010

More Related Content

What's hot

200981104 management-information-system-case-study
200981104 management-information-system-case-study200981104 management-information-system-case-study
200981104 management-information-system-case-studyhomeworkping4
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
management information systems
management information systemsmanagement information systems
management information systemsNisha Choudhary
 
Information system ethics
Information system ethicsInformation system ethics
Information system ethicsKriscila Yumul
 
Lesson 2 network database system
Lesson 2 network database systemLesson 2 network database system
Lesson 2 network database systemGiO Friginal
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Information System Concepts & Types of Information Systems
Information System Concepts & Types of Information SystemsInformation System Concepts & Types of Information Systems
Information System Concepts & Types of Information SystemsVR Talsaniya
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
FORMAL & INFORMAL INFORMATION SYSTEM
FORMAL & INFORMAL  INFORMATION SYSTEMFORMAL & INFORMAL  INFORMATION SYSTEM
FORMAL & INFORMAL INFORMATION SYSTEMZahid Parvez
 
Transaction processing system (TPS)
Transaction processing system (TPS)Transaction processing system (TPS)
Transaction processing system (TPS)Jaisha Jaikishan
 

What's hot (17)

200981104 management-information-system-case-study
200981104 management-information-system-case-study200981104 management-information-system-case-study
200981104 management-information-system-case-study
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
management information systems
management information systemsmanagement information systems
management information systems
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Information system ethics
Information system ethicsInformation system ethics
Information system ethics
 
Lesson 2 network database system
Lesson 2 network database systemLesson 2 network database system
Lesson 2 network database system
 
Gch2 l6
Gch2 l6Gch2 l6
Gch2 l6
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Information System Concepts & Types of Information Systems
Information System Concepts & Types of Information SystemsInformation System Concepts & Types of Information Systems
Information System Concepts & Types of Information Systems
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Data mining in e commerce
Data mining in e commerceData mining in e commerce
Data mining in e commerce
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
FORMAL & INFORMAL INFORMATION SYSTEM
FORMAL & INFORMAL  INFORMATION SYSTEMFORMAL & INFORMAL  INFORMATION SYSTEM
FORMAL & INFORMAL INFORMATION SYSTEM
 
DEEP LEARNING IN HEALTHCARE
DEEP LEARNING IN HEALTHCAREDEEP LEARNING IN HEALTHCARE
DEEP LEARNING IN HEALTHCARE
 
Transaction processing system (TPS)
Transaction processing system (TPS)Transaction processing system (TPS)
Transaction processing system (TPS)
 

Viewers also liked

Data Warehouse and OLAP - Lear-Fabini
Data Warehouse and OLAP - Lear-FabiniData Warehouse and OLAP - Lear-Fabini
Data Warehouse and OLAP - Lear-FabiniScott Fabini
 
OLAP Cubes: Basic operations
OLAP Cubes: Basic operationsOLAP Cubes: Basic operations
OLAP Cubes: Basic operationsSthefan Berwanger
 
Processing Big Data in Realtime
Processing Big Data in RealtimeProcessing Big Data in Realtime
Processing Big Data in RealtimeTikal Knowledge
 
Open Data Decentralisation
Open Data DecentralisationOpen Data Decentralisation
Open Data DecentralisationStefan Urbanek
 
Nata de Coco Management Case
Nata de Coco Management CaseNata de Coco Management Case
Nata de Coco Management Caserogel84
 
Business Intelligence Presentation 1 (15th March'16)
Business Intelligence Presentation 1 (15th March'16)Business Intelligence Presentation 1 (15th March'16)
Business Intelligence Presentation 1 (15th March'16)Muhammad Fahad
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explainedStefan Urbanek
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsStefan Urbanek
 
Thesis: Slicing of Java Programs using the Soot Framework (2006)
Thesis:  Slicing of Java Programs using the Soot Framework (2006) Thesis:  Slicing of Java Programs using the Soot Framework (2006)
Thesis: Slicing of Java Programs using the Soot Framework (2006) Arvind Devaraj
 
Case Study Real Time Olap Cubes
Case Study Real Time Olap CubesCase Study Real Time Olap Cubes
Case Study Real Time Olap Cubesmister_zed
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Toolsm_richardson
 
Olap Cube Design
Olap Cube DesignOlap Cube Design
Olap Cube Designh1m
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...Mohammed Naseeruddin Shah
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningMS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningDataminingTools Inc
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 

Viewers also liked (19)

Data Warehouse and OLAP - Lear-Fabini
Data Warehouse and OLAP - Lear-FabiniData Warehouse and OLAP - Lear-Fabini
Data Warehouse and OLAP - Lear-Fabini
 
OLAP Cubes: Basic operations
OLAP Cubes: Basic operationsOLAP Cubes: Basic operations
OLAP Cubes: Basic operations
 
Processing Big Data in Realtime
Processing Big Data in RealtimeProcessing Big Data in Realtime
Processing Big Data in Realtime
 
Open Data Decentralisation
Open Data DecentralisationOpen Data Decentralisation
Open Data Decentralisation
 
Nata de Coco Management Case
Nata de Coco Management CaseNata de Coco Management Case
Nata de Coco Management Case
 
Business Intelligence Presentation 1 (15th March'16)
Business Intelligence Presentation 1 (15th March'16)Business Intelligence Presentation 1 (15th March'16)
Business Intelligence Presentation 1 (15th March'16)
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explained
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data Objects
 
Thesis: Slicing of Java Programs using the Soot Framework (2006)
Thesis:  Slicing of Java Programs using the Soot Framework (2006) Thesis:  Slicing of Java Programs using the Soot Framework (2006)
Thesis: Slicing of Java Programs using the Soot Framework (2006)
 
Case Study Real Time Olap Cubes
Case Study Real Time Olap CubesCase Study Real Time Olap Cubes
Case Study Real Time Olap Cubes
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Tools
 
Olap Cube Design
Olap Cube DesignOlap Cube Design
Olap Cube Design
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
OLAP
OLAPOLAP
OLAP
 
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...
Design, Fabrication and Analysis of Crank and Slotted Lever Quick Return Mech...
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningMS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data mining
 
OLAP
OLAPOLAP
OLAP
 
Data cubes
Data cubesData cubes
Data cubes
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 

Similar to Lightweight Online Analytical Processing Cubes

Phil Mechanisms for e-Gov, ICT Devt, Innovation and Entrepreneurship
Phil Mechanisms for e-Gov, ICT Devt, Innovation and EntrepreneurshipPhil Mechanisms for e-Gov, ICT Devt, Innovation and Entrepreneurship
Phil Mechanisms for e-Gov, ICT Devt, Innovation and EntrepreneurshipAlejandro Melchor III
 
Part 2 of the webinar - Which freaking database should I use?
Part 2 of the webinar - Which freaking database should I use?Part 2 of the webinar - Which freaking database should I use?
Part 2 of the webinar - Which freaking database should I use?Dipti Borkar
 
What Can Data Tell Us?
What Can Data Tell Us?What Can Data Tell Us?
What Can Data Tell Us?OReillyTOC
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia wsYu Suzuki
 
Managing a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectManaging a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectYottaa
 
Sql 2008 and project server 2010
Sql 2008 and project server 2010Sql 2008 and project server 2010
Sql 2008 and project server 2010Eduardo Castro
 
Lattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsLattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsRoberto C. S. Pacheco
 
A short introduction of D3js
A short introduction of D3jsA short introduction of D3js
A short introduction of D3jsdreampuf
 
Lean principles and practices
Lean principles and practicesLean principles and practices
Lean principles and practicesJelle Bens
 
Strathcona update sept.20 2010
Strathcona  update sept.20 2010Strathcona  update sept.20 2010
Strathcona update sept.20 2010Gerry Gabinet
 
Ukázkový report - SolidVision
Ukázkový report - SolidVisionUkázkový report - SolidVision
Ukázkový report - SolidVision3dskenovani
 
Bratislava Airport City
Bratislava Airport CityBratislava Airport City
Bratislava Airport Cityreformcapital
 
The State of Distress in Chicago Area Real Estate
The State of Distress in Chicago Area Real EstateThe State of Distress in Chicago Area Real Estate
The State of Distress in Chicago Area Real EstateRyan Slack
 
State of chicago area distres -matthew anderson
State of chicago area distres -matthew andersonState of chicago area distres -matthew anderson
State of chicago area distres -matthew andersonguest381588
 

Similar to Lightweight Online Analytical Processing Cubes (17)

Phil Mechanisms for e-Gov, ICT Devt, Innovation and Entrepreneurship
Phil Mechanisms for e-Gov, ICT Devt, Innovation and EntrepreneurshipPhil Mechanisms for e-Gov, ICT Devt, Innovation and Entrepreneurship
Phil Mechanisms for e-Gov, ICT Devt, Innovation and Entrepreneurship
 
Part 2 of the webinar - Which freaking database should I use?
Part 2 of the webinar - Which freaking database should I use?Part 2 of the webinar - Which freaking database should I use?
Part 2 of the webinar - Which freaking database should I use?
 
What Can Data Tell Us?
What Can Data Tell Us?What Can Data Tell Us?
What Can Data Tell Us?
 
International Marginal Tax Rates (three graphs)
International Marginal Tax Rates (three graphs)International Marginal Tax Rates (three graphs)
International Marginal Tax Rates (three graphs)
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
 
Managing a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) ProjectManaging a Website Performance Optimization (WPO) Project
Managing a Website Performance Optimization (WPO) Project
 
Information Aesthetics
Information AestheticsInformation Aesthetics
Information Aesthetics
 
Intel ppt
Intel pptIntel ppt
Intel ppt
 
Sql 2008 and project server 2010
Sql 2008 and project server 2010Sql 2008 and project server 2010
Sql 2008 and project server 2010
 
Lattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsLattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platforms
 
A short introduction of D3js
A short introduction of D3jsA short introduction of D3js
A short introduction of D3js
 
Lean principles and practices
Lean principles and practicesLean principles and practices
Lean principles and practices
 
Strathcona update sept.20 2010
Strathcona  update sept.20 2010Strathcona  update sept.20 2010
Strathcona update sept.20 2010
 
Ukázkový report - SolidVision
Ukázkový report - SolidVisionUkázkový report - SolidVision
Ukázkový report - SolidVision
 
Bratislava Airport City
Bratislava Airport CityBratislava Airport City
Bratislava Airport City
 
The State of Distress in Chicago Area Real Estate
The State of Distress in Chicago Area Real EstateThe State of Distress in Chicago Area Real Estate
The State of Distress in Chicago Area Real Estate
 
State of chicago area distres -matthew anderson
State of chicago area distres -matthew andersonState of chicago area distres -matthew anderson
State of chicago area distres -matthew anderson
 

More from Stefan Urbanek

Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Stefan Urbanek
 
New york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionNew york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionStefan Urbanek
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deploymentStefan Urbanek
 
Knowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsKnowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsStefan Urbanek
 
Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionDallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionStefan Urbanek
 
Dallas Data Brewery - introduction
Dallas Data Brewery - introductionDallas Data Brewery - introduction
Dallas Data Brewery - introductionStefan Urbanek
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Stefan Urbanek
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Stefan Urbanek
 
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleKnowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleStefan Urbanek
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsStefan Urbanek
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceStefan Urbanek
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06Stefan Urbanek
 
Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Stefan Urbanek
 
Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management IntroductionStefan Urbanek
 

More from Stefan Urbanek (17)

StepTalk Introduction
StepTalk IntroductionStepTalk Introduction
StepTalk Introduction
 
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
 
Sepro - introduction
Sepro - introductionSepro - introduction
Sepro - introduction
 
New york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionNew york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introduction
 
Cubes 1.0 Overview
Cubes 1.0 OverviewCubes 1.0 Overview
Cubes 1.0 Overview
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deployment
 
Knowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsKnowledge Management Lecture 4: Models
Knowledge Management Lecture 4: Models
 
Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionDallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality Perception
 
Dallas Data Brewery - introduction
Dallas Data Brewery - introductionDallas Data Brewery - introduction
Dallas Data Brewery - introduction
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
 
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleKnowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: Cycle
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizations
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presence
 
Open spending as-is 2011-06
Open spending   as-is 2011-06Open spending   as-is 2011-06
Open spending as-is 2011-06
 
Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)
 
Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management Introduction
 

Recently uploaded

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Lightweight Online Analytical Processing Cubes

  • 1. Cubes Light-weigth Online Analytical Processing Stefan Urbanek stefan.urbanek@gmail.com April 2011 @Stiivi
  • 2. Features ■ logical model (metadata) ■ aggregated data browsing ■ OLAP HTTP server with json interface ■ data and metadata localization ■ multiple backends
  • 4. data cell data cube
  • 5. most detailed information measurable Fact examples: measure examples: • contract • contract amount • donation • revenue • spending fact • duration • invoice • price with VAT • project • ... data cell
  • 6. location type time dimensions
  • 7. dimensions type location time ■ provide context for facts ■ used to filter queries or reports ■ control scope of aggregation of facts ■ used for ordering or sorting ■ define master-detail relationships ☛ [star]
  • 8. region 2010 May 1st date hierarchies
  • 9. subject area data mart
  • 10. Summary ■ fact – most detailed information for analysis ■ measure – an attribute for computation ■ dimension – context of facts ■ hierarchy – master-detail relationship
  • 11. ✂ ✂ Slicing and Dicing
  • 12. 08 06 07 09 10 20 20 20 20 20 location type time spending in 2010
  • 13. contracts in location Estonia type Czech Republic Slovakia Hungary Poland Estonia time
  • 14. ✂ ✂ Estonia 2010 contracts in location Estonia in 2010 type time
  • 15. location type IT contracts time
  • 16. ✂ ✂ ✂ IT Estonia 2010 IT contracts in Estonia in 2010 location type time
  • 17. spending in 2010 revenue from IT projects measures can be aggregated top 10 contractors
  • 19. 250 250 ∑ amount 125 0 All years drill down by date ∑ 70 70 amount 35 50 40 60 30 0 2006 2007 2008 2009 2010
  • 20. looking at more detailed level
  • 21. 250 250 top level 125 0 All years 70 70 60 35 50 year level 30 40 0 2006 2007 2008 2009 2010 7 7 5 month level 3,5 3 4 3 2 0 1 1 Jan Feb Mar Apr March April May ...
  • 22. Logical Model description based on how you analyze data
  • 23. Logical Model user’s or analyst’s perspective: how data are being measured, aggregated and reported amount ∑ amount 24% 12% 28% 20% 16%
  • 24. abstraction over physical data ✶ star ❄ snowflake
  • 25. Model Legend: "#localizable name !#required during cube creation label " description " and denormalisation $#required by browser locale Cube name Dimension label " name Hierarchy description " label " name measures description " label " details default hierarchy fact table ! Level name label " key attribute label attribute ! Join ! Attribute Mapping ! master name master detail label " detail alias description " locales !$
  • 27. Application model HTTP request JSON reply Slicer Cell (point of view) ∑ aggregates facts (details) Aggregation Browser model
  • 28. GET /model { "cubes": { "contracts": { "measures": [ { "name": "amount", "label": "Contract amount" } ], "dimensions": ["date", "supplier", "process_type", "cpv"] } }, "dimensions": { "supplier": { ... } } ... }
  • 29. GET /aggregate aggregate measures
  • 30. amount GET /aggregate
  • 31. { "drilldown": {}, "summary": { "record_count": 19278, "zmluva_hodnota_sum": 11222821530.12966 } } GET /aggregate
  • 32. cut=... slice and dice with cut parameter
  • 33. amount ✂ 2010 GET /aggregate?cut=date:2010
  • 34. { "drilldown": {}, "summary": { "date.year": 2011, "record_count": 64, "zmluva_hodnota_sum": 78717997.108 } } ✂ 2010 GET /aggregate?cut=date:2010
  • 35. Estonia ∑ amount ✂ 2010 /aggregate?cut=date:2010|region:ee
  • 36. cut=date:2010|region:es dimension points cut
  • 37. date:2010 path dimension region:ee dimension points
  • 38. month level year level date:2010,12 path dimension hierarchies
  • 39. date:2010,12| category:it,sw| region:ee contracts in December 2010 in Estonia for IT – Software more hierarchies
  • 41. PUT /report if you want multiple tables and charts with single request
  • 43. 250 250 ∑ amount 125 0 All years drilldown=date ∑ 70 70 amount 35 50 40 60 30 0 2006 2007 2008 2009 2010
  • 44. 250 250 125 0 All years drilldown=date 70 70 60 35 50 40 30 0 cut=date:2010 2006 2007 2008 2009 2010 &drilldown=date implicit 7 7 3,5 5 4 hierarchy 0 3 2 1 3 1 Jan Feb Mar Apr March April May ...
  • 45. report by year: drilldown=date report by month: drilldown=date&cut=date:2010 report by day: drilldown=date&cut=date:2010,11
  • 48. date.year date.month region.region_name region.city_name region.region_code category.description attribute dimension
  • 49. hierarchical attribute dependencies are implicit defined in model
  • 50. there is no “date” data type date components are normal attributes of date dimension year month day
  • 51. received_amount_sum measure aggregation record_count
  • 52. page=...&order= ordering and pagination
  • 53. page=3&pagesize=20 3rd page with 20 results per page
  • 54. order=region.name order by region name order=amount:desc order by amount descending
  • 55. Natural Ordering attributes can have default order specified in model order=year:asc ... might be omitted if model contains: {“name” = “year”, “order” = “asc”}
  • 57. Localization 1 2 3
  • 58. + + master translations model
  • 60. Cubes online analytical processing github/bitbucket: Stiivi
  • 61. ☛ References and further reading [star] Christopher Adamson: Star Schema, 2010