SlideShare a Scribd company logo
Real World Business Intelligence
     and Data Warehousing
           Dr. Thomas Zurek
            January 2012
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Examples of Business Intelligence Scenarios

 fraud detection
 •   retail company
 •   point-of-sales data & given discounts
 •   huge amounts of data
 •   a prototypical BI question
 •   screencam
 production analysis
 • solar power production
 long tail analysis
 • e-commerce companies like Amazon, Ebay, iTunes, Netflix, …
 • translate sales of popular products into (additional) sales in the long tail
 • BI integrated into operational processes
Long Tail Analysis (1) – An Example from Amazon
Long Tail Analysis (2)   Source: Chris Anderson, The Long Tail, Wired, October
                         2004, http://www.wired.com/wired/archive/12.10/tail.html
Long Tail Analysis (3)




 •   Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
Business Intelligence and Data Warehouses

• Business Intelligence
  An environment in which business users conduct analyses that yield overall
  understanding of where
        the business has been,
        where it is now, and
        where it will be in the near future (i.e. planning, predictive).



• Data Warehouse
     An implementation of an informational database used to collect, integrate
      and provide sharable data sourced from multiple operational databases for
      analyses.
     Provide data that is reliable, consistent, understandable.
     It typically serves as the foundation for a business intelligence system.
A Typical Data Warehouse Architecture




                                                                                                                                    Project Governance
         End-user access / Presentation


                                                                       BI Layer                                            ODS
       Reporting / Analyses /
       Planning
Main Service : Make data available for reporting & planning tools
Transform      : Application specific/(dis-)aggregate/lookup
Content        : Application specific
History        : Application specific
Store          : IC,DSO, Info Set, Virtual Provider, Multi Provider.




        Data Propagation                                               Data Warehouse                          Corp.
Main Service   : Spot for apps/Delta to app/App recovery                                                       Memory
Transform      : Enriched || General Business logic
Content        : Data source || Business domain specific
History        : Determined by rebuild requirements of apps
Store          : DSO(can be logical partitioned)
                                                           Business




                                                                                                                                    IT Governance
      Harmonization                                       transform
Main Service   : Integrated, harmonized
Transform      : Harmonize quality assure (in flow|| lookup)
Content        : Defined fields
History        : Short or not at all || Long term
Store          : Info source || IO/DSO/Z-table


      Data Acquisition
Main Service   : Decouple, Fast load and distribute
Transform      : 1:1
Content        : 1 data source, All fields
History        : 4 weeks
Store          : PSA, DSO-WO.
                                                      Provide data
                                                                           Source 1     Source 2   Source 3   Source 4   Source 5
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Main Challenges in the Data Warehousing Layer
 physical connectivity to source systems
 •   many protocols
 •   many formats, code pages, unicode / non-unicode
 •   network quality
 •   source system dependency (down times, peak times, …)
 transformation, cleansing, scrubbing
 •   Jun 1, 2011 = 1.6.2011 = 06/01/11 = …
 •   VW Touareg = VW TOUAREG = *product+ 87654 = …
 •   currency and unit conversions: e.g. box  kg
 •   resolve ID clashes: e.g. same product no. used in different subsiduaries
 •   enrich data: add attributes from source A to data from source B
 consistency, integrity, compliance
 • create one version of the truth
 • track data flows; know where the data originated ("data provenance")
 • keep log and other change information for audits
Main Challenges in the BI Layer
 calculations
 •   aggregation of facts: SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT, …
 •   formulas: e.g. revenue per employee, profitability, …
 •   multi-dimensionality: e.g. time – region – product – sales org
 •   hierarchies: versioning, logic, various types of hierarchies
 •   currency and unit conversions
 •   exceptions: e.g. "good": revenue > 1 mio, "bad": revenue < 500000
 security
 performance
 • use efficient data structures
 • caching
 • precalculation
 planning
 • actuals (read-only) vs plan data
 • planning session / transaction
Main Challenges in the BI Frontend Layer
The frontend layer exposes the rich functionality of the platform.
 many user groups
 • casual user
 • advanced user
 • expert user: familiar w/ domain, data model, technology
 many contexts
 • operational: any employee supervising operations, processes
 • tactical: managers
 • strategical: higher management, board
 many technologies
 •   web: browser, portals, …
 •   Office (esp. Excel)
 •   specific tools
 •   dissemination via email, collaboration spaces, …
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
SQL and OLAP: Example of a Simple Query

                   (Standard) key                              Calculated key
                                          COUNT DISTINCT
                 figure aggregated                          figure, normalizing
                                            key figure
                       by SUM                                  to the subtotal



   Country          Material         Quantity       No. of       Share per
                                                    Customers    Country
                    Pencil           10             5            67% (10/15)

   DE               Paper            5              3            33% (5/15)

                    Subtotal         15             6            100%

                    Pencil           7              3            39% (7/18)

   US               Glue             11             5            61% (11/18)

                    Subtotal         18             7            100%
   Grand Total                       33             11           100%
SQL and OLAP: Data to Calculate the Query Result
            SELECT Country, Material, Customer, SUM(Quantity), 1 FROM …

Country   Material   Customer     Quantity   No. of Customers

                     Aral            2              1
                                                                This is what can be
                     BP              3              1
                                                                 retrieved by SQL.
          Pencil     Esso            1              1
                                                                This is the starting
                     Shell           2              1
DE                                                               point for further
                     Texaco          2              1
                                                                 calculations.
                     BP              1              1
                                                                16 rows 
          Paper      Esso            1              1               imagine a retailer
                     Jet             3              1              o 10000s of materials
                     Agip            1              1              o 10000s of customers
                                                                    imagine a utilities or
          Pencil     Chevron         3              1
                                                                    mobile phone
                     Texaco          3              1               company
                     Agip            3              1              o millions of customers
US                                                                  combinatorics let this
                     Elf             3              1               result explode
          Glue       Exxon           1              1
                     Repsol          2              1
                     Shell           2              1
SQL and OLAP: Layer Definition for Example Query



                                                                 LQ: Coun, Mat,Cust, SUM(Quan), 1




                                      L1: Coun, SUM(Quan)                L5: Coun, Cust, 1          L6: Cust, 1




              L2:                               L3:
                                                                              L4:
LQ.Coun, LQ.Mat, SUM(LQ.Quan)/   LQ.Coun, SUM(LQ.Quan)/SUM(L1.
                                                                 SUM(LQ.Quan)/SUM(L1.Quan), fro
         SUM(L1.Quan)                         Quan)
                                                                          m LQ join L1
         from LQ join L1                  from LQ join L1
SQL and OLAP: Assemble Query Result


  Country               Material   Quantity         No. of Customers   Share per
                                                                       Country


                                   LQ: Coun, Mat,   LQ: Coun, Mat,
                        …                                              L2
  …                                SUM(Quan)        SUM(1)


                        Subtotal   L1               L5: Coun, SUM(1)   L3

  Grand Total                      L1: SUM(Quan)    L6: SUM(1)         L4




© SAP AG 2009. All rights
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
What SAP Offers in this Context
 SAP Business Objects portfolio




                                                                                                                                    Project Governance
           End-user access / Presentation
    o     frontend tools
    o     data quality and extraction                                  BI Layer                                            ODS
       Reporting / Analyses /
       Planning
Main Service : Make data available for reporting & planning tools
    o
Transform modeling tools
            : Application specific/(dis-)aggregate/lookup
Content        : Application specific
History        : Application specific
    o
Store     analytic applications (EPM)
               : IC,DSO, Info Set, Virtual Provider, Multi Provider.


 SAP Sybase portfolio
          Data Propagation                                             Data Warehouse                          Corp.
    o     databases (ASE,app/App…)
Main Service : Spot for apps/Delta to IQ, recovery                                                             Memory
Transform      : Enriched || General Business logic
Content        : Data source || Business domain specific
    o
History   modeling tools
               : Determined by rebuild requirements of apps
Store          : DSO(can be logical partitioned)
 SAP Business Warehouse                                   Business




                                                                                                                                    IT Governance
      Harmonization                                       transform
    o     DW:: Integrated, quality assure (in flow|| lookup)
Main Service
Transform
                application on top of DB
               Harmonize
                           harmonized

Content        : Defined fields
    o
History
Store
          bestShort or not|| IO/DSO/Z-table
             :
                 practice || Long term
             : Info source
                           at all
                                  approach
    o Data Acquisition semantics
       built-in SAP
Main Service : Decouple, Fast load and distribute
 SAP HANA
Transform : 1:1
Content        : 1 data source, All fields
History        : 4 weeks
    o
Store     in-memory DB appliance data
               : PSA, DSO-WO.
                             Provide
                                                                           Source 1     Source 2   Source 3   Source 4   Source 5
SAP HANA + SAP Business Warehouse (BW)
• In general:
          DW = DB + X     e.g. with X = BW

• Now:
          DB  HANA

• Thus:
          DW = HANA + Y   with Y = BW optimized for HANA
SAP Business Warehouse: the X or Y in more detail

• Data Warehouse                                • BI Layer
 o modeling of                                   o analytic modeling
     data flows                                      shared dimensions
     transformations                                 hierarchies
     data containers                                 measures + KPIs
 o data movement and transformation                  currency and unit handling
   processes
                                                     time dependency / versioning
     design tools for such processes
                                                     formulas
     scheduling
     monitoring
                                                 o dimensional data containers
     archiving
                                                   (cubes)
 o connectivity and extraction                   o planning infrastructure
     native connectivity to SAP systems              modeling
     and extractors                                  planning session concept
     first-class integration of Data Services        planning functions
     (ETL)                                       o security
SAP HANA: Key Impacts on Modern DBMS

Advances in Technology    Application-Awareness
• column-store            • DB tailored towards the
                            applications
• in-memory
                          • providing generic operations
• multi-core processors     •   frequently used by those applications
• data compression          •   not in standard SQL (or else)
• infiniband              • examples
                            •   currency conversion
• hard- and software        •   unit of measure conversion
  bundling                  •   hierarchy logic
• NoSQL (i.e. no-ACID)      •   delta management  BW's DSO
                            •   calculation engine
• …                         •   planning engine
SAP HANA: In-Memory Computing
                  Programming Against a New Scarce Resource…




                                           Type of
                                                   Size           Latency (~)
                                           Memory
                                           L1 CPU
                                                      64K         1 ns
                                           Cache
                                           L2 CPU
                                                      256K        5 ns
                                           Cache
                                           L3 CPU
                                                      8M          20 ns
                                           Cache
                                           Main       GBs up to
                                                                  100ns
                                           Memory     TBs
                                           Disk       TBs         >1.000.000 ns



 need cache-conscious data-structures and algorithms !
SAP HANA™
                                                            SAP HANA™
 SAP Business Objects tools      Other query tools / apps
                                                             in-memory software + hardware
                                                              (HP, IBM, Fujitsu, Cisco, Dell, Hitachi)
         SQL       BICS            SQL        MDX
                                                             data modeling and data management
                      SAP HANA
                                                             data acquisition
           SAP In-Memory Computing Studio
                                                            Current Scenarios
                SAP In-Memory Database                       stand-alone data marts
        Calculation and           Row & Column                   operational data marts
        Planning Engine              Storage
                                                                 analytic data marts
                                                             accelerator for ERP scenarios
                                    SAP Business
        Real-Time Data
          Replication
                                    Objects Data                 e.g. controlling & profitability analysis (CO-PA)
                                      Services
                                                                 transparent, i.e. consumption stays with ERP
                                                             DB for Business Warehouse (BW)
                                                                 BW optimized for HANA
SAP Business           SAP NetWeaver          Other data
   Suite             Business Warehouse        sources
                                                                 HANA optimizations for BW
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Take Aways

1. What are Business Intelligence and Data Warehousing?

2. What are some of the challenges?

3. SAP's efforts and products in that space.
Real World Business Intelligence and Data Warehousing

More Related Content

What's hot

Data warehouse
Data warehouseData warehouse
Data warehouse
shachibattar
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing Overview
Ahmed Gamal
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
Eric Matthews
 
Data warehouse
Data warehouseData warehouse
Data warehouse
krishna kumar singh
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Subhanshu Verma
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
adivasoft
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
TDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDWTDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDW
ukc4
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
Krish_ver2
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
Bahria University ,
 
Data warehouse
Data warehouseData warehouse
Data warehouse
Medma Infomatix (P) Ltd.
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
Sachin Batham
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
cpjcollege
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 

What's hot (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing Overview
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Ppt
PptPpt
Ppt
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
TDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDWTDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDW
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 

Similar to Real World Business Intelligence and Data Warehousing

Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataHadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Cloudera, Inc.
 
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe RocherINTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
apidays
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
Dony Riyanto
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
Jos van Dongen
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
VARUN SAXENA
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
VARUN SAXENA
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
Yunyao Li
 
Enterprise Data Lakes
Enterprise Data LakesEnterprise Data Lakes
Enterprise Data Lakes
Farid Gurbanov
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
martinbpeters
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
Stefan Urbanek
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
Senturus
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2akitda
 
Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP
vinoth kumar
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
Byung Ho Lee
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Paris Data Engineers !
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Jonathan Seidman
 
GeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL toolGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool
Thierry Badard
 

Similar to Real World Business Intelligence and Data Warehousing (20)

Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataHadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
 
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe RocherINTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
 
SURENDRANATH GANDLA4
SURENDRANATH GANDLA4SURENDRANATH GANDLA4
SURENDRANATH GANDLA4
 
NaliniProfile
NaliniProfileNaliniProfile
NaliniProfile
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
 
Enterprise Data Lakes
Enterprise Data LakesEnterprise Data Lakes
Enterprise Data Lakes
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
 
GeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL toolGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool
 

Recently uploaded

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 

Recently uploaded (20)

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 

Real World Business Intelligence and Data Warehousing

  • 1. Real World Business Intelligence and Data Warehousing Dr. Thomas Zurek January 2012
  • 2. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 3. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 4. Examples of Business Intelligence Scenarios  fraud detection • retail company • point-of-sales data & given discounts • huge amounts of data • a prototypical BI question • screencam  production analysis • solar power production  long tail analysis • e-commerce companies like Amazon, Ebay, iTunes, Netflix, … • translate sales of popular products into (additional) sales in the long tail • BI integrated into operational processes
  • 5. Long Tail Analysis (1) – An Example from Amazon
  • 6. Long Tail Analysis (2) Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
  • 7. Long Tail Analysis (3) • Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
  • 8. Business Intelligence and Data Warehouses • Business Intelligence An environment in which business users conduct analyses that yield overall understanding of where  the business has been,  where it is now, and  where it will be in the near future (i.e. planning, predictive). • Data Warehouse  An implementation of an informational database used to collect, integrate and provide sharable data sourced from multiple operational databases for analyses.  Provide data that is reliable, consistent, understandable.  It typically serves as the foundation for a business intelligence system.
  • 9. A Typical Data Warehouse Architecture Project Governance End-user access / Presentation BI Layer ODS Reporting / Analyses / Planning Main Service : Make data available for reporting & planning tools Transform : Application specific/(dis-)aggregate/lookup Content : Application specific History : Application specific Store : IC,DSO, Info Set, Virtual Provider, Multi Provider. Data Propagation Data Warehouse Corp. Main Service : Spot for apps/Delta to app/App recovery Memory Transform : Enriched || General Business logic Content : Data source || Business domain specific History : Determined by rebuild requirements of apps Store : DSO(can be logical partitioned) Business IT Governance Harmonization transform Main Service : Integrated, harmonized Transform : Harmonize quality assure (in flow|| lookup) Content : Defined fields History : Short or not at all || Long term Store : Info source || IO/DSO/Z-table Data Acquisition Main Service : Decouple, Fast load and distribute Transform : 1:1 Content : 1 data source, All fields History : 4 weeks Store : PSA, DSO-WO. Provide data Source 1 Source 2 Source 3 Source 4 Source 5
  • 10. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 11. Main Challenges in the Data Warehousing Layer  physical connectivity to source systems • many protocols • many formats, code pages, unicode / non-unicode • network quality • source system dependency (down times, peak times, …)  transformation, cleansing, scrubbing • Jun 1, 2011 = 1.6.2011 = 06/01/11 = … • VW Touareg = VW TOUAREG = *product+ 87654 = … • currency and unit conversions: e.g. box  kg • resolve ID clashes: e.g. same product no. used in different subsiduaries • enrich data: add attributes from source A to data from source B  consistency, integrity, compliance • create one version of the truth • track data flows; know where the data originated ("data provenance") • keep log and other change information for audits
  • 12. Main Challenges in the BI Layer  calculations • aggregation of facts: SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT, … • formulas: e.g. revenue per employee, profitability, … • multi-dimensionality: e.g. time – region – product – sales org • hierarchies: versioning, logic, various types of hierarchies • currency and unit conversions • exceptions: e.g. "good": revenue > 1 mio, "bad": revenue < 500000  security  performance • use efficient data structures • caching • precalculation  planning • actuals (read-only) vs plan data • planning session / transaction
  • 13. Main Challenges in the BI Frontend Layer The frontend layer exposes the rich functionality of the platform.  many user groups • casual user • advanced user • expert user: familiar w/ domain, data model, technology  many contexts • operational: any employee supervising operations, processes • tactical: managers • strategical: higher management, board  many technologies • web: browser, portals, … • Office (esp. Excel) • specific tools • dissemination via email, collaboration spaces, …
  • 14. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 15. SQL and OLAP: Example of a Simple Query (Standard) key Calculated key COUNT DISTINCT figure aggregated figure, normalizing key figure by SUM to the subtotal Country Material Quantity No. of Share per Customers Country Pencil 10 5 67% (10/15) DE Paper 5 3 33% (5/15) Subtotal 15 6 100% Pencil 7 3 39% (7/18) US Glue 11 5 61% (11/18) Subtotal 18 7 100% Grand Total 33 11 100%
  • 16. SQL and OLAP: Data to Calculate the Query Result SELECT Country, Material, Customer, SUM(Quantity), 1 FROM … Country Material Customer Quantity No. of Customers Aral 2 1 This is what can be BP 3 1 retrieved by SQL. Pencil Esso 1 1 This is the starting Shell 2 1 DE point for further Texaco 2 1 calculations. BP 1 1 16 rows  Paper Esso 1 1 imagine a retailer Jet 3 1 o 10000s of materials Agip 1 1 o 10000s of customers imagine a utilities or Pencil Chevron 3 1 mobile phone Texaco 3 1 company Agip 3 1 o millions of customers US combinatorics let this Elf 3 1 result explode Glue Exxon 1 1 Repsol 2 1 Shell 2 1
  • 17. SQL and OLAP: Layer Definition for Example Query LQ: Coun, Mat,Cust, SUM(Quan), 1 L1: Coun, SUM(Quan) L5: Coun, Cust, 1 L6: Cust, 1 L2: L3: L4: LQ.Coun, LQ.Mat, SUM(LQ.Quan)/ LQ.Coun, SUM(LQ.Quan)/SUM(L1. SUM(LQ.Quan)/SUM(L1.Quan), fro SUM(L1.Quan) Quan) m LQ join L1 from LQ join L1 from LQ join L1
  • 18. SQL and OLAP: Assemble Query Result Country Material Quantity No. of Customers Share per Country LQ: Coun, Mat, LQ: Coun, Mat, … L2 … SUM(Quan) SUM(1) Subtotal L1 L5: Coun, SUM(1) L3 Grand Total L1: SUM(Quan) L6: SUM(1) L4 © SAP AG 2009. All rights
  • 19. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 20. What SAP Offers in this Context  SAP Business Objects portfolio Project Governance End-user access / Presentation o frontend tools o data quality and extraction BI Layer ODS Reporting / Analyses / Planning Main Service : Make data available for reporting & planning tools o Transform modeling tools : Application specific/(dis-)aggregate/lookup Content : Application specific History : Application specific o Store analytic applications (EPM) : IC,DSO, Info Set, Virtual Provider, Multi Provider.  SAP Sybase portfolio Data Propagation Data Warehouse Corp. o databases (ASE,app/App…) Main Service : Spot for apps/Delta to IQ, recovery Memory Transform : Enriched || General Business logic Content : Data source || Business domain specific o History modeling tools : Determined by rebuild requirements of apps Store : DSO(can be logical partitioned)  SAP Business Warehouse Business IT Governance Harmonization transform o DW:: Integrated, quality assure (in flow|| lookup) Main Service Transform application on top of DB Harmonize harmonized Content : Defined fields o History Store bestShort or not|| IO/DSO/Z-table : practice || Long term : Info source at all approach o Data Acquisition semantics built-in SAP Main Service : Decouple, Fast load and distribute  SAP HANA Transform : 1:1 Content : 1 data source, All fields History : 4 weeks o Store in-memory DB appliance data : PSA, DSO-WO. Provide Source 1 Source 2 Source 3 Source 4 Source 5
  • 21. SAP HANA + SAP Business Warehouse (BW) • In general: DW = DB + X e.g. with X = BW • Now: DB  HANA • Thus: DW = HANA + Y with Y = BW optimized for HANA
  • 22. SAP Business Warehouse: the X or Y in more detail • Data Warehouse • BI Layer o modeling of o analytic modeling data flows shared dimensions transformations hierarchies data containers measures + KPIs o data movement and transformation currency and unit handling processes time dependency / versioning design tools for such processes formulas scheduling monitoring o dimensional data containers archiving (cubes) o connectivity and extraction o planning infrastructure native connectivity to SAP systems modeling and extractors planning session concept first-class integration of Data Services planning functions (ETL) o security
  • 23. SAP HANA: Key Impacts on Modern DBMS Advances in Technology Application-Awareness • column-store • DB tailored towards the applications • in-memory • providing generic operations • multi-core processors • frequently used by those applications • data compression • not in standard SQL (or else) • infiniband • examples • currency conversion • hard- and software • unit of measure conversion bundling • hierarchy logic • NoSQL (i.e. no-ACID) • delta management  BW's DSO • calculation engine • … • planning engine
  • 24. SAP HANA: In-Memory Computing Programming Against a New Scarce Resource… Type of Size Latency (~) Memory L1 CPU 64K 1 ns Cache L2 CPU 256K 5 ns Cache L3 CPU 8M 20 ns Cache Main GBs up to 100ns Memory TBs Disk TBs >1.000.000 ns  need cache-conscious data-structures and algorithms !
  • 25. SAP HANA™ SAP HANA™ SAP Business Objects tools Other query tools / apps  in-memory software + hardware (HP, IBM, Fujitsu, Cisco, Dell, Hitachi) SQL BICS SQL MDX  data modeling and data management SAP HANA  data acquisition SAP In-Memory Computing Studio Current Scenarios SAP In-Memory Database  stand-alone data marts Calculation and Row & Column operational data marts Planning Engine Storage analytic data marts  accelerator for ERP scenarios SAP Business Real-Time Data Replication Objects Data e.g. controlling & profitability analysis (CO-PA) Services transparent, i.e. consumption stays with ERP  DB for Business Warehouse (BW) BW optimized for HANA SAP Business SAP NetWeaver Other data Suite Business Warehouse sources HANA optimizations for BW
  • 26. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 27. Take Aways 1. What are Business Intelligence and Data Warehousing? 2. What are some of the challenges? 3. SAP's efforts and products in that space.

Editor's Notes

  1. So, what’s inside HANA? This architecture diagram explains the main components and capabilities. …So, I keep throwing around words like ‘massive’ amounts of data and ‘amazing’ speed. What kinds of scale, speed and improvement are customers seeing?