SlideShare a Scribd company logo
1 of 20
Data Warehouse
                               An Introduction

                                   Lecture - 2


Dept of MCA, NIT, Durgapur.           September 6, 2012   1
Data, Data everywhere yet ...
                              I can’t find the data I need
                                 data is scattered over the network
                                 many versions, subtle differences

                              I can’t get the data I need
                                 need an expert to get the data


                              I can’t understand the data I found
                                 available data poorly documented


                              I can’t use the data I found
                                 results are unexpected
                                 data needs to be transformed from one form to
                                 other




Dept of MCA, NIT, Durgapur.      September 6, 2012                               2
What We Need?
     A single, complete and consistent
     store of data obtained from a variety
     of different sources made available to
     end users in a what they can
     understand and use, in a Business
     Context / Subject.

                                   [Barry Devlin]



 Leads towards Business Analysis




Dept of MCA, NIT, Durgapur.            September 6, 2012   3
Subject
                              Orientation
       Organized around major subjects, such as
        customer, product, sales.

       Focusing on the modeling and analysis of data for
       decision makers, not on daily operations or
       transaction processing.

       Provide a simple and concise view around
       particular subject issues, by excluding data that are
       not useful in the decision support process.

Dept of MCA, NIT, Durgapur.      September 6, 2012             4
What Are Analytical
                                   Needs?
                                     Which are our
                                     Which are our
                                 lowest/highest margin
                                  lowest/highest margin
                                      customers ?
                                       customers ?
                                                                Who are my customers
                                                                Who are my customers
        What is the most
         What is the most                                        and what products
                                                                  and what products
      effective distribution
       effective distribution                                     are they buying?
                                                                   are they buying?
            channel?
             channel?


   What product prom-
    What product prom-                                                Which customers
                                                                       Which customers
-otions have the biggest
 -otions have the biggest                                           are most likely to go
                                                                     are most likely to go
   impact on revenue?
    impact on revenue?                                              to the competition ?
                                                                     to the competition ?
                                    What impact will
                                     What impact will
                                 new products/services
                                  new products/services
                                    have on revenue
                                     have on revenue
                                      and margins?
                                       and margins?
Dept of MCA, NIT, Durgapur.                 September 6, 2012                                5
Decision Support System
                  Used to manage and control business
                  Data is historical or point-in-time
                  Optimized for inquiry rather than update
                  Use of the system is loosely defined and can
                  be ad-hoc
                  Used by managers and end-users to
                  understand the business and make
                  judgements




Dept of MCA, NIT, Durgapur.          September 6, 2012           6
Evolution of Decision Support
          60’s: Batch reports
                hard to find and analyze information

                inflexible and expensive, reprogram every request

          70’s: Terminal based DSS and EIS

          80’s: Desktop data access and analysis tools
                query tools, spreadsheets, GUIs

                easy to use, but access only operational db

          90’s: Data warehousing with integrated OLAP engines and
          tools
                To meet the analytical needs of the business.

Dept of MCA, NIT, Durgapur.                   September 6, 2012     7
What are the users saying...

           Data should be integrated across the
           enterprise
           Summary data had a real value to
           the organization
           Historical data held the key to
           understanding data over time
           What-if capabilities are required




Dept of MCA, NIT, Durgapur.            September 6, 2012   8
Need Separate Process?

                               Technique for assembling and
                               managing data from various sources
                               for the purpose of answering business
                               questions. Thus making decisions that
                               were not previously possible.

                               A decision support database
                               maintained separately from the
                               organization’s operational database




Dept of MCA, NIT, Durgapur.      September 6, 2012                     9
Traditional RDBMS used for OLTP
                  Database Systems have been used traditionally
                  for OLTP
                        clerical data processing tasks
                        detailed, up to date data
                        structured repetitive tasks
                        read/update a few records
                        isolation, recovery and integrity are critical
                        Normalization is mandatory



                  Will call these Operational Database
Dept of MCA, NIT, Durgapur.                     September 6, 2012        10
Decision Support
                                 Database
               Defined in many different ways, but not
               rigorously.
                     A decision support database that is
                     maintained separately from the
                     organization’s operational database
                     Support information processing by providing
                     a solid platform of consolidated, historical
                     data for analysis.

Dept of MCA, NIT, Durgapur.          September 6, 2012              11
Some Common Terms
     Operational databases: Operational databases are detail oriented
     databases defined to meet the needs of sometimes very complex
     processes in a company. This detailed view is reflected in the data
     arrangement in the database. The data is highly normalized to avoid data
     redundancy and “complex-maintenance".


     OLTP: On-Line Transaction Processing (OLTP) describes the way data
     is processed by an end user or a computer system. It is detail oriented,
     highly repetitive with massive amounts of updates and changes of the
     data by the end user. It is also very often described as the use of
     computers to run the on-going operation of a business.

Dept of MCA, NIT, Durgapur.           September 6, 2012                         12
Some Common Terms
                                         Cont…

          Data warehouse: A data warehouse collects, organizes, and makes
          data available for the purpose of analysis — to give management the
          ability to access and analyze information about its business. This type
          of data can be called "informational data". The systems used to work
          with informational data are referred to as OLAP (On-Line Analytical
          Processing).


          We will call it Informational Database .




Dept of MCA, NIT, Durgapur.             September 6, 2012                      13
Some Common Terms
                                            Cont…




          Operational versus informational databases
          The major difference between operational and informational databases is the
          update frequency:
          1. On operational databases a high number of transactions take place every
          hour. The database is always "up to date", and it represents a snapshot of
          the current business situation, or more commonly referred to as point in
          time.

          2. Informational databases are usually stable over a period of time to
          represent a situation at a specific point in time in the past, which can be
          noted as historical data.
Dept of MCA, NIT, Durgapur.                  September 6, 2012                          14
Some Common Terms
                                             Cont…

          OLAP: On-Line Analytical Processing (OLAP) is a category of software
          technology that enables analysts, managers and executives to gain insight into
          data through fast, consistent, interactive access to a wide variety of possible
          views of information that has been transformed from raw data to reflect the real
          dimensionality of the enterprise as understood by the user.

          OLAP is implemented in a multi-user client/server mode and offers
          consistently rapid response to queries, regardless of database size and
          complexity. OLAP helps the user synthesize enterprise information through
          comparative, personalized viewing, as well as through analysis of historical
          and projected data in various "what-if" data model scenarios. This is achieved
          through use of an OLAP Server.



Dept of MCA, NIT, Durgapur.                 September 6, 2012                           15
OLTP vs. Data Warehouse
                  OLTP                               Warehouse (OLAP)
                        Application Oriented              Subject Oriented
                        Used to run business              Used to analyze business
                        Clerical User                     Manager/Analyst
                        Detailed data                     Summarized and refined
                        Current up to date                Snapshot data
                        Isolated Data                     Integrated Data
                        Repetitive access by              Ad-hoc access using
                        small transactions                large queries
                        Read/Update access                Mostly read access (batch
                                                          update)

Dept of MCA, NIT, Durgapur.                    September 6, 2012                      16
Some Common Terms
                                              Cont…

          Metadata — a definition

          Metadata is the kind of information that describes the data stored in a
          database and includes such information as:

          • A description of tables and fields in the data warehouse, including data
          types and the range of acceptable values.

          • A similar description of tables and fields in the source databases, with a
          mapping of fields from the source to the warehouse.

          • A description of how the data has been transformed, including formulae,
          formatting, currency conversion, and time aggregation.

          • Any other information that is needed to support and manage the operation
          of the data warehouse.


Dept of MCA, NIT, Durgapur.                  September 6, 2012                           17
Some Common Terms
                                       Cont…

     Data mart: A data mart contains a subset of corporate data that is of
     value to a specific business unit, department, or set of users. This subset
     consists of historical, summarized, and possibly detailed data captured
     from transaction processing systems, or from an enterprise data
     warehouse. It is important to realize that a data mart is defined by the
     functional scope of its users, and not by the size of the data mart
     database. Most data marts today involve less than 100 GB of data; some
     are larger, however it is expected that as data mart usage increases they
     will rapidly increase in size.

     Data mining: Data mining is the process of extracting valid, useful,
     previously unknown, and comprehensible information from data and using
     it to make business decisions.
Dept of MCA, NIT, Durgapur.            September 6, 2012                       18
Problem in General Purpose SQL
            Let a set of database schemas are as follows:
            1. Product ( P_ID, P_NAME, P_DESC);
            2. Sales (R_NO, P_ID, Q_ID, AMOUNT);
            3. Time (Q_ID, Q_DESC);


            Say, the organization need to generate a report as follows:

              Product         4Q96 Sales        4Q97 Sales
                  XYZ              57                66
                  ABC              29                24
                  PQR             115               89


Dept of MCA, NIT, Durgapur.              September 6, 2012                19
Problem in SQL                   Cont…


       The SQL may be needed to display the Fourth Quarter 1996 Sales may be
       as follows:


       SELECT Product.P_Name, SUM(Sales.DOLLAR)
       FROM Sales, Product, Time
       WHERE . . . Time.Q_ID= '4Q96'
       AND Product.Product_Name in (‘XYZ', ‘ABC', ‘PQR')
       GROUP BY Product.P_NAME

       If one expand the Time constraint to include both quarters, as follows:

       WHERE . . . Time.Quarter IN ('4Q96', '4Q97')

       then the sum expression adds up the sales from both quarters, which
       we do not want. Also SQL not gives any other alternative.

          Hence General SQL Engine fails in case of query like above.

Dept of MCA, NIT, Durgapur.                September 6, 2012                     20

More Related Content

What's hot

Customer Contact Solutions
Customer Contact SolutionsCustomer Contact Solutions
Customer Contact Solutionsanglerdirekt
 
Putting customer insight into practice, Peter Gadsdon, Lewisham Council
Putting customer insight into practice, Peter Gadsdon, Lewisham CouncilPutting customer insight into practice, Peter Gadsdon, Lewisham Council
Putting customer insight into practice, Peter Gadsdon, Lewisham Councillocalinsight
 
GE Healthcare - HP Case Study
GE Healthcare - HP Case StudyGE Healthcare - HP Case Study
GE Healthcare - HP Case StudyMilan Caha
 
NINtec corporate presentation
NINtec corporate presentationNINtec corporate presentation
NINtec corporate presentationNINtec
 
Advocate Consulting - Enterprise Communications
Advocate Consulting - Enterprise CommunicationsAdvocate Consulting - Enterprise Communications
Advocate Consulting - Enterprise CommunicationsAdvocate Consulting
 
QServ Corporation Sap BI Brochure
QServ Corporation Sap BI BrochureQServ Corporation Sap BI Brochure
QServ Corporation Sap BI BrochureManisha Sangwan
 
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...“A Practitioner’s View” on the latest trends and information on BI/ DW techno...
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...Hazelknight Media & Entertainment Pvt Ltd
 
IBM Business Analytics and Optimization - Introduktion till Prediktiv Analys
IBM Business Analytics and Optimization - Introduktion till Prediktiv AnalysIBM Business Analytics and Optimization - Introduktion till Prediktiv Analys
IBM Business Analytics and Optimization - Introduktion till Prediktiv AnalysIBM Sverige
 
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506Tim Grant
 
2ST.net Corporate Overview 2012
2ST.net Corporate Overview 20122ST.net Corporate Overview 2012
2ST.net Corporate Overview 2012chohl
 
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...IBM Sverige
 
121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1Thibaut De Vylder
 
Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Stephen King
 
Cost Reduction Guide Issue 6 IT
Cost Reduction Guide Issue 6 ITCost Reduction Guide Issue 6 IT
Cost Reduction Guide Issue 6 ITymw15
 

What's hot (19)

Customer Contact Solutions
Customer Contact SolutionsCustomer Contact Solutions
Customer Contact Solutions
 
Automated loan processing
Automated loan processingAutomated loan processing
Automated loan processing
 
Putting customer insight into practice, Peter Gadsdon, Lewisham Council
Putting customer insight into practice, Peter Gadsdon, Lewisham CouncilPutting customer insight into practice, Peter Gadsdon, Lewisham Council
Putting customer insight into practice, Peter Gadsdon, Lewisham Council
 
GE Healthcare - HP Case Study
GE Healthcare - HP Case StudyGE Healthcare - HP Case Study
GE Healthcare - HP Case Study
 
NINtec corporate presentation
NINtec corporate presentationNINtec corporate presentation
NINtec corporate presentation
 
Advocate Consulting - Enterprise Communications
Advocate Consulting - Enterprise CommunicationsAdvocate Consulting - Enterprise Communications
Advocate Consulting - Enterprise Communications
 
QServ Corporation Sap BI Brochure
QServ Corporation Sap BI BrochureQServ Corporation Sap BI Brochure
QServ Corporation Sap BI Brochure
 
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...“A Practitioner’s View” on the latest trends and information on BI/ DW techno...
“A Practitioner’s View” on the latest trends and information on BI/ DW techno...
 
QServ Retail Analytics Offering
QServ Retail Analytics OfferingQServ Retail Analytics Offering
QServ Retail Analytics Offering
 
QServ Retail Analytics Offering
QServ Retail Analytics OfferingQServ Retail Analytics Offering
QServ Retail Analytics Offering
 
IBM Business Analytics and Optimization - Introduktion till Prediktiv Analys
IBM Business Analytics and Optimization - Introduktion till Prediktiv AnalysIBM Business Analytics and Optimization - Introduktion till Prediktiv Analys
IBM Business Analytics and Optimization - Introduktion till Prediktiv Analys
 
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506
Iscram09 Grant Ppr248 Mixed Rational Naturalistic Ds Final Slides 090506
 
2ST.net Corporate Overview 2012
2ST.net Corporate Overview 20122ST.net Corporate Overview 2012
2ST.net Corporate Overview 2012
 
Probabilistic Soft Logic
Probabilistic Soft LogicProbabilistic Soft Logic
Probabilistic Soft Logic
 
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...
IBM Information Management - Optimera er verksamhet och öka kundnyttan med nä...
 
121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1121211 depfac ulb_master_presentation_v5_1
121211 depfac ulb_master_presentation_v5_1
 
LucidEra Introduction
LucidEra IntroductionLucidEra Introduction
LucidEra Introduction
 
Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)Make Money with Big Data (TCELab)
Make Money with Big Data (TCELab)
 
Cost Reduction Guide Issue 6 IT
Cost Reduction Guide Issue 6 ITCost Reduction Guide Issue 6 IT
Cost Reduction Guide Issue 6 IT
 

Viewers also liked

Taller Viquipèdia al Museu del Disseny
Taller Viquipèdia al Museu del DissenyTaller Viquipèdia al Museu del Disseny
Taller Viquipèdia al Museu del DissenyKippelboy .
 
13. factoreo
13. factoreo13. factoreo
13. factoreoSALINAS
 
Hw geography why is georgraphy important part 1 map
Hw geography why is georgraphy important part 1 mapHw geography why is georgraphy important part 1 map
Hw geography why is georgraphy important part 1 mappaulsturtivant
 
Using Manual about Ad900 Operating Car Tool
Using Manual about Ad900 Operating Car ToolUsing Manual about Ad900 Operating Car Tool
Using Manual about Ad900 Operating Car ToolAmy joe
 
Dritte Welt Ernährung
Dritte Welt ErnährungDritte Welt Ernährung
Dritte Welt Ernährungalfred10
 
The Magnificient 7 Review 6
The Magnificient 7   Review 6The Magnificient 7   Review 6
The Magnificient 7 Review 6Markets Beyond
 
Mediaki Solutions - Advanced Solutions for Tourism & Travel Industry
Mediaki Solutions - Advanced Solutions for Tourism & Travel IndustryMediaki Solutions - Advanced Solutions for Tourism & Travel Industry
Mediaki Solutions - Advanced Solutions for Tourism & Travel IndustryMagmaConsultants
 
Subiaco Oval Business Strategy
Subiaco Oval Business Strategy Subiaco Oval Business Strategy
Subiaco Oval Business Strategy ISFM Australasia
 
Install wordpress offline
Install wordpress offlineInstall wordpress offline
Install wordpress offlineIim Dadut
 
Intl 2pp general flyer a4 dec2013 web
Intl 2pp general flyer a4 dec2013 webIntl 2pp general flyer a4 dec2013 web
Intl 2pp general flyer a4 dec2013 webThieu Nguyen
 

Viewers also liked (20)

Taller Viquipèdia al Museu del Disseny
Taller Viquipèdia al Museu del DissenyTaller Viquipèdia al Museu del Disseny
Taller Viquipèdia al Museu del Disseny
 
Advertising
AdvertisingAdvertising
Advertising
 
13. factoreo
13. factoreo13. factoreo
13. factoreo
 
Hw geography why is georgraphy important part 1 map
Hw geography why is georgraphy important part 1 mapHw geography why is georgraphy important part 1 map
Hw geography why is georgraphy important part 1 map
 
Using Manual about Ad900 Operating Car Tool
Using Manual about Ad900 Operating Car ToolUsing Manual about Ad900 Operating Car Tool
Using Manual about Ad900 Operating Car Tool
 
Tenis
TenisTenis
Tenis
 
C
CC
C
 
Dritte Welt Ernährung
Dritte Welt ErnährungDritte Welt Ernährung
Dritte Welt Ernährung
 
The Magnificient 7 Review 6
The Magnificient 7   Review 6The Magnificient 7   Review 6
The Magnificient 7 Review 6
 
Power Point Presention
Power Point PresentionPower Point Presention
Power Point Presention
 
Werbewoche-on-Buzzer
Werbewoche-on-BuzzerWerbewoche-on-Buzzer
Werbewoche-on-Buzzer
 
Evidencias unidad 2
Evidencias unidad 2Evidencias unidad 2
Evidencias unidad 2
 
Arumanis Rainbow
Arumanis RainbowArumanis Rainbow
Arumanis Rainbow
 
373 Std. Ind. 3
373 Std. Ind. 3373 Std. Ind. 3
373 Std. Ind. 3
 
Mediaki Solutions - Advanced Solutions for Tourism & Travel Industry
Mediaki Solutions - Advanced Solutions for Tourism & Travel IndustryMediaki Solutions - Advanced Solutions for Tourism & Travel Industry
Mediaki Solutions - Advanced Solutions for Tourism & Travel Industry
 
Subiaco Oval Business Strategy
Subiaco Oval Business Strategy Subiaco Oval Business Strategy
Subiaco Oval Business Strategy
 
Moving Checklist
Moving ChecklistMoving Checklist
Moving Checklist
 
Emm3103
Emm3103Emm3103
Emm3103
 
Install wordpress offline
Install wordpress offlineInstall wordpress offline
Install wordpress offline
 
Intl 2pp general flyer a4 dec2013 web
Intl 2pp general flyer a4 dec2013 webIntl 2pp general flyer a4 dec2013 web
Intl 2pp general flyer a4 dec2013 web
 

Similar to Dw

Data mining & warehousing
Data mining & warehousingData mining & warehousing
Data mining & warehousingSamoneh Dashti
 
Krithi talk-impact
Krithi talk-impactKrithi talk-impact
Krithi talk-impactKaran7755
 
Leverage IBM Business Analytics with PMSquare
Leverage IBM Business Analytics with PMSquareLeverage IBM Business Analytics with PMSquare
Leverage IBM Business Analytics with PMSquarePM square
 
OLAP Release 13082012
OLAP Release 13082012OLAP Release 13082012
OLAP Release 13082012Pozzolini
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
Seven building blocks for MDM
Seven building blocks for MDMSeven building blocks for MDM
Seven building blocks for MDMKousik Mukherjee
 
How to make data actionable for business
How to make data actionable for businessHow to make data actionable for business
How to make data actionable for businessRavi Padaki
 
Getting to Global Spend Visibility_Nestle
Getting to Global Spend Visibility_NestleGetting to Global Spend Visibility_Nestle
Getting to Global Spend Visibility_NestleZycus
 
Predictive Analytics with IBM Cognos 10
Predictive Analytics with IBM Cognos 10Predictive Analytics with IBM Cognos 10
Predictive Analytics with IBM Cognos 10Senturus
 
Business Intelligence: The Definitive Guide
Business Intelligence: The Definitive GuideBusiness Intelligence: The Definitive Guide
Business Intelligence: The Definitive GuideFindWhitePapers
 
Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014anilkaul123
 
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SPTechCon
 
Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12David J Rosenthal
 
Improve Efficiency & Reduce Costs through BI in Fertilizer Sector
Improve Efficiency & Reduce Costs through BI in Fertilizer SectorImprove Efficiency & Reduce Costs through BI in Fertilizer Sector
Improve Efficiency & Reduce Costs through BI in Fertilizer SectorDhiren Gala
 
Monetizing data - An Evening with Eight of Chicago's Data Product Management...
Monetizing data  - An Evening with Eight of Chicago's Data Product Management...Monetizing data  - An Evening with Eight of Chicago's Data Product Management...
Monetizing data - An Evening with Eight of Chicago's Data Product Management...Randy Horton
 

Similar to Dw (20)

Data mining & warehousing
Data mining & warehousingData mining & warehousing
Data mining & warehousing
 
Krithi talk-impact
Krithi talk-impactKrithi talk-impact
Krithi talk-impact
 
Leverage IBM Business Analytics with PMSquare
Leverage IBM Business Analytics with PMSquareLeverage IBM Business Analytics with PMSquare
Leverage IBM Business Analytics with PMSquare
 
OLAP Release 13082012
OLAP Release 13082012OLAP Release 13082012
OLAP Release 13082012
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Seven building blocks for MDM
Seven building blocks for MDMSeven building blocks for MDM
Seven building blocks for MDM
 
How to make data actionable for business
How to make data actionable for businessHow to make data actionable for business
How to make data actionable for business
 
Getting to Global Spend Visibility_Nestle
Getting to Global Spend Visibility_NestleGetting to Global Spend Visibility_Nestle
Getting to Global Spend Visibility_Nestle
 
Predictive Analytics with IBM Cognos 10
Predictive Analytics with IBM Cognos 10Predictive Analytics with IBM Cognos 10
Predictive Analytics with IBM Cognos 10
 
Business Intelligence: The Definitive Guide
Business Intelligence: The Definitive GuideBusiness Intelligence: The Definitive Guide
Business Intelligence: The Definitive Guide
 
iClaims SWOT
iClaims SWOTiClaims SWOT
iClaims SWOT
 
Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014Decision Engineering Pass conference presentation 2014
Decision Engineering Pass conference presentation 2014
 
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
 
Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12
 
Why mTAB?
Why mTAB?Why mTAB?
Why mTAB?
 
Mobile Analytics
Mobile AnalyticsMobile Analytics
Mobile Analytics
 
Improve Efficiency & Reduce Costs through BI in Fertilizer Sector
Improve Efficiency & Reduce Costs through BI in Fertilizer SectorImprove Efficiency & Reduce Costs through BI in Fertilizer Sector
Improve Efficiency & Reduce Costs through BI in Fertilizer Sector
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Monetizing data - An Evening with Eight of Chicago's Data Product Management...
Monetizing data  - An Evening with Eight of Chicago's Data Product Management...Monetizing data  - An Evening with Eight of Chicago's Data Product Management...
Monetizing data - An Evening with Eight of Chicago's Data Product Management...
 
[Webinar] High Speed Retail Analytics
[Webinar] High Speed Retail Analytics[Webinar] High Speed Retail Analytics
[Webinar] High Speed Retail Analytics
 

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Dw

  • 1. Data Warehouse An Introduction Lecture - 2 Dept of MCA, NIT, Durgapur. September 6, 2012 1
  • 2. Data, Data everywhere yet ... I can’t find the data I need data is scattered over the network many versions, subtle differences I can’t get the data I need need an expert to get the data I can’t understand the data I found available data poorly documented I can’t use the data I found results are unexpected data needs to be transformed from one form to other Dept of MCA, NIT, Durgapur. September 6, 2012 2
  • 3. What We Need? A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use, in a Business Context / Subject. [Barry Devlin] Leads towards Business Analysis Dept of MCA, NIT, Durgapur. September 6, 2012 3
  • 4. Subject Orientation  Organized around major subjects, such as customer, product, sales.  Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing.  Provide a simple and concise view around particular subject issues, by excluding data that are not useful in the decision support process. Dept of MCA, NIT, Durgapur. September 6, 2012 4
  • 5. What Are Analytical Needs? Which are our Which are our lowest/highest margin lowest/highest margin customers ? customers ? Who are my customers Who are my customers What is the most What is the most and what products and what products effective distribution effective distribution are they buying? are they buying? channel? channel? What product prom- What product prom- Which customers Which customers -otions have the biggest -otions have the biggest are most likely to go are most likely to go impact on revenue? impact on revenue? to the competition ? to the competition ? What impact will What impact will new products/services new products/services have on revenue have on revenue and margins? and margins? Dept of MCA, NIT, Durgapur. September 6, 2012 5
  • 6. Decision Support System Used to manage and control business Data is historical or point-in-time Optimized for inquiry rather than update Use of the system is loosely defined and can be ad-hoc Used by managers and end-users to understand the business and make judgements Dept of MCA, NIT, Durgapur. September 6, 2012 6
  • 7. Evolution of Decision Support 60’s: Batch reports hard to find and analyze information inflexible and expensive, reprogram every request 70’s: Terminal based DSS and EIS 80’s: Desktop data access and analysis tools query tools, spreadsheets, GUIs easy to use, but access only operational db 90’s: Data warehousing with integrated OLAP engines and tools To meet the analytical needs of the business. Dept of MCA, NIT, Durgapur. September 6, 2012 7
  • 8. What are the users saying... Data should be integrated across the enterprise Summary data had a real value to the organization Historical data held the key to understanding data over time What-if capabilities are required Dept of MCA, NIT, Durgapur. September 6, 2012 8
  • 9. Need Separate Process? Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previously possible. A decision support database maintained separately from the organization’s operational database Dept of MCA, NIT, Durgapur. September 6, 2012 9
  • 10. Traditional RDBMS used for OLTP Database Systems have been used traditionally for OLTP clerical data processing tasks detailed, up to date data structured repetitive tasks read/update a few records isolation, recovery and integrity are critical Normalization is mandatory Will call these Operational Database Dept of MCA, NIT, Durgapur. September 6, 2012 10
  • 11. Decision Support Database  Defined in many different ways, but not rigorously.  A decision support database that is maintained separately from the organization’s operational database  Support information processing by providing a solid platform of consolidated, historical data for analysis. Dept of MCA, NIT, Durgapur. September 6, 2012 11
  • 12. Some Common Terms Operational databases: Operational databases are detail oriented databases defined to meet the needs of sometimes very complex processes in a company. This detailed view is reflected in the data arrangement in the database. The data is highly normalized to avoid data redundancy and “complex-maintenance". OLTP: On-Line Transaction Processing (OLTP) describes the way data is processed by an end user or a computer system. It is detail oriented, highly repetitive with massive amounts of updates and changes of the data by the end user. It is also very often described as the use of computers to run the on-going operation of a business. Dept of MCA, NIT, Durgapur. September 6, 2012 12
  • 13. Some Common Terms Cont… Data warehouse: A data warehouse collects, organizes, and makes data available for the purpose of analysis — to give management the ability to access and analyze information about its business. This type of data can be called "informational data". The systems used to work with informational data are referred to as OLAP (On-Line Analytical Processing). We will call it Informational Database . Dept of MCA, NIT, Durgapur. September 6, 2012 13
  • 14. Some Common Terms Cont… Operational versus informational databases The major difference between operational and informational databases is the update frequency: 1. On operational databases a high number of transactions take place every hour. The database is always "up to date", and it represents a snapshot of the current business situation, or more commonly referred to as point in time. 2. Informational databases are usually stable over a period of time to represent a situation at a specific point in time in the past, which can be noted as historical data. Dept of MCA, NIT, Durgapur. September 6, 2012 14
  • 15. Some Common Terms Cont… OLAP: On-Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server. Dept of MCA, NIT, Durgapur. September 6, 2012 15
  • 16. OLTP vs. Data Warehouse OLTP Warehouse (OLAP) Application Oriented Subject Oriented Used to run business Used to analyze business Clerical User Manager/Analyst Detailed data Summarized and refined Current up to date Snapshot data Isolated Data Integrated Data Repetitive access by Ad-hoc access using small transactions large queries Read/Update access Mostly read access (batch update) Dept of MCA, NIT, Durgapur. September 6, 2012 16
  • 17. Some Common Terms Cont… Metadata — a definition Metadata is the kind of information that describes the data stored in a database and includes such information as: • A description of tables and fields in the data warehouse, including data types and the range of acceptable values. • A similar description of tables and fields in the source databases, with a mapping of fields from the source to the warehouse. • A description of how the data has been transformed, including formulae, formatting, currency conversion, and time aggregation. • Any other information that is needed to support and manage the operation of the data warehouse. Dept of MCA, NIT, Durgapur. September 6, 2012 17
  • 18. Some Common Terms Cont… Data mart: A data mart contains a subset of corporate data that is of value to a specific business unit, department, or set of users. This subset consists of historical, summarized, and possibly detailed data captured from transaction processing systems, or from an enterprise data warehouse. It is important to realize that a data mart is defined by the functional scope of its users, and not by the size of the data mart database. Most data marts today involve less than 100 GB of data; some are larger, however it is expected that as data mart usage increases they will rapidly increase in size. Data mining: Data mining is the process of extracting valid, useful, previously unknown, and comprehensible information from data and using it to make business decisions. Dept of MCA, NIT, Durgapur. September 6, 2012 18
  • 19. Problem in General Purpose SQL Let a set of database schemas are as follows: 1. Product ( P_ID, P_NAME, P_DESC); 2. Sales (R_NO, P_ID, Q_ID, AMOUNT); 3. Time (Q_ID, Q_DESC); Say, the organization need to generate a report as follows: Product 4Q96 Sales 4Q97 Sales XYZ 57 66 ABC 29 24 PQR 115 89 Dept of MCA, NIT, Durgapur. September 6, 2012 19
  • 20. Problem in SQL Cont… The SQL may be needed to display the Fourth Quarter 1996 Sales may be as follows: SELECT Product.P_Name, SUM(Sales.DOLLAR) FROM Sales, Product, Time WHERE . . . Time.Q_ID= '4Q96' AND Product.Product_Name in (‘XYZ', ‘ABC', ‘PQR') GROUP BY Product.P_NAME If one expand the Time constraint to include both quarters, as follows: WHERE . . . Time.Quarter IN ('4Q96', '4Q97') then the sum expression adds up the sales from both quarters, which we do not want. Also SQL not gives any other alternative. Hence General SQL Engine fails in case of query like above. Dept of MCA, NIT, Durgapur. September 6, 2012 20