SlideShare a Scribd company logo
Data Warehousing
and Data Mining
Unit 1
Overview
● Introduction
● Need of Data warehousing
● Data warehousing Building
Blocks
● OLAP in Data warehouse
● OLAP models
● Operational DB vs Data
warehouse
Introduction
Data Warehouse
A data warehouse is a subject-oriented, integrated, time-variant and
non-volatile collection of data in support of management's decision
making process. -Bill Inmon
It is the process whereby organizations extract value from their
informational assets through use of special stores called data
warehouses
Need of Data Warehouse
Computer applications that support day-to-day business operations. are
effective in what they are designed to do. They gather, store, and process all the
data needed to successfully perform the daily operations.
As businesses grew more complex and business executives became
desperate for information to stay competitive. what the executives needed
were different kinds of information that could be readily used to make strategic
decisions. The operational systems, important as they were, could not provide
strategic information. Businesses, therefore, were compelled to turn to new
ways of getting strategic information.Data warehousing is a new paradigm
specifically intended to provide vital strategic information.
Need of Data Warehouse
ESCALATING NEED FOR STRATEGIC INFORMATION
The executives and managers who are responsible for keeping the enterprise
competitive need information to make proper decisions. They need information
to formulate the:-business strategies, establish goals, set objectives, and
monitor results
Critical business decisions depend on the availability of proper strategic
information in an enterprise.
The desired
characteristics of
strategic
information are:
1. INTEGRATED- must have a single,
enterprise-wide view
2. DATA INTEGRITY- Information
must be accurate and must
conform to business rules
3. ACCESSIBLE- Easily accessible
with intuitive access paths, and
responsive for analysis.
4. CREDIBLE- Every business factor
must have one and only one value
5. TIMELY- Information must be
available within the stipulated
time frame.
Need of Data Warehouse
FAILURES OF PAST DECISION-SUPPORT SYSTEMS
The user must be able to query online, get results, and query some more. The
information must be in a format suitable for analysis.
PAST DECISION-SUPPORT SYSTEMS
Ad Hoc Reports. send requests to IT for special reports.IT would write special programs, typically one for each
request, and produce the ad hoc reports.
Special Extract Programs.For the types of reports that would be requested from time to time. IT would write a
suite of programs and run the programs periodically to extract data from the various applications.For any reports
that could not be run off the extracted files, IT would write individual special programs.
Need of Data Warehouse
PAST DECISION-SUPPORT SYSTEMS
Small Applications.IT would create simple applications based on the extracted files.The users could stipulate the
parameters for each special report.
Information Centers. The information center typically was a place where users view special information on
screens.
Decision-Support Systems. these systems were supported by extracted files.The systems were menu-driven
and provided online information and also the ability to print special reports.
Executive Information Systems.The main criteria were simplicity and ease of use. The system would display
key information every day and provide ability to request simple, straightforward reports.However, only
preprogrammed screens and reports were available
Data Warehouse- the only viable solution
DATA WAREHOUSE IS DEFINED AS:
The data warehouse is an informational environment that:-
● Provides an integrated and total view of the enterprise
● Makes the enterprise’s current and historical information easily available
for decision making
● Makes decision-support transactions possible without hindering
operational systems
● Renders the organization’s information consistent
● Presents a flexible and interactive source of strategic information
Data Warehouse: The building block
A data warehouse is typically a dedicated database system for decision
making that is separate from the production database(s) used
operationally. It differs from production system in that:
● it covers a much longer time horizon than transaction systems
● it includes multiple databases that have been processed so that the
warehouse’s data are defined uniformly (i.e., ‘clean’ data)
● it is optimized for answering complex queries from managers and
analysts
Standard Database vs Data warehouse
Characteristics
Characteristics
Subject Orientated
● Data is organized around major subjects of the enterprise.
● Data warehouses are designed to help you analyze data.
● For example, to learn more about your company's sales data, you
can build a warehouse that concentrates on sales.
● Using this warehouse, you can answer questions like "Who was
our best customer for this item last year?" This ability to define a
data warehouse by subject matter, sales in this case, makes the
data warehouse subject oriented.
Integrated
● Integration is closely related to subject orientation.
● Data warehouses must put data from disparate sources into a
consistent format.
● They must resolve such problems as naming conflicts and
inconsistencies among units of measure.
● When they achieve this, they are said to be integrated.
Non Volatile
● Non-volatile means that, once entered into the warehouse, data are
not changed/updated.
● This is logical because the purpose of a warehouse is to enable you
to analyze what has occurred
Time Variant
● In order to discover trends in business, analysts need large
amounts of data.
● This is very much in contrast to online transaction processing
(OLTP) systems, where performance requirements demand that
historical data be moved to an archive.
● The data are kept for many years so they can be used for trends,
forecasting, and comparisons over time.
● A data warehouse's focus on change over time is what is meant
by the term time variant.
Data Granularity
Data Marts
● Data Mart: A scaled-down version of the data warehouse
● A data mart is a small warehouse designed for the Small
Business Unit (SBU) or department level.
● It is often a way to gain entry and provide an opportunity to
learn
● Major problem: if they differ from department to department,
they can be difficult to integrate enterprise-wide
Data Warehouse vs Data Mart
Data Warehouse vs Data Mart
Data Warehouse vs Operational Database Management System
● OLTP (on-line transaction processing)
○ Major task of traditional relational DBMS
○ Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
● OLAP (on-line analytical processing)
○ Major task of data warehouse system
○ Data analysis and decision making
● Distinct features (OLTP vs. OLAP):
○ User and system orientation: customer vs. market
○ Data contents: current, detailed vs. historical,consolidated
○ Database design: ER + application vs. star + subject
○ View: current, local vs. evolutionary, integrated
○ Access patterns: update vs. read-only but complex queries
OLTP vs OLAP
Why separate Data Warehouse?
● High performance for both systems
○ DBMS— tuned for OLTP: access methods, indexing,concurrency control, recovery
○ Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, and
consolidation.
● Different functions and different data:
○ Missing data: Decision support requires historical data which operational DBs do not
typically maintain
○ Data consolidation: DS requires consolidation(aggregation, summarization) of data
from heterogeneous sources
○ Data quality: different sources typically use inconsistent data representations, codes
and formats which have to be reconciled.
OLAP
OLAP (for online analytical processing) is software for performing
multidimensional analysis at high speeds on large volumes of data from a
data warehouse, data mart, or some other unified, centralized data store.
In a data warehouse, data sets are stored in tables, each of which can
organize data into just two of these dimensions at a time. OLAP extracts
data from multiple relational data sets and reorganizes it into a
multidimensional format that enables very fast processing and very
insightful analysis
.
Types of OLAP models
We have four types of OLAP models−
● Relational OLAP (ROLAP)
○ Use relational or extended-relational DBMS to store and manage warehouse data and OLAP
middle ware to support missing pieces
○ Include optimization of DBMS backend, implementation of aggregation navigation logic, and
additional tools and services
○ greater scalability
● Multidimensional OLAP (MOLAP)
○ Array-based multidimensional storage engine (sparse matrix techniques)
○ fast indexing to pre-computed summarized data
● Hybrid OLAP (HOLAP)
○ User flexibility, e.g., low level: relational, highlevel: array
○ Specialized SQL servers
○ specialized support for SQL queries over star/snowflake schemas
Thank you
Saleha Mariyam
Assistant Professor
Dept. of Computer Science
Integral University

More Related Content

Similar to DWDM Unit 1 (1).pptx

Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
A P
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
GraceJoyMoleroCarwan
 
data warehousing
data warehousingdata warehousing
data warehousing
143sohil
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
Anusuya123
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
obieefans
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
SCITprojects2022
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
SCITprojects2022
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
Kiran kumar
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
PalaniKumarR2
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Materialobieefans
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
Martin Bém
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
ParnalSatle
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
Kushal Singh
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
SOMASUNDARAM T
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligence
ThilinaWanshathilaka
 
Unit4
Unit4Unit4
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
King Julian
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 

Similar to DWDM Unit 1 (1).pptx (20)

Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Material
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligence
 
Unit4
Unit4Unit4
Unit4
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 

Recently uploaded

Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
awadeshbabu
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
Divyam548318
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
obonagu
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
azkamurat
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 

DWDM Unit 1 (1).pptx

  • 2. Overview ● Introduction ● Need of Data warehousing ● Data warehousing Building Blocks ● OLAP in Data warehouse ● OLAP models ● Operational DB vs Data warehouse
  • 3. Introduction Data Warehouse A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. -Bill Inmon It is the process whereby organizations extract value from their informational assets through use of special stores called data warehouses
  • 4. Need of Data Warehouse Computer applications that support day-to-day business operations. are effective in what they are designed to do. They gather, store, and process all the data needed to successfully perform the daily operations. As businesses grew more complex and business executives became desperate for information to stay competitive. what the executives needed were different kinds of information that could be readily used to make strategic decisions. The operational systems, important as they were, could not provide strategic information. Businesses, therefore, were compelled to turn to new ways of getting strategic information.Data warehousing is a new paradigm specifically intended to provide vital strategic information.
  • 5. Need of Data Warehouse ESCALATING NEED FOR STRATEGIC INFORMATION The executives and managers who are responsible for keeping the enterprise competitive need information to make proper decisions. They need information to formulate the:-business strategies, establish goals, set objectives, and monitor results Critical business decisions depend on the availability of proper strategic information in an enterprise.
  • 6. The desired characteristics of strategic information are: 1. INTEGRATED- must have a single, enterprise-wide view 2. DATA INTEGRITY- Information must be accurate and must conform to business rules 3. ACCESSIBLE- Easily accessible with intuitive access paths, and responsive for analysis. 4. CREDIBLE- Every business factor must have one and only one value 5. TIMELY- Information must be available within the stipulated time frame.
  • 7. Need of Data Warehouse FAILURES OF PAST DECISION-SUPPORT SYSTEMS The user must be able to query online, get results, and query some more. The information must be in a format suitable for analysis. PAST DECISION-SUPPORT SYSTEMS Ad Hoc Reports. send requests to IT for special reports.IT would write special programs, typically one for each request, and produce the ad hoc reports. Special Extract Programs.For the types of reports that would be requested from time to time. IT would write a suite of programs and run the programs periodically to extract data from the various applications.For any reports that could not be run off the extracted files, IT would write individual special programs.
  • 8. Need of Data Warehouse PAST DECISION-SUPPORT SYSTEMS Small Applications.IT would create simple applications based on the extracted files.The users could stipulate the parameters for each special report. Information Centers. The information center typically was a place where users view special information on screens. Decision-Support Systems. these systems were supported by extracted files.The systems were menu-driven and provided online information and also the ability to print special reports. Executive Information Systems.The main criteria were simplicity and ease of use. The system would display key information every day and provide ability to request simple, straightforward reports.However, only preprogrammed screens and reports were available
  • 9. Data Warehouse- the only viable solution DATA WAREHOUSE IS DEFINED AS: The data warehouse is an informational environment that:- ● Provides an integrated and total view of the enterprise ● Makes the enterprise’s current and historical information easily available for decision making ● Makes decision-support transactions possible without hindering operational systems ● Renders the organization’s information consistent ● Presents a flexible and interactive source of strategic information
  • 10.
  • 11. Data Warehouse: The building block A data warehouse is typically a dedicated database system for decision making that is separate from the production database(s) used operationally. It differs from production system in that: ● it covers a much longer time horizon than transaction systems ● it includes multiple databases that have been processed so that the warehouse’s data are defined uniformly (i.e., ‘clean’ data) ● it is optimized for answering complex queries from managers and analysts
  • 12. Standard Database vs Data warehouse
  • 15. Subject Orientated ● Data is organized around major subjects of the enterprise. ● Data warehouses are designed to help you analyze data. ● For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. ● Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.
  • 16. Integrated ● Integration is closely related to subject orientation. ● Data warehouses must put data from disparate sources into a consistent format. ● They must resolve such problems as naming conflicts and inconsistencies among units of measure. ● When they achieve this, they are said to be integrated.
  • 17. Non Volatile ● Non-volatile means that, once entered into the warehouse, data are not changed/updated. ● This is logical because the purpose of a warehouse is to enable you to analyze what has occurred
  • 18. Time Variant ● In order to discover trends in business, analysts need large amounts of data. ● This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. ● The data are kept for many years so they can be used for trends, forecasting, and comparisons over time. ● A data warehouse's focus on change over time is what is meant by the term time variant.
  • 20. Data Marts ● Data Mart: A scaled-down version of the data warehouse ● A data mart is a small warehouse designed for the Small Business Unit (SBU) or department level. ● It is often a way to gain entry and provide an opportunity to learn ● Major problem: if they differ from department to department, they can be difficult to integrate enterprise-wide
  • 21. Data Warehouse vs Data Mart
  • 22. Data Warehouse vs Data Mart
  • 23. Data Warehouse vs Operational Database Management System ● OLTP (on-line transaction processing) ○ Major task of traditional relational DBMS ○ Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. ● OLAP (on-line analytical processing) ○ Major task of data warehouse system ○ Data analysis and decision making ● Distinct features (OLTP vs. OLAP): ○ User and system orientation: customer vs. market ○ Data contents: current, detailed vs. historical,consolidated ○ Database design: ER + application vs. star + subject ○ View: current, local vs. evolutionary, integrated ○ Access patterns: update vs. read-only but complex queries
  • 25. Why separate Data Warehouse? ● High performance for both systems ○ DBMS— tuned for OLTP: access methods, indexing,concurrency control, recovery ○ Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, and consolidation. ● Different functions and different data: ○ Missing data: Decision support requires historical data which operational DBs do not typically maintain ○ Data consolidation: DS requires consolidation(aggregation, summarization) of data from heterogeneous sources ○ Data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled.
  • 26. OLAP OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store. In a data warehouse, data sets are stored in tables, each of which can organize data into just two of these dimensions at a time. OLAP extracts data from multiple relational data sets and reorganizes it into a multidimensional format that enables very fast processing and very insightful analysis .
  • 27. Types of OLAP models We have four types of OLAP models− ● Relational OLAP (ROLAP) ○ Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing pieces ○ Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services ○ greater scalability ● Multidimensional OLAP (MOLAP) ○ Array-based multidimensional storage engine (sparse matrix techniques) ○ fast indexing to pre-computed summarized data ● Hybrid OLAP (HOLAP) ○ User flexibility, e.g., low level: relational, highlevel: array ○ Specialized SQL servers ○ specialized support for SQL queries over star/snowflake schemas
  • 28.
  • 29.
  • 30. Thank you Saleha Mariyam Assistant Professor Dept. of Computer Science Integral University

Editor's Notes

  1. https://www2.seas.gwu.edu/~bell/csci243/lectures/data_warehousing.pdf