SlideShare a Scribd company logo
1 of 30
Data Warehousing
and Data Mining
Unit 1
Overview
● Introduction
● Need of Data warehousing
● Data warehousing Building
Blocks
● OLAP in Data warehouse
● OLAP models
● Operational DB vs Data
warehouse
Introduction
Data Warehouse
A data warehouse is a subject-oriented, integrated, time-variant and
non-volatile collection of data in support of management's decision
making process. -Bill Inmon
It is the process whereby organizations extract value from their
informational assets through use of special stores called data
warehouses
Need of Data Warehouse
Computer applications that support day-to-day business operations. are
effective in what they are designed to do. They gather, store, and process all the
data needed to successfully perform the daily operations.
As businesses grew more complex and business executives became
desperate for information to stay competitive. what the executives needed
were different kinds of information that could be readily used to make strategic
decisions. The operational systems, important as they were, could not provide
strategic information. Businesses, therefore, were compelled to turn to new
ways of getting strategic information.Data warehousing is a new paradigm
specifically intended to provide vital strategic information.
Need of Data Warehouse
ESCALATING NEED FOR STRATEGIC INFORMATION
The executives and managers who are responsible for keeping the enterprise
competitive need information to make proper decisions. They need information
to formulate the:-business strategies, establish goals, set objectives, and
monitor results
Critical business decisions depend on the availability of proper strategic
information in an enterprise.
The desired
characteristics of
strategic
information are:
1. INTEGRATED- must have a single,
enterprise-wide view
2. DATA INTEGRITY- Information
must be accurate and must
conform to business rules
3. ACCESSIBLE- Easily accessible
with intuitive access paths, and
responsive for analysis.
4. CREDIBLE- Every business factor
must have one and only one value
5. TIMELY- Information must be
available within the stipulated
time frame.
Need of Data Warehouse
FAILURES OF PAST DECISION-SUPPORT SYSTEMS
The user must be able to query online, get results, and query some more. The
information must be in a format suitable for analysis.
PAST DECISION-SUPPORT SYSTEMS
Ad Hoc Reports. send requests to IT for special reports.IT would write special programs, typically one for each
request, and produce the ad hoc reports.
Special Extract Programs.For the types of reports that would be requested from time to time. IT would write a
suite of programs and run the programs periodically to extract data from the various applications.For any reports
that could not be run off the extracted files, IT would write individual special programs.
Need of Data Warehouse
PAST DECISION-SUPPORT SYSTEMS
Small Applications.IT would create simple applications based on the extracted files.The users could stipulate the
parameters for each special report.
Information Centers. The information center typically was a place where users view special information on
screens.
Decision-Support Systems. these systems were supported by extracted files.The systems were menu-driven
and provided online information and also the ability to print special reports.
Executive Information Systems.The main criteria were simplicity and ease of use. The system would display
key information every day and provide ability to request simple, straightforward reports.However, only
preprogrammed screens and reports were available
Data Warehouse- the only viable solution
DATA WAREHOUSE IS DEFINED AS:
The data warehouse is an informational environment that:-
● Provides an integrated and total view of the enterprise
● Makes the enterprise’s current and historical information easily available
for decision making
● Makes decision-support transactions possible without hindering
operational systems
● Renders the organization’s information consistent
● Presents a flexible and interactive source of strategic information
Data Warehouse: The building block
A data warehouse is typically a dedicated database system for decision
making that is separate from the production database(s) used
operationally. It differs from production system in that:
● it covers a much longer time horizon than transaction systems
● it includes multiple databases that have been processed so that the
warehouse’s data are defined uniformly (i.e., ‘clean’ data)
● it is optimized for answering complex queries from managers and
analysts
Standard Database vs Data warehouse
Characteristics
Characteristics
Subject Orientated
● Data is organized around major subjects of the enterprise.
● Data warehouses are designed to help you analyze data.
● For example, to learn more about your company's sales data, you
can build a warehouse that concentrates on sales.
● Using this warehouse, you can answer questions like "Who was
our best customer for this item last year?" This ability to define a
data warehouse by subject matter, sales in this case, makes the
data warehouse subject oriented.
Integrated
● Integration is closely related to subject orientation.
● Data warehouses must put data from disparate sources into a
consistent format.
● They must resolve such problems as naming conflicts and
inconsistencies among units of measure.
● When they achieve this, they are said to be integrated.
Non Volatile
● Non-volatile means that, once entered into the warehouse, data are
not changed/updated.
● This is logical because the purpose of a warehouse is to enable you
to analyze what has occurred
Time Variant
● In order to discover trends in business, analysts need large
amounts of data.
● This is very much in contrast to online transaction processing
(OLTP) systems, where performance requirements demand that
historical data be moved to an archive.
● The data are kept for many years so they can be used for trends,
forecasting, and comparisons over time.
● A data warehouse's focus on change over time is what is meant
by the term time variant.
Data Granularity
Data Marts
● Data Mart: A scaled-down version of the data warehouse
● A data mart is a small warehouse designed for the Small
Business Unit (SBU) or department level.
● It is often a way to gain entry and provide an opportunity to
learn
● Major problem: if they differ from department to department,
they can be difficult to integrate enterprise-wide
Data Warehouse vs Data Mart
Data Warehouse vs Data Mart
Data Warehouse vs Operational Database Management System
● OLTP (on-line transaction processing)
○ Major task of traditional relational DBMS
○ Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
● OLAP (on-line analytical processing)
○ Major task of data warehouse system
○ Data analysis and decision making
● Distinct features (OLTP vs. OLAP):
○ User and system orientation: customer vs. market
○ Data contents: current, detailed vs. historical,consolidated
○ Database design: ER + application vs. star + subject
○ View: current, local vs. evolutionary, integrated
○ Access patterns: update vs. read-only but complex queries
OLTP vs OLAP
Why separate Data Warehouse?
● High performance for both systems
○ DBMS— tuned for OLTP: access methods, indexing,concurrency control, recovery
○ Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, and
consolidation.
● Different functions and different data:
○ Missing data: Decision support requires historical data which operational DBs do not
typically maintain
○ Data consolidation: DS requires consolidation(aggregation, summarization) of data
from heterogeneous sources
○ Data quality: different sources typically use inconsistent data representations, codes
and formats which have to be reconciled.
OLAP
OLAP (for online analytical processing) is software for performing
multidimensional analysis at high speeds on large volumes of data from a
data warehouse, data mart, or some other unified, centralized data store.
In a data warehouse, data sets are stored in tables, each of which can
organize data into just two of these dimensions at a time. OLAP extracts
data from multiple relational data sets and reorganizes it into a
multidimensional format that enables very fast processing and very
insightful analysis
.
Types of OLAP models
We have four types of OLAP models−
● Relational OLAP (ROLAP)
○ Use relational or extended-relational DBMS to store and manage warehouse data and OLAP
middle ware to support missing pieces
○ Include optimization of DBMS backend, implementation of aggregation navigation logic, and
additional tools and services
○ greater scalability
● Multidimensional OLAP (MOLAP)
○ Array-based multidimensional storage engine (sparse matrix techniques)
○ fast indexing to pre-computed summarized data
● Hybrid OLAP (HOLAP)
○ User flexibility, e.g., low level: relational, highlevel: array
○ Specialized SQL servers
○ specialized support for SQL queries over star/snowflake schemas
Thank you
Saleha Mariyam
Assistant Professor
Dept. of Computer Science
Integral University

More Related Content

Similar to DWDM Unit 1 (1).pptx

Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biA P
 
data warehousing
data warehousingdata warehousing
data warehousing143sohil
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptxAnusuya123
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfSCITprojects2022
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfSCITprojects2022
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemKiran kumar
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Materialobieefans
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Martin Bém
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxParnalSatle
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Singh
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseSOMASUNDARAM T
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligenceThilinaWanshathilaka
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 

Similar to DWDM Unit 1 (1).pptx (20)

Manish tripathi-ea-dw-bi
Manish tripathi-ea-dw-biManish tripathi-ea-dw-bi
Manish tripathi-ea-dw-bi
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Informatica and datawarehouse Material
Informatica and datawarehouse MaterialInformatica and datawarehouse Material
Informatica and datawarehouse Material
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligence
 
Unit4
Unit4Unit4
Unit4
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 

Recently uploaded

Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 

DWDM Unit 1 (1).pptx

  • 2. Overview ● Introduction ● Need of Data warehousing ● Data warehousing Building Blocks ● OLAP in Data warehouse ● OLAP models ● Operational DB vs Data warehouse
  • 3. Introduction Data Warehouse A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. -Bill Inmon It is the process whereby organizations extract value from their informational assets through use of special stores called data warehouses
  • 4. Need of Data Warehouse Computer applications that support day-to-day business operations. are effective in what they are designed to do. They gather, store, and process all the data needed to successfully perform the daily operations. As businesses grew more complex and business executives became desperate for information to stay competitive. what the executives needed were different kinds of information that could be readily used to make strategic decisions. The operational systems, important as they were, could not provide strategic information. Businesses, therefore, were compelled to turn to new ways of getting strategic information.Data warehousing is a new paradigm specifically intended to provide vital strategic information.
  • 5. Need of Data Warehouse ESCALATING NEED FOR STRATEGIC INFORMATION The executives and managers who are responsible for keeping the enterprise competitive need information to make proper decisions. They need information to formulate the:-business strategies, establish goals, set objectives, and monitor results Critical business decisions depend on the availability of proper strategic information in an enterprise.
  • 6. The desired characteristics of strategic information are: 1. INTEGRATED- must have a single, enterprise-wide view 2. DATA INTEGRITY- Information must be accurate and must conform to business rules 3. ACCESSIBLE- Easily accessible with intuitive access paths, and responsive for analysis. 4. CREDIBLE- Every business factor must have one and only one value 5. TIMELY- Information must be available within the stipulated time frame.
  • 7. Need of Data Warehouse FAILURES OF PAST DECISION-SUPPORT SYSTEMS The user must be able to query online, get results, and query some more. The information must be in a format suitable for analysis. PAST DECISION-SUPPORT SYSTEMS Ad Hoc Reports. send requests to IT for special reports.IT would write special programs, typically one for each request, and produce the ad hoc reports. Special Extract Programs.For the types of reports that would be requested from time to time. IT would write a suite of programs and run the programs periodically to extract data from the various applications.For any reports that could not be run off the extracted files, IT would write individual special programs.
  • 8. Need of Data Warehouse PAST DECISION-SUPPORT SYSTEMS Small Applications.IT would create simple applications based on the extracted files.The users could stipulate the parameters for each special report. Information Centers. The information center typically was a place where users view special information on screens. Decision-Support Systems. these systems were supported by extracted files.The systems were menu-driven and provided online information and also the ability to print special reports. Executive Information Systems.The main criteria were simplicity and ease of use. The system would display key information every day and provide ability to request simple, straightforward reports.However, only preprogrammed screens and reports were available
  • 9. Data Warehouse- the only viable solution DATA WAREHOUSE IS DEFINED AS: The data warehouse is an informational environment that:- ● Provides an integrated and total view of the enterprise ● Makes the enterprise’s current and historical information easily available for decision making ● Makes decision-support transactions possible without hindering operational systems ● Renders the organization’s information consistent ● Presents a flexible and interactive source of strategic information
  • 10.
  • 11. Data Warehouse: The building block A data warehouse is typically a dedicated database system for decision making that is separate from the production database(s) used operationally. It differs from production system in that: ● it covers a much longer time horizon than transaction systems ● it includes multiple databases that have been processed so that the warehouse’s data are defined uniformly (i.e., ‘clean’ data) ● it is optimized for answering complex queries from managers and analysts
  • 12. Standard Database vs Data warehouse
  • 15. Subject Orientated ● Data is organized around major subjects of the enterprise. ● Data warehouses are designed to help you analyze data. ● For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. ● Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.
  • 16. Integrated ● Integration is closely related to subject orientation. ● Data warehouses must put data from disparate sources into a consistent format. ● They must resolve such problems as naming conflicts and inconsistencies among units of measure. ● When they achieve this, they are said to be integrated.
  • 17. Non Volatile ● Non-volatile means that, once entered into the warehouse, data are not changed/updated. ● This is logical because the purpose of a warehouse is to enable you to analyze what has occurred
  • 18. Time Variant ● In order to discover trends in business, analysts need large amounts of data. ● This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. ● The data are kept for many years so they can be used for trends, forecasting, and comparisons over time. ● A data warehouse's focus on change over time is what is meant by the term time variant.
  • 20. Data Marts ● Data Mart: A scaled-down version of the data warehouse ● A data mart is a small warehouse designed for the Small Business Unit (SBU) or department level. ● It is often a way to gain entry and provide an opportunity to learn ● Major problem: if they differ from department to department, they can be difficult to integrate enterprise-wide
  • 21. Data Warehouse vs Data Mart
  • 22. Data Warehouse vs Data Mart
  • 23. Data Warehouse vs Operational Database Management System ● OLTP (on-line transaction processing) ○ Major task of traditional relational DBMS ○ Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. ● OLAP (on-line analytical processing) ○ Major task of data warehouse system ○ Data analysis and decision making ● Distinct features (OLTP vs. OLAP): ○ User and system orientation: customer vs. market ○ Data contents: current, detailed vs. historical,consolidated ○ Database design: ER + application vs. star + subject ○ View: current, local vs. evolutionary, integrated ○ Access patterns: update vs. read-only but complex queries
  • 25. Why separate Data Warehouse? ● High performance for both systems ○ DBMS— tuned for OLTP: access methods, indexing,concurrency control, recovery ○ Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, and consolidation. ● Different functions and different data: ○ Missing data: Decision support requires historical data which operational DBs do not typically maintain ○ Data consolidation: DS requires consolidation(aggregation, summarization) of data from heterogeneous sources ○ Data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled.
  • 26. OLAP OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store. In a data warehouse, data sets are stored in tables, each of which can organize data into just two of these dimensions at a time. OLAP extracts data from multiple relational data sets and reorganizes it into a multidimensional format that enables very fast processing and very insightful analysis .
  • 27. Types of OLAP models We have four types of OLAP models− ● Relational OLAP (ROLAP) ○ Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing pieces ○ Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services ○ greater scalability ● Multidimensional OLAP (MOLAP) ○ Array-based multidimensional storage engine (sparse matrix techniques) ○ fast indexing to pre-computed summarized data ● Hybrid OLAP (HOLAP) ○ User flexibility, e.g., low level: relational, highlevel: array ○ Specialized SQL servers ○ specialized support for SQL queries over star/snowflake schemas
  • 28.
  • 29.
  • 30. Thank you Saleha Mariyam Assistant Professor Dept. of Computer Science Integral University

Editor's Notes

  1. https://www2.seas.gwu.edu/~bell/csci243/lectures/data_warehousing.pdf