DATA WAREHOUSE
Compiled by:-
Anshika Nigam
Roll No.-1303213012
Department of Information Technology(IT)
19-02-2016
1
Topics to be covered-
 History
 Definition
 Need for Data Warehousing
 Attributes of Data Warehousing
 Applications of Data Warehousing
 Difference between Data Warehouse and a Database
 Basic architecture of a Data Warehouse
 Advantages of Data Warehousing
 Future scope/Limitations
 References
19-02-2016
2
History
19-02-2016
3
 Bill Inmon also known as the “father of data warehousing” coined
the term in the 1970’s while working as a data professional and
became an expert in relational data modelling.
 Ralph Kimball later in 1990’s introduced a dimensional modelling
concept for data warehouse design.
 While Inmon’s work focuses on a centralized approach benefiting
data warehouses of large sizes , Kimball’s approach being an
integrated one focuses on the smaller businessmen.
What is a Data Warehouse?
 Warehouse- A large building where raw materials or
manufactured goods are stored prior to their distribution.
Data Warehouse
 Collections of databases that work together are called data
warehouses. This makes it possible to integrate data from
multiple databases.
 The process of centralizing or aggregating data from
multiple sources into one common repository is called Data
Warehousing.
 This data is then readily available to any business
professionals, managers etc. who need to use the data to
create forecasts and for other purposes.
19-02-2016
4
What is the need for data
warehousing?
 Decisions need to be made quickly and correctly, using all available
data.
 There is a need to automate and rationalize the process of
information storage rather than developing many individual
databases.
 Forecasting was an obstacle in the process of overall growth of an
organisation
 Such a large amount of data was difficult to store, collect , use and
understand with discrete databases.
 With the overall advancement in business intelligence there is a
need for quick strategic decision making.
19-02-2016
5
Data Mining v/s Data
Warehouse!
19-02-2016
6
Attributes of a Data Warehouse-
 Subject Oriented- Data that gives information about a
particular subject rather than the organization’s ongoing
operations.
 Integrated- Data is gathered from a variety of heterogeneous
sources(individual databases) into one interlinked repository(data
warehouse).
 Time Variant- All the data in the data warehouse is identified
with a specific time period or it should not be time dependent.
 Non – Volatile- Data is stable in a data warehouse. Previous
data is not deleted when new data is added.
19-02-2016
7
Applications of Data Warehousing-
As discussed before, a data warehouse helps business executives to
organize, analyze, and use their data for decision making.
Data warehouses are widely used in the following fields:
• Financial Services
• Banking Services
• Consumer Goods
• Retail Sectors
• Controlled Manufacturing
19-02-2016
8
Difference between Operational Database and
Data Warehouse
DATA WAREHOUSE DATABASE
It involves historical processing
of information.
It involves day-to-day
processing.
It is used to analyze the business. It is used to run the business.
It contains historical data. It contains current data.
It focuses on Information out. It focuses on Data in.
The number of records accessed
are in millions.
The number of records accessed
are in tens.
19-02-2016
9
Basic architecture of a Data
Warehouse -
19-02-2016
10
Data Warehouse Architecture: with a Staging Area
19-02-2016
11
 Data Source layer- Collects the data from different data
sources. So all the data that is extracted from various databases
comes under this layer.
 ETL(EXTRACT,TRANSFORM,LOAD )- This tool is used for
cleaning, transforming and loading the data.
 Staging Area- With the help of the ETL tool the data from the
source systems is copied into a temporary location called Staging
Area.
 Warehouse- Before loading the data into the warehouse the raw
data is first cleaned and transformed into a structured format and
then finally loaded into the warehouse.
 Presentation Layer- The data which is completely loaded into
the warehouse can be accessed according to the requirements of the
user eg. For data mining, forecasting, analysis etc.
19-02-2016
12
Advantages of Data Warehousing-
 Competitive Advantage- The huge returns on investment for those
companies that have successfully implemented a data warehouse is
evidence of the enormous competitive advantage that accompanies
this technology.
 Increased productivity of corporate decision-makers- By creating
an integrated database of consistent, subject-oriented, historical
data the productivity of the decision makers has improved.
 More cost-effective decision-making- Data warehousing helps to
reduce the overall cost of the· product· by reducing the number of
channels.
 Better enterprise intelligence- Helps in enhancing customer
services.
19-02-2016
13
Big Data v/s Data
Warehouse!
19-02-2016
14
Future Scope/Limitations-
 No Real Time Processing-Data stored is not real time. It takes some
time to upload the recent data.
 Hidden problems with source systems- Sometimes problems
associated with the data warehouse may be undetected for years.
 Data homogenization- May result in the loss of some important data.
 Complexity of integration- An organization must spend a significant
amount of time determining how well the various different data
warehousing tools can be integrated into the overall solution that is
needed.
 High maintenance- Data warehouses are high maintenance systems
thus resulting in high maintenance cost.
19-02-2016
15
References
 www.tutorialspoint.com
 www.programmerinterview.com
 www.ecomputernotes.com
 www.wikipedia.com
 www.study.com
 Video tutorials by Edureka
19-02-2016
16
THANK YOU!
19-02-2016
17

Data warehousing

  • 1.
    DATA WAREHOUSE Compiled by:- AnshikaNigam Roll No.-1303213012 Department of Information Technology(IT) 19-02-2016 1
  • 2.
    Topics to becovered-  History  Definition  Need for Data Warehousing  Attributes of Data Warehousing  Applications of Data Warehousing  Difference between Data Warehouse and a Database  Basic architecture of a Data Warehouse  Advantages of Data Warehousing  Future scope/Limitations  References 19-02-2016 2
  • 3.
    History 19-02-2016 3  Bill Inmonalso known as the “father of data warehousing” coined the term in the 1970’s while working as a data professional and became an expert in relational data modelling.  Ralph Kimball later in 1990’s introduced a dimensional modelling concept for data warehouse design.  While Inmon’s work focuses on a centralized approach benefiting data warehouses of large sizes , Kimball’s approach being an integrated one focuses on the smaller businessmen.
  • 4.
    What is aData Warehouse?  Warehouse- A large building where raw materials or manufactured goods are stored prior to their distribution. Data Warehouse  Collections of databases that work together are called data warehouses. This makes it possible to integrate data from multiple databases.  The process of centralizing or aggregating data from multiple sources into one common repository is called Data Warehousing.  This data is then readily available to any business professionals, managers etc. who need to use the data to create forecasts and for other purposes. 19-02-2016 4
  • 5.
    What is theneed for data warehousing?  Decisions need to be made quickly and correctly, using all available data.  There is a need to automate and rationalize the process of information storage rather than developing many individual databases.  Forecasting was an obstacle in the process of overall growth of an organisation  Such a large amount of data was difficult to store, collect , use and understand with discrete databases.  With the overall advancement in business intelligence there is a need for quick strategic decision making. 19-02-2016 5
  • 6.
    Data Mining v/sData Warehouse! 19-02-2016 6
  • 7.
    Attributes of aData Warehouse-  Subject Oriented- Data that gives information about a particular subject rather than the organization’s ongoing operations.  Integrated- Data is gathered from a variety of heterogeneous sources(individual databases) into one interlinked repository(data warehouse).  Time Variant- All the data in the data warehouse is identified with a specific time period or it should not be time dependent.  Non – Volatile- Data is stable in a data warehouse. Previous data is not deleted when new data is added. 19-02-2016 7
  • 8.
    Applications of DataWarehousing- As discussed before, a data warehouse helps business executives to organize, analyze, and use their data for decision making. Data warehouses are widely used in the following fields: • Financial Services • Banking Services • Consumer Goods • Retail Sectors • Controlled Manufacturing 19-02-2016 8
  • 9.
    Difference between OperationalDatabase and Data Warehouse DATA WAREHOUSE DATABASE It involves historical processing of information. It involves day-to-day processing. It is used to analyze the business. It is used to run the business. It contains historical data. It contains current data. It focuses on Information out. It focuses on Data in. The number of records accessed are in millions. The number of records accessed are in tens. 19-02-2016 9
  • 10.
    Basic architecture ofa Data Warehouse - 19-02-2016 10
  • 11.
    Data Warehouse Architecture:with a Staging Area 19-02-2016 11
  • 12.
     Data Sourcelayer- Collects the data from different data sources. So all the data that is extracted from various databases comes under this layer.  ETL(EXTRACT,TRANSFORM,LOAD )- This tool is used for cleaning, transforming and loading the data.  Staging Area- With the help of the ETL tool the data from the source systems is copied into a temporary location called Staging Area.  Warehouse- Before loading the data into the warehouse the raw data is first cleaned and transformed into a structured format and then finally loaded into the warehouse.  Presentation Layer- The data which is completely loaded into the warehouse can be accessed according to the requirements of the user eg. For data mining, forecasting, analysis etc. 19-02-2016 12
  • 13.
    Advantages of DataWarehousing-  Competitive Advantage- The huge returns on investment for those companies that have successfully implemented a data warehouse is evidence of the enormous competitive advantage that accompanies this technology.  Increased productivity of corporate decision-makers- By creating an integrated database of consistent, subject-oriented, historical data the productivity of the decision makers has improved.  More cost-effective decision-making- Data warehousing helps to reduce the overall cost of the· product· by reducing the number of channels.  Better enterprise intelligence- Helps in enhancing customer services. 19-02-2016 13
  • 14.
    Big Data v/sData Warehouse! 19-02-2016 14
  • 15.
    Future Scope/Limitations-  NoReal Time Processing-Data stored is not real time. It takes some time to upload the recent data.  Hidden problems with source systems- Sometimes problems associated with the data warehouse may be undetected for years.  Data homogenization- May result in the loss of some important data.  Complexity of integration- An organization must spend a significant amount of time determining how well the various different data warehousing tools can be integrated into the overall solution that is needed.  High maintenance- Data warehouses are high maintenance systems thus resulting in high maintenance cost. 19-02-2016 15
  • 16.
    References  www.tutorialspoint.com  www.programmerinterview.com www.ecomputernotes.com  www.wikipedia.com  www.study.com  Video tutorials by Edureka 19-02-2016 16
  • 17.