2. PROBLEMS
1. We cant find the data we need (data is
scattered over the network)
2. We cant get the data we need (need an
expert to get the data)
3. We cant understand the data we found
(available data is poorly documented)
4. We cant use the data we found (data needs
to be transformed from one form to
another)
Company
B1
B2
B3
B4
B5
B6
B7
B8
3. SOLUTION : DATA WAREHOUSE
A big collection of data
from smaller databases.
4. DATA STORED IN DATA
WAREHOUSE
Types of data:
1. Operational data (Day to day)
2. Strategic Information (Used in decision making)
Information such as branch’s sale, which products are being
sold the most, which products are in stock, in which time of the year
the sales are high and low, for which product should marketing should
be done, etc.
Operational data are stored in their respective branches whereas
Strategic Information are stored in Data warehouses.
5. DEFINITION
According to Bill Inmon, also known as father of data warehouse,
“ A data warehouse is a subject-oriented, integrated, time-variant,
and non-volatile collection of data in support of management’s
decision-making process. ”
6. SUBJECT ORIENTED
Organized around major subjects, such as customer, product, sales,
etc.
Basically, a data warehouse consists of small partitions called
‘Datamarts’
Each datamart contains a detailed information about a particular
subject.
Sales Products
Customers Employees
Data about these subjects are inserted into
data warehouse in very managed form
8. NON VOLATILE
In operational data, there are frequent updates of data.
For example, a customer places an order and its details are placed in
database. If the customer changes the delivery location, the database
is updated accordingly. After the delivery, the database is changed
again to mark it as completed.
But in data warehouse, once the data has been added, one cant simply
edit the data.
The only thing you can do is delete it all, or add new information on
top of existing data.
9. TIME VARIANT
The time horizon for the data warehouse is significantly longer than
that of the operational systems.
It provides information from a historical perspective (e.g. past 5-10
years)