Data Warehouse
presented by: Akkal Bist
akkalbist55@gmail.com
Agenda
Introduction to data warehouse
The goals of a data warehouse
Type of data warehouse
Architecture of data warehouse
Benefit and problems of data
warehouse
Introduction to Data
Warehouse
Subject oriented
Integrated
Time-variant
Non-volatile
Collection of data in support of
management decision making process.
The goals of a data
warehouse
 Makes an organization’s information accessible
 Makes the organization’s information consistent
 Is an adaptive & resilient source of information
 Is a secure bastion that protect our information
asset
 Is the foundation for decision making
Basic elements of the data warehouse
Storage:
flat files;
RDBMS;
other
Processing:
clean;
prune;
combine;
remove duplicates;
household;
standardize;
conform dimensions;
store awaiting replication;
archive;
export to data marts
No user query services
Data Mart #1:
OLAP (ROLAP or MOLAP)
query services;
dimensional;
subject oriented;
locally implemented;
user group driven;
may store atomic data;
may be frequently refreshed;
conforms to DW bus
Data Mart #2
Data Mart #3
Ad Hoc Query Tools
Report Writers
End User Applications
Models
forecasting;
scoring;
allocating;
data mining;
other downstream systems;
other parameters;
special UI
Source
Systems
(Legacy)
Data Staging Area Presentation Servers End User Data Access
Extract
Extract
Extract
Populate,
replicate,
recover
Populate,
replicate,
recover
Populate,
replicate,
recover
DW
BUS
DW
BUS
Conformed dimensions
Conformed facts
Conformed dimensions
Conformed facts
Feed
Feed
Feed
Feed
upload cleaned
dimensions
Upload model
results
Type of data warehouse
1) Data Mart
 A logical subset of the complete data warehouse
 It focused on a single subject (financial area)
 A data warehouse is made up of the union of all
its data marts
 It’s build and controlled by a single department
within an organization
 Data mart can contains not only the summary
data but also atomic data
2) Online Analytical
Processing (OLAP)
 OLAP is a characterized by low volume
transaction
 Query are very complex & involve
aggregation
 It has response time is effective measure
 This system has typically have data latency of
a few hours, as opposed to data marts, where
latency is expected to be closer to one day.
OLAP(con..)
Basic Operation
 Roll-up (Consolidation), Drill-down and Slicing
& Dicing
 It characterized by large no of short one-time
transaction(INSERT, UPDATE,DELETE)
 Fast query processing & maintaining data
integrity in multi-access environments
 Measure effectiveness by no of transaction
per second
Security in data warehouse
 It is an Integrated repository derived from
multiple source,(OL) databases
 Replication control
 Aggregation and Generalization
 Exaggeration and Mi sleading
 Anonymity
Application of data warehouse
It was Micromarketing and profitability
calculation
Stock control
Product category management
Basket analysis
Fraud analysis
Architecture
Benefits
Potential high return on investment
Competitive advantages
Increased productivity of corporate decision-
making
=> by creating an integrated database of consistent,
subject-oriented, historical data.
More cost-effective decision making
=> help to reduce the over-all cost of product by reducing
the no of channels.
Problems
Underestimation of resources of data loading
Hidden problems with source systems
Required data not captured
Increased end-user demands
Data homogenization
High demand for resources
Data ownership
High maintenance
Long-duration project
Thank you!!!

Data warehouse

  • 1.
    Data Warehouse presented by:Akkal Bist akkalbist55@gmail.com
  • 2.
    Agenda Introduction to datawarehouse The goals of a data warehouse Type of data warehouse Architecture of data warehouse Benefit and problems of data warehouse
  • 3.
    Introduction to Data Warehouse Subjectoriented Integrated Time-variant Non-volatile Collection of data in support of management decision making process.
  • 4.
    The goals ofa data warehouse  Makes an organization’s information accessible  Makes the organization’s information consistent  Is an adaptive & resilient source of information  Is a secure bastion that protect our information asset  Is the foundation for decision making
  • 5.
    Basic elements ofthe data warehouse Storage: flat files; RDBMS; other Processing: clean; prune; combine; remove duplicates; household; standardize; conform dimensions; store awaiting replication; archive; export to data marts No user query services Data Mart #1: OLAP (ROLAP or MOLAP) query services; dimensional; subject oriented; locally implemented; user group driven; may store atomic data; may be frequently refreshed; conforms to DW bus Data Mart #2 Data Mart #3 Ad Hoc Query Tools Report Writers End User Applications Models forecasting; scoring; allocating; data mining; other downstream systems; other parameters; special UI Source Systems (Legacy) Data Staging Area Presentation Servers End User Data Access Extract Extract Extract Populate, replicate, recover Populate, replicate, recover Populate, replicate, recover DW BUS DW BUS Conformed dimensions Conformed facts Conformed dimensions Conformed facts Feed Feed Feed Feed upload cleaned dimensions Upload model results
  • 6.
    Type of datawarehouse 1) Data Mart  A logical subset of the complete data warehouse  It focused on a single subject (financial area)  A data warehouse is made up of the union of all its data marts  It’s build and controlled by a single department within an organization  Data mart can contains not only the summary data but also atomic data
  • 7.
    2) Online Analytical Processing(OLAP)  OLAP is a characterized by low volume transaction  Query are very complex & involve aggregation  It has response time is effective measure  This system has typically have data latency of a few hours, as opposed to data marts, where latency is expected to be closer to one day.
  • 8.
    OLAP(con..) Basic Operation  Roll-up(Consolidation), Drill-down and Slicing & Dicing  It characterized by large no of short one-time transaction(INSERT, UPDATE,DELETE)  Fast query processing & maintaining data integrity in multi-access environments  Measure effectiveness by no of transaction per second
  • 9.
    Security in datawarehouse  It is an Integrated repository derived from multiple source,(OL) databases  Replication control  Aggregation and Generalization  Exaggeration and Mi sleading  Anonymity
  • 10.
    Application of datawarehouse It was Micromarketing and profitability calculation Stock control Product category management Basket analysis Fraud analysis
  • 11.
  • 12.
    Benefits Potential high returnon investment Competitive advantages Increased productivity of corporate decision- making => by creating an integrated database of consistent, subject-oriented, historical data. More cost-effective decision making => help to reduce the over-all cost of product by reducing the no of channels.
  • 13.
    Problems Underestimation of resourcesof data loading Hidden problems with source systems Required data not captured Increased end-user demands Data homogenization High demand for resources Data ownership High maintenance Long-duration project
  • 14.