INTRODUCTION TO
DATA WAREHOUSING
By: Eng. Eyad R. Manaa
INTRODUCTION
• Data: Meaningful facts, text, graphics, images,
sound, video segments.
• Database: An organized collection ...
ADVANTAGES OF THE DATABASE APPROACH
• Data Independence/Reduced Maintenance
• Improved Data Sharing
• Increased Applicatio...
PROBLEM:
HETEROGENEOUS INFORMATION SOURCES
“Heterogeneities are
everywhere” Personal
Databases
Digital Libraries
Scientifi...
PROBLEM: DATA MANAGEMENT IN LARGE
ENTERPRISES
 fragmentation of informational systems
 Result of application (user)-driv...
SOLUTION: UNIFIED ACCESS TO DATA
Integration System
Collects and combines information
Provides integrated view, uniform us...
WHAT IS A DATA WAREHOUSE?
“A data warehouse is simply a single,
complete, and consistent store of data
obtained from a var...
WHAT IS A DATA WAREHOUSE?
“A DW is a
subject-oriented,
integrated,
time-varying,
non-volatile
collection of data that is u...
A DATA WAREHOUSE IS
Stored collection of diverse data
A solution to data integration problem
Single repository of informat...
A DATA WAREHOUSE IS
Large volume of data (Gb, Tb)
Non-volatile
Historical
Time attributes are important
Updates infrequent
OLTP VS. OLAP
OLTP: On Line Transaction Processing
Describes processing at operational sites
OLAP: On Line Analytical Proc...
WAREHOUSE IS A SPECIALIZED DB
Standard DB (OLTP)
 Mostly updates
 Many small transactions
 Mb - Gb of data
 Current sn...
GENERIC WAREHOUSE ARCHITECTURE
Extractor/
Monitor
Extractor/
Monitor
Extractor/
Monitor
Integrator
Warehouse
Client Client...
 ETL Concept
WAREHOUSING PROCESS
ETL CONCEPT
ETL CONCEPT
ISSUES IN DATA WAREHOUSING
Warehouse Design
Extraction
Wrappers, monitors (change detectors)
Integration
Cleansing & mergi...
Upcoming SlideShare
Loading in...5
×

Introduction to Data Warehousing

2,085

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,085
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
147
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Introduction to Data Warehousing

  1. 1. INTRODUCTION TO DATA WAREHOUSING By: Eng. Eyad R. Manaa
  2. 2. INTRODUCTION • Data: Meaningful facts, text, graphics, images, sound, video segments. • Database: An organized collection of logically related data. • Information: Data processed to be useful in decision making. • Metadata: Data that describes data.
  3. 3. ADVANTAGES OF THE DATABASE APPROACH • Data Independence/Reduced Maintenance • Improved Data Sharing • Increased Application Development Productivity • Enforcement of Standards • Improved Data Quality (Constraints) • Better Data Accessibility/ Responsiveness • Security, Backup/Recovery, Concurrency
  4. 4. PROBLEM: HETEROGENEOUS INFORMATION SOURCES “Heterogeneities are everywhere” Personal Databases Digital Libraries Scientific Databases World Wide Web  Different interfaces  Different data representations  Duplicate and inconsistent information
  5. 5. PROBLEM: DATA MANAGEMENT IN LARGE ENTERPRISES  fragmentation of informational systems  Result of application (user)-driven development of operational systems Sales Administration Finance Manufacturing ... Sales Planning Stock Mngmt ... Suppliers ... Debt Mngmt Num. Control ... Inventory
  6. 6. SOLUTION: UNIFIED ACCESS TO DATA Integration System Collects and combines information Provides integrated view, uniform user interface Supports sharing World Wide Web Digital Libraries Scientific Databases Personal Databases
  7. 7. WHAT IS A DATA WAREHOUSE? “A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.”
  8. 8. WHAT IS A DATA WAREHOUSE? “A DW is a subject-oriented, integrated, time-varying, non-volatile collection of data that is used primarily in organizational decision making.”
  9. 9. A DATA WAREHOUSE IS Stored collection of diverse data A solution to data integration problem Single repository of information Subject-oriented Organized by subject, not by application Used for analysis, data mining, etc. Optimized differently from transaction- oriented db
  10. 10. A DATA WAREHOUSE IS Large volume of data (Gb, Tb) Non-volatile Historical Time attributes are important Updates infrequent
  11. 11. OLTP VS. OLAP OLTP: On Line Transaction Processing Describes processing at operational sites OLAP: On Line Analytical Processing Describes processing at warehouse
  12. 12. WAREHOUSE IS A SPECIALIZED DB Standard DB (OLTP)  Mostly updates  Many small transactions  Mb - Gb of data  Current snapshot  Raw data  Thousands of users (e.g., clerical users) Warehouse (OLAP) Mostly reads Queries are long and complex Gb - Tb of data History Summarized data Hundreds of users (e.g., decision- makers, analysts)
  13. 13. GENERIC WAREHOUSE ARCHITECTURE Extractor/ Monitor Extractor/ Monitor Extractor/ Monitor Integrator Warehouse Client Client Design Phase Maintenance Loading ... Metadata Optimization Query & Analysis
  14. 14.  ETL Concept WAREHOUSING PROCESS
  15. 15. ETL CONCEPT
  16. 16. ETL CONCEPT
  17. 17. ISSUES IN DATA WAREHOUSING Warehouse Design Extraction Wrappers, monitors (change detectors) Integration Cleansing & merging Warehousing specification & Maintenance Optimizations
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×