Your SlideShare is downloading. ×
0
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Introduction to Data Warehousing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to Data Warehousing

937

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
937
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
64
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Introduction to Data Warehousing December 20, 2012Tameem AhmadM.Tech. (F)ZHCET, AMU, Aligarh
  • 2. References: • “Building Data Warehouse” by Inmon (Third Edition), New York: John Wiley & Sons, (2002) • “Data Mining: Concepts and Techniques” by Han,Kamber. 2000 • http://www.data-warehouse-online.com/ [Accessed: November 4, 2012] • Data Warehousing Battle of the Giants: Comparing the Basics of the Kimball and Inmon Models: by Mary Breslin http://www.bibestpractices.com/view-articles/476812/27/2012 Tameem Ahmad 2
  • 3. Plan for the Presentation • Necessity of Data Warehousing. (Why it is needed?) • What is Data Warehousing? • Architecture • Schema • How to build Data Warehouse (components) • Data Warehousing Tools12/27/2012 Tameem Ahmad 3
  • 4. ? ? ? ? Necessity is the mother of invention… Why Data Warehouse?12/27/2012 Tameem Ahmad 4
  • 5. Scenario • ABC Pvt Ltd is a company with branches at Mumbai, Delhi, Chennai and Banglore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system.12/27/2012 Tameem Ahmad 5
  • 6. Scenarion: ABC Pvt. Ltd. Mumbai Delhi Sales per item type per branch Sales for first quarter. Manager Chennai Banglore12/27/2012 Tameem Ahmad 6 6
  • 7. Solution: ABC Pvt. Ltd.  Extract sales information from each database.  Store the information in a common repository at a single site.12/27/2012 Tameem Ahmad 7
  • 8. Solution: ABC Pvt. Ltd. Mumbai Report Delhi Query & Sales Data Analysis tools Manager Warehouse Chennai Banglore12/27/2012 Tameem Ahmad 8
  • 9. Data Warehousing… • Definition A data warehouse is » -subject-oriented, » -integrated, » -time-variant, » -nonvolatile collection of data in support of management’s decision making process.12/27/2012 Tameem Ahmad 9
  • 10. Subject-oriented • Data warehouse is organized around subjects such as sales, product, customer. • It focuses on modeling and analysis of data for decision makers. • Excludes data not useful in decision support process.12/27/2012 Tameem Ahmad 10
  • 11. Integration • Data Warehouse is constructed by integrating multiple heterogeneous sources. • Data Preprocessing are applied to ensure consistency. RDBMS Legacy Data System Warehouse Data Processing Flat File Data Transformation12/27/2012 Tameem Ahmad 11
  • 12. Time-variant • Provides information from historical perspective e.g. past 5- 10 years12/27/2012 Tameem Ahmad 12
  • 13. Nonvolatile • Data once recorded cannot be updated. • Data warehouse requires two operations in data accessing – Initial loading of data – Access of data load access12/27/2012 Tameem Ahmad 13
  • 14. Data Warehousing Architecture12/27/2012 Tameem Ahmad 14
  • 15. Data Warehousing Architecture (Contt…) • Data Warehouse server • almost always a relational DBMS, rarely flat files • OLAP servers • to support and operate on multi-dimensional data structures • Clients • Query and reporting tools • Analysis tools • Data mining tools12/27/2012 Tameem Ahmad 15
  • 16. Data Warehousing Schema • Star Schema • Snowflake Schema12/27/2012 Tameem Ahmad 16
  • 17. Measures & Dimensions • Measure – Units sold, Amount. • Dimensions – Product, Time, Region12/27/2012 Tameem Ahmad 17
  • 18. Star Schema • A single, large and central fact table and one table for each dimension. • Every fact points to one tuple in each of the dimensions and has additional attributes. • Does not capture hierarchies directly.12/27/2012 Tameem Ahmad 18
  • 19. Star Schema (Contt…) Fact Table Store Dimension Time Dimension Store Key Store Key Product Key Period Key Store Name Period Key Year City Units Quarter State Price Month Region Product Key Product Desc Product DimensionBenefits: Easy to understand, easy to define hierarchies, reduces no. of physical joins.12/27/2012 Tameem Ahmad 19
  • 20. Snowflake Schema • Variant of star schema model. • A single, large and central fact table and one or more tables for each dimension. • Dimension tables are normalized i.e. split dimension table data into additional tables12/27/2012 Tameem Ahmad 20
  • 21. Snowflake Schema (Contt…) Store Dimension Fact Table Time Dimension Store Key Period Key Store Key Product Key Year Store Name Period Key Quarter City Key Units Month Price City Dimension City Key Product Key City Product Desc State Region Product Dimension Drawbacks: Time consuming joins,report generation slow12/27/2012 Tameem Ahmad 21
  • 22. Building the Data Warehouse • Data Selection • Data Pre-processing – Fill missing values – Remove inconsistency • Data Transformation & Integration • Data Loading Data in warehouse is stored in form of fact tables and dimension tables.12/27/2012 Tameem Ahmad 22
  • 23. Data Warehousing Tools • Data Warehouse – SQL Server 2000 DTS – Oracle 8i Warehouse Builder • ETL tools – Ab Initio – Informatica • Reporting tools • OLAP tools −MS Excel Pivot Chart – SQL Server Analysis −VB Applications Services −cognos, – Oracle Express Server −Microstrategy, −Hyperion12/27/2012 Tameem Ahmad 23
  • 24. Thank You

×