Final presentation


Published on

Data Warehousing

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Final presentation

  1. 1. Introduction to Data Warehousing
  2. 2. Data Warehousing <ul><li>“ A data warehouse is a subject-oriented , integrated , time-variant , and nonupdatable collection of data in support of management’s decision-making process.” </li></ul><ul><li>Subject-Oriented High level Entities like Customers, Patients, Students, Products and time. </li></ul><ul><li>Integrated Data gathered from several internal system of records or from sources external to the organization. </li></ul>
  3. 3. <ul><li>Time-Varient Time dimension is used in Data Warehousing to study the trends and changes . </li></ul><ul><li>Nonupdatable New data is always added as a supplement to DB, rather than replacement. The DB continually absorbs this new data, incrementally integrating it with previous data. </li></ul><ul><li>Data warehouse can be more than one database </li></ul>
  4. 4. <ul><li>In Simple Words </li></ul><ul><li>“ A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.” </li></ul>
  5. 5. Problem: Heterogeneous Information Sources “ Heterogeneities are everywhere” <ul><li>Different interfaces </li></ul><ul><li>Different data representations </li></ul><ul><li>Duplicate and inconsistent information </li></ul><ul><li>Combined research results from different bioinformatics repositories </li></ul>Personal Databases Digital Libraries Scientific Databases World Wide Web
  6. 6. Goal: Unified Access to Data <ul><li>Collects and combines information </li></ul><ul><li>Provides integrated view, uniform user interface </li></ul><ul><li>Supports sharing </li></ul>Digital Libraries Scientific Databases Personal Databases Integration System World Wide Web
  7. 7. The Need for Data Warehousing <ul><li>A business requires an integrated, companywide view of high quality information. </li></ul><ul><li>The information systems department must separate informational from operational systems( system of records ) to improve performance dramatically in managing company data. </li></ul>
  8. 8. Why a Warehouse <ul><li>For analysis and decision support, end users require access to data captured and stored in an organization’s operational or production systems. </li></ul><ul><li>This data is stored in multiple formats, on multiple platforms, in multiple data structures, with multiple names, and probably created using different business rules </li></ul>
  9. 9. Why should we consider Data Warehousing solutions ? <ul><li>When users are requesting access to a large amount of historical information for reporting purposes, you should strongly consider a warehouse. The user will benefit when the information is organized in an efficient manner for this type of access. </li></ul>
  10. 10. An Example to look at the need of Data Warehousing
  11. 11. Data Warehouse Components
  12. 13. Administration and Management Tools <ul><li>a data warehouse requires tools to support the administration and management of such complex enviroment. </li></ul><ul><li>for the various types of meta-data and the day-to-day operations of the data warehouse, the administration and management tools must be capable of supporting those tasks: </li></ul><ul><li>monitoring data loading from multiple sources </li></ul><ul><li>data quality and integrity checks </li></ul><ul><li>managing and updating meta-data </li></ul><ul><li>monitoring database performance to ensure efficient query response times and resource utilization </li></ul>
  13. 14. <ul><li>auditing data warehouse usage to provide user chargeback information </li></ul><ul><li>replicating, subsetting, and distributing data </li></ul><ul><li>maintaining effient data storage management </li></ul><ul><li>archiving and backing-up data </li></ul><ul><li>implementing recovery following failure </li></ul><ul><li>security management </li></ul>
  14. 15. <ul><li>In computers, the path of data from source document to data entry to processing to final reports. Data changes format and sequence (within a file) as it moves from program to program. </li></ul><ul><li>Is known as Data flow </li></ul>
  15. 16. Data Flow <ul><li>Inflow- The processes associated with the extraction, cleansing, and loading of the data from the source systems into the data warehouse. </li></ul><ul><li>upflow- The process associated with adding value to the data in the warehouse through summarizing, distribution of the data. </li></ul><ul><li>downflow- The processes associated with archiving and backing-up of data in the warehouse. </li></ul><ul><li>outflow- The process associated with making the data availabe to the end-users. </li></ul><ul><li>Meta-flow- The processes associated with the management of the meta-data. </li></ul>
  16. 17. Architectures <ul><li>Many database architectures has been implemented </li></ul><ul><li>2 architectures need to be quoted: </li></ul><ul><li>OLTP (OnLine Transaction Processing) </li></ul><ul><li>Data Warehouse ( OLAP )(online analytical processing) </li></ul><ul><li>OLTP is used to store data and query it frequently and is based on normalized schemas. </li></ul><ul><li>Data warehouse is used to store data history and is based on fact tables and dimension tables. </li></ul>
  17. 18. Difference between OLTP and DataWare House
  18. 19. <ul><li>Special Thanks to </li></ul><ul><li> </li></ul><ul><li>and othe sites. </li></ul><ul><li>Thank You </li></ul>