Presented by Gopalakrishnan K
KG Data Solutions
gopalk@kgds.org
 What Is A Data Warehouse?
 History
 Current scenario
 Characteristics
 Operational Database vs. Data Warehouse
 Architecture
 Data Model
Gopal K KGDS
 The term "data warehouse" refers to a special
type of database that acts as the central
repository for company data. It can be thought of
as a database archive that is segregated from the
operational databases, and used primarily for
reporting and data mining purposes.
 The relational database revolution in the early
1980s ushered in an era of improved access to
the valuable information contained deep within
data. Still improvements were needed.
 It was soon discovered that databases modeled
to be efficient at transactional processing were
not always optimized for complex reporting or
analytical needs
 Inmon champions the large centralized Data Warehouse approach
leveraging solid relational design principles. His Corporate
Information Factory remains an example of this "top down"
philosophy.
 Kimball, on the other hand, favors the development of individual
data marts at the departmental level that get integrated together
using the Information Bus architecture. This "bottom up" approach
dovetails nicely with Kimball's preference for star-schema modeling
Many of the current changes in today's data industry also affect Data
Warehousing. Cloud storage and high-velocity, real-time data analysis
being two obvious factors playing a role in the practice's evolution. On
the end-user side, web-based and mobile access to decision support or
reporting data is a major requirement on many projects. Advances in
the practice of ontology have enhanced the capabilities of ETL systems
to parse information out of unstructured as well as structured data
sources
 Subject-oriented
The data in the database is organized so that all the data elements
relating to the same real-world event or object are linked together.
 Time-variant
The changes to the data in the database are tracked and recorded
so that reports can be produced showing changes over time.
 Non-volatile
Data in the database is never over-written or deleted. Once
committed, the data is static, read-only, but retained for future
reporting.
 Integrated
The database contains data from most or all of an organization's
operational applications, and that this data is made consistent.
 The processing load of reporting reduced the
response time of the operational systems.
 The database designs of operational systems
were not optimized for information analysis and
reporting.
 Most organizations had more than one
operational system, so company-wide reporting
could not be supported from a single system.
 Development of reports in operational systems
often required writing specific computer
programs which was slow and expensive.
 Consolidation of data from a wide variety of data
sources.
 Ability to analyze data beyond the level of
standard monitoring reports.
 Operational response time unaffected.
Data warehouse presentation
Data warehouse presentation
Data warehouse presentation

Data warehouse presentation

  • 1.
    Presented by GopalakrishnanK KG Data Solutions gopalk@kgds.org
  • 2.
     What IsA Data Warehouse?  History  Current scenario  Characteristics  Operational Database vs. Data Warehouse  Architecture  Data Model Gopal K KGDS
  • 3.
     The term"data warehouse" refers to a special type of database that acts as the central repository for company data. It can be thought of as a database archive that is segregated from the operational databases, and used primarily for reporting and data mining purposes.
  • 4.
     The relationaldatabase revolution in the early 1980s ushered in an era of improved access to the valuable information contained deep within data. Still improvements were needed.  It was soon discovered that databases modeled to be efficient at transactional processing were not always optimized for complex reporting or analytical needs
  • 5.
     Inmon championsthe large centralized Data Warehouse approach leveraging solid relational design principles. His Corporate Information Factory remains an example of this "top down" philosophy.  Kimball, on the other hand, favors the development of individual data marts at the departmental level that get integrated together using the Information Bus architecture. This "bottom up" approach dovetails nicely with Kimball's preference for star-schema modeling
  • 6.
    Many of thecurrent changes in today's data industry also affect Data Warehousing. Cloud storage and high-velocity, real-time data analysis being two obvious factors playing a role in the practice's evolution. On the end-user side, web-based and mobile access to decision support or reporting data is a major requirement on many projects. Advances in the practice of ontology have enhanced the capabilities of ETL systems to parse information out of unstructured as well as structured data sources
  • 7.
     Subject-oriented The datain the database is organized so that all the data elements relating to the same real-world event or object are linked together.  Time-variant The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time.
  • 8.
     Non-volatile Data inthe database is never over-written or deleted. Once committed, the data is static, read-only, but retained for future reporting.  Integrated The database contains data from most or all of an organization's operational applications, and that this data is made consistent.
  • 9.
     The processingload of reporting reduced the response time of the operational systems.  The database designs of operational systems were not optimized for information analysis and reporting.
  • 10.
     Most organizationshad more than one operational system, so company-wide reporting could not be supported from a single system.  Development of reports in operational systems often required writing specific computer programs which was slow and expensive.
  • 11.
     Consolidation ofdata from a wide variety of data sources.  Ability to analyze data beyond the level of standard monitoring reports.  Operational response time unaffected.