2. In today’s competitive global business environment,
understanding and managing enterprises wide
information is crucial for making timely decisions and
responding to changing business conditions.
There is a tremendous amount of data generated by
day today business operational applications. Studies
indicate that the amount of data in a given
organizations doubles Every Five Years. so it is difficult
to analyse this data’s to make better decission.
In order to solve this problem DATAWAREHOUSE as
emerged
INTRODUCTION TO DATA
WAREHOUSE
3. Datawarehousing is a popular and powerful concept of
applying information technology to turn this huge island
of data into meaningful information for better business
decisions.
4. A Data warehouse is a subject
oriented, integrated ,time-varient
& non-volatile collection of data
in support of managements
decision making process.
Bill Inmon designs Top-Down
approach
A warehouse is a copy of
transaction data specifically
structured for query & analysis.
Ralph Kimball designs Bottom-
Up approach
Bill Inmon
Ralph Kimball
5.
6. oAn OLTP system is an application that modifies data
and has a large number of concurrent users.
oThis environment is the source for Data warehouse
which consist of current day to day data’s and it is
normalised.
LIST OF OPERATIONAL DATABASE:
Relational Database.
Eg: Oracle,SQL server,Sybase,Teradata
Files
CRM
ERP
External sources & Legacy systems.
7. Extract : Get the data out of the source systems
Transform : Convert the data into a useful format for
analysis.
Load : Get the data into the data warehouse
POPULAR ETL TOOLS ARE:
Informatica
Datastage
Abinitio
Oracle warehouse builder
8. OLAP is computer processing that enables a user to
easily and selectively extract and view data from different
points of view.
This environment helps the Business users to view the
Data’s in multidimensional format that makes the users
to take better decisions easily.
POPULAR OLAP TOOL ARE:
Cognos
Business objects
Micro Strategy
SAS
Crystal Report
Hyperion
9. Generic Two-Level Architecture
Independent Data Mart
Dependent Data Mart and Operational
Data Store.
11. Data marts:Data marts:
Mini-warehouses, limited in scope
E
T
L
Separate ETL for each independent
data mart
Data access complexity due
to multiple data marts
12. ODSODS provides option for
obtaining current data
Single ETL for
enterprise data warehouse (EDW)(EDW)
E
T
L
Dependent data marts loaded
from EDW
13.
14. Data mining, the extraction of hidden predictive
information from large databases, is a powerful new
technology with great potential to help companies focus
on the most important information in their data
warehouses.
Data Mining predicts future trends and behaviors,
allowing businesses to make proactive, knowledge driven
decisions.
Data mining is the process of analyzing business data in
the data warehouse to find unknown partners or rules of
information that you can use to tailor business
operations.
15. Data mining software is one of a number of
analytical tools for analyzing data. It allows
users to analyze data from many different
dimensions or angles, categorize it, and
summarize the relationships identified.
16. Clustering - is the task of discovering groups and
structures in the data that are in some way or another
"similar", without using known structures in the data.
Classification - is the task of generalizing known
structure to apply to new data.
Regression - Attempts to find a function which
models the data with the least error.
Association rule learning - Searches for
relationships between variables.
17. Artificial neural networks: Non-linear predictive models
that learn through training and resemble biological neural
networks in structure.
Genetic algorithms: Optimization techniques that use
processes such as genetic combination, mutation, and natural
selection in a design based on the concepts of natural
evolution.
Decision trees: Tree-shaped structures that represent sets of
decisions.
Nearest neighbor method: A technique that classifies each
record in a dataset based on a combination of the classes of the
k record(s) most similar to it in a historical dataset (where k 1).
Rule induction: The extraction of useful if-then rules from
data based on statistical significance.
Data visualization: The visual interpretation of complex
relationships in multidimensional data. Graphics tools are
used to illustrate data relationships.
18. The data warehouse is the hub for decision support data.
A good data warehouse will provide the RIGHT data to
the RIGHT people at the RIGHT time: RIGHT NOW!
So customers can use data warehouses to improve their
decision making and their competitive advantage
Data warehouse also plays a major role in DATA
MINING to predict future trends and behaviours and
knowledge Driven Decision