2. TOPICS!!
I N T R O D U C T I O N
B E N I F I T
G E N E R I C
C H A R A C T E RS T I C S
E X A M P L E S
2
3. 3
INTRODUCTION
W h a t i s D a t a W a r e h o u s e ?
In computing, a data warehouse (DW or DWH), also
known as an enterprise data warehouse (EDW), is a
system used for reporting and data analysis and is
considered a core component of business intelligence. DWs
are central repositories of integrated data from one or more
disparate sources. They store current and historical data in
one single place that are used for creating analytical reports
for workers throughout the enterprise.
3
4. M O R E A B O U T
W A R E H O U S E
4
T H E D A T A S T O R E D I N T H E W A R E H O U S E
I S U P L O A D E D F R O M T H E O P E R A T I O N A L
S Y S T E M S ( S U C H A S M A R K E T I N G O R S A L E S ) .
T H E D A T A M A Y P A S S T H R O U G H
A N O P E R A T I O N A L D A T A S T O R E A N D M A Y
R E Q U I R E D A T A C L E A N S I N G F O R A D D I T I O N A L
O P E R A T I O N S T O E N S U R E D A T A
Q U A L I T Y B E F O R E I T I S U S E D I N T H E D W F O R
R E P O R T I N G .
5. B E N I F I T S
Integrate data from multiple sources into a single database and data model. More congregation of data to single database
so a single query engine can be used to present data in an ODS.
Mitigate the problem of database isolation level lock contention in transaction processing systems caused by attempts to
run large, long-running analysis queries in transaction processing databases.
Maintain data history, even if the source transaction systems do not.
Integrate data from multiple source systems, enabling a central view across the enterprise. This benefit is always valuable,
but particularly so when the organization has grown by merger.
Improve data quality, by providing consistent codes and descriptions, flagging or even fixing bad data
Restructure the data so that it makes sense to the business users.
6. G E N E R I C
•Source systems that provide data to the warehouse or mart
•Data integration technology and processes that are needed to prepare the data for use
•Different architectures for storing data in an organization's data warehouse or data marts
•Different tools and applications for a variety of users;
•Metadata, data quality, and governance processes must be in place to ensure that the
warehouse or mart meets its purposes
7. LETS TALK ABOUT ITS
CHARACTERSTICS!!
B A S I C F E AT U R E S
7
8. Subject-oriented
Unlike the operational systems, the data in the data warehouse revolves around the subjects of the enterprise.
Subject orientation is not database normalization. Subject orientation can be really useful for decision-making.
Gathering the required objects is called subject-oriented.
Integrated
The data found within the data warehouse is integrated. Since it comes from several operational systems, all
inconsistencies must be removed. Consistencies include naming conventions, measurement of variables, encoding
structures, physical attributes of data, and so forth.
C H A R A C T E R S T I C S
9. C H A R A C T E R S T I C S
Time-variant
While operational systems reflect current values as they support day-to-day operations, data warehouse data
represents a long time horizon (up to 10 years) which means it stores mostly historical data. It is mainly meant for
data mining and forecasting. (E.g. if a user is searching for a buying pattern of a specific customer, the user needs
to look at data on the current and past purchases.)
Nonvolatile
The data in the data warehouse is read-only, which means it cannot be updated, created, or deleted
(unless there is a regulatory or statutory obligation to do so)
10. 10
The different methods used to construct/organize
a data warehouse specified by an organizationare
numerous. The hardware utilized, software
created and data resources specifically required
for the correct functionality of a data warehouse
are the main components of the data warehouse
architecture. All data warehouses have multiple
phases in which the requirements of the
organizationare modified and fine-tuned.
10
D a t a wa r e h o u s e a r c h i t e c t u r e
11. E X A M P L E S
MarkLogic is useful data warehousing solution that makes data integration easier and faster using an
array of enterprise features. This tool helps to perform very complex search operations. It can query
different types of data like documents, relationships, and metadata.
Oracle is the industry-leading database. It offers a wide range of choice of data warehouse solutions
for both on-premises and in the cloud. It helps to optimize customer experiences by increasing
operational efficiency
Amazon Redshift is Data warehouse tool. It is a simple and cost-effective tool to analyze all types of
data using standard SQL and existing BI tools. It also allows running complex queries against
petabytes of structured data, using the technique of query optimization.
12. THANK YOU
P ro f. A n u j a
C h a t u r v e d i
B C A P re s e n t a t i o n - 2
B y A M A N K U M A R