Presented to : Prof. Lokesh Upreti
Presented by: Tirthankar Mandal
PGDM (16-18)
sec: B
A data warehouse is a relational database that is designed for query and
analysis rather than for transaction processing. It usually contains
historical data derived from transaction data, but it can include data
from other sources. It separates analysis workload from transaction
workload and enables an organization to consolidate data from several
sources.
A data warehouse environment includes an extraction, transportation,
transformation, and loading (ETL) solution, an online analytical
processing (OLAP) engine, client analysis tools, and other applications
that manage the process of gathering data and delivering it to business
users.
Benefits :
•Integrate data from multiple sources into a single database and
data model.
•Maintain data history, even if the source transaction systems do
not.
•Integrate data from multiple source systems, enabling a central
view across the enterprise.
•Improve data quality, by providing consistent codes and
descriptions, flagging or even fixing bad data. Present the
organization's information consistently.
•Add value to operational business applications, notably
customer relationship management (CRM) systems.
Disadvantage: Data warehouses are expensive to scale, and
do not excel at handling raw, unstructured, or complex data.
However, data warehouses are still an important tool in the big
data era.
Classification :
Subject-Oriented: A data warehouse can be used to analyze a
particular subject area. For example, "sales" can be a particular
subject.
Integrated: A data warehouse integrates data from multiple data
sources. For example, source A and source B may have different
ways of identifying a product, but in a data warehouse, there will
be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For
example, one can retrieve data from 3 months, 6 months, 12
months, or even older data from a data warehouse. This contrasts
with a transactions system, where often only the most recent data
is kept.
Non-volatile: Once data is in the data warehouse, it will not
change. So, historical data in a data warehouse should never be
altered.
Design Methods :
Bottom-up design
In the bottom-up approach, data marts are first created to provide
reporting and analytical capabilities for specific business processes.
These data marts can then be integrated to create a comprehensive
data warehouse.
Top-down design
The top-down approach is designed using a normalized enterprise
data model. "Atomic" data, that is, data at the greatest level of
detail, are stored in the data warehouse.
Hybrid design
Data warehouses (DW) often resemble the hub and spokes
architecture. Legacy systems feeding the warehouse often include
customer relationship management and enterprise resource
planning, generating large amounts of data. To consolidate these
various data models, and facilitate the extract transform load
process, data warehouses often make use of an operational data
store, the information from which is parsed into the actual DW.
Architecture
Three-Tier Data Warehouse Architecture:
•Bottom tier
•Middle tire
•Top tire
Datawarehouse

Datawarehouse

  • 1.
    Presented to :Prof. Lokesh Upreti Presented by: Tirthankar Mandal PGDM (16-18) sec: B
  • 2.
    A data warehouseis a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. A data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
  • 3.
    Benefits : •Integrate datafrom multiple sources into a single database and data model. •Maintain data history, even if the source transaction systems do not. •Integrate data from multiple source systems, enabling a central view across the enterprise. •Improve data quality, by providing consistent codes and descriptions, flagging or even fixing bad data. Present the organization's information consistently. •Add value to operational business applications, notably customer relationship management (CRM) systems. Disadvantage: Data warehouses are expensive to scale, and do not excel at handling raw, unstructured, or complex data. However, data warehouses are still an important tool in the big data era.
  • 4.
    Classification : Subject-Oriented: Adata warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject. Integrated: A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product. Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.
  • 5.
    Design Methods : Bottom-updesign In the bottom-up approach, data marts are first created to provide reporting and analytical capabilities for specific business processes. These data marts can then be integrated to create a comprehensive data warehouse. Top-down design The top-down approach is designed using a normalized enterprise data model. "Atomic" data, that is, data at the greatest level of detail, are stored in the data warehouse. Hybrid design Data warehouses (DW) often resemble the hub and spokes architecture. Legacy systems feeding the warehouse often include customer relationship management and enterprise resource planning, generating large amounts of data. To consolidate these various data models, and facilitate the extract transform load process, data warehouses often make use of an operational data store, the information from which is parsed into the actual DW.
  • 6.
    Architecture Three-Tier Data WarehouseArchitecture: •Bottom tier •Middle tire •Top tire