SlideShare a Scribd company logo
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 29
Overview of
Data Warehousing and OLAP
Slide 29- 2
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 3
Chapter Outline
 1 Purpose of Data Warehousing
 2 Introduction, Definitions, and Terminology
 3 Comparison with Traditional Databases
 4 Characteristics of Data Warehouses
 5 Classification of Data Warehouses
 6 Data Modeling for Data Warehouses
 7 Multi-dimensional Schemas
 8 Building a Data Warehouse
 9 Typical Functionality of a Data Warehouse
 10 Data Warehouse vs. Data Views
 11 Implementation difficulties and open issues
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 4
Introduction, Definitions, and Terminology (1)
 Data Warehouse (DW) was proposed as a new type of
database management system which would keep no
transactional data but only summarized historical
information for decision making purposes.
 DWs are modeled and structured differently, use different
techniques for storage and retrieval and cater to a
different set of users.
 W. H Inmon characterized a data warehouse as:
 “A subject-oriented, integrated, nonvolatile, time-
variant collection of data in support of
management’s decisions.”
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 5
Purpose of Data Warehousing
 Traditional databases are not optimized for data access - they have to
balance the requirement of data access with the need to ensure
integrity of data.
 DWs provide access for complex analysis of data, knowledge
discovery and decision support both through ad-hoc and canned
queries.
 Most of the times the data warehouse users need only read access
but, need the access to be fast over a large volume of data.
 Most of the data required for data warehouse analysis comes from
multiple sources that may include databases from different data
models and sometimes files acquired from independent systems and
platforms.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 6
Introduction, Definitions, and Terminology (2)
 Data warehouses are databases that store and maintain
analytical data separately from transaction-oriented
databases for the purpose of decision support
 Traditional databases support online transaction processing -
OLTP .
 Data Warehouses are for analytical applications- largely OLAP.
 Applications that data warehouse supports are:
 OLAP (Online Analytical Processing) is a term used to
describe the analysis of complex data from the data
warehouse.
 DSS (Decision Support Systems) also known as EIS
(Executive Information Systems) supports organization’s
leading decision makers for making complex and important
decisions.
 Data Mining is used for knowledge discovery, the process
of searching data for unanticipated new knowledge (See
Chapter 28).
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 7
Conceptual Structure of Data Warehouse
 Data Warehouse processing involves
 Cleaning and reformatting of data
 ETL (Extract, Transform, Load)
 OLAP – Data Analytics
 Data Mining
Figure 29.1
Overview of the
general architecture
of a data warehouse.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 8
Comparison with Traditional Databases
 Data Warehouses are mainly optimized for appropriate
data access.
 Traditional databases are transactional and are optimized for
both transaction processing and integrity assurance.
 Data warehouses emphasize more on historical data as
their main purpose is to support time-series and trend
analysis.
 In transactional databases transaction is the mechanism
of change to the database. By contrast, information in
data warehouse is relatively coarse grained and DWs are
regarded as non-real time. The periodic refresh policy is
carefully chosen, usually incremental.
 Compared with transactional databases, data warehouses
are nonvolatile.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 9
Characteristics of Data Warehouses
Based on Codd and Salley (1993) article on providing OLAP
to users, the following characteristics of Data Warehouses
were identified:
 Multidimensional conceptual view
 Unlimited dimensions and aggregation levels
 Unrestricted cross-dimensional operations
 Dynamic sparse matrix handling
 Client-server architecture
 Multiuser support
 Accessibility
 Transparency
 Intuitive data manipulation
 Inductive and deductive analysis
 Flexible distributed reporting
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 10
Classification of Data Warehouses
 Generally, Data Warehouses are an order of magnitude
larger than the source databases.
 The sheer volume of data is an issue, based on which
Data Warehouses could be classified as follows.
 Enterprise-wide data warehouses
 They are huge projects requiring massive investment of time
and resources.
 Virtual data warehouses
 They provide views of operational databases that are
materialized for efficient access.
 Logical Data Warehouses
 Use data federation, distribution and virtualization techniques
 Data marts
 These are generally targeted to a subset of organization, such
as a department, and are more tightly focused.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 11
Other Concepts common with Data
Warehouses
 ODS: – Operational Data Store : This term is
commonly used for intermediate form of
databases before they are cleansed, aggregated
and transformed into a warehouse.
 ADS – Analytical Data Store: these are
databases built for analysis. Typically, ODS’s are
reconfigured and repurposed into ADS’s.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 12
Data Modeling for Data Warehouses (1)
 Traditional Databases generally deal with two-
dimensional data (similar to a spread sheet).
 However, querying performance in a multi-dimensional
data storage model (matrices) is much more efficient.
 Data warehouses can take advantage of this feature as
generally these are
 Non volatile
 The degree of predictability of the analysis that will be
performed on them is high.
 Typical Dimensions used in corporate DWs:
 Fiscal Periods, Product Categories, Geographic Regions
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 13
Data Modeling for Data Warehouses (2)
 Example of Two- Dimensional vs. Multi-
Dimensional (3D typically called “Data Cube”)
Figure 29.2 A two-dimensional
matrix model.
Figure 29.3 A three-dimensional
data cube model.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 14
Functionality of a Data Warehouse
 Functionality that can be expected:
 Pivot: Cross tabulation (also referred to as rotation) is
performed.
 Roll-up (also Drill-up): Data is summarized with
increasing generalization (for example, weekly to
quarterly to annually).
 Drill-down: Increasing levels of detail are revealed
(the complement of roll-up).
 Slice and dice: Projection operations are performed
on the dimensions.
 Sorting: Data is sorted by ordinal value.
 Selection: Data is filtered by value or range.
 Derived (computed) attributes: Attributes are
computed by operations on stored and derived values.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 15
The Pivot operation in a Data Warehouse
 Figure 29.4 Pivoted version of the data
cube from Figure 29.3.
This presents data in terms of Region vs. Fiscal
Quarter : product by product. “Pivoting” is also
called as “Rotation”
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 16
The “roll-up” operation in a Data Warehouse
 Figure 29.5 The roll-up operation.
The roll-up in above figure aggregates data for all products
numbered 123, 124, …… into 1XX, etc.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 17
The “drill-down” operation in a Data Warehouse
 Figure 29.6 The drill-down operation.
A drill-down expands aggregate data into a finer grained
view. In this example , products are divided into styles
and regions into sub-regions.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 18
Multi-dimensional Schemas (1)
 Multi-dimensional model (also called ”dimensional
model”) includes two types of tables:
 Dimension table
 It consists of tuples of attributes of the dimension.
 Fact table
 Each tuple is a recorded fact. This fact contains some
measured or observed variable (s) and identifies it with
pointers to dimension tables. The fact table contains the
data, and the dimensions to identify each tuple in the
data.
 A fact table is as an agglomerated view of transaction
data whereas each dimension table represents “master
data” that those transactions belonged to.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 19
Multi-dimensional Schemas (2)
 Multidimensional DW systems implemented the
multidimensional model as is.
 The more popular approach is to implement the
multidimensional model on top of the relational model.
 Two common multi-dimensional schemas are
 Star schema:
 Consists of a fact table with a single table for each dimension
 Snowflake Schema:
 It is a variation of star schema, in which the dimensional tables
from a star schema are organized into a hierarchy by
normalizing them.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 20
Multi-dimensional Schemas (3)
 Star schema:
 Consists of a fact table with a single table for each
dimension.
Figure 29.7 A star
schema with fact and
dimensional tables.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 21
Multi-dimensional Schemas (4)
 Snowflake Schema:
 It is a variation of star schema, in which the
dimensional tables from a star schema are
organized into a hierarchy by normalizing them.
Figure 29.8 A snowflake schema.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 22
Multi-dimensional Schemas (5)
 Fact Constellation
 Fact constellation is a set of tables that share some
dimension tables. However, fact constellations limit the
possible queries for the warehouse.
 Example shows the Product dimension table being
shared by two Fact tables.
Figure 29.9 A fact constellation.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 23
Multi-dimensional Schemas (6)
 Indexing
 Data warehouse also utilizes indexing to support
high performance access.
 A technique called bitmap indexing constructs a bit
vector for each value in the domain being indexed.
 Indexing works very well for domains of low
cardinality. (See example of using a bitmap index in
Section 19.8)
 Master Data Management (MDM)
 Purpose of MDM is to define standards, processes,
policies and governance issues related to critical
data elements entities of the organization
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 24
Building A Data Warehouse (1)
 The builders of Data warehouse should take a
broad view of the anticipated use of the
warehouse.
 The design should harmonize dimensions across
the whole enterprise and multiple data sources
 The design should support ad-hoc querying
 An appropriate schema should be chosen that
reflects the anticipated usage and the business
model of the organization.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 25
Building A Data Warehouse (2)
 The Design of a Data Warehouse involves
following steps.
 Acquisition of data for the warehouse.
 Ensuring that Data Storage meets the query
requirements efficiently.
 Giving full consideration to the environment in
which the data warehouse resides.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 26
Data Acquisition (1)
 Acquisition of data for the warehouse
 The data must be extracted from multiple,
heterogeneous sources.
 Data must be formatted for consistency within the
warehouse.
 The data must be cleaned to ensure validity.
 Difficult to automate cleaning process.
 Back flushing: refers to upgrading the source data
by returning cleaned data.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 27
Data Acquisition (2)
 Acquisition of data for the warehouse (contd.)
 The data must be fitted into the data model of the
warehouse. Data may have to be converted from
its source model into a multi-dimensional format.
 The data must be loaded into the warehouse.
 Proper design for refresh policy should be
considered.
 Data may come from different systems, language
areas and time-zones.
 Order of loading is critical and semantic constraints
must be obeyed.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 28
Storing Data in a Data Warehouse
 Storing the data according to the data model of
the warehouse
 Creating and maintaining required data structures
 Creating and maintaining appropriate access
paths
 Providing for time-variant data as new data are
added
 Supporting the updating of warehouse data.
 Refreshing the data
 Purging data
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 29
DW Design Considerations
 Usage projections
 The fit of the data model
 Characteristics of available resources
 Design of the metadata component
 Modular component design
 Design for manageability and change
 Considerations of distributed and parallel architecture
 Distributed DWs: Replication, Partitioning,
Communication, Consistency issues
 Federated DWs : Decentralized federation of
autonomous DWs.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 30
Metadata Repositories
 The metadata repository is a key data warehouse
component. It includes both technical and business
metadata.
 technical metadata, covers details of acquisition,
processing, storage structures, data descriptions,
warehouse operations and maintenance, and access
support .
 business metadata, includes the relevant business rules
and organizational details supporting the warehouse.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 31
Functionality of Data Warehouses
 We already presented operations of Pivot, Roll-up, Drill-
down, etc.
 Other functions include-
 intersection and union of indexes
 SQL extensions for aggregation
 Advanced join techniques for star schemas
 Among OLAP functions, we have:
 ROLAP : Relational OLAP – keeping DW as a relational DB
 MOLAP: OLAP on multidimensional model
 HOLAP: Hybrid OLAP –combines the above two. HOLAP
can “drill-through” the data to underlying relations.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 32
Data Warehouse vs. Data Views
 Views and data warehouses are alike in that they both
have read-only extracts from the databases.
 However, data warehouses are different from views in the
following ways:
 Data Warehouses exist as persistent storage instead of being
materialized on demand.
 Data Warehouses are not just relational, but rather multi-
dimensional with multiple levels of aggregation.
 Data Warehouses can be indexed for optimal performance.
Views cannot be indexed directly.
 Data Warehouses provide specific support of functionality.
 Data Warehouses deals with large volumes of integrated data
that is contained generally in more than one database.
 Data warehouses bring in data periodically from multiple
sources via a complex ETL process. Views do not go through
cleaning and pruning.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 33
Difficulties of implementing Data
Warehouses
 Lead time is huge in building a data warehouse
 Potentially it takes years to build and efficiently maintain a data
warehouse.
 Both quality and consistency of data as well as master
data management are major concerns.
 Revising the usage projections regularly to meet the
current requirements.
 The data warehouse should be designed to accommodate
addition and attrition of data sources without major redesign
 DW schema and acquisition component must handle the
changes in data sources in terms of structure and content.
 Administration of data warehouse would require far
broader skills than are needed for a traditional database.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 34
Open Issues in Data Warehousing
 Data cleaning, indexing, partitioning, and views could be
given new attention with perspective to data warehousing.
 Automation of
 data acquisition
 data quality management
 selection and construction of access paths and structures
 self-maintainability
 functionality and performance optimization
 Incorporating of domain and business rules appropriately
into the warehouse creation and maintenance process
more intelligently.
 Creating appropriate teams of technical experts that also
understand the business.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 35
Future of Data Warehousing
 Businesses are getting dissatisfied with the traditional
data warehousing techniques and technologies.
 New analytic requirements are driving new analytic
appliances such at IBM Netezza, EMC Greenplum, SAP
Hana and ParAccel.
 Big data analytics have driven Hadoop and other
specialized data bases such as graph and key-value
stores into the next generation of data warehousing (see
Chapter 25 for Big Data technology based on Hadoop).
 Data virtualization platforms such as the one from Cisco
will enable such Logical Data Warehouses to be built in
future
See Description of Cisco’s Data Virtualization Platform at
http://www.compositesw.com/products-services/data-virtualization-platform/
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 36
Future of Data Warehousing
 Gartner reports, a prestigious source of guidance for
industry says:
“the Logical Data Warehouse (LDW) is a new data
management architecture for analytics which combines the
strengths of traditional repository warehouses with
alternative data management and access strategy. The
LDW will form a new best practices by the end of 2015.”
 The term “Logical Data Warehouse” is a clear demarcation
between centralized repository approaches and
managed data services for analytics.
See: https://www.gartner.com/doc/2057915/understanding-logical-data-
warehouse-emerging
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 37
Recap
 Purpose of Data Warehousing
 Introduction, Definitions, and Terminology
 Comparison with Traditional Databases
 Characteristics of data Warehouses
 Classification of Data Warehouses
 Data Modeling for Data Warehouses
 Multi-dimensional Schemas
 Building A Data Warehouse
 Functionality of a Data Warehouse
 Warehouse vs. Data Views
 Implementation difficulties, open issues, and future.

More Related Content

Similar to Chapter29.ppt

DMDW 1st module.pdf
DMDW 1st module.pdfDMDW 1st module.pdf
DMDW 1st module.pdf
ShreyaBharadwaj7
 
Advances And Research Directions In Data-Warehousing Technology
Advances And Research Directions In Data-Warehousing TechnologyAdvances And Research Directions In Data-Warehousing Technology
Advances And Research Directions In Data-Warehousing Technology
Kate Campbell
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
Priyanshu931034
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
nayakslideshare
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
SAS SNDP YOGAM COLLEGE,KONNI
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
MutiaSari53
 
DW 101
DW 101DW 101
DW 101
jeffd00
 
Data warehouse
Data warehouseData warehouse
Data warehouse
RajThakuri
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 
An Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data WarehousingAn Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data Warehousing
BRNSSPublicationHubI
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
King Julian
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Juhi Mahajan
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
SCITprojects2022
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
SCITprojects2022
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Subrata Kumer Paul
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
Krishan Pareek
 
Data warehousing
Data warehousingData warehousing
Data warehousing
1810dubeybhavna
 
2. olap warehouse
2. olap warehouse2. olap warehouse
2. olap warehouse
Azad public school
 
Presentation
PresentationPresentation
Presentation
Anoush Ghamsari
 
dataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdfdataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdf
AnilGupta681764
 

Similar to Chapter29.ppt (20)

DMDW 1st module.pdf
DMDW 1st module.pdfDMDW 1st module.pdf
DMDW 1st module.pdf
 
Advances And Research Directions In Data-Warehousing Technology
Advances And Research Directions In Data-Warehousing TechnologyAdvances And Research Directions In Data-Warehousing Technology
Advances And Research Directions In Data-Warehousing Technology
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
DW 101
DW 101DW 101
DW 101
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
An Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data WarehousingAn Overview On Data Warehousing An Overview On Data Warehousing
An Overview On Data Warehousing An Overview On Data Warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
TOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdfTOPIC 9 data warehousing and data mining.pdf
TOPIC 9 data warehousing and data mining.pdf
 
data warehousing and data mining (1).pdf
data warehousing and data mining (1).pdfdata warehousing and data mining (1).pdf
data warehousing and data mining (1).pdf
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
2. olap warehouse
2. olap warehouse2. olap warehouse
2. olap warehouse
 
Presentation
PresentationPresentation
Presentation
 
dataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdfdataminingpres-150821063129-lva1-app6891 (3).pdf
dataminingpres-150821063129-lva1-app6891 (3).pdf
 

Recently uploaded

Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
mahammadsalmanmech
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
enizeyimana36
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 

Recently uploaded (20)

Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 

Chapter29.ppt

  • 1. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
  • 2. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 29 Overview of Data Warehousing and OLAP Slide 29- 2
  • 3. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 3 Chapter Outline  1 Purpose of Data Warehousing  2 Introduction, Definitions, and Terminology  3 Comparison with Traditional Databases  4 Characteristics of Data Warehouses  5 Classification of Data Warehouses  6 Data Modeling for Data Warehouses  7 Multi-dimensional Schemas  8 Building a Data Warehouse  9 Typical Functionality of a Data Warehouse  10 Data Warehouse vs. Data Views  11 Implementation difficulties and open issues
  • 4. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 4 Introduction, Definitions, and Terminology (1)  Data Warehouse (DW) was proposed as a new type of database management system which would keep no transactional data but only summarized historical information for decision making purposes.  DWs are modeled and structured differently, use different techniques for storage and retrieval and cater to a different set of users.  W. H Inmon characterized a data warehouse as:  “A subject-oriented, integrated, nonvolatile, time- variant collection of data in support of management’s decisions.”
  • 5. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 5 Purpose of Data Warehousing  Traditional databases are not optimized for data access - they have to balance the requirement of data access with the need to ensure integrity of data.  DWs provide access for complex analysis of data, knowledge discovery and decision support both through ad-hoc and canned queries.  Most of the times the data warehouse users need only read access but, need the access to be fast over a large volume of data.  Most of the data required for data warehouse analysis comes from multiple sources that may include databases from different data models and sometimes files acquired from independent systems and platforms.
  • 6. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 6 Introduction, Definitions, and Terminology (2)  Data warehouses are databases that store and maintain analytical data separately from transaction-oriented databases for the purpose of decision support  Traditional databases support online transaction processing - OLTP .  Data Warehouses are for analytical applications- largely OLAP.  Applications that data warehouse supports are:  OLAP (Online Analytical Processing) is a term used to describe the analysis of complex data from the data warehouse.  DSS (Decision Support Systems) also known as EIS (Executive Information Systems) supports organization’s leading decision makers for making complex and important decisions.  Data Mining is used for knowledge discovery, the process of searching data for unanticipated new knowledge (See Chapter 28).
  • 7. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 7 Conceptual Structure of Data Warehouse  Data Warehouse processing involves  Cleaning and reformatting of data  ETL (Extract, Transform, Load)  OLAP – Data Analytics  Data Mining Figure 29.1 Overview of the general architecture of a data warehouse.
  • 8. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 8 Comparison with Traditional Databases  Data Warehouses are mainly optimized for appropriate data access.  Traditional databases are transactional and are optimized for both transaction processing and integrity assurance.  Data warehouses emphasize more on historical data as their main purpose is to support time-series and trend analysis.  In transactional databases transaction is the mechanism of change to the database. By contrast, information in data warehouse is relatively coarse grained and DWs are regarded as non-real time. The periodic refresh policy is carefully chosen, usually incremental.  Compared with transactional databases, data warehouses are nonvolatile.
  • 9. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 9 Characteristics of Data Warehouses Based on Codd and Salley (1993) article on providing OLAP to users, the following characteristics of Data Warehouses were identified:  Multidimensional conceptual view  Unlimited dimensions and aggregation levels  Unrestricted cross-dimensional operations  Dynamic sparse matrix handling  Client-server architecture  Multiuser support  Accessibility  Transparency  Intuitive data manipulation  Inductive and deductive analysis  Flexible distributed reporting
  • 10. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 10 Classification of Data Warehouses  Generally, Data Warehouses are an order of magnitude larger than the source databases.  The sheer volume of data is an issue, based on which Data Warehouses could be classified as follows.  Enterprise-wide data warehouses  They are huge projects requiring massive investment of time and resources.  Virtual data warehouses  They provide views of operational databases that are materialized for efficient access.  Logical Data Warehouses  Use data federation, distribution and virtualization techniques  Data marts  These are generally targeted to a subset of organization, such as a department, and are more tightly focused.
  • 11. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 11 Other Concepts common with Data Warehouses  ODS: – Operational Data Store : This term is commonly used for intermediate form of databases before they are cleansed, aggregated and transformed into a warehouse.  ADS – Analytical Data Store: these are databases built for analysis. Typically, ODS’s are reconfigured and repurposed into ADS’s.
  • 12. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 12 Data Modeling for Data Warehouses (1)  Traditional Databases generally deal with two- dimensional data (similar to a spread sheet).  However, querying performance in a multi-dimensional data storage model (matrices) is much more efficient.  Data warehouses can take advantage of this feature as generally these are  Non volatile  The degree of predictability of the analysis that will be performed on them is high.  Typical Dimensions used in corporate DWs:  Fiscal Periods, Product Categories, Geographic Regions
  • 13. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 13 Data Modeling for Data Warehouses (2)  Example of Two- Dimensional vs. Multi- Dimensional (3D typically called “Data Cube”) Figure 29.2 A two-dimensional matrix model. Figure 29.3 A three-dimensional data cube model.
  • 14. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 14 Functionality of a Data Warehouse  Functionality that can be expected:  Pivot: Cross tabulation (also referred to as rotation) is performed.  Roll-up (also Drill-up): Data is summarized with increasing generalization (for example, weekly to quarterly to annually).  Drill-down: Increasing levels of detail are revealed (the complement of roll-up).  Slice and dice: Projection operations are performed on the dimensions.  Sorting: Data is sorted by ordinal value.  Selection: Data is filtered by value or range.  Derived (computed) attributes: Attributes are computed by operations on stored and derived values.
  • 15. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 15 The Pivot operation in a Data Warehouse  Figure 29.4 Pivoted version of the data cube from Figure 29.3. This presents data in terms of Region vs. Fiscal Quarter : product by product. “Pivoting” is also called as “Rotation”
  • 16. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 16 The “roll-up” operation in a Data Warehouse  Figure 29.5 The roll-up operation. The roll-up in above figure aggregates data for all products numbered 123, 124, …… into 1XX, etc.
  • 17. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 17 The “drill-down” operation in a Data Warehouse  Figure 29.6 The drill-down operation. A drill-down expands aggregate data into a finer grained view. In this example , products are divided into styles and regions into sub-regions.
  • 18. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 18 Multi-dimensional Schemas (1)  Multi-dimensional model (also called ”dimensional model”) includes two types of tables:  Dimension table  It consists of tuples of attributes of the dimension.  Fact table  Each tuple is a recorded fact. This fact contains some measured or observed variable (s) and identifies it with pointers to dimension tables. The fact table contains the data, and the dimensions to identify each tuple in the data.  A fact table is as an agglomerated view of transaction data whereas each dimension table represents “master data” that those transactions belonged to.
  • 19. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 19 Multi-dimensional Schemas (2)  Multidimensional DW systems implemented the multidimensional model as is.  The more popular approach is to implement the multidimensional model on top of the relational model.  Two common multi-dimensional schemas are  Star schema:  Consists of a fact table with a single table for each dimension  Snowflake Schema:  It is a variation of star schema, in which the dimensional tables from a star schema are organized into a hierarchy by normalizing them.
  • 20. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 20 Multi-dimensional Schemas (3)  Star schema:  Consists of a fact table with a single table for each dimension. Figure 29.7 A star schema with fact and dimensional tables.
  • 21. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 21 Multi-dimensional Schemas (4)  Snowflake Schema:  It is a variation of star schema, in which the dimensional tables from a star schema are organized into a hierarchy by normalizing them. Figure 29.8 A snowflake schema.
  • 22. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 22 Multi-dimensional Schemas (5)  Fact Constellation  Fact constellation is a set of tables that share some dimension tables. However, fact constellations limit the possible queries for the warehouse.  Example shows the Product dimension table being shared by two Fact tables. Figure 29.9 A fact constellation.
  • 23. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 23 Multi-dimensional Schemas (6)  Indexing  Data warehouse also utilizes indexing to support high performance access.  A technique called bitmap indexing constructs a bit vector for each value in the domain being indexed.  Indexing works very well for domains of low cardinality. (See example of using a bitmap index in Section 19.8)  Master Data Management (MDM)  Purpose of MDM is to define standards, processes, policies and governance issues related to critical data elements entities of the organization
  • 24. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 24 Building A Data Warehouse (1)  The builders of Data warehouse should take a broad view of the anticipated use of the warehouse.  The design should harmonize dimensions across the whole enterprise and multiple data sources  The design should support ad-hoc querying  An appropriate schema should be chosen that reflects the anticipated usage and the business model of the organization.
  • 25. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 25 Building A Data Warehouse (2)  The Design of a Data Warehouse involves following steps.  Acquisition of data for the warehouse.  Ensuring that Data Storage meets the query requirements efficiently.  Giving full consideration to the environment in which the data warehouse resides.
  • 26. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 26 Data Acquisition (1)  Acquisition of data for the warehouse  The data must be extracted from multiple, heterogeneous sources.  Data must be formatted for consistency within the warehouse.  The data must be cleaned to ensure validity.  Difficult to automate cleaning process.  Back flushing: refers to upgrading the source data by returning cleaned data.
  • 27. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 27 Data Acquisition (2)  Acquisition of data for the warehouse (contd.)  The data must be fitted into the data model of the warehouse. Data may have to be converted from its source model into a multi-dimensional format.  The data must be loaded into the warehouse.  Proper design for refresh policy should be considered.  Data may come from different systems, language areas and time-zones.  Order of loading is critical and semantic constraints must be obeyed.
  • 28. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 28 Storing Data in a Data Warehouse  Storing the data according to the data model of the warehouse  Creating and maintaining required data structures  Creating and maintaining appropriate access paths  Providing for time-variant data as new data are added  Supporting the updating of warehouse data.  Refreshing the data  Purging data
  • 29. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 29 DW Design Considerations  Usage projections  The fit of the data model  Characteristics of available resources  Design of the metadata component  Modular component design  Design for manageability and change  Considerations of distributed and parallel architecture  Distributed DWs: Replication, Partitioning, Communication, Consistency issues  Federated DWs : Decentralized federation of autonomous DWs.
  • 30. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 30 Metadata Repositories  The metadata repository is a key data warehouse component. It includes both technical and business metadata.  technical metadata, covers details of acquisition, processing, storage structures, data descriptions, warehouse operations and maintenance, and access support .  business metadata, includes the relevant business rules and organizational details supporting the warehouse.
  • 31. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 31 Functionality of Data Warehouses  We already presented operations of Pivot, Roll-up, Drill- down, etc.  Other functions include-  intersection and union of indexes  SQL extensions for aggregation  Advanced join techniques for star schemas  Among OLAP functions, we have:  ROLAP : Relational OLAP – keeping DW as a relational DB  MOLAP: OLAP on multidimensional model  HOLAP: Hybrid OLAP –combines the above two. HOLAP can “drill-through” the data to underlying relations.
  • 32. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 32 Data Warehouse vs. Data Views  Views and data warehouses are alike in that they both have read-only extracts from the databases.  However, data warehouses are different from views in the following ways:  Data Warehouses exist as persistent storage instead of being materialized on demand.  Data Warehouses are not just relational, but rather multi- dimensional with multiple levels of aggregation.  Data Warehouses can be indexed for optimal performance. Views cannot be indexed directly.  Data Warehouses provide specific support of functionality.  Data Warehouses deals with large volumes of integrated data that is contained generally in more than one database.  Data warehouses bring in data periodically from multiple sources via a complex ETL process. Views do not go through cleaning and pruning.
  • 33. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 33 Difficulties of implementing Data Warehouses  Lead time is huge in building a data warehouse  Potentially it takes years to build and efficiently maintain a data warehouse.  Both quality and consistency of data as well as master data management are major concerns.  Revising the usage projections regularly to meet the current requirements.  The data warehouse should be designed to accommodate addition and attrition of data sources without major redesign  DW schema and acquisition component must handle the changes in data sources in terms of structure and content.  Administration of data warehouse would require far broader skills than are needed for a traditional database.
  • 34. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 34 Open Issues in Data Warehousing  Data cleaning, indexing, partitioning, and views could be given new attention with perspective to data warehousing.  Automation of  data acquisition  data quality management  selection and construction of access paths and structures  self-maintainability  functionality and performance optimization  Incorporating of domain and business rules appropriately into the warehouse creation and maintenance process more intelligently.  Creating appropriate teams of technical experts that also understand the business.
  • 35. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 35 Future of Data Warehousing  Businesses are getting dissatisfied with the traditional data warehousing techniques and technologies.  New analytic requirements are driving new analytic appliances such at IBM Netezza, EMC Greenplum, SAP Hana and ParAccel.  Big data analytics have driven Hadoop and other specialized data bases such as graph and key-value stores into the next generation of data warehousing (see Chapter 25 for Big Data technology based on Hadoop).  Data virtualization platforms such as the one from Cisco will enable such Logical Data Warehouses to be built in future See Description of Cisco’s Data Virtualization Platform at http://www.compositesw.com/products-services/data-virtualization-platform/
  • 36. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 36 Future of Data Warehousing  Gartner reports, a prestigious source of guidance for industry says: “the Logical Data Warehouse (LDW) is a new data management architecture for analytics which combines the strengths of traditional repository warehouses with alternative data management and access strategy. The LDW will form a new best practices by the end of 2015.”  The term “Logical Data Warehouse” is a clear demarcation between centralized repository approaches and managed data services for analytics. See: https://www.gartner.com/doc/2057915/understanding-logical-data- warehouse-emerging
  • 37. Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Slide 29- 37 Recap  Purpose of Data Warehousing  Introduction, Definitions, and Terminology  Comparison with Traditional Databases  Characteristics of data Warehouses  Classification of Data Warehouses  Data Modeling for Data Warehouses  Multi-dimensional Schemas  Building A Data Warehouse  Functionality of a Data Warehouse  Warehouse vs. Data Views  Implementation difficulties, open issues, and future.