SlideShare a Scribd company logo
Decision Support,
Decision Support, Data
Data
Warehousing,
Warehousing, and OLAP
and OLAP
Data Warehousing
Data Warehousing
Warehousing,
Warehousing, and OLAP
and OLAP
Decision Support
Decision Support and OLAP
and OLAP

 Information technology to help the knowledge
Information technology to help the knowledge

 Information technology to help the knowledge
Information technology to help the knowledge
worker (executive, manager, analyst) make faster
worker (executive, manager, analyst) make faster
and better decisions
and better decisions.
.
Decision Support
Decision Support and OLAP
and OLAP
•
• What
What were the sales volumes by region and product
were the sales volumes by region and product
•
• What
What were the sales volumes by region and product
were the sales volumes by region and product
category for the last year?
category for the last year?
•
• How did the share price of computer manufacturers
How did the share price of computer manufacturers
correlate with quarterly profits over the past 10 years?
correlate with quarterly profits over the past 10 years?
•
• Which orders should we fill to maximize revenues?
Which orders should we fill to maximize revenues?
•
• Will a 10% discount increase sales volume sufficiently?
Will a 10% discount increase sales volume sufficiently?
•
• Which of two new medications will result in the best
Which of two new medications will result in the best
•
• Which of two new medications will result in the best
Which of two new medications will result in the best
outcome: higher recovery rate  shorter hospital stay?
outcome: higher recovery rate  shorter hospital stay?

 On
On-
-Line
Line Analytical Processing (OLAP) is an element of
Analytical Processing (OLAP) is an element of
decision support
decision support systems
systems (DSS).
(DSS).
Evolution
Evolution

 60’s: Batch reports
60’s: Batch reports

 60’s: Batch reports
60’s: Batch reports
•
• hard to find and analyze information
hard to find and analyze information
•
• inflexible and expensive, reprogram every new request
inflexible and expensive, reprogram every new request

 70’s: Terminal
70’s: Terminal-
-based DSS and EIS (executive information
based DSS and EIS (executive information
systems)
systems)
•
• still inflexible, not integrated with desktop tools
still inflexible, not integrated with desktop tools

 80’s: Desktop data access and analysis tools
80’s: Desktop data access and analysis tools

 80’s: Desktop data access and analysis tools
80’s: Desktop data access and analysis tools
•
• query tools, spreadsheets, GUIs
query tools, spreadsheets, GUIs
•
• easier to use, but only access operational databases
easier to use, but only access operational databases

 90’s: Data warehousing with integrated OLAP engines
90’s: Data warehousing with integrated OLAP engines
and tools
and tools
OLTP vs. OLAP
OLTP vs. OLAP

 OLTP (
OLTP (Online Transaction Processing
Online Transaction Processing

 OLTP (
OLTP (Online Transaction Processing
Online Transaction Processing
Systems
Systems) is a direct transactional processing
) is a direct transactional processing
system (
system (insert, update, delete)
insert, update, delete) through the
through the
network.
network.

 OLAP
OLAP (
(Online Analytical Processing
Online Analytical Processing

 OLAP
OLAP (
(Online Analytical Processing
Online Analytical Processing
Systems
Systems)
) is a system built to help in
is a system built to help in
planning, problem solving, and decision
planning, problem solving, and decision
support.
support.
OLTP vs. OLAP
OLTP vs. OLAP
Item OLTP OLAP
Item OLTP OLAP
Data source Operational Data,
OLTP as the original
data.
Consolidated Data,
OLAP data from
OLTP.
Data function Controlling and
running the main tasks.
Planning, problem
solving and supporting
the decision.
Data showed Sustainable business From varies of
Data showed Sustainable business
process.
From varies of
business activities.
Query used Simple Query. Complex Queries.
OLTP vs. OLAP
OLTP vs. OLAP
Item OLTP OLAP
Item OLTP OLAP
Speed of access Faster. Depends on the data
involved, it could be faster
using indexes.
Space required Smaller. Larger, needs more
indexing other than OLTP.
Database Design Normalized with
many tables.
De-normalized with less
table and using star /
snowflakes schemas.
snowflakes schemas.
User IT Professional Knowledge worker
Data Warehouse
Data Warehouse

 A decision support database that is maintained
A decision support database that is maintained

 A decision support database that is maintained
A decision support database that is maintained
separately from the organization’s operational
separately from the organization’s operational
databases.
databases.

 A data warehouse is a
A data warehouse is a
•
• subject
subject-
-oriented,
oriented,
•
• integrated,
integrated,
•
• integrated,
integrated,
•
• time
time-
-varying,
varying,
•
• non
non-
-volatile
volatile
collection of data that is used primarily in
collection of data that is used primarily in
organizational decision making
organizational decision making
Why Separate Data Warehouse?
Why Separate Data Warehouse?

 Performance
Performance

 Performance
Performance
•
• Operational database designed
Operational database designed  tuned for
 tuned for
known
known transactions 
transactions  workloads.
workloads.
•
• Complex OLAP queries would degrade
Complex OLAP queries would degrade
performance.
performance. For
For operational transactions.
operational transactions.
•
• Special data organization, access 
Special data organization, access 
•
• Special data organization, access 
Special data organization, access 
implementation methods needed for
implementation methods needed for
multidimensional views  queries.
multidimensional views  queries.
Why Separate Data Warehouse?
Why Separate Data Warehouse?

 Function
Function

 Function
Function
•
• Missing data: Decision support requires historical
Missing data: Decision support requires historical
data, which
data, which operational DB do
operational DB do not typically
not typically
maintain.
maintain.
•
• Data consolidation: Decision support requires
Data consolidation: Decision support requires
consolidation (aggregation, summarization) of data
consolidation (aggregation, summarization) of data
consolidation (aggregation, summarization) of data
consolidation (aggregation, summarization) of data
from many heterogeneous sources:
from many heterogeneous sources: operational DB,
operational DB,
external sources.
external sources.
•
• Data quality: Different sources typically use
Data quality: Different sources typically use
inconsistent data representations, codes, and
inconsistent data representations, codes, and
formats which have to be reconciled.
formats which have to be reconciled.
Data Warehousing Market
Data Warehousing Market

 Hardware: servers, storage, clients
Hardware: servers, storage, clients

 Hardware: servers, storage, clients
Hardware: servers, storage, clients

 Warehouse DBMs
Warehouse DBMs

 Tools
Tools

 Market growing from
Market growing from
•
• $2B in 1995 to $8 B in 1998 (Meta Group)
$2B in 1995 to $8 B in 1998 (Meta Group)
•
• 1.5B today to $6.9B in 1999 (Gartner Group)
1.5B today to $6.9B in 1999 (Gartner Group)

 Systems integration  Consulting
Systems integration  Consulting

 Already deployed in many industries: manufacturing,
Already deployed in many industries: manufacturing,
retail, financial, insurance, transportation, telecom.,
retail, financial, insurance, transportation, telecom.,
utilities, healthcare.
utilities, healthcare.
Data Warehousing Architecture
Data Warehousing Architecture
OLAP servers
OLAP servers
Monitoring
Monitoring  Administration
 Administration
Extract
Extract
Transform
Transform
Load
Load
Refresh
Refresh
External
External
Sources
Sources
Serve
Serve
OLAP servers
OLAP servers
Analysis
Analysis
Query/
Query/
Reporting
Reporting
Metadata
Metadata
Repository
Repository
Refresh
Refresh
Data Marts
Data Marts
Serve
Serve
Data
Data
Mining
Mining
Operational
Operational
DB
DB
Three
Three-
-Tier Architecture
Tier Architecture

 Warehouse database server
Warehouse database server

 Warehouse database server
Warehouse database server
•
• Almost always a relational DBMS; rarely flat files
Almost always a relational DBMS; rarely flat files

 OLAP servers
OLAP servers
•
• Relational OLAP (ROLAP): extended relational DBMS that maps
Relational OLAP (ROLAP): extended relational DBMS that maps
operations on multidimensional data to standard relational
operations on multidimensional data to standard relational
operations.
operations.
•
• Multidimensional OLAP (MOLAP): special purpose server that
Multidimensional OLAP (MOLAP): special purpose server that
•
• Multidimensional OLAP (MOLAP): special purpose server that
Multidimensional OLAP (MOLAP): special purpose server that
directly implements multidimensional data and operations.
directly implements multidimensional data and operations.

 Clients
Clients
•
• Query and reporting tools.
Query and reporting tools.
•
• Analysis tools
Analysis tools
•
• Data mining tools (e.g., trend analysis, prediction)
Data mining tools (e.g., trend analysis, prediction)
Design  Operational Process
Design  Operational Process

 Define architecture. Do capacity planning.
Define architecture. Do capacity planning.

 Define architecture. Do capacity planning.
Define architecture. Do capacity planning.

 Integrate db and OLAP servers, storage and client tools.
Integrate db and OLAP servers, storage and client tools.

 Design warehouse schema, views.
Design warehouse schema, views.

 Design physical warehouse organization: data placement,
Design physical warehouse organization: data placement,
partitioning, access methods.
partitioning, access methods.

 Connect sources: gateways, ODBC drivers, wrappers.
Connect sources: gateways, ODBC drivers, wrappers.
Connect sources: gateways, ODBC drivers, wrappers.
Connect sources: gateways, ODBC drivers, wrappers.

 Design  implement scripts for data extract, load refresh.
Design  implement scripts for data extract, load refresh.

 Define metadata and populate repository.
Define metadata and populate repository.

 Design  implement end
Design  implement end-
-user applications.
user applications.

 Roll out warehouse and applications.
Roll out warehouse and applications.

 Monitor the warehouse.
Monitor the warehouse.
OLAP for Decision Support
OLAP for Decision Support

 Goal of OLAP is to support ad
Goal of OLAP is to support ad-
-hoc querying for the
hoc querying for the

 Goal of OLAP is to support ad
Goal of OLAP is to support ad-
-hoc querying for the
hoc querying for the
business analyst
business analyst

 Business analysts are familiar with spreadsheets
Business analysts are familiar with spreadsheets

 Extend spreadsheet analysis model to work with
Extend spreadsheet analysis model to work with
warehouse data
warehouse data
•
• Large data set
Large data set
•
• Semantically enriched to understand business terms (e.g., time,
Semantically enriched to understand business terms (e.g., time,
•
• Semantically enriched to understand business terms (e.g., time,
Semantically enriched to understand business terms (e.g., time,
geography)
geography)
•
• Combined with reporting features
Combined with reporting features

 Multidimensional
Multidimensional view of data is the foundation of OLAP
view of data is the foundation of OLAP
Multidimensional Data Model
Multidimensional Data Model

 Database is a set of
Database is a set of facts
facts (points) in a
(points) in a

 Database is a set of
Database is a set of facts
facts (points) in a
(points) in a
multidimensional space
multidimensional space

 A fact has a
A fact has a measure
measure dimension
dimension
•
• quantity that is analyzed, e.g., sale, budget
quantity that is analyzed, e.g., sale, budget

 A set of
A set of dimensions
dimensions on which data is
on which data is

 A set of
A set of dimensions
dimensions on which data is
on which data is
analyzed
analyzed
•
• e.g. , store, product, date associated with a sale
e.g. , store, product, date associated with a sale
amount
amount
Multidimensional Data Model
Multidimensional Data Model

 Dimensions
Dimensions form a sparsely populated
form a sparsely populated

 Dimensions
Dimensions form a sparsely populated
form a sparsely populated
coordinate system
coordinate system

 Each dimension has a set of
Each dimension has a set of attributes
attributes
•
• e.g., owner city and county of store
e.g., owner city and county of store

 Attributes of a dimension may be related by
Attributes of a dimension may be related by

 Attributes of a dimension may be related by
Attributes of a dimension may be related by
partial order
partial order
•
• Hierarchy
Hierarchy: e.g., street  county city
: e.g., street  county city
•
• Lattice
Lattice: e.g., date monthyear,
: e.g., date monthyear,
dateweekyear
dateweekyear
Multidimensional Data
Multidimensional Data
Sales
Sales
10
10
47
47
30
30
Juice
Juice
Cola
Cola
Milk
Milk
Sales
Sales
Volume
Volume
as a
as a
function
function
of time,
of time,
city and
city and
30
30
12
12
Milk
Milk
Cream
Cream
city and
city and
product
product
3/1 3/2 3/3 3/4
3/1 3/2 3/3 3/4
Date
Date
Operations in Multidimensional
Operations in Multidimensional
Data Model
Data Model

 Aggregation (
Aggregation (roll
roll-
-up
up)
)

 Aggregation (
Aggregation (roll
roll-
-up
up)
)
•
• dimension reduction: e.g., total sales by city
dimension reduction: e.g., total sales by city
•
• summarization over aggregate hierarchy: e.g., total
summarization over aggregate hierarchy: e.g., total
sales by city and year
sales by city and year -
- total sales by region and by
 total sales by region and by
year
year

 Selection (
Selection (slice
slice) defines a subcube
) defines a subcube
•
• e.g., sales where city = Palo Alto and date = 1/15/96
e.g., sales where city = Palo Alto and date = 1/15/96

 Navigation to detailed data (
Navigation to detailed data (drill
drill-
-down
down)
)
•
• e.g., (sales
e.g., (sales -
- expense) by city, top 3% of cities by
expense) by city, top 3% of cities by
average income
average income

 Visualization Operations (e.g., Pivot)
Visualization Operations (e.g., Pivot)
A Visual Operation: Pivot (Rotate)
A Visual Operation: Pivot (Rotate)
10
10
47
47
30
30
Juice
Juice
Cola
Cola
Milk
Milk 30
30
12
12
Milk
Milk
Cream
Cream
3/1
3/1 3/2 3/3
3/2 3/3 3/4
3/4
Date
Date
Product
Product
Approaches to OLAP Servers
Approaches to OLAP Servers

 Relational OLAP (ROLAP)
Relational OLAP (ROLAP)

 Relational OLAP (ROLAP)
Relational OLAP (ROLAP)
•
• Relational and Specialized Relational DBMS
Relational and Specialized Relational DBMS
to store and manage warehouse data
to store and manage warehouse data
•
• OLAP middleware to support missing pieces
OLAP middleware to support missing pieces
–
– Optimize for each DBMS backend
Optimize for each DBMS backend
–
– Aggregation Navigation Logic
Aggregation Navigation Logic
–
– Aggregation Navigation Logic
Aggregation Navigation Logic
–
– Additional tools and services
Additional tools and services
•
• Example:
Example: Microstrategy
Microstrategy,
, MetaCube
MetaCube
(Informix
(Informix)
)
Approaches to OLAP Servers
Approaches to OLAP Servers

Multidimensional
Multidimensional OLAP (MOLAP)
OLAP (MOLAP)

Multidimensional
Multidimensional OLAP (MOLAP)
OLAP (MOLAP)
•
• Array
Array-
-based storage structures
based storage structures
•
• Direct access to array data structures
Direct access to array data structures
•
• Example:
Example: Essbase
Essbase (Arbor),
(Arbor), Accumate
Accumate
(
(Kenan
Kenan)
)
(
(Kenan
Kenan)
)

Domain
Domain-
-specific enrichment
specific enrichment
Relational DBMS as Warehouse
Relational DBMS as Warehouse
Server
Server

 Schema design
Schema design

 Schema design
Schema design

 Specialized scan, indexing and join techniques
Specialized scan, indexing and join techniques

 Handling of aggregate views (querying and
Handling of aggregate views (querying and
materialization)
materialization)

 Supporting query language extensions beyond
Supporting query language extensions beyond
SQL
SQL
SQL
SQL

 Complex query processing and optimization
Complex query processing and optimization

 Data partitioning and parallelism
Data partitioning and parallelism
Warehouse Database Schema
Warehouse Database Schema

 ER design techniques not appropriate
ER design techniques not appropriate

 ER design techniques not appropriate
ER design techniques not appropriate

 Design should reflect multidimensional
Design should reflect multidimensional
view
view
•
• Star Schema
Star Schema
•
• Snowflake Schema
Snowflake Schema
•
• Snowflake Schema
Snowflake Schema
•
• Fact Constellation Schema
Fact Constellation Schema
Example of a Star Schema
Example of a Star Schema
Order
Order
Product
Product
Order No
Order No
Order Date
Order Date
SalespersonID
SalespersonID
SalespersonName
SalespersonName
City
City
Quota
Quota
OrderNO
OrderNO
SalespersonID
SalespersonID
CustomerNO
CustomerNO
ProdNo
ProdNo
ProductNO
ProductNO
ProdName
ProdName
ProdDescr
ProdDescr
Category
Category
CategoryDescription
CategoryDescription
UnitPrice
UnitPrice
Salesperson
Salesperson
Date
Date
Product
Product
Fact Table
Fact Table
Customer No
Customer No
Customer Name
Customer Name
Customer Address
Customer Address
City
City
ProdNo
ProdNo
DateKey
DateKey
CityName
CityName
Quantity
Quantity
TotalPrice
DateKey
DateKey
Date
Date
CityName
CityName
State
State
Country
Country
Customer
Customer
City
City
Date
Date
Star Schema
Star Schema

 A single fact table and a single table for each dimension
A single fact table and a single table for each dimension

 A single fact table and a single table for each dimension
A single fact table and a single table for each dimension

 Every fact points to one tuple in each of the dimensions
Every fact points to one tuple in each of the dimensions
and has additional attributes
and has additional attributes

 Does not capture hierarchies directly
Does not capture hierarchies directly

 Generated keys are used for performance and maintenance
Generated keys are used for performance and maintenance
reasons
reasons

 Fact constellation: Multiple Fact tables that share many
Fact constellation: Multiple Fact tables that share many
dimension tables
dimension tables
•
• Example: Projected expense and the actual expense may share
Example: Projected expense and the actual expense may share
dimensional tables
dimensional tables
Example of a Snowflake Schema
Example of a Snowflake Schema
Order
Order
Product
Product
Category
Category
Order No
Order No
Order Date
Order Date
Customer No
Customer No
Customer Name
Customer Name
Customer
Customer
Address
Address
OrderNO
OrderNO
SalespersonID
SalespersonID
CustomerNO
CustomerNO
ProdNo
ProdNo
ProductNO
ProductNO
ProdName
ProdName
ProdDescr
ProdDescr
CategoryName
CategoryName
Category
Category
UnitPrice
UnitPrice
Customer
Customer
Date
Date
Product
Product
Fact Table
Fact Table
CategoryName
CategoryName
CategoryDescr
CategoryDescr
Category
Category
Month
Month
City
City
SalespersonID
SalespersonID
SalespersonName
SalespersonName
City
City
Quota
Quota
ProdNo
ProdNo
DateKey
DateKey
CityName
CityName
Quantity
Quantity
TotalPrice
DateKey
DateKey
Date
Date
Month
Month
CityName
CityName
StateName
StateName
Salesperson
Salesperson
City
City
Date
Date
Month
Month
Year
Year
Year
Year
StateName
StateName
Country
Country
State
State
Month
Month
Year
Year
Snowflake Schema
Snowflake Schema

 Represent dimensional hierarchy directly by
Represent dimensional hierarchy directly by

 Represent dimensional hierarchy directly by
Represent dimensional hierarchy directly by
normalizing the dimension tables
normalizing the dimension tables

 Easy to maintain
Easy to maintain

 Saves storage, but is alleged that it reduces
Saves storage, but is alleged that it reduces
effectiveness of browsing (Kimball)
effectiveness of browsing (Kimball)
effectiveness of browsing (Kimball)
effectiveness of browsing (Kimball)

More Related Content

Similar to 05_Decision Support and OLAP.pdf

DWH_Session_1.pptx
DWH_Session_1.pptxDWH_Session_1.pptx
DWH_Session_1.pptx
umashanker manthena
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
Samraiz Tejani
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
RTTS
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
RafiulHasan19
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
Er. Nawaraj Bhandari
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
Caserta
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
Dr Pradhan PL Pradhan
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
Zalpa Rathod
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
Zalpa Rathod
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
Priyanshu931034
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
Blackvard
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data Exploration
Inside Analysis
 
Msbi by quontra us
Msbi by quontra usMsbi by quontra us
Msbi by quontra us
QUONTRASOLUTIONS
 
Data warehouse
Data warehouseData warehouse
Data warehouse
Saurab Dulal
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
Sushil Kulkarni
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
Dunn Solutions Group
 

Similar to 05_Decision Support and OLAP.pdf (20)

DWH_Session_1.pptx
DWH_Session_1.pptxDWH_Session_1.pptx
DWH_Session_1.pptx
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data Exploration
 
Msbi by quontra us
Msbi by quontra usMsbi by quontra us
Msbi by quontra us
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 

More from INyomanSwitrayana

sa.ppt
sa.pptsa.ppt
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
INyomanSwitrayana
 
IS
ISIS
Handout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptxHandout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptx
INyomanSwitrayana
 
05_Skema Database.ppt
05_Skema Database.ppt05_Skema Database.ppt
05_Skema Database.ppt
INyomanSwitrayana
 
06_The ETL (Extract-Transform-Load) Process.ppt
06_The ETL (Extract-Transform-Load) Process.ppt06_The ETL (Extract-Transform-Load) Process.ppt
06_The ETL (Extract-Transform-Load) Process.ppt
INyomanSwitrayana
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
INyomanSwitrayana
 
Lecture2-DT.pptx
Lecture2-DT.pptxLecture2-DT.pptx
Lecture2-DT.pptx
INyomanSwitrayana
 

More from INyomanSwitrayana (8)

sa.ppt
sa.pptsa.ppt
sa.ppt
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
IS
ISIS
IS
 
Handout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptxHandout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptx
 
05_Skema Database.ppt
05_Skema Database.ppt05_Skema Database.ppt
05_Skema Database.ppt
 
06_The ETL (Extract-Transform-Load) Process.ppt
06_The ETL (Extract-Transform-Load) Process.ppt06_The ETL (Extract-Transform-Load) Process.ppt
06_The ETL (Extract-Transform-Load) Process.ppt
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
 
Lecture2-DT.pptx
Lecture2-DT.pptxLecture2-DT.pptx
Lecture2-DT.pptx
 

Recently uploaded

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 

Recently uploaded (20)

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 

05_Decision Support and OLAP.pdf

  • 1. Decision Support, Decision Support, Data Data Warehousing, Warehousing, and OLAP and OLAP Data Warehousing Data Warehousing Warehousing, Warehousing, and OLAP and OLAP
  • 2. Decision Support Decision Support and OLAP and OLAP Information technology to help the knowledge Information technology to help the knowledge Information technology to help the knowledge Information technology to help the knowledge worker (executive, manager, analyst) make faster worker (executive, manager, analyst) make faster and better decisions and better decisions. .
  • 3. Decision Support Decision Support and OLAP and OLAP • • What What were the sales volumes by region and product were the sales volumes by region and product • • What What were the sales volumes by region and product were the sales volumes by region and product category for the last year? category for the last year? • • How did the share price of computer manufacturers How did the share price of computer manufacturers correlate with quarterly profits over the past 10 years? correlate with quarterly profits over the past 10 years? • • Which orders should we fill to maximize revenues? Which orders should we fill to maximize revenues? • • Will a 10% discount increase sales volume sufficiently? Will a 10% discount increase sales volume sufficiently? • • Which of two new medications will result in the best Which of two new medications will result in the best • • Which of two new medications will result in the best Which of two new medications will result in the best outcome: higher recovery rate shorter hospital stay? outcome: higher recovery rate shorter hospital stay? On On- -Line Line Analytical Processing (OLAP) is an element of Analytical Processing (OLAP) is an element of decision support decision support systems systems (DSS). (DSS).
  • 4. Evolution Evolution 60’s: Batch reports 60’s: Batch reports 60’s: Batch reports 60’s: Batch reports • • hard to find and analyze information hard to find and analyze information • • inflexible and expensive, reprogram every new request inflexible and expensive, reprogram every new request 70’s: Terminal 70’s: Terminal- -based DSS and EIS (executive information based DSS and EIS (executive information systems) systems) • • still inflexible, not integrated with desktop tools still inflexible, not integrated with desktop tools 80’s: Desktop data access and analysis tools 80’s: Desktop data access and analysis tools 80’s: Desktop data access and analysis tools 80’s: Desktop data access and analysis tools • • query tools, spreadsheets, GUIs query tools, spreadsheets, GUIs • • easier to use, but only access operational databases easier to use, but only access operational databases 90’s: Data warehousing with integrated OLAP engines 90’s: Data warehousing with integrated OLAP engines and tools and tools
  • 5. OLTP vs. OLAP OLTP vs. OLAP OLTP ( OLTP (Online Transaction Processing Online Transaction Processing OLTP ( OLTP (Online Transaction Processing Online Transaction Processing Systems Systems) is a direct transactional processing ) is a direct transactional processing system ( system (insert, update, delete) insert, update, delete) through the through the network. network. OLAP OLAP ( (Online Analytical Processing Online Analytical Processing OLAP OLAP ( (Online Analytical Processing Online Analytical Processing Systems Systems) ) is a system built to help in is a system built to help in planning, problem solving, and decision planning, problem solving, and decision support. support.
  • 6. OLTP vs. OLAP OLTP vs. OLAP Item OLTP OLAP Item OLTP OLAP Data source Operational Data, OLTP as the original data. Consolidated Data, OLAP data from OLTP. Data function Controlling and running the main tasks. Planning, problem solving and supporting the decision. Data showed Sustainable business From varies of Data showed Sustainable business process. From varies of business activities. Query used Simple Query. Complex Queries.
  • 7. OLTP vs. OLAP OLTP vs. OLAP Item OLTP OLAP Item OLTP OLAP Speed of access Faster. Depends on the data involved, it could be faster using indexes. Space required Smaller. Larger, needs more indexing other than OLTP. Database Design Normalized with many tables. De-normalized with less table and using star / snowflakes schemas. snowflakes schemas. User IT Professional Knowledge worker
  • 8. Data Warehouse Data Warehouse A decision support database that is maintained A decision support database that is maintained A decision support database that is maintained A decision support database that is maintained separately from the organization’s operational separately from the organization’s operational databases. databases. A data warehouse is a A data warehouse is a • • subject subject- -oriented, oriented, • • integrated, integrated, • • integrated, integrated, • • time time- -varying, varying, • • non non- -volatile volatile collection of data that is used primarily in collection of data that is used primarily in organizational decision making organizational decision making
  • 9. Why Separate Data Warehouse? Why Separate Data Warehouse? Performance Performance Performance Performance • • Operational database designed Operational database designed tuned for tuned for known known transactions transactions workloads. workloads. • • Complex OLAP queries would degrade Complex OLAP queries would degrade performance. performance. For For operational transactions. operational transactions. • • Special data organization, access Special data organization, access • • Special data organization, access Special data organization, access implementation methods needed for implementation methods needed for multidimensional views queries. multidimensional views queries.
  • 10. Why Separate Data Warehouse? Why Separate Data Warehouse? Function Function Function Function • • Missing data: Decision support requires historical Missing data: Decision support requires historical data, which data, which operational DB do operational DB do not typically not typically maintain. maintain. • • Data consolidation: Decision support requires Data consolidation: Decision support requires consolidation (aggregation, summarization) of data consolidation (aggregation, summarization) of data consolidation (aggregation, summarization) of data consolidation (aggregation, summarization) of data from many heterogeneous sources: from many heterogeneous sources: operational DB, operational DB, external sources. external sources. • • Data quality: Different sources typically use Data quality: Different sources typically use inconsistent data representations, codes, and inconsistent data representations, codes, and formats which have to be reconciled. formats which have to be reconciled.
  • 11. Data Warehousing Market Data Warehousing Market Hardware: servers, storage, clients Hardware: servers, storage, clients Hardware: servers, storage, clients Hardware: servers, storage, clients Warehouse DBMs Warehouse DBMs Tools Tools Market growing from Market growing from • • $2B in 1995 to $8 B in 1998 (Meta Group) $2B in 1995 to $8 B in 1998 (Meta Group) • • 1.5B today to $6.9B in 1999 (Gartner Group) 1.5B today to $6.9B in 1999 (Gartner Group) Systems integration Consulting Systems integration Consulting Already deployed in many industries: manufacturing, Already deployed in many industries: manufacturing, retail, financial, insurance, transportation, telecom., retail, financial, insurance, transportation, telecom., utilities, healthcare. utilities, healthcare.
  • 12. Data Warehousing Architecture Data Warehousing Architecture OLAP servers OLAP servers Monitoring Monitoring Administration Administration Extract Extract Transform Transform Load Load Refresh Refresh External External Sources Sources Serve Serve OLAP servers OLAP servers Analysis Analysis Query/ Query/ Reporting Reporting Metadata Metadata Repository Repository Refresh Refresh Data Marts Data Marts Serve Serve Data Data Mining Mining Operational Operational DB DB
  • 13. Three Three- -Tier Architecture Tier Architecture Warehouse database server Warehouse database server Warehouse database server Warehouse database server • • Almost always a relational DBMS; rarely flat files Almost always a relational DBMS; rarely flat files OLAP servers OLAP servers • • Relational OLAP (ROLAP): extended relational DBMS that maps Relational OLAP (ROLAP): extended relational DBMS that maps operations on multidimensional data to standard relational operations on multidimensional data to standard relational operations. operations. • • Multidimensional OLAP (MOLAP): special purpose server that Multidimensional OLAP (MOLAP): special purpose server that • • Multidimensional OLAP (MOLAP): special purpose server that Multidimensional OLAP (MOLAP): special purpose server that directly implements multidimensional data and operations. directly implements multidimensional data and operations. Clients Clients • • Query and reporting tools. Query and reporting tools. • • Analysis tools Analysis tools • • Data mining tools (e.g., trend analysis, prediction) Data mining tools (e.g., trend analysis, prediction)
  • 14. Design Operational Process Design Operational Process Define architecture. Do capacity planning. Define architecture. Do capacity planning. Define architecture. Do capacity planning. Define architecture. Do capacity planning. Integrate db and OLAP servers, storage and client tools. Integrate db and OLAP servers, storage and client tools. Design warehouse schema, views. Design warehouse schema, views. Design physical warehouse organization: data placement, Design physical warehouse organization: data placement, partitioning, access methods. partitioning, access methods. Connect sources: gateways, ODBC drivers, wrappers. Connect sources: gateways, ODBC drivers, wrappers. Connect sources: gateways, ODBC drivers, wrappers. Connect sources: gateways, ODBC drivers, wrappers. Design implement scripts for data extract, load refresh. Design implement scripts for data extract, load refresh. Define metadata and populate repository. Define metadata and populate repository. Design implement end Design implement end- -user applications. user applications. Roll out warehouse and applications. Roll out warehouse and applications. Monitor the warehouse. Monitor the warehouse.
  • 15. OLAP for Decision Support OLAP for Decision Support Goal of OLAP is to support ad Goal of OLAP is to support ad- -hoc querying for the hoc querying for the Goal of OLAP is to support ad Goal of OLAP is to support ad- -hoc querying for the hoc querying for the business analyst business analyst Business analysts are familiar with spreadsheets Business analysts are familiar with spreadsheets Extend spreadsheet analysis model to work with Extend spreadsheet analysis model to work with warehouse data warehouse data • • Large data set Large data set • • Semantically enriched to understand business terms (e.g., time, Semantically enriched to understand business terms (e.g., time, • • Semantically enriched to understand business terms (e.g., time, Semantically enriched to understand business terms (e.g., time, geography) geography) • • Combined with reporting features Combined with reporting features Multidimensional Multidimensional view of data is the foundation of OLAP view of data is the foundation of OLAP
  • 16. Multidimensional Data Model Multidimensional Data Model Database is a set of Database is a set of facts facts (points) in a (points) in a Database is a set of Database is a set of facts facts (points) in a (points) in a multidimensional space multidimensional space A fact has a A fact has a measure measure dimension dimension • • quantity that is analyzed, e.g., sale, budget quantity that is analyzed, e.g., sale, budget A set of A set of dimensions dimensions on which data is on which data is A set of A set of dimensions dimensions on which data is on which data is analyzed analyzed • • e.g. , store, product, date associated with a sale e.g. , store, product, date associated with a sale amount amount
  • 17. Multidimensional Data Model Multidimensional Data Model Dimensions Dimensions form a sparsely populated form a sparsely populated Dimensions Dimensions form a sparsely populated form a sparsely populated coordinate system coordinate system Each dimension has a set of Each dimension has a set of attributes attributes • • e.g., owner city and county of store e.g., owner city and county of store Attributes of a dimension may be related by Attributes of a dimension may be related by Attributes of a dimension may be related by Attributes of a dimension may be related by partial order partial order • • Hierarchy Hierarchy: e.g., street county city : e.g., street county city • • Lattice Lattice: e.g., date monthyear, : e.g., date monthyear, dateweekyear dateweekyear
  • 18. Multidimensional Data Multidimensional Data Sales Sales 10 10 47 47 30 30 Juice Juice Cola Cola Milk Milk Sales Sales Volume Volume as a as a function function of time, of time, city and city and 30 30 12 12 Milk Milk Cream Cream city and city and product product 3/1 3/2 3/3 3/4 3/1 3/2 3/3 3/4 Date Date
  • 19. Operations in Multidimensional Operations in Multidimensional Data Model Data Model Aggregation ( Aggregation (roll roll- -up up) ) Aggregation ( Aggregation (roll roll- -up up) ) • • dimension reduction: e.g., total sales by city dimension reduction: e.g., total sales by city • • summarization over aggregate hierarchy: e.g., total summarization over aggregate hierarchy: e.g., total sales by city and year sales by city and year - - total sales by region and by total sales by region and by year year Selection ( Selection (slice slice) defines a subcube ) defines a subcube • • e.g., sales where city = Palo Alto and date = 1/15/96 e.g., sales where city = Palo Alto and date = 1/15/96 Navigation to detailed data ( Navigation to detailed data (drill drill- -down down) ) • • e.g., (sales e.g., (sales - - expense) by city, top 3% of cities by expense) by city, top 3% of cities by average income average income Visualization Operations (e.g., Pivot) Visualization Operations (e.g., Pivot)
  • 20. A Visual Operation: Pivot (Rotate) A Visual Operation: Pivot (Rotate) 10 10 47 47 30 30 Juice Juice Cola Cola Milk Milk 30 30 12 12 Milk Milk Cream Cream 3/1 3/1 3/2 3/3 3/2 3/3 3/4 3/4 Date Date Product Product
  • 21. Approaches to OLAP Servers Approaches to OLAP Servers Relational OLAP (ROLAP) Relational OLAP (ROLAP) Relational OLAP (ROLAP) Relational OLAP (ROLAP) • • Relational and Specialized Relational DBMS Relational and Specialized Relational DBMS to store and manage warehouse data to store and manage warehouse data • • OLAP middleware to support missing pieces OLAP middleware to support missing pieces – – Optimize for each DBMS backend Optimize for each DBMS backend – – Aggregation Navigation Logic Aggregation Navigation Logic – – Aggregation Navigation Logic Aggregation Navigation Logic – – Additional tools and services Additional tools and services • • Example: Example: Microstrategy Microstrategy, , MetaCube MetaCube (Informix (Informix) )
  • 22. Approaches to OLAP Servers Approaches to OLAP Servers Multidimensional Multidimensional OLAP (MOLAP) OLAP (MOLAP) Multidimensional Multidimensional OLAP (MOLAP) OLAP (MOLAP) • • Array Array- -based storage structures based storage structures • • Direct access to array data structures Direct access to array data structures • • Example: Example: Essbase Essbase (Arbor), (Arbor), Accumate Accumate ( (Kenan Kenan) ) ( (Kenan Kenan) ) Domain Domain- -specific enrichment specific enrichment
  • 23. Relational DBMS as Warehouse Relational DBMS as Warehouse Server Server Schema design Schema design Schema design Schema design Specialized scan, indexing and join techniques Specialized scan, indexing and join techniques Handling of aggregate views (querying and Handling of aggregate views (querying and materialization) materialization) Supporting query language extensions beyond Supporting query language extensions beyond SQL SQL SQL SQL Complex query processing and optimization Complex query processing and optimization Data partitioning and parallelism Data partitioning and parallelism
  • 24. Warehouse Database Schema Warehouse Database Schema ER design techniques not appropriate ER design techniques not appropriate ER design techniques not appropriate ER design techniques not appropriate Design should reflect multidimensional Design should reflect multidimensional view view • • Star Schema Star Schema • • Snowflake Schema Snowflake Schema • • Snowflake Schema Snowflake Schema • • Fact Constellation Schema Fact Constellation Schema
  • 25. Example of a Star Schema Example of a Star Schema Order Order Product Product Order No Order No Order Date Order Date SalespersonID SalespersonID SalespersonName SalespersonName City City Quota Quota OrderNO OrderNO SalespersonID SalespersonID CustomerNO CustomerNO ProdNo ProdNo ProductNO ProductNO ProdName ProdName ProdDescr ProdDescr Category Category CategoryDescription CategoryDescription UnitPrice UnitPrice Salesperson Salesperson Date Date Product Product Fact Table Fact Table Customer No Customer No Customer Name Customer Name Customer Address Customer Address City City ProdNo ProdNo DateKey DateKey CityName CityName Quantity Quantity TotalPrice DateKey DateKey Date Date CityName CityName State State Country Country Customer Customer City City Date Date
  • 26. Star Schema Star Schema A single fact table and a single table for each dimension A single fact table and a single table for each dimension A single fact table and a single table for each dimension A single fact table and a single table for each dimension Every fact points to one tuple in each of the dimensions Every fact points to one tuple in each of the dimensions and has additional attributes and has additional attributes Does not capture hierarchies directly Does not capture hierarchies directly Generated keys are used for performance and maintenance Generated keys are used for performance and maintenance reasons reasons Fact constellation: Multiple Fact tables that share many Fact constellation: Multiple Fact tables that share many dimension tables dimension tables • • Example: Projected expense and the actual expense may share Example: Projected expense and the actual expense may share dimensional tables dimensional tables
  • 27. Example of a Snowflake Schema Example of a Snowflake Schema Order Order Product Product Category Category Order No Order No Order Date Order Date Customer No Customer No Customer Name Customer Name Customer Customer Address Address OrderNO OrderNO SalespersonID SalespersonID CustomerNO CustomerNO ProdNo ProdNo ProductNO ProductNO ProdName ProdName ProdDescr ProdDescr CategoryName CategoryName Category Category UnitPrice UnitPrice Customer Customer Date Date Product Product Fact Table Fact Table CategoryName CategoryName CategoryDescr CategoryDescr Category Category Month Month City City SalespersonID SalespersonID SalespersonName SalespersonName City City Quota Quota ProdNo ProdNo DateKey DateKey CityName CityName Quantity Quantity TotalPrice DateKey DateKey Date Date Month Month CityName CityName StateName StateName Salesperson Salesperson City City Date Date Month Month Year Year Year Year StateName StateName Country Country State State Month Month Year Year
  • 28. Snowflake Schema Snowflake Schema Represent dimensional hierarchy directly by Represent dimensional hierarchy directly by Represent dimensional hierarchy directly by Represent dimensional hierarchy directly by normalizing the dimension tables normalizing the dimension tables Easy to maintain Easy to maintain Saves storage, but is alleged that it reduces Saves storage, but is alleged that it reduces effectiveness of browsing (Kimball) effectiveness of browsing (Kimball) effectiveness of browsing (Kimball) effectiveness of browsing (Kimball)