SlideShare a Scribd company logo
What is Data Warehouse?


A data ware is a
 Subject

orient
 Integrated
 Time variant(historic perspective)
 Nonvolatile
 collection

of data in support of management’s
decision making process(William H Inmon)
 Semantically consistent data store that serve as
a physical implementation of decision support
data model
Data Warehouse—Subject-Oriented
3



Organized around major subjects, such as customer,
product, sales



Focusing on the modeling and analysis of data for
decision makers, not on daily operations or transaction
processing



Provide a simple and concise view around particular
subject issues by excluding data that are not useful in the
decision support process
Data Warehouse—Integrated
4





Constructed by integrating multiple, heterogeneous data
sources
 relational databases, flat files, on-line transaction
records
Data cleaning and data integration techniques are
applied.
 Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different
data sources




E.g., Hotel price: currency, tax, breakfast covered, etc.

When data is moved to the warehouse, it is converted.
Data Warehouse—Time Variant
5



The time horizon for the data warehouse is significantly
longer than that of operational systems





Operational database: current value data
Data warehouse data: provide information from a
historical perspective (e.g., past 5-10 years)

Every key structure in the data warehouse


Contains an element of time, explicitly or implicitly



But the key of operational data may or may not contain
“time element”
Data Warehouse—Nonvolatile
6



A physically separate store of data transformed from the
operational environment



Operational update of data does not occur in the data
warehouse environment


Does not require transaction processing, recovery, and
concurrency control mechanisms



Requires only two operations in data accessing:
 initia l lo a d ing

o f d a ta and a c c e s s o f d a ta
Differences between Operational Database
Systems and Data Warehouses


Online Operational Database Systems
Perform online transaction and query processing system
 Online transaction processing (OLTP) systems
 Cover the day-to-day operations of an organization
(purchasing,inventory,banking,payroll etc..)




Data warehouses




Serve users or knowledge workers in the role of
data analysis and decision making
Online analytical processing(OLAP)systems
Differences between OLTP and OLAP


Users and system orientation





Data contents





OLTP– customer oriented
OLAP-market-oriented
OLTP–manages current data easily used for decision making
OLAP-manages large amount of historic data, provides
facilities for summarization.

Database design
 OLTP–entity-relationship(ER) data model and
application-oriented DB design
 OLAP-star or snowflake model and a subjectoriented database design
Differences between OLTP and OLAP(contd..)




View
 OLTP–manages current data within an enterprise or
department without referring to historic data or data
in different organization
 OLAP-deal with information from different
organization ,integrating information from many
data stores
Access patterns
 OLTP- Mainly of short, atomic transaction
 OLAP-complex queries
OLTP vs. OLAP
10

OLTP

OLAP

users

clerk, IT professional

knowledge worker

function

day to day operations

decision support

DB design

application-oriented

subject-oriented

data

current, up-to-date
detailed, flat relational
isolated
repetitive

historical,
summarized, multidimensional
integrated, consolidated
ad-hoc
lots of scans

unit of work

read/write
index/hash on prim. key
short, simple transaction

# records accessed

tens

millions

#users

thousands

hundreds

DB size

100MB-GB

100GB-TB

metric

transaction throughput

query throughput, response

usage
access

complex query
11

Why a Separate Data
Warehouse?


High performance for both systems






DBMS— tuned for OLTP: access methods, indexing, concurrency
control, recovery
Warehouse—tuned for OLAP: complex OLAP queries,
multidimensional view, consolidation

Different functions and different data:


missing data: Decision support requires historical data which
operational DBs do not typically maintain



data consolidation: DS requires consolidation (aggregation,
summarization) of data from heterogeneous sources



data quality: different sources typically use inconsistent data
representations, codes and formats which have to be reconciled
Data Warehouse: A Multi-Tiered
Architecture



Bottom tier








Middle tier




Warehouse database server
Backend tools &utilities are used to feed data into bottom layer
from other databases
contains metadata repository
Data are extracted using application program interfaces
known as gateways . Eg.OBDC,OLEDB,JDBC
OLAP server, Implemented using 1) relational OLAP(RALOP)
2) multi dimensional
OLAP(MOLAP)

Top tier: front end client layer which contain query and reporting
tools, analysis tools and/or data mining tools
Three tier data warehousing architecture
Data Warehouse: A Multi-Tiered Architecture

Other
sources
Operational
DBs

Metadata

Extract
Transform
Load
Refresh

Monitor
&
Integrator

Data
Warehouse

OLAP Server

Serve

Analysis
Query
Reports
Data mining

Data Marts
14

Data Sources

Data Storage

OLAP Engine

Front-End Tools
Three Data Warehouse Models (Architecture point of
view)
15



Enterprise warehouse
collects all of the information about subjects spanning the entire
organization(Either detailed or summarized data in gigabytes ,
terabyte or beyond)
 Implemented in mainframes, superservers or parallel architecture
platforms
Data Mart
 a subset of corporate-wide data that is of value to a specific groups of
users. Its scope is confined to specific, selected groups, such as
marketing data mart(customer, item, sales)






Independent vs. dependent (directly from warehouse) data mart
Virtual warehouse
 A set of views over operational databases for efficient query




Implemented on low cost department servers that are in UnixLinux or
Windows based
Extraction, Transformation, and Loading (ETL)
16



Back end tools and utilities include the following functions










Data extraction
 get data from multiple, heterogeneous, and external sources
Data cleaning
 detect errors in the data and rectify them when possible
Data transformation
 convert data from legacy or host format to warehouse format
Load
 sort, summarize, consolidate, compute views, check integrity,
and build indices and partitions
Refresh
 propagate the updates from the data sources to the
warehouse
Metadata Repository
17


Meta data is the data defining warehouse objects. It stores:



Description of the structure of the data warehouse




schema, view, dimensions, hierarchies, derived data defn, data mart
locations and contents

Operational meta-data


data lineage (history of migrated data and transformation path), currency of
data (active, archived, or purged), monitoring information (warehouse usage
statistics, error reports, audit trails)



The algorithms used for summarization



The mapping from operational environment to the data warehouse





Data related to system performance
 warehouse schema, view and derived data definitions
Business data


business terms and definitions, ownership of data, charging policies
From Tables and Spreadsheets to
Data Cubes
18


A data warehouse is based on a multidimensional data model which
views data in the form of a data cube : defined by dimension and fact



A data cube allows data to be modeled and viewed in multiple
dimensions




Example: sales with dimension time, branch, Item and location

Dimension tables, such as item (item_name, brand, type), or
time(day, week, month, quarter, year)



Fact table contains numeric measures (such as dollars_sold) and
keys to each of the related dimension tables



In data warehousing literature, an n-D base cube is called a base
cuboid. The top most 0-D cuboid, which holds the highest-level of
summarization, is called the apex cuboid. The lattice of cuboids
forms a data cube.
Cube: A Lattice of Cuboids
19

all
time

0-D (apex) cuboid

item

time,location
time,item

location

supplier

item,location

time,supplier

location,supplier

item,supplier

time,location,supplier

time,item,location

time,item,supplier

1-D cuboids

2-D cuboids
3-D cuboids

item,location,supplier

4-D (base) cuboid
time, item, location, supplier
Conceptual Modeling of Data Warehouses
20



Modeling data warehouses: dimensions & measures


Star schema: A fact table in the middle connected to a
set of dimension tables




1) large central table(fact table) 2) a set of dimension tables

Snowflake schema: A refinement of star schema where
some dimensional hierarchy is normalized into a set of
smaller dimension tables, forming a shape similar to
snowflake



Fact constellations: Multiple fact tables share
dimension tables, viewed as a collection of stars,
therefore called galaxy schema or fact constellation
Example of Star Schema
21

time
item

time_key
day
day_of_the_week
month
quarter
year

Sales Fact Table
time_key
item_key
branch_key

branch
branch_key
branch_name
branch_type

location_key
units_sold
dollars_sold
avg_sales

Measures

item_key
item_name
brand
type
supplier_type

location
location_key
street
city
state_or_province
country
22

Example of Snowflake
Schema
time
time_key
day
day_of_the_week
month
quarter
year

item

Sales Fact Table
time_key
item_key
branch_key

branch

location_key

branch_key
branch_name
branch_type

units_sold
dollars_sold
avg_sales

Measures

item_key
item_name
brand
type
supplier_key

supplier
supplier_key
supplier_type

location
location_key
street
city_key

city
city_key
city
state_or_province
country
23
time

Example of Fact
Constellation

time_key
day
day_of_the_week
month
quarter
year

item
Sales Fact Table
time_key
item_key

item_key
item_name
brand
type
supplier_type

Shipping Fact Table
time_key
item_key
shipper_key
from_location

branch_key
location_key

branch
branch_key
branch_name
branch_type

units_sold
dollars_sold
avg_sales

Measures

location

to_location

location_key
street
city
province_or_state
country

dollars_cost
units_shipped
shipper
shipper_key
shipper_name
location_key
shipper_type
A Concept Hierarchy:
Dimension (location)

24

all

all

Europe

region

country

city

office

Germany

Frankfurt

...

...

...

Spain

North_America

Canada

Vancouver

L. Chan

...

...

M. Wind

...

Toronto

Mexico
Data Cube Measures: Three Categories
25



Distributive: if the result derived by applying the function to
n aggregate values is the same as that derived by applying
the function on all the data without partitioning




Algebraic: if it can be computed by an algebraic function
with Marguments (where Mis a bounded integer), each of
which is obtained by applying a distributive aggregate
function




E.g., count(), sum(), min(), max()

E.g., avg(), min_N(), standard_deviation()

Holistic: if there is no constant bound on the storage size
needed to describe a subaggregate.


E.g., median(), mode(), rank()
Multidimensional Data
26

Sales volume as a function of product, month,
and region
Re
g

io
n

Dimensions: Product, Location, Time
Hierarchical summarization paths
Industry Region

Year

Category Country Quarter
Product



Product

City
Office

Month

Month
Day

Week
A Sample Data Cube
27

Pr

TV
PC
VCR
sum

1Qtr

2Qtr

3Qtr

4Qtr

sum

Total annual sales
of TVs in U.S.A.
U.S.A
Canada
Mexico
sum

Country

od
uc
t

Date
28

Cuboids Corresponding to the
Cube
all
0-D (apex) cuboid
product
product,date

date

country

product,country

1-D cuboids
date, country

2-D cuboids

product, date, country

3-D (base) cuboid
Typical OLAP Operations
29


Roll up (drill-up): summarize data




Drill down (roll down): reverse of roll-up




by c lim bing up hie ra rc hy o r by d im e ns io n re d uc tio n
fro m hig he r le ve l s um m a ry to lo we r le v e l s um m a ry o r d e ta ile d d a ta , o r
intro d uc ing ne w d im e ns io ns

Slice and dice: p ro je c t a nd s e le c t





Dice: sub cube selecting 2 or more dimensions Eg: (Location =“Toronto” or
“Vancouver”
Time=“Q1” or “Q2”)

Pivot (rotate):




Slice: selection of one dimension resulting sub cube .Eg. Time=“Q1”

re o rie nt the c ube , vis ua liz a tio n, 3 D to s e rie s o f 2 D p la ne s

Other operations


d rill a c ro s s : invo lv ing (a c ro s s ) m o re tha n o ne fa c t ta ble



d rill thro ug h: thro ug h the bo tto m le ve l o f the c ube to its ba c k-e nd re la tio na l
ta ble s (us ing SQL)
Fig. 3.10 Typical
OLAP Operations

30
Starnet Query model







Querying multidimensional DBs
Consists of radial lines originated from a
central point
Each line represent a concept hierarchy for
dimension
Each abstraction level in the hierarchy –
footprint
A Star-Net Query Model
32

Customer Orders

Shipping Method

Customer
CONTRACTS

AIR-EXPRESS

ORDER

TRUCK
Time

PRODUCT LINE
ANNUALY QTRLY

DAILY

CITY

Product

PRODUCT ITEM PRODUCT GROUP
SALES PERSON

COUNTRY
DISTRICT
REGION
Location

DIVISION

Each circle is
called a footprint

Promotion

Organization
33

Browsing a Data
Cube





Visualization
OLAP capabilities
Interactive manipulation
Data W
arehouse design and usage

What goes into a data warehouse
design?
 How are data warehouses used?




How do data warehousing and OLAP
relate to data mining
Design of Data Warehouse: A Business
Analysis Framework
35



Wha t c a n bus ine s s a na ly s ts g a in
fro m ha ving a d a ta wa re ho us e ?



Productivity



Customer relationship management





Competitive advantage

Cost reduction

Effective data warehouse design need
to understand and analysis business
needs and construct a business
Design of Data Warehouse: A Business
Analysis Framework(contd..)


Four views regarding the design of a data warehouse


Top-down view




Data source view




exposes the information being captured, stored, and managed
by operational systems.(modeled by E-R model or CASE)

Data warehouse view




allows selection of the relevant information necessary for the
data warehouse(information matches current and future
business needs)

consists of fact tables and dimension tables with precalculated
totals and counts(source ,date,time)

Business query view


sees the perspectives of data in the warehouse from the view
of end-user
Design of Data Warehouse: A Business Analysis
Framework(contd..)


Building data warehouse is complex and long-term task
requires






Business skills
Technology skills
Program management skills

Implementation scope should be clearly defined. The
goals of an initial data warehouse implementation
should be




Specific
Achievable
Measurable
Data Warehouse Design Process
38



Top-down, bottom-up approaches or a combination of
both






Top-down: Starts with overall design and planning
(mature)
Bottom-up: Starts with experiments and prototypes
(rapid)

From software engineering point of view


Waterfall: structured and systematic analysis at each
step before proceeding to the next



Spiral: rapid generation of increasingly functional
systems, short turn around time, quick turn around
Data Warehouse Design Process
(contd..)


Typical data warehouse design process


Choose a business process to model, e.g., orders,
invoices, etc.



Choose the g ra in (a to m ic le ve l o f d a ta ) of the
business process(eg. Individual transactions)



Choose the dimensions that will apply to each fact
table record. Typical dimensions are
time,item,customer etc..



Choose the measure that will populate each fact table
record eg. Dollars_sold or units_sold
Data warehouse usage for information
Processing


Information processing
 Supports

reporting



query processing , statistical analysis and

Analytical processing
 Support

basic OLAP operations, including slice-anddice, drill-down, roll-up and pivoting



Data mining
 Support

knowledge discovery by finding hidden
patterns and association, constructing analytical
models, performing classification and predication and
presentation
From On-Line Analytical Processing (OLAP)
to On Line Analytical Mining (OLAM)
41





Integrate OLAP with data mining to uncover knowledge
in multidimensional databases .
Why online analytical mining?
 High quality of data in data warehouses
 DW contains integrated, consistent, cleaned data
 Available information processing structure surrounding data
warehouses
 ODBC, OLEDB, Web accessing, service facilities, reporting
and OLAP tools
 OLAP-based exploratory data analysis
 Mining with drilling, dicing, pivoting, etc.
 On-line selection of data mining functions
 Integration and swapping of multiple mining functions,
algorithms, and tasks

More Related Content

What's hot

Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
Abhinav Tyagi
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
thomasmary607
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Gajanand Sharma
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
janani thirupathi
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Jason Rodrigues
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
Navjot Kaur
 
Tools for data warehousing
Tools  for data warehousingTools  for data warehousing
Tools for data warehousing
Manju Rajput
 
OLAP
OLAPOLAP
OLAP
Ashir Ali
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
Radhika Kotecha
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
DataminingTools Inc
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
Ashish Kumar Thakur
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
nurmeen1
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
Komal Choudhary
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Shruti Dalela
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
Aashish Rathod
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
uncleRhyme
 

What's hot (20)

Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Tools for data warehousing
Tools  for data warehousingTools  for data warehousing
Tools for data warehousing
 
OLAP
OLAPOLAP
OLAP
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 

Similar to Datawarehouse and OLAP

Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
MutiaSari53
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Subrata Kumer Paul
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Salah Amean
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
Krish_ver2
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
ambujm
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
ambujm
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technology
Datamining Tools
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
mekuanint sefi
 
Dataware house multidimensionalmodelling
Dataware house multidimensionalmodellingDataware house multidimensionalmodelling
Dataware house multidimensionalmodelling
meghu123
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
jainyshah20
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
SHIKHA GAUTAM
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
Christalin Nelson
 
Cs1011 dw-dm-1
Cs1011 dw-dm-1Cs1011 dw-dm-1
Cs1011 dw-dm-1
Aarti Goyal
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
DURGADEVIL
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 

Similar to Datawarehouse and OLAP (20)

Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
3dw
3dw3dw
3dw
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technology
 
3dw
3dw3dw
3dw
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Dataware house multidimensionalmodelling
Dataware house multidimensionalmodellingDataware house multidimensionalmodelling
Dataware house multidimensionalmodelling
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
 
Cs1011 dw-dm-1
Cs1011 dw-dm-1Cs1011 dw-dm-1
Cs1011 dw-dm-1
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
2. olap warehouse
2. olap warehouse2. olap warehouse
2. olap warehouse
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 

Recently uploaded

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
Steve Thomason
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 

Recently uploaded (20)

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 

Datawarehouse and OLAP

  • 1.
  • 2. What is Data Warehouse?  A data ware is a  Subject orient  Integrated  Time variant(historic perspective)  Nonvolatile  collection of data in support of management’s decision making process(William H Inmon)  Semantically consistent data store that serve as a physical implementation of decision support data model
  • 3. Data Warehouse—Subject-Oriented 3  Organized around major subjects, such as customer, product, sales  Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing  Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process
  • 4. Data Warehouse—Integrated 4   Constructed by integrating multiple, heterogeneous data sources  relational databases, flat files, on-line transaction records Data cleaning and data integration techniques are applied.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources   E.g., Hotel price: currency, tax, breakfast covered, etc. When data is moved to the warehouse, it is converted.
  • 5. Data Warehouse—Time Variant 5  The time horizon for the data warehouse is significantly longer than that of operational systems    Operational database: current value data Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years) Every key structure in the data warehouse  Contains an element of time, explicitly or implicitly  But the key of operational data may or may not contain “time element”
  • 6. Data Warehouse—Nonvolatile 6  A physically separate store of data transformed from the operational environment  Operational update of data does not occur in the data warehouse environment  Does not require transaction processing, recovery, and concurrency control mechanisms  Requires only two operations in data accessing:  initia l lo a d ing o f d a ta and a c c e s s o f d a ta
  • 7. Differences between Operational Database Systems and Data Warehouses  Online Operational Database Systems Perform online transaction and query processing system  Online transaction processing (OLTP) systems  Cover the day-to-day operations of an organization (purchasing,inventory,banking,payroll etc..)   Data warehouses   Serve users or knowledge workers in the role of data analysis and decision making Online analytical processing(OLAP)systems
  • 8. Differences between OLTP and OLAP  Users and system orientation    Data contents    OLTP– customer oriented OLAP-market-oriented OLTP–manages current data easily used for decision making OLAP-manages large amount of historic data, provides facilities for summarization. Database design  OLTP–entity-relationship(ER) data model and application-oriented DB design  OLAP-star or snowflake model and a subjectoriented database design
  • 9. Differences between OLTP and OLAP(contd..)   View  OLTP–manages current data within an enterprise or department without referring to historic data or data in different organization  OLAP-deal with information from different organization ,integrating information from many data stores Access patterns  OLTP- Mainly of short, atomic transaction  OLAP-complex queries
  • 10. OLTP vs. OLAP 10 OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated repetitive historical, summarized, multidimensional integrated, consolidated ad-hoc lots of scans unit of work read/write index/hash on prim. key short, simple transaction # records accessed tens millions #users thousands hundreds DB size 100MB-GB 100GB-TB metric transaction throughput query throughput, response usage access complex query
  • 11. 11 Why a Separate Data Warehouse?  High performance for both systems    DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation Different functions and different data:  missing data: Decision support requires historical data which operational DBs do not typically maintain  data consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sources  data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled
  • 12. Data Warehouse: A Multi-Tiered Architecture  Bottom tier      Middle tier   Warehouse database server Backend tools &utilities are used to feed data into bottom layer from other databases contains metadata repository Data are extracted using application program interfaces known as gateways . Eg.OBDC,OLEDB,JDBC OLAP server, Implemented using 1) relational OLAP(RALOP) 2) multi dimensional OLAP(MOLAP) Top tier: front end client layer which contain query and reporting tools, analysis tools and/or data mining tools
  • 13. Three tier data warehousing architecture
  • 14. Data Warehouse: A Multi-Tiered Architecture Other sources Operational DBs Metadata Extract Transform Load Refresh Monitor & Integrator Data Warehouse OLAP Server Serve Analysis Query Reports Data mining Data Marts 14 Data Sources Data Storage OLAP Engine Front-End Tools
  • 15. Three Data Warehouse Models (Architecture point of view) 15  Enterprise warehouse collects all of the information about subjects spanning the entire organization(Either detailed or summarized data in gigabytes , terabyte or beyond)  Implemented in mainframes, superservers or parallel architecture platforms Data Mart  a subset of corporate-wide data that is of value to a specific groups of users. Its scope is confined to specific, selected groups, such as marketing data mart(customer, item, sales)    Independent vs. dependent (directly from warehouse) data mart Virtual warehouse  A set of views over operational databases for efficient query   Implemented on low cost department servers that are in UnixLinux or Windows based
  • 16. Extraction, Transformation, and Loading (ETL) 16  Back end tools and utilities include the following functions      Data extraction  get data from multiple, heterogeneous, and external sources Data cleaning  detect errors in the data and rectify them when possible Data transformation  convert data from legacy or host format to warehouse format Load  sort, summarize, consolidate, compute views, check integrity, and build indices and partitions Refresh  propagate the updates from the data sources to the warehouse
  • 17. Metadata Repository 17  Meta data is the data defining warehouse objects. It stores:  Description of the structure of the data warehouse   schema, view, dimensions, hierarchies, derived data defn, data mart locations and contents Operational meta-data  data lineage (history of migrated data and transformation path), currency of data (active, archived, or purged), monitoring information (warehouse usage statistics, error reports, audit trails)  The algorithms used for summarization  The mapping from operational environment to the data warehouse   Data related to system performance  warehouse schema, view and derived data definitions Business data  business terms and definitions, ownership of data, charging policies
  • 18. From Tables and Spreadsheets to Data Cubes 18  A data warehouse is based on a multidimensional data model which views data in the form of a data cube : defined by dimension and fact  A data cube allows data to be modeled and viewed in multiple dimensions   Example: sales with dimension time, branch, Item and location Dimension tables, such as item (item_name, brand, type), or time(day, week, month, quarter, year)  Fact table contains numeric measures (such as dollars_sold) and keys to each of the related dimension tables  In data warehousing literature, an n-D base cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid. The lattice of cuboids forms a data cube.
  • 19. Cube: A Lattice of Cuboids 19 all time 0-D (apex) cuboid item time,location time,item location supplier item,location time,supplier location,supplier item,supplier time,location,supplier time,item,location time,item,supplier 1-D cuboids 2-D cuboids 3-D cuboids item,location,supplier 4-D (base) cuboid time, item, location, supplier
  • 20. Conceptual Modeling of Data Warehouses 20  Modeling data warehouses: dimensions & measures  Star schema: A fact table in the middle connected to a set of dimension tables   1) large central table(fact table) 2) a set of dimension tables Snowflake schema: A refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflake  Fact constellations: Multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation
  • 21. Example of Star Schema 21 time item time_key day day_of_the_week month quarter year Sales Fact Table time_key item_key branch_key branch branch_key branch_name branch_type location_key units_sold dollars_sold avg_sales Measures item_key item_name brand type supplier_type location location_key street city state_or_province country
  • 22. 22 Example of Snowflake Schema time time_key day day_of_the_week month quarter year item Sales Fact Table time_key item_key branch_key branch location_key branch_key branch_name branch_type units_sold dollars_sold avg_sales Measures item_key item_name brand type supplier_key supplier supplier_key supplier_type location location_key street city_key city city_key city state_or_province country
  • 23. 23 time Example of Fact Constellation time_key day day_of_the_week month quarter year item Sales Fact Table time_key item_key item_key item_name brand type supplier_type Shipping Fact Table time_key item_key shipper_key from_location branch_key location_key branch branch_key branch_name branch_type units_sold dollars_sold avg_sales Measures location to_location location_key street city province_or_state country dollars_cost units_shipped shipper shipper_key shipper_name location_key shipper_type
  • 24. A Concept Hierarchy: Dimension (location) 24 all all Europe region country city office Germany Frankfurt ... ... ... Spain North_America Canada Vancouver L. Chan ... ... M. Wind ... Toronto Mexico
  • 25. Data Cube Measures: Three Categories 25  Distributive: if the result derived by applying the function to n aggregate values is the same as that derived by applying the function on all the data without partitioning   Algebraic: if it can be computed by an algebraic function with Marguments (where Mis a bounded integer), each of which is obtained by applying a distributive aggregate function   E.g., count(), sum(), min(), max() E.g., avg(), min_N(), standard_deviation() Holistic: if there is no constant bound on the storage size needed to describe a subaggregate.  E.g., median(), mode(), rank()
  • 26. Multidimensional Data 26 Sales volume as a function of product, month, and region Re g io n Dimensions: Product, Location, Time Hierarchical summarization paths Industry Region Year Category Country Quarter Product  Product City Office Month Month Day Week
  • 27. A Sample Data Cube 27 Pr TV PC VCR sum 1Qtr 2Qtr 3Qtr 4Qtr sum Total annual sales of TVs in U.S.A. U.S.A Canada Mexico sum Country od uc t Date
  • 28. 28 Cuboids Corresponding to the Cube all 0-D (apex) cuboid product product,date date country product,country 1-D cuboids date, country 2-D cuboids product, date, country 3-D (base) cuboid
  • 29. Typical OLAP Operations 29  Roll up (drill-up): summarize data   Drill down (roll down): reverse of roll-up   by c lim bing up hie ra rc hy o r by d im e ns io n re d uc tio n fro m hig he r le ve l s um m a ry to lo we r le v e l s um m a ry o r d e ta ile d d a ta , o r intro d uc ing ne w d im e ns io ns Slice and dice: p ro je c t a nd s e le c t    Dice: sub cube selecting 2 or more dimensions Eg: (Location =“Toronto” or “Vancouver” Time=“Q1” or “Q2”) Pivot (rotate):   Slice: selection of one dimension resulting sub cube .Eg. Time=“Q1” re o rie nt the c ube , vis ua liz a tio n, 3 D to s e rie s o f 2 D p la ne s Other operations  d rill a c ro s s : invo lv ing (a c ro s s ) m o re tha n o ne fa c t ta ble  d rill thro ug h: thro ug h the bo tto m le ve l o f the c ube to its ba c k-e nd re la tio na l ta ble s (us ing SQL)
  • 30. Fig. 3.10 Typical OLAP Operations 30
  • 31. Starnet Query model     Querying multidimensional DBs Consists of radial lines originated from a central point Each line represent a concept hierarchy for dimension Each abstraction level in the hierarchy – footprint
  • 32. A Star-Net Query Model 32 Customer Orders Shipping Method Customer CONTRACTS AIR-EXPRESS ORDER TRUCK Time PRODUCT LINE ANNUALY QTRLY DAILY CITY Product PRODUCT ITEM PRODUCT GROUP SALES PERSON COUNTRY DISTRICT REGION Location DIVISION Each circle is called a footprint Promotion Organization
  • 33. 33 Browsing a Data Cube    Visualization OLAP capabilities Interactive manipulation
  • 34. Data W arehouse design and usage What goes into a data warehouse design?  How are data warehouses used?   How do data warehousing and OLAP relate to data mining
  • 35. Design of Data Warehouse: A Business Analysis Framework 35  Wha t c a n bus ine s s a na ly s ts g a in fro m ha ving a d a ta wa re ho us e ?   Productivity  Customer relationship management   Competitive advantage Cost reduction Effective data warehouse design need to understand and analysis business needs and construct a business
  • 36. Design of Data Warehouse: A Business Analysis Framework(contd..)  Four views regarding the design of a data warehouse  Top-down view   Data source view   exposes the information being captured, stored, and managed by operational systems.(modeled by E-R model or CASE) Data warehouse view   allows selection of the relevant information necessary for the data warehouse(information matches current and future business needs) consists of fact tables and dimension tables with precalculated totals and counts(source ,date,time) Business query view  sees the perspectives of data in the warehouse from the view of end-user
  • 37. Design of Data Warehouse: A Business Analysis Framework(contd..)  Building data warehouse is complex and long-term task requires     Business skills Technology skills Program management skills Implementation scope should be clearly defined. The goals of an initial data warehouse implementation should be    Specific Achievable Measurable
  • 38. Data Warehouse Design Process 38  Top-down, bottom-up approaches or a combination of both    Top-down: Starts with overall design and planning (mature) Bottom-up: Starts with experiments and prototypes (rapid) From software engineering point of view  Waterfall: structured and systematic analysis at each step before proceeding to the next  Spiral: rapid generation of increasingly functional systems, short turn around time, quick turn around
  • 39. Data Warehouse Design Process (contd..)  Typical data warehouse design process  Choose a business process to model, e.g., orders, invoices, etc.  Choose the g ra in (a to m ic le ve l o f d a ta ) of the business process(eg. Individual transactions)  Choose the dimensions that will apply to each fact table record. Typical dimensions are time,item,customer etc..  Choose the measure that will populate each fact table record eg. Dollars_sold or units_sold
  • 40. Data warehouse usage for information Processing  Information processing  Supports reporting  query processing , statistical analysis and Analytical processing  Support basic OLAP operations, including slice-anddice, drill-down, roll-up and pivoting  Data mining  Support knowledge discovery by finding hidden patterns and association, constructing analytical models, performing classification and predication and presentation
  • 41. From On-Line Analytical Processing (OLAP) to On Line Analytical Mining (OLAM) 41   Integrate OLAP with data mining to uncover knowledge in multidimensional databases . Why online analytical mining?  High quality of data in data warehouses  DW contains integrated, consistent, cleaned data  Available information processing structure surrounding data warehouses  ODBC, OLEDB, Web accessing, service facilities, reporting and OLAP tools  OLAP-based exploratory data analysis  Mining with drilling, dicing, pivoting, etc.  On-line selection of data mining functions  Integration and swapping of multiple mining functions, algorithms, and tasks

Editor's Notes

  1. ion
  2. MK 08.11.09 Former title: Data Warehouse Back-End Tools and Utilities