© 2010 All Rights Reserved. SAP UAIntroduction to Busine.docx

© 2010 All Rights Reserved. SAP UA
Introduction to Business Intelligence
SAP University Alliances
Version 2.0
Authors Klaus Freyburger
Peter Lehmann
*
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
BI1-M1 Introduction to Business Intelligence
AgendaWhat is Business Intelligence?What is a „Data
Warehouse“?What are the benefits and challenges?What does
„multi-dimensional“ mean?What is „OLAP“?OLTP versus
OLAPBusiness Intelligence tool box
Agenda
SAP UA
Page *
© 2010 SAP AG

SAP BI Curriculum
What is Business Intelligence?
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
What is Intelligence?
Definition:
The ability to learn or understand or deal with new and trying
situations
The ability to apply knowledge to manipulate one’s environment
Source: Merriam-Webster’s Online Dictionary
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Intelligence in BusinessWhat is the current status of the
business?What’s going well?What needs improvement?
What are the business’ strengths and weaknesses?
Are there opportunities for innovation or competitive
advantage?
How do we improve our decision making?

Yesterday – Now - Tomorrow
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Main Objective of Business Intelligence Support intelligence
(e.g., knowledge of status)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
A Brief History of Information SystemsERP Systems
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Information Systems in a Company!?

SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Definition: „Business Intelligence“„Business Intelligence“
covers strategies, processes and technologies in order to achieve
knowledge about status, potentials and perspectives of a
company out of heterogeneous and distributed data.
?
Definition Source: Institut für Business Intelligence (IBI),
http://www.i-bi.de/home/index.html
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Business Intelligence Turns Data into Knowledge

Data
Information
Knowledge
Decision
Product A
Product B
Customer Smith
Customer Smith
buys product A
Product A & B have
a 80% sales correlation
Offer product B to
customer Smith
*
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Business Intelligence Stages
Source: Brobst, S. and J. Rarey, "Five Stages of Data
Warehouse Decision Support Evolution", DSSResources.COM,
01/06/2003
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
What is a Data Warehouse?A data warehouse is the most
common
data architecture for business intelligence.A data warehouse is a
specific
company-wide data pool in order
to support decision makers.
Top management
Mid and lower management
Planners, controllers, …
…
Data Warehouse

SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Data Warehouse Characteristics"A Data Warehouse is a subject-
oriented, integrated, time-variant and nonvolatile collection of
data in order to support management decisions,“ Bill Inmon
(1996).
Subject-oriented
The organization of data is guided by the view of decision
makers on specific areas of business.
Integrated
The Data Warehouse contains data from different internal and
external sources. Important is the high quality of data, i.e., its
correctness and consistency.
Time-oriented
Data in a Data Warehouse has a time dimension, i.e. all data
values and their changes in time can be compared and analyzed
along the time axis.
Nonvolatile
As opposed to operational databases, data are stored persistently
in a Data Warehouse. Access is by reading the
data; analysis does not change the data.

*
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Data Integration Process
Source systems External data sources Internal data sources
Multi-dimensional data Information models Aggregation
Extraction, trans-
formation, loading Selection, extraction, Modification, loading
Data warehouse Data storage Administration

Data analysis and
analytical applications OLAP, MIS, cockpits, … Planning,
scorecard, …
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Business Intelligence Platforms – 2008
Source: Gartner 2008
SAP UA
Page *

© 2010 SAP AG
SAP BI Curriculum
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Benefit: Reduce Action Time
Added
Value
Time
Operational
Transaction
Data Available
Analysis Results
Available
Decision
Made
Action
Implemented
Data access
time
Analysis
time
Decision
time
Implementation
time
Action time
Added Value loss
Source: Hackathorn, R.: Minimizing Action Distance (2003),
http://www.tdan.com/view-articles/5132/
SAP UA
Page *

© 2010 SAP AG
SAP BI Curriculum
Benefit: Reduce Action Time to Nearly Real Time
Added
Value
Time
Operational
Transaction
Data Available
Analysis Results
Available
Decision Made
Saved Time
Saved Value
Action Implemented
Action Implemented
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Other BenefitsIncreased
Information quality by a „single version of the truth“
Competitive ability
Customer satisfaction

Inter-company and inter-department collaboration
Alignment with business strategyCompliance with financial
reporting regulations
Risk management
Financial consolidation
Corporate planning and forecasting
SOX (US), BASEL II (European Community), etc.
…
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Challenges (1)Data Quality
Precise
Subject-oriented
Complete
Accessable
Flexibility
Security and authorisation
Time dependencies
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Challenges (2)Business Terms – What do you mean?
„What is our revenue“?
„95% of our trains arrive just in time!“
“How many students will certainly finish?”

…
I have a stock here that could really excel !
This is
Madness!
I can‘t take anymore
Good Bye
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Challenges (3)Tools – easy to use?
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Challenges (4)Others
No clear project scope
No management project support
No appropriated IT infrastructure
Inadequate staff qualification
Lack of end user acceptance
…
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Multi-dimensional (1)A key figure without a relationship to any
object makes no sense!
1388486
A telephone number?
A material number?
Revenue in September 2007?
My boss‘ salary?
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Multi-dimensional (2)San Francisco had 1,388,486 USD
revenue in May 2007 by selling Mountain Bikes
Revenue in USD:

1,388,486
Month: May
Sales Organisation:
San Francisco
Group: Mountain Bike
Year: 2007
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Multi-dimensional (3)
Key figure: RevenueObject: Values
Month May
Year 2007
Sales Organisation San Francisco
Material Group Mountain Bike
Multi-dimensional means a key figure always relates to one or
more objects.
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
OLAP
On-line Analytical Processing is a software technology which
allows end-user driven, fast and interactive data analysis.
Revenue
1,388,486 USD
*
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Excel PivotTable„What is the revenue in each country in 2007?“
*

SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Excel PivotTable (1)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Excel Pivot Table (2)„What is the revenue in each country per
month?“
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Pivot Table (3)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
From Table View to a 3-Dimensional Cube

SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Example
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
From Cube to Report: MS Excel
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
From Cube to Report: SAP Business Explorer (1)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

click
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Cube Navigation (1)Slice
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Cube Navigation (2)Rotation
SAP UA
Page *
© 2010 SAP AG

SAP BI Curriculum
Cube Navigation (3)Rotation, Example
Material Group and Country visible, Year restricted to 2007
Material Group and Year visible, all Countries aggregated
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Cube Navigation (4)Dice
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Cube Navigation (5)Drill-Down / Roll-Up
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Cube Navigation (Summary)RotationSliceDiceDrilldown
SAP UA

Page *
© 2010 SAP AG
SAP BI Curriculum
Is a Excel an OLAP Tool?CFOs & controllers love itCheapMS
Office componentEasy to useNice graphics
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Is Excel an OLAP tool? What about…Concurrent users?Data
volume?Processing performance?Aggregation behavior (fat
client)?Authorizations?Metadata? Many data sources?Cell
references?…
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Requirement for OLAP SystemsMany concurrent users
(1000+)Sophisticated authorizations and securityFast response
time (< 5 sec)High data volumes (>100+ GB or TB)Multiple
data sources Easy to use (slice, dice, drill-down, roll-
up)Enhanced reporting functions
SAP UA
Page *

© 2010 SAP AG
SAP BI Curriculum
ExercisesOLAP navigation using Excel PivotTables
OLAP navigation in SAP
Business Explorer Analyzer (Excel)
Business Explorer Web (Browser)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
BI Architecture
Source systems External data sources Internal data sources
Multi-dimensional data Information models Aggregation
Extraction, trans-
formation, loading Selection, extraction, Modification, loading

Data warehouse Data storage Administration
Data analysis and
analytical applications OLAP, MIS, cockpits, … Planning,
scorecard, …
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Examples
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Information Systems
Vertical Integration
Source: Mertens, P., Meier, M.: Integrierte
Informationsverarbeitung (2009), 1.
Corporate
Planning
Accounting
Supplier Management,
Production Planning,

Cost Planning, …
Supplier
Accounting
Purchasing
Stocks
Sales
Personnel
Warehouse
Accounting
Customer
Invoices
Employee
Salary
Administration II:
Amount-Oriented
Processing
Administration I:
Value-Oriented
Processing
Disposition and
Planning
Analysis
and Controlling
Strategic
Enterprise
Management

Horizontal Integration
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
OLTP versus OLAP
On-line
Transactional
Processing On-line
Analytical
Processing
Corporate
Planning
Accounting
Cost Planning, …
Supplier
Accounting
Purchasing
Stocks
Sales
Personnel

Warehouse
Accounting
Customer
Invoices
Employee
Salary
Administration II:
Amount-Oriented
Processing
Administration I:
Value-Oriented
Processing
Disposition and
Planning
Analysis
and Controlling
Strategic
Enterprise
Management
SAP UA
Page *

© 2010 SAP AG
SAP BI Curriculum
OLTP Versus OLAP (1) OLTPOLAP - Optimized to get data
in - Optimized to get data out - For management and
daily business - For administration and daily
decisions - Processes a small amount of
data per transaction - Processes a large amount of
data per transaction - Business-critical availability -
Less critical availability - Data updates online - Data
updates regularly - Data overwritten - Data are time-
dependent
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
OLTP Versus OLAP (2)
It is very hard to get all-in-one!
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum

Agenda
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Where Are the End Users?
Top Management
Middle Management
Lower Management
Corporate
Planning
Accounting
Cost Planning, …
Supplier
Accounting
Purchasing
Stocks
Sales
Personnel
Warehouse
Accounting
Customer
Invoices
Employee
Salary

Administration II:
Amount-Oriented
Processing
Administration I:
Value-Oriented
Processing
Disposition and
Planning
Analysis
and Controlling
Strategic
Enterprise
Management
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
What Do They Do?
Specialists
(Authors)
Analysts
Consumers
10%

20%
70%
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
ConsumersGet reports on a regular baseLook over huge amount
of dataOccasionally stumble on something that proves
to be usefulSporadic usage of dataSometimes find areas for
further explorationHeavy reliance on tools for displaying data
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
AnalystsRegular access to dataKnow what they are looking
forRequirements known before search for data
startsFind small flakes of gold regularlyMake use of tools for
analysis and presentation

SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Specialists (Authors)Requirements are knownIrregular access to
dataLook over massive amounts of dataSometimes find huge
nuggets of goldUnpredictable pattern of accessAccess detailed
data regularlyLook at relationships of dataMake use of tools of
discovery, analysis and
presentationMake results available to others (consumers,
analysts)
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
Business Intelligence Tool Box
Source: Eckerson, W. (May 2006). Business intelligence 2006 -
only the beginning. What Works, 21.
SAP UA
Page *
© 2010 SAP AG
SAP BI Curriculum
SummaryWhat is Business Intelligence?What is a „Data

Agenda
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t

o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M

IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as

s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n

41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in

g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od

s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec

.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB

C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e

pr
od
uc
t
DB
C
R
DB
C
R
DB
C
R
DB

C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G

R/
IR
DB
C
R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB

C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB

C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB

C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia

l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct

or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as

h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or

de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re

ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po

st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po

st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41

0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er

y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m

en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r

De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac

c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or

y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB
C
R
DB

C
R
DB
C
R
DB
C
R
Co
ns
.
R
aw
ra
w

m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB
C
R

G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa

y.
C
as
h
DB
C
R
DB
C
R
Fin
ish
ed
Pr
od
.

pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C
R
Do
m
es
tic

sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c

ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve

stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R
DB
C
R
M

D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut

e M
RP
Chapter 16 : Business intelligence
Paul Mireault (HEC Montréal)
This chapter introduces the student to the topic of Business
Intelligence (BI). BI comprises a set of tools whose purpose is
to help
decision makers make better decisions by presenting them with
the pertinent data, and allowing them to do analysis. For
business
users to use BI properly, they must understand proper
infrastructure set up by IT specialists.
16.1 Introduction
“Intelligence” is information gathered by a government or other
institution to guide decisions and actions (Boyer, 2001). While
many associate this definition with espionage, which is the act
of covertly gathering such information, its use in the business
world does not have the cloak and dagger connotation that is
the fodder of espionage novels and movies. But its fundamental
objective is the same : guiding decisions and actions.
Business Intelligence (BI) is a set of processes (data gathering,
data analysis), technologies (data warehouse), and presentation
tools (report generator, dashboard) used by organizations to
analyse data (either internal or external) in order to gain new
insight on their environment (customers, suppliers) and make
better decisions.
O2, a mobile phone company, uses BI to reduce its churn rate
(i.e. clients leaving them for a competitor). It estimates that its
churn rate of 30% is costing them over €270m per year. A
reduc-

tion of 1% would then save them €9m per year. Churn analy-
sis identifies the characteristics and behaviour of clients on the
threshold of leaving, and informs client managers ahead of time
so they can offer incentives for remaining with the company.
(ComputerWeekly.com, 2010)
Hallmark, a greeting card manufacturer, uses data from its
13 million loyalty program customers to figure out how to
engage customers all year, not only during holidays and special
occasions. (Computerworld.com, 2011)
The FBI is using BI to identify fraudulent housing transactions,
and to look for patterns suggesting the presence of identity
theft rings in given areas. (Computerworld.com, 2007)
Those are but a few examples of companies and organizations
using Business Intelligence to improve their profitability or
become more efficient.
Originally, BI was restricted to advanced analytical tools used
by analysts with degrees in data analysis. But now, BI’s
purview
encompasses information presentation tools as well, like
management reports and dashboards.
Analytical tools seek new knowledge and insight related to the
data amassed through the organization’s daily operations, with
the expectation that this new knowledge will help managers
think of new products, policies, or processes that benefit the
organization.
Presentation tools are designed to help managers make
decisions related to the current state of their operations. They
present key business indicators and compare them with either
previous values, showing trends and direction of changes, or
with the similar indicators in other categories, showing distribu-

tions.
In the context of an ERP, presentation and analytical tools
present the information using the same naming conventions
that the user sees in his normal interactions with the ERP. Tools
closely linked to the inner workings of the ERP offer this
desired
characteristic thanks to their access to the ERP’s data dictionary
and standardized data structures.
But for the users to be able to perform significant analysis some
important preparation work has to be done by the IT special-
ists. They need to set up the proper working environment,
called Data Warehouse, which will not interfere with normal
daily operations and, at the same time, make it easy for users
to manipulate data without being bothered by technical details.
This behind-the-scenes work is crucial to the success of a BI
implementation. We will explain concepts related to the design
of the Data Warehouse as well as its operation.
The first part of the chapter will present the business user’s
view
of a BI environment, and the second part will present the behind
the scenes work that has to be done by the technical developers.
Mireault – Business intelligenceCHAPTER 16
Readings on Enterprise Resource Planning
Preliminary Version - send comments to [email protected] 209
M
D6
1

Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de

r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo

ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc

tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr

ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd

en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od

uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R

DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C

R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB

C
R
DB
C
R
DB
C
R
DB
C
R
Co
ns
.

R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C

R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB

C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin
ish

ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C

R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io

n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od

uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R

DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5

M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N

Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce

ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7

CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm

pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea

te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se

or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh

A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en

ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed

c
ha
ng
e
pr
od
uc
t
DB
C
R
DB
C
R
DB

C
R
DB
C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra

w
m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR
A

cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB

C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou

tp
ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra

w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro

du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en

ts
C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr

od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
16.2 Using Business Intelligence
Business Intelligence tools can be categorized in two broad
categories. The first category consists of analytical tools that
use a wide of range of mathematical methods to analyse data
and, hopefully, discover new information. The second category
consists of presentation tools that present data, basic or aggre-

gated, in a visual format that lets users manipulate it themselves
by changing the way it is presented, hopefully leading to new
insight.
Analytical tools are usually designed for data analysts who have
been schooled in advanced mathematical techniques. Two
major programs are SAS Data Miner and SPSS Clementine.
On the other hand, presentation tools are designed for the more
occasional user : managers and directors who don’t use them all
day. They can be as simple as pivot tables in a spreadsheet.
They
can also be easy to interact but more complicated to develop,
like dashboards.
16.2.1 AnAlyTICAl Tools
16.2.1.1 Process
The analyst must understand the origin of the data he analy-
ses. That may seem obvious, but different data sources may use
homonyms to represent slightly different concepts, which could
then affect the analysis itself. For example, in one business unit
the term Client may refer to somebody who has made at least
one purchase, but in another business unit the term Client may
include prospective clients. Using the same term for slightly
different concepts will lead to problems when we analyse data
from the two business units.
Units of measure are a common cause of incompatible data. For
example, Company A buys Company B and merges its historical
data with its own in the data warehouse. But Company A counts
the boxes sold in the field QTY SOLD, and Company B counts
the
number of items in the boxes in its QTY SOLD field. The result
would be much higher quantities in Company B’s data.

While it may be tempting to use all the available data to build a
model, (Berry & Linoff, 2004) recommend partitioning the
avail-
able data into three sets : A training set to develop the model,
a validation set “ to adjust the initial model to make it more
general and less tied to the idiosyncrasies of the training set.”,
and a test set to evaluate the model’s performance with data
unused during development. Analysis packages, like the afore-
mentioned SAS Data Miner and SPS Clementine, will easily
parti-
tion the whole data set into the recommended sets.
16.2.1.2 Techniques
There are many techniques to analyse data, too numerous to list
here. And new ones are invented regularly. We will just present
a few of them.
There are two categories of analysis. One group of techniques
aims to analyse data and find some sort of structure to gather
understanding. This group uses traditional statistical analysis
tools, like hypothesis tests and confidence interval, as well as
non-statistical methods like the ABC analysis.
The other group aims to predict events. Traditional linear
regres-
sion is an example many readers already know.
Regression Analysis defines a dependent variable as a function
of one or more independent variables. This technique can only
be used with quantitative variables measured on a scale. Linear
regression is taught in most undergraduate programs and is
useful when the relationship between the dependant variable
and each independent variable is linear.

When the relationship between the dependant variable and
some independent variable is not linear, then the data analyst
can use more advanced non-linear regression techniques.
ABC Analysis is a technique that classifies items according to
their
relative importance and produces three groups. This technique
is based on the Pareto Principle, also called the 80-20 rule,
which
is commonly used in business : 80% of your revenues come
from
20% of your clients, and 80% of your sales are made by 20%
of your products. The actual values 80 and 20 must be taken
lightly. In the ABC Analysis, we divide our data into three
groups.
Group A contains the smallest number of items representing a
total of about 80% of the measured value. Group B contains the
smallest number of the remaining items representing a total of
about 15% of the measured value. Group C contains the left
over
items.
Table 1- Sales Data for ABC Analysis
CITy sAlEs
Berlin 1517
Boston 72099
Madrid 34302
Montreal 10328
New York 1915

Paris 8974
Toronto 1284
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co

nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts

40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO

41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od

uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te

b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de

r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A

cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue

s
DB
C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c

ha
ng
e
pr
od
uc
t
DB
C
R
DB
C
R
DB
C

R
DB
C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w

m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR
A
cc
. P

ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB

C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp

ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w

m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du

ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts

C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od

uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea

te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M

IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6

F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or

de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d

el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t

re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n

or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB

C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C
R
In

ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB
C

R
DB
C
R
DB
C
R
DB
C
R
Co
ns
.
R

aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB

C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB
C
R

Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin
ish
ed

Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C
R
Do

m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve

nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio

n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R
DB

C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1

Ex
ec
ut
e M
RP
For example, consider the sales data shown in Table 1. We than
calculate the sales percentages and order the data in decreasing
Sales order, as shown in Table 2.
Table 2 - Sales Data in Decreasing Sales Order
CITy sAlEs PCT sAlEs CUM PCT sAlEs GRoUP
Boston 72099 55.28% 55.28% A
Madrid 34302 26.30% 81.58% A
Montreal 10328 7.92% 89.50% B
Paris 8974 6.88% 96.38% B
New York 1915 1.47% 97.85% C
Berlin 1517 1.16% 99.02% C
Toronto 1284 0.98% 100.00% C
We can then assign Boston and Madrid to Group A, Montreal
and
Paris to Group B and New York, Berlin and Toronto to Group
C.

Associative Analysis refers to techniques that try to find
associa-
tions in the data. A popular associative analysis technique is the
Market Basket Analysis. “Market basket analysis provides
insight
into the merchandise by telling us which products tend to be
purchased together and which are more amenable to promo-
tions.” (Berry & Linoff, 2004) p. 287.
Knowing which products are often bought together, like a
clothes washer and dryer, or in sequence, like a flight, a car
rental
and a hotel room, can be valuable information for marketing
purposes.
A Decision Tree is a classification model that divides a
population
into small homogeneous groups with respect to a categorical
variable.
A simple categorical variable may be Membership Renewed
which
may take the values Yes or No. A decision tree could be built
using the clients’ demographical data and all the service calls
and complaints to determine the probability that an individual
client may not renew his subscription. Clients who are
identified
as having the highest probability on non-renewal may then be
targeted with special incentives to reduce that probability.
The major publishers of data analysis software are SAS and
SPSS,
now a division of IBM.
16.2.1.3 Outliers

Outliers are cases whose values are unnaturally far from the
majority of the other cases. While it is not unusual to have
values
in the extremities of a distribution, there are situations where
some abnormally extreme values occur and can affect the data
analysis we want to perform.
For example, a small jewellery store may have an average daily
sales of 10 000$, with daily sales extreme of 50 000$ happen-
ing three or four times a year. This week, it offers a 100$ rebate
on all purchases over 1 000$. A billionaire comes in and buys 1
000 000$ worth of rings and necklaces. Now, an analysis on the
rebate’s effectiveness may show a great impact. But is it realis-
tic ? First of all, we may doubt that the billionaire was attracted
by the 100$ rebate. And if the actual sales that week were 1 020
000$, most people would conclude that the rebate did not have
any impact.
For this reason, many data analysts start their analysis by
looking
for outliers and removing them from their data sets. An easy
way to identify data points that can be considered outliers is
to use the mean and the standard deviation (values computed
by any statistical analysis program, measuring, respectively, the
central point and the dispersion of the data set.) Values that are
within 3 standard deviations of the mean are usually the result
of
normal randomness. Values that are outside that range should
be examined more closely. You should remove only values for
which you can give a proper explanation, like in the example
above. Data analyst can use more advanced outlier detection
techniques. (Grubbs, 1969) (Rousseeuw & Leroy, 1996)
16.2.2 BI PREsEnTATIon Tools
Managers often need to visualize data in summary representa-

tions like tables and charts to help them answer questions or
identify problems in their area. They formulate their requests
for
information in ways that are very similar.
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N

Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip

ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7

CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr

od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea

te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or

de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh

c
ha
ng
e
pr
od
uc
t
DB
C
R
DB
C
R
DB
C

m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR
A
cc

. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB

tp
ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w

m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro

ts
C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od

te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r

M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40

6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n

or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te

d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t

re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio

n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB

C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C
R

In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB

R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB

C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB
C

R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin
ish
ed

Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C
R

Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In

ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio

C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0

1
Ex
ec
ut
e M
RP
Here are a few typical requests :
• I want to see sales totals by month.
• I want to see total production per product per month.
• I want to see the monthly average number of defects per
employee.
• I want the year-to-date sales by week, for this year and the
similar periods last year.
Those requests have an intrinsic structure that becomes appar-
ent when you analyse them. First, there is the thing the manager
needs to see : sales amount, production quantity, and the
number of defects. We call those things measures and they are
usually numerical data aggregated in some way : total, average,
and running total.
Secondly, there are the grouping terms : month, employee, and
week. We call them attributes and they can be quantitative or
qualitative data, like month, employee, and week. They are
usually recognized by the use of the words by and per.
Figure 1 – Multidimensional Data Model

Attributes that pertain to a single concept are grouped togeth-
er to form a dimension. Thus, the attributes Year, Month, Date,
and Weekday form a Time dimension. The attributes that form
a logical sequence from the broadest, with a small number of
values, to the most specific, having the largest number of values
form a hierarchy.
Measures, dimensions and hierarchies are illustrated in a
Multidimensional Data Model (Golfarelli & Rizzi, 2009), as
shown
in Figure 1. This example indicates that we are interested in
three
measures related to sales : Quantity Sold, Price, and Number of
Orders. We have three dimensions (Time, Client and Product)
with corresponding hierarchies, and we also have two attributes
(Weekday and Brand) that are not part of any hierarchy but are
of interest to the data analysts. There are natural hierarchies,
like
Time, but you can create artificial hierarchies to suit your
needs.
An American retailer could create a hierarchy Region, State and
City, where Region can have the value Northeast or Midwest.
An
international retailer could have a hierarchy Region, Country
and
City, where Region can have the value Europe or South Pacific.
In
these examples, the Region dimension has different meanings.
Time is a special case that needs careful attention. In our every-
day activities we use homonyms to describe time concepts that
are different. Consider the following expressions :
• What day are we today ? Oh! It’s Tuesday.

• I’m going on vacation in three days.
• I want the total sales per day from January 10, 2011 to January
14, 2011.
In the first case, day refers to a generic day of the week. In the
second case, day refers to a period of time. And in the last case,
day refers to specific dates. The time hierarchy deals with
specific
dates.
16.2.2.1 Queries
Queries are the basis of all data extractions. A query is a
request
to extract data from a database system. Queries need to specify
many details about the information that the user wants : which
data elements (called fields), where they are in the database (in
which tables), what condition they must satisfy (called a
criteria,
which can be used to specify a date range for example), what
calculations need to be performed (for example, multiplying
price by quantity to get an amount) and what groupings are
wanted (for example, group by products and calculate the sum
of the amounts sold).
M
D6

1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se

or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in

vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od

uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N

Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de

pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr

od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C

R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB

C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t

DB
C
R
DB
C
R
DB
C
R
DB
C
R
Co
ns

.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB

C
R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB

C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin

ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB

C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt

io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t

pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C

R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30

5
M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5

9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re

ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t

40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�

rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01

Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha

se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca

sh
A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R

ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh

ed
c
ha
ng
e
pr
od
uc
t
DB
C
R
DB
C
R
DB

C
R
DB
C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial

Ra
w
m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR

A
cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h

DB
C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts

ou
tp
ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es

Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld

p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve

stm
en
ts
C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se

pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
Queries are used in dashboards to obtain data that will be used
to calculate indicators (see section 16.2.2.2). They are used in
reports to provide printed data. They are also used to construct
the fact table that is the base of a data cube (see section

16.2.2.4).
With many data analysis programs, queries can be specified
with a point-and-click method, where the analyst selects data
tables and data fields, can indicate a selection criteria, and can
perform some basic mathematical calculations. For example,
you could ask for the total sales per salesperson per month.
Figure 2 shows how this query is written with Microsoft Access,
and Figure 3 shows its result.
Figure 2 - Simple Query with Graphical Interface
M
D6
1
Cr
ea
te
fo
re
ca
st
40

1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od

s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym

en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15

Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3

VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t

Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv

er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.

R
ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y

Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB
C
R
DB
C

R
DB
C
R
DB
C
R
Co
ns
.
R
aw
ra
w
m

at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/

IR
A
cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C

as
h
DB
C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od

uc
ts
ou
tp
ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re

ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e

so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts

In
ve
stm
en
ts
C
as
h
DB
C
R
DB
C
R
M
D0
7

Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP

M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha

se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st

in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e

pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01

N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In

de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t

Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB

C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB

C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc

t
DB
C
R
DB
C
R
DB
C
R
DB
C
R
Co

ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB

C
R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R

DB
C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R

Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB

C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns

um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu

t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB

C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9

30
5
M
D0
1
Ex
ec
ut
e M
RP
Figure 3 - Result of the Simple Query
While simple to use, this approach usually cannot perform
sophisticated extractions. It would be hard to get the top selling
product per month, which is a simple request, in appearance
(see Figure 4).
If an analyst needs to extract data using complex criteria,
he usually has access to SQL (Structured Query Language), a
language used to write queries. While programmers use SQL
to develop operational data base information systems, it can
also be used to extract data from a data warehouse. The SQL
query used to produce the result shown in Figure 4 is shown in
Figure 5. Explaining the structure of SQL is beyond the scope
of this chapter ; the interested reader can consult (Pratt & Last,
2009) for an introduction and (Celko, 2011) for advanced SQL
programming.

M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t

e M
RP
MonTH PRoDUCT_ID AMoUnT
1990-08 1010 4077.66
1990-09 1099 19288.61
1990-10 1099 39839.99
1990-11 1099 35454.12
1990-12 1099 26460.42
1991-01 1099 33526.14
1991-02 1123 27859.38
1991-03 1082 21071.68
1991-04 1099 43184.96
1991-05 1099 42539.11
1991-06 1099 30778.92
1991-07 1115 25482.88
1991-08 1058 23202.72
1991-09 1099 34104.52
1991-10 1099 29352.27

1991-11 1099 33535.76
1991-12 1099 48534.98
1992-01 1099 53335.31
1992-02 1099 29294.37
1992-03 1099 62136.15
1992-04 1099 79352.35
1992-05 1099 21004.55
1992-06 1099 25014.50
1992-07 1108 19231.94
1992-08 1099 47166.15
Figure 4 - Result of Complex Query
M
D6
1
Cr
ea

1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv

er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts

40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41

M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od

uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b

ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de

r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc

. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s

DB
C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha

ng
e
pr
od
uc
t
DB
C
R
DB
C
R
DB
C
R

DB
C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m

at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR
A
cc
. P

ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB
C

R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut

DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m

at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct

Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts

C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc

tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
1 SELECT to_char (order_date,’YYYY-MM’) Month,
2 Product_ID,
3 SUM (qty* price) amount
4 FROM order o1
5 JOIN order_line USING (order_id)

6 GROUP BY to_char (order_date,’YYYY-MM’), product_id
7 HAVING SUM (qty* price)
8 >= ALL (SELECT sum (qty* price)
9 FROM order o2
10 JOIN order_line USING (order_id)
11 WHERE to_char (o2. order_date,’YYYY-MM’) = to_char
(o1. order_date,’YYYY-MM’)
12 GROUP BY product_id
13 )
14 ORDER BY
Figure 5 - Complex Query
16.2.2.2 Dashboard
Dashboards are information presentation tools designed to
help decision makers see at a glance all the pertinent informa-
tion related to their domain. Like a car dashboard, with which
we
are familiar, a management dashboard will show key indicators
managers can use to evaluate the state of their business unit.
For a car, the usual key indicators are speed, headlight status,
oil
pressure, remaining fuel, turn signal activation, etc. In a
manage-
ment context, many key indicators come from financial state-
ments, like net profit, but the manager may also want indicators

specific to his needs. For example, a purchasing manager may
want to see the delivery delay for each supplier.
Like a car dashboard, a management dashboard may represent
many indicators in a compact format. A well designed
dashboard
will present information in a way that is most appropriate for
the decision maker : line charts, pie charts, bar charts, tabular
data with color coded indicators. Geographical information may
be superimposed on a map to show it in a way that tables and
charts can’t do justice (see Figure 6). A dashboard can also be
interactive and present data according to where the user clicks,
as shown in Figure 7.
Figure 6 - Representation of Data on a Map
M
D6
1
Cr
ea
te
fo
re

e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o

pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR

O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele

as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41

1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g

41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r

ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.

DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C

R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e

pr
od
uc
t
DB
C
R
DB
C
R
DB
C
R
DB
C

R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/

IR
DB
C
R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB

C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C

R
DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l

co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o

ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h

DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de

r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
Figure 7 - Dashboard with Dynamic Charts
Dashboard development tools are offered by the major
data warehouse management systems. SAP offers SAP
BusinessObjects Xcelsius Enterprise, IBM has Cognos Business
Intelligence and Oracle has Analytics Workspace Manager.
Dashboard design is a balancing act between simplicity and
complexity. On one hand, we want to present information that
is simple to understand, and on the other hand we want to
present information that is complex.
The data elements shown in a dashboard are extracted from
the data warehouse with queries. There are situations where
data elements may come from the operational database, but

those are mostly limited to dashboards used by operational
managers. In such situations, care must be taken that the under-
lying queries do not adversely affect the operations themselves :
a complex query may slow down the operational system.
Alerts are associated with indicators. The dashboard designer
must set up threshold levels for all the desired alert warnings.
For example, a low cash reserve may be accompanied by a red
light, a medium sized cash reserve by a yellow light, and a high
cash level by a green light. Figure 6 illustrates states with low
revenues in red.
Some warnings may be associated with both low and high
values : too much stock in inventory may need the same atten-
tion from the warehouse manager as too little stock.
16.2.2.3 Reports
Managers have used reports since the beginning of computer-
ized information systems. Until recently, reports were printed
and visually examined by managers to see if anything needed
attention. Modern reports are now produced as files that can
be analyzed further by the manager using a spreadsheet like
Microsoft Excel.
Reports are designed, usually by systems analysts, according to
the information needed by different users.

30
5
M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1

M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s

re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en

t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co

n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF

01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc

ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y

Ca
sh
A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.

R
ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y
Fi

ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB
C
R
DB
C
R

DB
C
R
DB
C
R
Co
ns
.
R
aw
ra
w
m
at

er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR

A
cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.
Pa
y.
C
as

h
DB
C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts

ou
tp
ut
DB
C
R
DB
C
R
Do
m
es
tic
sa
le
re
ve

nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so

ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In

ve
stm
en
ts
C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea

se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
There are two major types of report : detail and summary. A
detail

report is an extraction of the basic data from the database, like
the list of all orders received yesterday, or the list of products
in
the warehouse with their current stock.
Reports can be produced at regular intervals, according to the
user’s needs. The inventory manager may want to see the stock
level report every day, and the VP of Sales may want to see the
regional sales report every month.
Occasional or alert reports are produced only when a pre-
defined
situation occurs. Thus, the Production VP may receive a special
report when more than 5% of raw materials have a level corres-
ponding to less than 3 days of production. It could indicate that
production may cease if those raw materials are not received
soon. An alert report is, by itself, important information that
can lead to action. The design of the report includes not only
the information content but also the event definition, which
must be balanced between the reduction of false alerts and the
reaction time to be able to act on the problem.
Figure 8 - MDM of Cube Data
Reports often combine detailed information and different levels
of aggregation. For example, the inventory report containing the
list of every product in the warehouse, along with its quantity
and its value, may group products by product type or by brand,
and compute a subtotal value by product type or brand.
Such reports are useful for a visual examination of their
content.
But their structure needs to be defined in advance, and any
variation needs to be programmed by the IS specialists.
The major publishers of reporting software are SAP Crystal

Report and IBM Cognos Business Intelligence Query and
Reporting.
16.2.2.4 Data Cubes and Pivot Tables
Pivot tables are the logical evolution of reports, with the major
improvement being that the user can easily redefine them. They
can present detailed as well as aggregate information.
A data cube is a representation of a fact table presenting
measures organized by dimensions and hierarchies. While a
cube is limited to three dimensions, the underlying data table
may have more than three dimensions. In that case, we call it
a hyper-cube. Its full visual representation becomes impossible
with the limitations of two-dimensional sheets of paper and
computer monitors.
The data cube we will use in our example comes from the
Multidimensional Data Model of Figure 8, and its data is shown
in Table 3.
Figure 9 presents the data cube corresponding to the data
presented in Table 3.
M
D6
1

r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice

40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio

n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea

te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd

en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc

tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB

C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C

R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R

DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB

C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin
ish
ed

Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C

R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n

In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od

uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R
DB

C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M

D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N

uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
Figure 9 - Data Cube
Pivot tables are the visual representation of projections of the
hyper-cube on one or two dimensions. These dimensions are
called vertical dimension and horizontal dimension. Pivot tables
try to make the most of those two physical dimensions by

allowing the user to combine more than one cube dimension in
the table’s vertical or horizontal dimensions.
There are a few basic operations that we can perform on
pivot tables.
Pivoting
When we interchange the dimensions among its axes, we are
pivoting the table. The data shown in the table does not change,
but the user’s view and perception are different. A different
view
may provide different insight on the data.
slicing
Slicing is the act of selecting a specific value for one
dimension.
For example, Figure 10 shows the slice for the product Gizmo.
Figure 10 - Slice for Product Gizmo
Dicing
Dicing is a generalization of slicing, where we choose more
than
one specific values for one dimension or more.
For example, Figure 11 shows the sales for Q1 and Q2 for
Montreal and Boston.

M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc

ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O

Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as

e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1

VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4

In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec

eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB

C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R

DB
C
R
In
ve
nt
or
y
Fi
ni
sh
ed
c
ha
ng
e
pr

od
uc
t
DB
C
R
DB
C
R
DB
C
R
DB
C
R

Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/
IR

DB
C
R
DB
C
R
G
R/
IR
A
cc
. P
ay
.
DB
C

R
DB
C
R
Ac
c.
Pa
y.
C
as
h
DB
C
R
DB
C

R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R

DB
C
R
Do
m
es
tic
sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co

ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut

pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB

C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r

40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40

Figure 11 - Dice for Q1 and Q2, Montreal and Boston
Drill Down & Roll Up
Drilling down and rolling up are operations using a dimension’s
hierarchy. Looking at the Period dimension, we can roll up from
the quarters to the years, and from the cities to the regions. The
table then contains less detailed data, as illustrated in Figure 12.
We could also drill down from the cities dimensions to the
neigh-
bourhood (if we had access to more detailed information.) The
table would then contain more detailed data.
Figure 12 - Roll Up and Drill Down
M
D6
1
Cr
ea
te
fo
re

ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O

Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53

Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r

41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv

er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire

m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od
uc
tio
n
or
de

r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB
C
R

Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or

C
R
DB
C
R
DB
C
R
Co
ns
.
R
aw
ra

w
m
at
er
ial
Ra
w
m
at
G
R/
IR
DB
C
R
DB
C

R
G
R/
IR
A
cc
. P
ay
.
DB
C
R
DB
C
R
Ac
c.

Pa
y.
C
as
h
DB
C
R
DB
C
R
Fin
ish
ed
Pr
od

.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB
C
R
Do
m
es
tic

sa
le
re
ve
nu
es
Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or

y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve

stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R
DB
C
R

M
D0
7
Re
lea
se
pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec

ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o

O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r

ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm
pr
od
uc
tio
n

41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g

41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od

s r
ec
eip
t
Pr
od
uc
tio
n
or
de
r
De
liv
er
y
Ca
sh
A
cc
. R
ec
.

DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB

C
R
Co
ns
.
R
aw
ra
w
m
at
er
ial
Ra
w
m
at
G
R/

l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or

y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h

DB
C
R
DB
C
R
M
D0
7
Re
lea
se
pr
od
uc
tio
n
or

de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
16.3 Infrastructure supporting BI
Performing Business Intelligence analyses can be success-
ful only if you have the data available for those analyses. “You
can’t use data that you don’t have,” might have said Yogi
Berra1.
The data used for BI must have a source and it must be stored
somewhere.
This section will explain the technology infrastructure behind
the use of BI, along with the behind the scenes work that must
be performed regularly to maintain a viable BI initiative. While
this work is usually done by an organization’s IT department,
BI
analysts must be familiar with it because of its impact on data

quality. Ideally, BI analysts should participate in the design of
the
infrastructure and its processes.
The major data warehouse vendors are Teradata, Oracle, IBM
InfoSphere, and SAP Business Information Warehouse.
16.3.1 ConCEPT
An organization’s information systems are the engine that
makes it run smoothly. In the course of day-to-day operations,
an
organization records information that originates externally, like
client orders, or is produced internally, like product
manufactur-
ing data, and is then used by different departments to perform
required processes.
For example, a client’s order is recorded in the database by the
order-entry system. That order’s data, along with other orders,
is used by the manufacturing system to produce ordered
products and to record product availability in the database.
Then, the logistics system uses client orders and product avail-
ability to schedule and execute deliveries, and record delivery
data. Finally, the accounting system uses the delivery data to
produce and send invoices to clients, recording invoice data in
the database.
This is how information systems functioned, for decades, before
the advent of Business Intelligence.
The operational database is not well suited for BI applications.
Being designed for efficiency and performance, data analysts
found that it hindered their ability to analyze data properly.
Even though it usually enforces business rules and integrity

constraints, an operational database does not have the same
data quality requirements needed by data analysts. Typos in
names will not affect operations, but they may have an impact
on some analyses.
Also, an operational database is not designed to record the
evolution of data. When a client changes his address, the new
address usually overwrites the old address. This may affect
data analysis that now associates all past sales to the new
neighbourhood.
When no longer needed, data from an operational database
can be archived and purged to reclaim disk space for current
operations. The data analyst then loses precious data.
Finally, the data analysts can perform analyses that can slow
the operational database down to a crawl by overwhelming
the database management system with resource consuming
queries. For most organizations, even a small slowdown is not
tolerable.
16.3.2 DATA soURCEs
The data analyst may need data not from the operational
database. For example, when analysing sales data for ice
cream, it may be interesting to correlate it with each day’s
maximum temperature. Such data comes from external sources.
Consequently, the operational database is an internal source.
External sources have to be chosen for their reliability and
official status. You should not choose a random exchange rate
on the Internet, but rather use your bank or your government’s
central bank. Also, there are practical considerations to take
into
account : if your bank publishes exchange rates with a 6-month
lag, then you might look for a source with fresher data.

External data sources may be free or may need to be purchased.
Usually, weather information can be obtained for free on most
governments’ web sites. But you have to purchase financial data
from Bloomberg and market data from Neilsen.
16.3.3 WAREHoUsE
Those reasons have led to the development of data warehouses.
A data warehouse is a type of database better suited for business
intelligence applications, using data structures designed for
data analysis rather than for operational systems.
Because data warehouses are huge, some organizations
create restricted views of the data for different users. These
views are called data marts, and they are usually accessible by
specific business areas, like marketing, manufacturing, human
resources, etc.
1 Yogi Berra was the manager of the New York Mets and the
New York Yankees baseball teams in the 1970’s and the 1980’s.
He is known for uttering truisms like “It
ain’t over till it’s over” and “You can observe a lot by
watching.” (But he never said anything about data.)
M
D6

1
Cr
ea
te
fo
re
ca
st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or

de
r
M
IG
O
Po
st
go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo

ice
40
6
F-
53
Po
st
pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od

Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe

nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se
or
de
r
Go
od
s r
ec
eip
t
Pr
od

DB
C
R
Ac
c.
Re
c.
R
ev
en
ue
s
DB
C
R
DB

DB
C
R
DB
C
R
DB
C
R
DB
C
R
Co
ns
.

ed
Pr
od
.
pr
od
uc
ts
ou
tp
ut
DB
C
R
DB

io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p
ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr

od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm
en
ts
C
as
h
DB
C
R

M
D0
1
Ex
ec
ut
e M
RP
M
D6
1
Cr
ea
te
fo
re
ca
st
40
1
M
E5

9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st
go
od
s
re
ce

ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st
pa
ym
en
t
40

7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0
CO
15
Co
n�
rm

pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y
41
3
VF
01
Cr

ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m
en
t
Pu
rc
ha
se

A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.
Re
c.
R
ev

en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y
Fi
ni
sh

Ra
w
m
at
G
R/
IR
DB
C
R
DB
C
R
G
R/
IR
A

C
R
DB
C
R
Fin
ish
ed
Pr
od
.
pr
od
uc
ts

Ra
w
m
at
er
ia
l
co
ns
um
pt
io
n
In
ve
nt
or
y c
ha
ng
e
so
ld
p

ro
du
ct
Fa
ct
or
y o
ut
pu
t
pr
od
uc
tio
n
In
ve
stm
en
ts
In
ve
stm

en
ts
C
as
h
DB
C
R
DB
C
R
M
D0
7
Re
lea
se

pr
od
uc
tio
n
or
de
r
40
9
30
5
M
D0
1
Ex
ec
ut
e M
RP
16.3.3.1 Star Schema
The model that is simplest to understand is the star schema.
At the center of the star schema is the fact table with its dimen-
sions and its measures. Each dimension is linked to its

dimension
table, describing in detail each of its values.
The chapter’s Appendix illustrates a sales information fact
table (Table 4) with measures for the number of sales made
(NbSales), the total quantity sold (QtySold), the total amount
sold (AmountSold), aggregated for the dimensions ProductID,
PeriodID, and CityID.
The source of this fact table is the detailed sales transactions of
the operational database. Figure 13 illustrates the correspond-
ing star schema.
16.3.3.2 Snowflake Schema
The snowflake schema is an extension to the star schema that
represents hierarchies. The fact table is built with the most
specific dimension of a hierarchy. Then, each successive dimen-
sion of the hierarchy is linked to the previous one.
To build a snowflake schema from the star schema of Figure 13,
we transform the City Dimension Table shown in Table 5 into
a revised City Dimension Table (Table 8), a Country Dimension
Table (Table 9), and a Region Dimension Table (Table 10).
We also transform the Product Dimension Table of Table 7 into
a revised Product Dimension Table (Table 11) and a Category
Dimension Table (Table 12). The resulting snowflake schema is
shown in Figure 14.
Both schemas are functionally identical. Choosing one or the
other is a matter of trading redundancy with complexity. In a
star
schema, we memorize redundant information : The fact that
the country USA is in the region North America is repeated for

each city in the warehouse, which may be thousands of times.
The snowflake schema saves storage space, but adds a layer of
complexity whenever we need to see the Region field : the
query
must follow an extra link.
The business manager does not have to deal with this decision :
reports and dashboards use queries that are preprogrammed
by the IT specialists. The data analyst who writes his own SQL
queries will need to know the structure chosen by the data
warehouse designer.
Figure 13 - Star Schema
M
D6
1
Cr
ea
te
fo
re
ca

st
40
1
M
E5
9N
Co
nv
er
t t
o
pu
rc
ha
se
or
de
r
M
IG
O
Po
st

go
od
s
re
ce
ip
ts
40
5
M
IR
O
Po
st
in
vo
ice
40
6
F-
53
Po
st

pa
ym
en
t
40
7
CO
41
M
as
s r
ele
as
e
pr
od
uc
tio
n
or
de
r
41
0

CO
15
Co
n�
rm
pr
od
uc
tio
n
41
1
VL
01
N
Cr
ea
te
d
el
iv
er
y

41
3
VF
01
Cr
ea
te
b
ill
in
g
41
4
In
de
pe
nd
en
t
re
qu
ire
m

De
liv
er
y
Ca
sh
A
cc
. R
ec
.
DB
C
R
DB
C
R
Ac
c.

Re
c.
R
ev
en
ue
s
DB
C
R
DB
C
R
In
ve
nt
or
y

Fi
ni
sh
ed
c
ha
ng
e
pr
od
uc
t
DB
C
R
DB

© 2010 All Rights Reserved. SAP UAIntroduction to Busine.docx

Recommended

Recommended

More Related Content

Similar to © 2010 All Rights Reserved. SAP UAIntroduction to Busine.docx

Similar to © 2010 All Rights Reserved. SAP UAIntroduction to Busine.docx (20)

More from LynellBull52

More from LynellBull52 (20)

Recently uploaded

Recently uploaded (20)

© 2010 All Rights Reserved. SAP UAIntroduction to Busine.docx