Business Intelligence: Multidimensional Analysis

Business Intelligence
Michael Lamont, ’12
lamont@post.harvard.edu

The Analysis Gap
The Analysis Gap

Soda Example
Cola Cherry Grape Lemon-Lime
Munich Frankfurt Cologne Berlin

Soda Example
Time
$ Sales
Q3
$16,000
Q4
$16,000
Total
$32,000

Soda Example
Time
$ Sales
Q3
$16,000
Q4
$16,000
Total
$32,000
Product
$ Sales
Cola
$8,000
Cherry
$8,000
Grape
$8,000
Lemon-Lime
$8,000
Total
$32,000
Geography
$ Sales
Munich
$8,000
Frankfurt
$8,000
Cologne
$8,000
Berlin
$8,000
Total
$32,000

Soda Example
Munich
Frankfurt
Cologne
Berlin
Total
Q3
Cola
$ -
$ -
$2,500
$1,500
$4,000
Cherry
$ -
$ -
$2,000
$2,000
$4,000
Grape
$1,000
$3,000
$ -
$ -
$4,000
Lem-Lime
$2,000
$2,000
$ -
$ -
$4,000
Total Q3
$3,000
$5,000
$4,500
$3,500
$16,000
Q4
Cola
$4,000
$ -
$ -
$ -
$4,000
Cherry
$1,000
$3,000
$ -
$ -
$4,000
Grape
$ -
$ -
$1,500
$2,500
$4,000
Lem-Line
$ -
$ -
$2,000
$2,000
$4,000
Total Q4
$5,000
$3,000
$3,500
$4,500
$16,000
Total
$8,000
$8,000
$8,000
$8,000
$32,000

Multidimensional Analysis
Intuitive way for people with business training to analyze data
Natural
Easy
Effective
Difficult to get data into a format that supports multidimensional analysis

Operational Databases
Where did our data come from?
Lots of individual shoppers buying a soda
Each transaction stored in database designed to store checkout transactions
Operational Database: supports the day-to-day operations of a company
Data in operational databases can’t easily be analyzed

Core operational database functionality:
Gather data
Update data
Store data
Retrieve data
Archive data

OLTP: Online Transaction Processing

OLTP Example
Buying toothpaste at Target:
1.You place toothpaste on conveyor belt
2.Cashier swipes barcode over POS scanner
3.POS system looks up price of toothpaste
4.POS totals cost of transaction + tax
5.POS prompts for payment
6.You swipe debit card and enter PIN
7.POS system xfers cost of toothpaste from your bank account to Target’s account
8.POS generates receipt and cashier bags purchase

Key OLTP Characteristics
Processes a transaction according to rules
Performs all elements of a transaction in real time
Continually processes multiple transactions

OLTP Systems
OLTP systems are everywhere:
Order tracking
Invoicing
Credit card processing
Retail POS
Banking
Airline reservations
OLTP is optimized for managing low- level business data

OLTP Systems
OLTP systems can be used to answer transactional questions
Raw transactional data not really useful for business intelligence
OLTP systems can’t be used to answer most analysis questions
Can’t search, sort, & summarize large numbers of records
Can’t handle required calculations
Negative impact on OLTP system performance

OLTP Systems
OLTP systems gather raw data used for multidimensional analysis
Raw data has to be converted into something suitable for analysis
Converting raw data to something useful isn’t easy

OLTP Systems
IT dept used to spend most of their time and resources on operational systems
Usually purchased as packaged apps today
Today’s operational apps usually include some meaningful reporting capabilities

OLTP Systems
 Packaged systems have 2 big limitations:
1. Can only report on their own data – “silos” of
data
2. Don’t really support multidimensional
analysis
Sales Marketing Accounting Finance

OLTP Systems
Every large company has some sort of BI system to analyze operational data
OLTP system vendors are constantly improving their ability to integrate with BI systems

OLAP
Modern BI systems designed to follow OnLine Analytic Processing (OLAP) model
Named by IBM’s E.F. Codd (inventor of SQL and relational databases)
All OLAP systems have to meet three key criteria

Three Key OLAP Criteria
1.Must support multidimensional analysis
Top managers/analysts have always thought multidimensionally
View “by” qualifiers are usually dimensions
OLAP systems organize data into multidimensional structures
Provide tools for users to examine/filter dimensional data

Three Key OLAP Criteria
2.Fast retrieval times
Answer more questions in less time
“Infinite Question Syndrome”
3.Calculation engine that can handle specialized multidimensional math
Lets analysts use simple formulas that are auto-performed across dimensions

Dimensions
Dimension: categorically consistent view of data
Two tests for dimensionality:
1.Can data about members be compared?
○Sales numbers of one product compared to sales numbers of another product
2.Can data from members be aggregated into summaries?
○Jan, Feb, Mar aggregate together as Q1

Slicing & Dicing
Dimensions let you “slice and dice” multidimensional data

Slicing & Dicing
Jan Feb Mar Apr May
Boston
New York
Philadelphia
Baltimore
Washington

Pivoted Soda Data
Cola
Cherry
Grape
Lem-Lime
Total
Munich
Qtr 3
$ -
$ -
$1,000
$2,000
$3,000
Qtr 4
$4,000
$1,000
$ -
$ -
$5,000
Total
$4,000
$1,000
$1,000
$2,000
$8,000
Frankfurt
Qtr 3
$ -
$ -
$3,000
$2,000
$5,000
Qtr 4
$ -
$3,000
$ -
$ -
$3,000
Total
$ -
$3,000
$3,000
$2,000
$8,000
Cologne
Qtr 3
$2,500
$2,000
$ -
$ -
$4,500
Qtr 4
$ -
$ -
$1,500
$2,000
$3,500
Total
$2,500
$2,000
$1,500
$2,000
$8,000
Berlin
Qtr 3
$1,500
$2,000
$ -
$ -
$3,500
Qtr 4
$ -
$ -
$2,500
$2,000
$4,500
Total
$1,500
$2,000
$2,500
$2,000
$8,000
Grand
Total
$8,000
$8,000
$8,000
$8,000
$32,000

OLAP Munich
Frankfurt
Cologne
Berlin
Geography Dimension

OLAP
Q1
Q2
Q3
Q4
Time Dimension

OLAP Cola
Cherry
Grape
Lemon-Lime

OLAP Munich
Frankfurt
Cologne
Berlin
Geography Dimension
Q1
Q2
Q3
Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime

OLAP
Munich
Frankfurt
Cologne
Berlin
Geography Dimension
Q1
Q2
Q3
Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
$2,000

$32,000
OLAP
Munich
Frankfurt
Cologne
Berlin
Geography Dimension
Q1
Q2
Q3
Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime

OLAP
Munich
Frankfurt
Cologne
Berlin
Geography Dimension
Q1
Q2
Q3
Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime

OLAP
Munich
Frankfurt
Cologne
Berlin
Geography Dimension
Q1
Q2
Q3
Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
$8,000

OLAP
Data cubes can have very large numbers of members
OLAP Cube: multidimensional structure that stores and maintains discrete intersection values
Some OLAP systems let cubes intersect with each other

Hierarchies
Typical analysis task:
Units Sold, Average Price, Dollar Sales
100 products
24 months
200 major cities
Total data points: 1,440,000
Not all products sold in all cities during all months

Hierarchies
Hierarchy – organizes data by levels
Each level in the hierarchy is the aggregate of the levels beneath it
Examples:
Monthly data rolls up to quarters and years
Cities roll up to regions and states
Products roll up to product lines and groups
Calculations, like Average Price, can be back-calculated at each hierarchy level

Hierarchies
Hierarchies let you drill-down into data to explore interesting patterns and anomalies
Top-down approach is like “20 Questions”
Start by exploring broad trends
Become more focused as analysis progresses
Top-down thinking is natural way for humans to organize complex info

Ad hoc Analysis
Point-and-click drill-down is made usable by OLAP’s rapid response model
Lets managers and analysts perform ad hoc analysis
Paper-based reporting gives fixed answers to fixed questions
OLAP-based ad hoc analysis lets virtually any question be answered quickly

Ad hoc Analysis
Virtually any report can be formatted multidimensionally (pivoting & nesting dimensions)
Virtually anyone can be taught how to do their own analysis work with minimal training

Sample Hierarchy
2013
Q1 Jan Feb Mar
Q2
Apr
May
Jun
Q3 Jul Aug Sep
Q4
Oct
Nov
Dec

Attributes
Attribute: descriptive non-hierarchical information
Examples:
Model number
Size
List price
Color
Flavor
Street address

Measures
Measure: any quantitative expression contained in an OLAP system
A measure is the data that’s being analyzed across multiple dimensions
Example: Dollar Sales of soda by month, by product, and by city

Measures
Four important properties of a measure:
1.Always a quantity or expression that yields a quantity
2.Can take any quantitative format
3.Can be derived from any original data source or calculation
4.At least one measure required to perform OLAP analysis

Measures
The measures to be analyzed depend on the purpose of the OLAP system
In BI, measures known by different names depending on application:
Metric/Key Performance Indicator (KPI)
Benchmark
Ratio

Summary
Analysis gap between raw data and BI can be bridged by combining OLTP systems with BI systems
OLAP systems provide ad hoc analysis, slicing and dicing, pivoting dimensions, and drilling down through hierarchies
OLAP provides significant capabilities over standard single-dimensional analysis

Michael Lamont, ’12
lamont@post.harvard.edu

Business Intelligence: Multidimensional Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Business Intelligence: Multidimensional Analysis

Similar to Business Intelligence: Multidimensional Analysis (20)

More from Michael Lamont

More from Michael Lamont (15)

Recently uploaded

Recently uploaded (20)

Business Intelligence: Multidimensional Analysis