2. Business Intelligence (BI)
Business Intelligence (BI) is a technology infrastructure for gaining maximum
information from available data for the purpose of improving business
processes.
The most common kinds of Business Intelligence systems are:
EIS - Executive Information Systems
DSS - Decision Support Systems
MIS - Management Information Systems
GIS - Geographic Information Systems
CRM - Customer Relationship Management
OLAP - Online Analytical Processing
2
4. OLTP vs OLAP
What is OLTP?
Online transactional processing (OLTP) enables the real-time execution of large numbers
of database transactions by large numbers of people, typically over the Internet.
OLTP can also drive non-financial transactions, including password changes and text
messages.
OLTP systems use a relational database that can do the following:
Process a large number of relatively simple transactions — usually insertions, updates and
deletions to data.
Enable multi-user access to the same data, while ensuring data integrity.
Support very rapid processing, with response times measured in milliseconds.
Provide indexed data sets for rapid searching, retrieval and querying.
Be available 24/7/365, with constant incremental backups.
Organizations use OLTP systems to provide data for OLAP.
4
5. OLTP vs OLAP
What is OLAP?
Online analytical processing (OLAP) is a system for performing multi-dimensional analysis
at high speeds on large volumes of data.
This data is from a data warehouse, data mart or some other centralized data store.
OLAP is ideal for data mining, business intelligence and complex analytical calculations, as
well as business reporting functions like financial analysis, budgeting and sales
forecasting.
The core of most OLAP databases is the OLAP cube, which allows you to quickly query,
report on and analyze multidimensional data.
The OLAP cube extends the row-by-column format of a traditional relational database
schema and adds layers for other data dimensions.
5
6. OLTP vs OLAP
What is OLAP?
The OLAP cube for sales data in multiple
dimensions
By region, by quarter and by product
While the top layer of the cube might organize sales
by region, data analysts can also “drill-down” into
layers for sales by state/province, city and/or specific
stores.
Aggregated data for OLAP is usually stored in a star
schema or snowflake schema.
6
7. OLTP vs OLAP
The main distinction between the two systems is in their names: analytical vs.
transactional. Each system is optimized for that type of processing.
OLAP:
Optimized for conducting complex data analysis for smarter decision-making.
Designed for use by data scientists, business analysts and knowledge workers
Support business intelligence (BI), data mining and other decision support applications.
OLTP:
Optimized for processing a massive number of transactions.
Designed for use by frontline workers (e.g., cashiers, bank tellers, hotel desk clerks) or for customer
self-service applications (e.g., online banking, e-commerce, travel reservations).
7
8. OLTP vs OLAP
OLTP OLAP
Source of data Operational data; OLTPs are the original
source of the data.
Consolidation data; OLAP data comes
from the various OLTP Databases
Purpose of data To control and run fundamental business
tasks
To help with planning, problem solving,
and decision support
Inserts and updates Short and fast inserts and updates
initiated by end users
Periodic long-running batch jobs refresh
the data
Queries Relatively standardized and simple
queries Returning relatively few records
Often complex queries involving
aggregations
Processing speed Very fast Depends on the amount of data involved
Space requirement relatively small if historical data is
archived
Larger due to the existence of aggregation
structures and history data; requires more
indexes than OLTP
Database design Highly normalized with many tables Typically de-normalized with fewer tables
8
9. OLAP (Online Analytical Processing)
What is OLAP?
It is a software for performing multidimensional analysis at high speeds on large volumes
of data from a data warehouse, data mart, or some other unified, centralized data store.
Most business data have multiple dimensions—multiple categories into which the data
are broken down for presentation, tracking, or analysis.
E.g. Sales figures might have several dimensions related to location (region, country,
state/province, store), time (year, month, week, day), product (clothing, men/women/children,
brand, type), and more.
But in a data warehouse, data sets are stored in tables, each of which can organize data
into just two of these dimensions at a time.
OLAP extracts data from multiple relational data sets and reorganizes it into a
multidimensional format that enables very fast processing and very insightful analysis.
9
10. OLAP (Online Analytical Processing)
What is OLAP?
OLAP cube consists of numeric facts called measures or facts which are categorized by
dimensions.
The cube metadata may be created from a star schema or snowflake schema of tables in
a relational database.
10
11. Multidimensional Analysis
Multidimensional analysis refers to the process commonly used in data
warehousing applications of examining data using various combinations of
dimensions.
Dimensions are the categories used to classify data such as time, geography, a
company’s departments, product lines, and so on
The results associated with a particular set of dimensions are called facts.
Facts are typically figures associated with product sales, profits, volumes, counts,
etc.
In order to obtain these facts according to a set of dimensions in a relational
database system, SQL aggregation is typically used.
In SQL aggregation, data is grouped according to certain criteria (dimensions) and the result set
consists of aggregates of facts such as counts, sums, and averages of the data in each group
11
12. Multidimensional Analysis
A multidimensional model views data in the form of a data-cube.
A data cube enables data to be modeled and viewed in multiple dimensions.
It is defined by dimensions and facts.
Data cube helps to analyze facts/measures by multiple dimensions
Example:
We want to analyze sales data by product by time and by store location.
12
17. Hypercube
Suppose analysts want to analyze not just Sales but other metrics / facts as well.
Assume that the other metrics to be analysed are profit margin, fixed cost and
indirect sale.
Store Location = “Paris”
Product Category = “Ice-cream”
Time Sale
Profit
Margin
Fixed
Cost
Indirec
t Sale
Jan 210 70 25 245
Feb 251 76 20 275
Mar 140 52 15 230
… … … … …
Dec 521 139 30 200
Ice-cream
Time Metrics
Product
Jan
Pancakes
Mar
Apr
Dec
May
Jun
Jul
Aug
Sept
Oct
Nov
Waffles
Feb
Cookies
Profit
Margin
Sale
Fixed
Cost
Indirect
Sale
Multidimensional Domain Structure
(MDS)
Ice-cream
17
18. Hypercube
Multidimensional Domain Structure
(MDS) with additional dimension
Nantes
Time Metrics
Store
Location
Jan
Lyon
Mar
Apr
Dec
May
Jun
Jul
Aug
Sept
Oct
Nov
Paris
Feb
Nice
Profit
Margin
Sale
Fixed
Cost
Indirect
Sale
Ice-cream
Product
Pancakes
Waffles
Cookies
How can we represent these four groups
as edges of a three-dimensional cube?
The MDS is well suited to
represent four or more
dimensions.
18
19. Hypercube
What is Hypercube?
A hypercube, a representation that accommodates more than three dimensions.
At a lower level of simplification, a hypercube can very well accommodate three
dimensions.
A hypercube is a general metaphor for representing multidimensional data.
3-D Cube
4-D Cube(Tesseract)
5-D Cube Hypercube
19
20. Hypercube
How to view multidimensional data?
Store Location: Paris
Time
Ice-cream:
Sale
Ice-cream:
Profit Margin
Ice-cream:
Fixed Cost
Ice-cream:
Indirect Sale
Jan 210 70 25 245
Feb 251 76 20 275
Mar 140 52 15 230
… … … … …
Dec 521 139 30 200
Page: Store Location Dimension
Rows: Time Dimension
Column: Product Category and Metric/facts
Page displays for four-dimensional data.
20
21. Hypercube
How to view multidimensional data?
Life Style : Coupon
Time
Ice-cream:
Sale
Ice-cream:
Profit Margin
Cookies:
Sale
Cookies:
Profit Margin
Paris Jan 210 70 25 245
Feb 251 76 20 275
Lyon Jan 140 52 15 230
Feb 521 139 30 200
Page: Demographics and Promotion Dimensions
Rows: Store Location and Time Dimensions
Column: Product Category and Metric/facts
Page displays for Six-dimensional data.
21
22. Hypercube
A model with three dimensions can be represented by a physical cube.
But a physical cube is limited to only three dimensions or less.
Hypercube can be visualized in tables with multiple tables on multiple pages.
22
23. OLAP Operation
Following operations can be performed on OLAP data cubes.
Roll up
Drill down
Slice
Dice
Pivot
23
24. OLAP Operation
Roll-up
Performs aggregation on a data cube, by climbing down the concept hierarchies i.e. dimension
reduction
Summarizes data along a dimension
When a roll-up is performed by dimensions reduction, one or more dimensions are removed from
the cube.
We have aggregated data along Time dimension.
Monthly Sale Quarterly Sale Yearly Sale All Sale
24
26. OLAP Operation
Drill-down
The reverse operation of roll-up
navigates from less detailed record to more detailed data.
Can be performed by climbing up a concept hierarchy for a dimension or adding additional dimensions.
Extracting data with more specific details. So it can be performed by adding a new dimension to a cube.
E.g. we are extracting Monthly data from Yearly data
Yearly data Quarterly data Monthly data
26
28. OLAP Operation
Slice
A slice is a subset of the cubes corresponding to a single
value for one or more members of the dimension.
It will form a new sub-cubes by selecting one or more
dimensions.
Slice of Cube
Cube
28
29. OLAP Operation
Dice
The dice operation describes a sub-cube by operating a
selection on two or more dimension.
Cube
Subcube
29
30. OLAP Operation
Pivot
The pivot operation is also called a rotation.
Pivot is a visualization operations which rotates the data axes in view to provide an alternative
presentation of the data.
It may contain swapping the rows and columns or moving one of the row-dimensions into the
column dimensions.
30
32. OLAP Operation
Drill Through
The drill-through operations make use of relational SQL facilitates to drill through the bottom level
of a data cubes down to its back-end relational tables.
32
33. OLAP Operation
Drill Across
Accesses more than one fact table that is linked by common dimensions. Combines cubes that
share one or more dimensions.
33
34. OLAP Operation
Examples
Consider a data warehouse for a hospital where there are three dimensions Doctor, Patient and
Time and two measures Count and Charge (fee that doctor charges a patient for a visit). Describe
OLAP operations Roll-up, Drill-down, Slice and Dice.
DigiOne company have sales department. Consider the three dimensions Time, Product and Store.
The schema contains a central fact table Sales with two measures Dollar_cost and Units_sold.
Describe OLAP operations Roll-up, Drill-down, Slice and Dice.
34
36. OLAP Models
Different models have similar online analytical processing but the storage
methodology is different.
ROLAP: Relational Online Analytical Processing
The OLAP system is built on top of a relational database
MOLAP: Multidimensional Online Analytical Processing
OLAP system is implemented through a specialized multidimensional database
HOLAP: Hybrid Online Analytical Processing
Combine the strengths and features of ROLAP and MOLAP
DOLAP: Desktop Online Analytical Processing
Provide portability to users of online analytical processing
36
37. OLAP Models
Database OLAP:
A relational database management system (RDBMS) designated to support OLAP
structures and to perform OLAP calculations
Web OLAP:
Online analytical processing where OLAP data is accessible from a Web browser.
37
39. OLAP Models
MOLAP:
Data for analysis is stored in specialized multidimensional databases
Precalculated and prefabricated multidimensional data cubes are stored in
multidimensional databases
MOLAP engine in the application layer pushes a multidimensional view of the data from
the MDDBs to the users
Multidimensional database management systems are proprietary software systems
provide the capability to consolidate and fabricate summarized cubes during the process
that loads data into the MDDBs from the main data warehouse
39
41. OLAP Models
ROLAP:
Data is stored as rows and columns as in a relational data model
Presents data to the users in the form of business dimensions
The metadata layer supports the mapping of dimensions to the relational
The analytical server in the middle tier application layer creates multidimensional views on the
fly
The multidimensional system at the presentation layer provides a multidimensional view of
the data to the users
Queries based on this multidimensional view are transformed into complex SQL directed to
the relational database
41
42. OLAP Models
ROLAP:
Characteristics
Supports all the basic OLAP features and functions
Stores data in a relational form
Supports some form of aggregation
Local hypercubing
The user issues a query.
The results of the query get stored in a small, local, multidimensional database.
The user performs analysis against this local database
If additional data is required to continue the analysis, the user issues another query and the analysis
continues.
42
43. OLAP Models
ROLAP
MOLAP
Complexity of Analysis
Query
Performance
ROLAP versus MOLAP
ROLAP Vs MOLAP:
the choice between ROLAP and MOLAP also depends on the complexity of the queries from
your users.
MOLAP is the choice for faster response and more intensive queries.
43