• Q1 – Why do organizations need business intelligence?
• Q2 – What business intelligence systems are available?
• Q3 – What are typical reporting applications?
• Q4 – What are typical data-mining applications?
• Q5 – What is the purpose of data warehouses and data
Why do organizations need business
• Business intelligence is comprised of
information that contains patterns,
relationships, and trends about customers,
suppliers, business partners, and employees.
• Business intelligence systems process, store,
and provide useful information to users who
need it, when they need it.
What business intelligence systems are
• A business intelligence (BI) system is an
information system that employs business
intelligence tools to produce and deliver
• Business intelligence tools are computer
programs that implement a particular BI
technique. The techniques are categorized
Business Intelligence Tools
– Reporting tools read data, process them, and format
the data into structured reports that are delivered to
users. They are used primarily for assessment.
– Data-mining tools process data using statistical
techniques, search for patterns and relationships, and
make predictions based on the results
– Knowledge-management tools store employee
knowledge, make it available to whomever needs it.
These tools are distinguished from the others because
the source of the data is human knowledge
It’s important that you understand the difference
between these business intelligence components:
– A BI tool is a computer program that implements
the logic of a particular procedure or process.
– A BI application uses BI tools on a particular type
of data for a particular purpose.
– A BI system is an information system that has all
five components (hardware, software, data,
procedures, people) that delivers the results of a
BI application to users.
What are typical reporting applications?
• Reporting applications input data from a
source(s) and apply a reporting tool to the
data to produce information. The reporting
system delivers the information to users.
• Basic reporting operations include sorting,
grouping, calculating, filtering, and
• This figure shows
raw data before
• The figure on the left shows the raw sales data
sorted by customer names.
• The figure on the right shows data that’s been
sorted and grouped.
Sales Data Sorted by Customer Name
Sales Data, Sorted by Customer Name &
Grouped by Number of Orders &
Fig 9-5 Sales Data Filtered to Show Repeat Customers
This figure shows even better information that’s been filtered and formatted
according to specific criteria.
• RFM Analysis allows you to
analyze and rank
customers according to
purchasing patterns as this
– R = how recently a
customer purchased your
– F = how frequently a
customer purchases your
– M = how much money a
customer typically spends
on your products
• The lower the score, the
better the customer.
• Online Analytical Processing (OLAP) is more
generic than RFM and provides you with the
dynamic ability to sum, count, average, and
perform other arithmetic operations on
groups of data. Reports, also called OLAP
– Measures which are data items of interest. In the
next figure a measure is Store Sales Net .
• Dimensions which are characteristics of a measure. In the figure below a
dimension is Product Family.
Fig 9-7 OLAP Product Family by Store Type
• A presentation like what you saw in the prior
slide is often called a OLAP cube or a cube.
• Know that an OLAP cube and a OLAP report are the
• Users can alter the format of a report
• Its possible to Drill down into the available
Further drilled down to just stores in
What are typical data-mining
Fig 9-11 Convergence Disciplines for Data Mining
Businesses use statistical techniques to find patterns and relationships
among data and use it for classification and prediction. Data mining
techniques are a blend of statistics and mathematics, and artificial
intelligence and machine-learning.
• Because data mining is a odd blend of terms
from different disciplines it is sometimes
referred to as knowledge discovery in
• There are two types of data-mining techniques:
– Unsupervised data-mining characteristics:
• No model or hypothesis exists before running the analysis
• Analysts apply data-mining techniques and then observe the
• Analysts create a hypotheses after analysis is completed
• Cluster analysis, a common technique in this category groups
entities together that have similar characteristics
– Supervised data-mining characteristics:
• Analysts develop a model prior to their analysis
• Apply statistical techniques to estimate parameters of a model
• Regression analysis is a technique in this category that measures
the impact of a set of variables on another variable
• Neural networks predict values and make classifications
Market-Basket Analysis is a data-mining tool for determining sales
It helps businesses create cross-selling opportunities. Two terms used with
this type of analysis, and shown in the figure, are:
Support—the probability that two items will be purchased together
Confidence—a conditional probability estimate
• A decision tree is a hierarchical arrangement
of criteria that predicts a classification or
value. It’s an unsupervised data-mining
technique that selects the most useful
attributes for classifying entities on some
criterion. It uses if…then rules in the decision
• Next are two examples.
Fig 9-13 Grades of Students from Past MIS
Class (Hypothetical Data)
Fig 9-14 Credit Score Decision Tree
What is the purpose of data warehouses and
Fig 9-15 Components of a Data Warehouse
Data warehouses and data marts address the problems companies have with
missing data values and inconsistent data. They also help standardize data formats
between operational data and data purchased from third-party vendors.
These facilities prepare, store, and manage data specifically for data mining and
Figure 9-16, left, lists some of the data
that’s readily available for purchase
from data vendors
Some of the problems companies
experience with operational data are
shown in figure 9-17 below.
Granularity refers to whether
data are too fine or too coarse.
Clickstream data refers to the
clicking behavior of customers on
The phenomenon called the
curse of dimensionality—just
because you have more attributes
doesn’t mean you have a more
Here’s the difference between a data warehouse
and a data mart:
Fig 9-18 Data Mart Examples
A data warehouse stores operational data and purchased data. It cleans and
processes data as necessary. It serves the entire organization.
A data mart is smaller than a data warehouse and addresses a particular
component or functional area of an organization.