1. Business Intelligence
⢠BI Meaning
⢠Components of BI
⢠BI process
⢠BI Providers
⢠Functions of BI server
⢠BI Capabilities
⢠BI Users
⢠BI Infrastructure
⢠BI Tools
⢠Others
1
8/19/2023
2. Business Intelligence
⢠IS generates enormous amounts of operational data that contain patterns,
relationships, clusters, and other information that can facilitate management,
especially planning and forecasting.
⢠BI systems produce such information from operational data.
⢠BI is the collective information aboutâŚ
⢠Customers
⢠Competitors
⢠Business partners
⢠Competitive environment
⢠BI is the process of extracting data from an OLAP database and then
analyzing that data for information that you can use to make informed
business decisions and take action.
8/19/2023 2
13. BUSINESS INTELLIGENCE USERS
Power users :Producers
20% of employee
Capabilities Casual users :consumers
80 % of employee
IT developers Production reports Operational employee
Super users Parameterized report Senior manager
Business analysts Dashboard Manager/staff
Analytical modelers Ad hoc, drill down
search/OLAP
Business analyst
8/19/2023 1-13
14. BUSINESS INTELLIGENCE AND ANALYTICS CAPABILITIES
â Production reports: These are predefined reports based on
industry specific requirements .
â Parameterized reports: Users enter several parameters as
in a pivot table to filter data and isolate impacts of
parameters. For instance,
⢠you might want to enter region and time of day to understand how
sales of a product vary by region and time.
8/19/2023 1-14
15. BUSINESS INTELLIGENCE AND ANALYTICS CAPABILITIES
⢠Dashboards/scorecards: These are visual tools for presenting
performance data defined by users.
⢠Ad hoc query/search/report creation: These allow users to create
their own reports based on queries and searches.
⢠Drill down: This is the ability to move from a high-level summary
to a more detailed view.
⢠Forecasts, scenarios, models: These include the ability to perform
linear forecasting, what-if scenario analysis, and analyze data using
standard statistical tools.
8/19/2023 1-15
17. BI Infrastructure
⢠an array of tools for obtaining useful information from all the
different types of data used by businesses today, including
semi-structured and unstructured big data in vast quantities.
⢠These capabilities include
â Data warehouse and data mart
â Hadoop,
â in-memory computing, and
â analytical platforms.
8/19/2023 17
19. Hadoop
⢠For handling unstructured and semi-structured data
in vast quantities, as well as structured data,
organizations are using Hadoop.
⢠Hadoop is an open source software framework
managed by the Apache Software Foundation that
enables
â distributed parallel processing of huge amounts of data
across inexpensive computers.
8/19/2023 19
20. Key services:
⢠Hadoop consists of several key services:
â the Hadoop Distributed File System (HDFS) for data
storage and
â MapReduce for high-performance parallel data
processing.
â HDFS links together the file systems on the numerous
nodes in a Hadoop cluster to turn them into one big file
system.
â Hadoopâs MapReduce was inspired by Googleâs
MapReduce system for breaking down processing of huge
datasets and assigning work to the various nodes in a
cluster.
8/19/2023 20
21. ⢠HBase, Hadoopâs non-relational database, provides rapid
access to the data stored on HDFS and a
transactional platform for running high-scale real-time
applications.
⢠Hadoop can process large quantities of any kind of data,
including structured transactional data, loosely structured
data such as Facebook and Twitter feeds, complex data such
as Web server log files, and unstructured audio and video
data.
⢠Hadoop runs on a cluster of inexpensive servers, and
processors can be added or removed as needed. Companies
use Hadoop for analyzing very large
21
22. In-Memory Computing
⢠Another way of facilitating big data analysis is to use in-
memory computing, which relies primarily on a computerâs
main memory (RAM) for data storage.
(Conventional DBMS use disk storage systems.)
⢠Users access data stored in system primary memory, thereby
eliminating bottlenecks from retrieving and reading data in a
traditional, disk-based database and dramatically shortening
query response times.
⢠In-memory processing makes it possible for very large sets of
data, amounting to the size of a data mart or small data
warehouse, to reside entirely in memory.
8/19/2023 22
23. ⢠Leading commercial products for in-memory computing
include SAPâs High Performance Analytics Appliance (HANA)
and Oracle Exalytics.
⢠Each provides a set of integrated software components,
including in-memory database software and specialized
analytics software, that run on hardware optimized for in-
memory computing work.
8/19/2023 23
24. ⢠Centrica, a gas and electric utility, uses HANA to quickly
capture and analyze the vast amounts of data generated by
smart meters.
⢠The company is able to analyze usage every 15 minutes, giving
it a much clearer picture of usage by
neighborhood, home size, type of business served, or building
type.
⢠HANA also helps Centrica show its customers their energy
usage patterns in real-time using online and mob
8/19/2023 24
25. Analytic platforms
⢠Commercial database vendors have developed specialized
high-speed analytic platforms using both relational and non-
relational technology that are optimized for analyzing large
datasets.
⢠These analytic platforms such as
â IBMNetezza and Oracle Exadata, feature preconfigured hardware-
software systems that are specifically designed for query processing
and analytics
8/19/2023 25
26. Analytic platforms contdâŚ
⢠For example,IBM Netezza features tightly integrated
database, server, and storage components that
handle complex analytic queries 10 to 100 times
faster than traditional systems.
⢠Analytic platforms also include in-memory systems
and NoSQL non-relational database management
systems.
8/19/2023 26
27. BI Tools
1. Reporting Tools
⢠Integrate data from multiple systems
⢠Sorting, grouping, summing, averaging, comparing data
2. Data-mining Tools
⢠Used to discover hidden patterns and relationships
⢠Use sophisticated statistical techniques, regression analysis, and decision
tree analysis , Market-basket analysis
3. Knowledge-management tool
⢠Create value by collecting and sharing human knowledge about products,
product uses, best practices, other critical knowledge
8/19/2023 27
28. Reporting Tools
⢠Reporting tools produce information from data using
five basic operations:
⢠Sorting
⢠Grouping
⢠Calculating
⢠Filtering
⢠Formatting
8/19/2023 1-28
29. RFM Analysis
RFM analysis allows you to analyze and rank customers according to
purchasing patterns as this figure shows.
..R = how recently a customer purchased your products
..F = how frequently a customer purchases your products
..M = how much money a customer typically spends on your products
8/19/2023 29
30. RFM Analysis âŚ.
Divides customers into five groups and assigns a score from 1 to 5
⢠R score 1 = top 20 percent in most recent orders
⢠R score 5 = bottom 20 percent (longest since last order)
⢠F score 1 = top 20 percent in most frequent orders
⢠F score 5 = bottom 20 percent least frequent orders
⢠M score 1 = top 20 percent in most money spent
⢠M score 5 = bottom 20 percent in amount of money spent
8/19/2023
1-30
32. Online Analytical processing (OLAP)
⢠OLAP, a second type of reporting tool, is more generic than
RFM.
⢠OLAP provides the ability to sum, count, average, and perform
other simple arithmetic operations on groups of data.
⢠ability to manipulate, analyze large volumes of data from
multiple perspectives
8/19/2023 1-32
34. OLAP FeaturesâŚ.
⢠Dynamic
⢠User can change report structure
⢠View online
⢠Dimension
⢠Characteristic of measureâpurchase date, customer
type, location, sales region
⢠Users can drill down into data.
â Divide data into more detail
8/19/2023 1-34
38. Data Mining
⢠A major use of data warehouse databases
⢠Data is analyzed to reveal hidden correlations, patterns, and
trends
⢠Uses variety of techniques to find hidden patterns and
relationship in large pool of data and infer rules from them
that can be used to predict future behavior and guide
decision.
8/19/2023 1-38
39. Data Mining Tools
i. Query-and-reporting tools â similar to QBE tools, SQL, and
report generators
ii. Intelligent agents â utilize AI tools to help you âdiscoverâ
information and trends
iii. Multidimensional analysis (MDA tools) â slice-and-dice
techniques for viewing multidimensional information
iv. Statistical tools â for applying mathematical models to data
warehouse information
8/19/2023 1-39
40. Data Mining Tools/Techniques
⢠Can be of Two Types
âSupervised data mining :Regression and NN
âUn supervised data mining: Decision tree and cluster
analysis
8/19/2023 1-40
41. Supervised data mining âŚ
Model developed before analysis (ie regression ,NN)
⢠Statistical techniques used to estimate parameters
â˘Examples:
..Regression analysisâmeasures impact of set of variables
on one another
..Used for making predictions
8/19/2023 1-41
42. Supervised data mining⌠Neural Network
⢠Popular supervised data-mining technique used to
predict values and make classifications such as âgood
prospectâ or âpoor prospectâ customers
8/19/2023 42
43. Unsupervised data miningâŚ.
⢠Analysts do not create model before running analysis. i.e.
cluster analysis, decision tree etc.
⢠Analysts create hypotheses after analysis to explain patterns
found.
⢠No prior model about the patterns and relationships that
might exist . Common statistical technique used:
⢠Cluster analysis to find groups of similar customers from
customer order and demographic data
8/19/2023 1-43
44. Unsupervised data miningâŚDecision tree
⢠Hierarchical arrangement of criteria that predict a
classification or value
⢠Basic idea of a decision tree
..Select attributes most useful for classifying something
on some criteria that create disparate groups
⢠More different or pure the groups, the better the
classification
8/19/2023 44
46. Create Set of If/Then Decision Rules
If student is a junior and works in a restaurant, then
predict grade > 3.0.
⢠If student is a senior and is a non-business major,
then predict grade < 3.0.
⢠If student is a junior and does not work in a
restaurant, then predict grade < 3.0.
⢠If student is a senior and is a business major, then
make no prediction.
8/19/2023 1-46
47. Market Basket Analysis (MBA)
Market-basket analysis is a data-mining technique for
determining sales patterns.
â Uses statistical methods to identify sales patterns in large volumes of
data
â Shows which products customers tend to buy together
â Used to estimate probability of customer purchase
â Helps identify cross-selling opportunities
⢠"Customers who bought book X also bought book Yâ
8/19/2023 1-47
49. 8/19/2023 49
Support
..Probability that two items will be bought together
..Fins and masks purchased together 150 times,
thus support for fins and a mask is 150/1,000, or
15 percent
..Support for fins and weights is 60/1,000, or 6
percent
..Support for fins along with a second pair of fins is
10/1,000, or 1 percent
Market-Basket terminologies
50. Market-Basket terminologies
Confidence
..What proportion of the customers who bought a mask also
bought fins?
..Conditional probability estimate
⢠Example:
Âť Probability of buying fins = 28%
Âť Probability of buying swim mask = 27%
⢠After buying fins,
Âť Probability of buying mask = 150/270 or 55.56%
⢠..Likelihood that a customer will also buy fins almost doubles, from
28% to 55.56%.
⢠Thus, all sales personnel should try to sell fins to anyone buying a
mask.
8/19/2023 1-50
51. Market-Basket terminologiesâŚ
Lift
..Ratio of confidence to base probability of buying
item
..Shows how much base probability increases or
decreases when other products are purchased
â˘Example:
..Lift of fins and a mask is confidence of fins given
a mask, divided by the base probability of fins.
..Lift of fins and a mask is .5556/.28 = 1.98
8/19/2023 1-51