Foundations of Business
Intelligence:
Data warehouses and Data Mining
Mr. Roshan Bhattarai
Kathmandu, Nepal
Data warehouse
• a separate database than operational database
• stores current and historical data of potential
interest to decision makers throughout the company
• information can be used across the enterprise for
management analysis and decision making, supports
reporting and query tools
• data may originate from sales, customer accounts,
website transactions, manufacturing, competitors,
regulatory body, market etc
Components of a Data Warehouse
Data Mart
• A data mart is a subset of data warehouse in which
summarized and highly focused portion of
organization’s data is placed in a separate database
• Smaller and decentralized warehouses
• Focuses on single subject area, so can be constructed
more rapidly and at lower cost than enterprise-wide
data warehouse
• Eg: Marketing and Sales data mart, Manufacturing
data mart etc
Tools for Business Intelligence
• Business Intelligence tools enable users to analyze data
to see new patterns, relationships and insights that are
useful for guiding decision making
• Principal tools include:
– Online Analytical Processing (OLAP)
– Data Mining
– Text Mining and Web Mining
1. Online Analytical Processing (OLAP)
• Tool for multi-dimensional data analysis
• Enables user to view the same data in different ways
using multiple dimensions
• Supports manipulation and analysis of large volumes of
data from multiple perspectives
• Eg: Product vs Actual and Projected sales, Region vs
Actual and Projected sales etc
2. Data Mining
• Provides insights into corporate data that cannot be
obtained with OLAP by finding hidden patterns and
relationships in larger databases
• Infer rules to predict future behavior of data
• Patterns and rules are used to guide decision making
and forecast the effect of those decisions
• The types of information obtainable from data mining
include:
a) Associations
– Occurrences linked to a single event
– Eg: Promotion vs Sales
After promotion, Purchase of coca-cola is increased to
80% (from 60%) of the time when pop corn is purchased
b) Sequences
– events linked over time
– Eg: if a house is purchased, an oven will be bought
within one month
c) Classification
– recognizes patterns that describes the group by
examining existing items that have been classified
d) Clustering
– no groups have been defined, data mining tool can
discover different grouping of data
e) Forecasts
– estimate future value of continuous variables by using
series of existing values
3. Text mining and Web mining
• Unstructured data, most in the form of text files
• Believed to account for 80% of organization’s useful
information
• Email, memo, survey responses, service reports etc
• Text mining tools are used to analyze these data
• Discover hidden patterns and relationships from large
unstructured data sets
• Discovery and analysis of useful patterns and information
from www is Web mining
• Google trends and Google Insights for services services
• Track the popularity of various words and phrases used in
google search queries
• Web mining looks for patterns in data through content
mining (text, audio, video), structure mining (links in web
documents) and usage mining (user interaction data
recorded by web server).

Data warehouses and data mining

  • 1.
    Foundations of Business Intelligence: Datawarehouses and Data Mining Mr. Roshan Bhattarai Kathmandu, Nepal
  • 2.
    Data warehouse • aseparate database than operational database • stores current and historical data of potential interest to decision makers throughout the company • information can be used across the enterprise for management analysis and decision making, supports reporting and query tools • data may originate from sales, customer accounts, website transactions, manufacturing, competitors, regulatory body, market etc
  • 3.
    Components of aData Warehouse
  • 4.
    Data Mart • Adata mart is a subset of data warehouse in which summarized and highly focused portion of organization’s data is placed in a separate database • Smaller and decentralized warehouses • Focuses on single subject area, so can be constructed more rapidly and at lower cost than enterprise-wide data warehouse • Eg: Marketing and Sales data mart, Manufacturing data mart etc
  • 5.
    Tools for BusinessIntelligence • Business Intelligence tools enable users to analyze data to see new patterns, relationships and insights that are useful for guiding decision making • Principal tools include: – Online Analytical Processing (OLAP) – Data Mining – Text Mining and Web Mining
  • 6.
    1. Online AnalyticalProcessing (OLAP) • Tool for multi-dimensional data analysis • Enables user to view the same data in different ways using multiple dimensions • Supports manipulation and analysis of large volumes of data from multiple perspectives • Eg: Product vs Actual and Projected sales, Region vs Actual and Projected sales etc
  • 7.
    2. Data Mining •Provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in larger databases • Infer rules to predict future behavior of data • Patterns and rules are used to guide decision making and forecast the effect of those decisions • The types of information obtainable from data mining include: a) Associations – Occurrences linked to a single event – Eg: Promotion vs Sales After promotion, Purchase of coca-cola is increased to 80% (from 60%) of the time when pop corn is purchased
  • 8.
    b) Sequences – eventslinked over time – Eg: if a house is purchased, an oven will be bought within one month c) Classification – recognizes patterns that describes the group by examining existing items that have been classified d) Clustering – no groups have been defined, data mining tool can discover different grouping of data e) Forecasts – estimate future value of continuous variables by using series of existing values
  • 9.
    3. Text miningand Web mining • Unstructured data, most in the form of text files • Believed to account for 80% of organization’s useful information • Email, memo, survey responses, service reports etc • Text mining tools are used to analyze these data • Discover hidden patterns and relationships from large unstructured data sets
  • 10.
    • Discovery andanalysis of useful patterns and information from www is Web mining • Google trends and Google Insights for services services • Track the popularity of various words and phrases used in google search queries • Web mining looks for patterns in data through content mining (text, audio, video), structure mining (links in web documents) and usage mining (user interaction data recorded by web server).