Business Intelligence
    Data Mining
      (Part 2 of 2)
The End?
How far can I go?

• Storing and analyzing historical data you can see just
  one part of reality (the past and the present)
• Is there a way to answer questions not yet made?
  Can I look into the future?
• Can I predict how my business is going to work?
  What about the market? And my customers?
Data Mining

• Is a process to extract patterns from data
• “We’re drowning in data but information
  thirsty”
• Data Mining borrows techniques from
  statistics, probability, maths, artificial
  intelligence and other fields
Business Problems
• Recommendations
• Anomaly Detection
• Customer abandon analysis
• Risk Management
• Customer segmentation
• Targeted advertising
• Projections
Data Mining Tasks

• Classification
• Estimation / Regression
• Prediction / Projection (Forecasting)
• Association Rules / Affinity Groups
• Clusterization
Predictive Models
• Classifications
 • Discrete value prediction
   • Yes, No
   • High, Medium, Low
• Estimation / Regression
 • Continuous value prediction
   • Amounts
   • Numbers
• Projection / Forecasting
Descriptive Models
• Association Rules / Affinity
 • Looks for correlation indexes among
    diverse associated elements
 • Market Basket Analysis
• Clusterization
 • Groups items according to similarity
 • “Automatic” classification
Work Cycle

                           Transform
                            Data to
                          Information




Identify Business Opportunities             Act with
                                          Information




                        Measure Results
Data Mining and DWh

• The Data Warehsouse unifies diverse data sources
  in one common repository
• Before the DM process, you must have reliable data
  sources
• Data must be presented in a way that eases analysis
Project Cycle
• Business Problem Formulation
• Data Gathering
• Data transformation and cleansing
• Model Construction
• Model Evaluation
• Reports and Prediction
• Application Integration
• Model Management
What is a Model?
• The model is a set of conclusions reached (in
  mathematical format) after data processing
• Is used to extract knowledge and to compare it
  to new data to reach to new conclusions
• It has some efficency percentage
• Must be adjusted to make helpful predictions
• It is time-constrainted
Cases
Outlook    Temperature (C)    Humidity   Wind   Play Golf?
 Sunny          29.4            85%      NO        No
 Sunny          26.6            90%      YES       No
Overcast        28.3            78%      NO        Yes
 Rainy          21.1            96%      NO        Yes
 Rainy          20.0            80%      NO        Yes
 Rainy          18.3            70%      YES       No
Overcast        17.7            65%      YES       Yes
 Sunny          22.2            95%      NO        No
 Sunny          20.5            70%      NO        Yes
 Rainy          23.8            80%      NO        Yes
 Sunny          23.8            70%      YES       Yes
Overcast        22.2            90%      YES       Yes
Overcast        27.2            75%      NO        Yes
 Rainy          21.6            80%      YES       No
Model
                  Outlook

Overcast        Rainy                    Sunny

  YES              Wind              Humidity

           NO               YES <=77.5            >77.5

            YES         NO          YES      NO
Data Mining Algorithms
 • Naive Bayes
 • Decission Trees
 • Autoregression trees (ARTxp and ARIMA)
 • K-Means
 • Kohonen Maps
 • Neural Networks
 • Logistic regression
 • Time Series
Where can I use them?
• Marketing: Segmentation, Campaigns, Results,
  Loyalty,...
• Sales: Behaviour detection, Sales habits
• Finances: Investments, Portfolio Management
• Banks and Assurance: Credit Check
• Security: Fraud Detection
• Medicine: Possible treatment analysis
• Manufacturing: Quality Control
• Internet: Click analysis, Text Mining
Data Mining and CRM (1)

 • Detect the best prospect / customers
 • Select the best communication channel for
   prospects / customers
 • Select an appropriate message to
   prospects / customers
 • Cross-selling, Up-selling and sales
   recommendation engines
Data Mining and CRM (2)

 • Improve direct marketing campaign results
 • Customer base segmentation
 • Reduce credit risk exposure
 • Customer Lifetime Value
 • Customer retention and loss
Clustering
• “Self” Customer Segmentation
 • Descriptive Characteristics
 • Behavioural Characteristics
   • Relationship
   • Purchases
   • Payments
Classification
• Customers by purchase behaviour
• Customers by payment behaviour
• Customers by resources devoted/needed
  to their service
• Customers by credit profile
• Customers by attention required
Association Rules

• Market Basket Analysis
• Cross Selling
• Up Selling
Prediction / Forecasting

• Revenue Projection
• Payment Projection
• Number of Products sold Projection
• Cash Flow Projection
Some other DM cases


• Key Influencers
• Predictions Calculator
Some Possible
             Problems (1)
• To learn things that are not true
 • The patterns may not represent any underlying rule
 • The model may not represent a relevant number of
    examples
 • Data may be in a detail level not enough for analysis
Possible Problems... (1I)

• To learn things that are true, but not
  useful
 • Learn things that we already knew
 • Learn things that cannot be applied
Thank you!

Business Intelligence Presentation - Data Mining (2/2)

  • 1.
    Business Intelligence Data Mining (Part 2 of 2)
  • 2.
  • 3.
    How far canI go? • Storing and analyzing historical data you can see just one part of reality (the past and the present) • Is there a way to answer questions not yet made? Can I look into the future? • Can I predict how my business is going to work? What about the market? And my customers?
  • 4.
    Data Mining • Isa process to extract patterns from data • “We’re drowning in data but information thirsty” • Data Mining borrows techniques from statistics, probability, maths, artificial intelligence and other fields
  • 5.
    Business Problems • Recommendations •Anomaly Detection • Customer abandon analysis • Risk Management • Customer segmentation • Targeted advertising • Projections
  • 6.
    Data Mining Tasks •Classification • Estimation / Regression • Prediction / Projection (Forecasting) • Association Rules / Affinity Groups • Clusterization
  • 7.
    Predictive Models • Classifications • Discrete value prediction • Yes, No • High, Medium, Low • Estimation / Regression • Continuous value prediction • Amounts • Numbers • Projection / Forecasting
  • 8.
    Descriptive Models • AssociationRules / Affinity • Looks for correlation indexes among diverse associated elements • Market Basket Analysis • Clusterization • Groups items according to similarity • “Automatic” classification
  • 9.
    Work Cycle Transform Data to Information Identify Business Opportunities Act with Information Measure Results
  • 10.
    Data Mining andDWh • The Data Warehsouse unifies diverse data sources in one common repository • Before the DM process, you must have reliable data sources • Data must be presented in a way that eases analysis
  • 11.
    Project Cycle • BusinessProblem Formulation • Data Gathering • Data transformation and cleansing • Model Construction • Model Evaluation • Reports and Prediction • Application Integration • Model Management
  • 12.
    What is aModel? • The model is a set of conclusions reached (in mathematical format) after data processing • Is used to extract knowledge and to compare it to new data to reach to new conclusions • It has some efficency percentage • Must be adjusted to make helpful predictions • It is time-constrainted
  • 13.
    Cases Outlook Temperature (C) Humidity Wind Play Golf? Sunny 29.4 85% NO No Sunny 26.6 90% YES No Overcast 28.3 78% NO Yes Rainy 21.1 96% NO Yes Rainy 20.0 80% NO Yes Rainy 18.3 70% YES No Overcast 17.7 65% YES Yes Sunny 22.2 95% NO No Sunny 20.5 70% NO Yes Rainy 23.8 80% NO Yes Sunny 23.8 70% YES Yes Overcast 22.2 90% YES Yes Overcast 27.2 75% NO Yes Rainy 21.6 80% YES No
  • 14.
    Model Outlook Overcast Rainy Sunny YES Wind Humidity NO YES <=77.5 >77.5 YES NO YES NO
  • 15.
    Data Mining Algorithms • Naive Bayes • Decission Trees • Autoregression trees (ARTxp and ARIMA) • K-Means • Kohonen Maps • Neural Networks • Logistic regression • Time Series
  • 16.
    Where can Iuse them? • Marketing: Segmentation, Campaigns, Results, Loyalty,... • Sales: Behaviour detection, Sales habits • Finances: Investments, Portfolio Management • Banks and Assurance: Credit Check • Security: Fraud Detection • Medicine: Possible treatment analysis • Manufacturing: Quality Control • Internet: Click analysis, Text Mining
  • 17.
    Data Mining andCRM (1) • Detect the best prospect / customers • Select the best communication channel for prospects / customers • Select an appropriate message to prospects / customers • Cross-selling, Up-selling and sales recommendation engines
  • 18.
    Data Mining andCRM (2) • Improve direct marketing campaign results • Customer base segmentation • Reduce credit risk exposure • Customer Lifetime Value • Customer retention and loss
  • 19.
    Clustering • “Self” CustomerSegmentation • Descriptive Characteristics • Behavioural Characteristics • Relationship • Purchases • Payments
  • 20.
    Classification • Customers bypurchase behaviour • Customers by payment behaviour • Customers by resources devoted/needed to their service • Customers by credit profile • Customers by attention required
  • 21.
    Association Rules • MarketBasket Analysis • Cross Selling • Up Selling
  • 22.
    Prediction / Forecasting •Revenue Projection • Payment Projection • Number of Products sold Projection • Cash Flow Projection
  • 23.
    Some other DMcases • Key Influencers • Predictions Calculator
  • 24.
    Some Possible Problems (1) • To learn things that are not true • The patterns may not represent any underlying rule • The model may not represent a relevant number of examples • Data may be in a detail level not enough for analysis
  • 25.
    Possible Problems... (1I) •To learn things that are true, but not useful • Learn things that we already knew • Learn things that cannot be applied
  • 26.