Plan                    Probabilistic Structural Equations
  Introduction

  Bayesian
  Networks

  Application           ...
BayesiaLab’s Probabilistic Structural Equations for
                                                              Perfume ...
Plan


  Introduction

  Bayesian
  Networks                           INTRODUCTION
  Application




   ©2009 Bayesia SA
...
Bayesian Networks


                                     A Computational Tool to Model Uncertainty
             Plan
     ...
Bayesian Networks


                                     1763: Bayes’ Theorem
             Plan                           ...
Example of Probabilistic Reasoning



                                     Letter from the analysis laboratory
           ...
Example of Probabilistic Reasoning

                                     Letter from the analysis laboratory

            ...
Example of Probabilistic Reasoning


                                         Letter from the analysis laboratory

       ...
Example of Probabilistic Reasoning


                                     Letter from the analysis laboratory

           ...
Plan


  Introduction

  Bayesian
  Networks                            BAYESIAN BELIEF NETWORKS
  Application




   ©200...
... are made of Two Distinct Parts




             Plan                     Structure

                                  ...
... are Powerful Inference Engines

                                      We get some evidence on the states of a subset o...
How to Build a Bayesian Network?

                                      Modeling by Brainstorming

                       ...
Plan
                                                     PROBABILISTIC STRUCTURAL
                                       ...
Perfume Market Analysis



                                      Questionnaire’s characteristics
             Plan

      ...
Step 1: Unsupervised learning on the
                                                   Manifest variables only




      ...
Analysis of the arcs’ strength




             Plan


  Introduction

  Bayesian
  Networks

  Applications




         ...
Step 2: Variables’ Clustering
                                                                             to find the con...
Step 2: Variables’ Clustering




             Plan


  Introduction

  Bayesian
  Networks

  Applications




   ©2009 B...
Step 3: Multiple Data Clustering

                                                               By using the BayesiaLab’s...
Analysis of the Induced Factors:
                                                                                     Fact...
Analysis of the Induced Factors:
                                                                                         ...
Analysis of the Induced Factors:
                                                                        Quality measureme...
Analysis of the Induced Factors:
                                                                          Quality measure...
Analysis of the Induced Factors:
                                                                                Semantic ...
Analysis of the Induced Factors




             Plan


                                            Here is a table descri...
Final Step: Unsupervised Learning on
                                                               Manifest, Latent, and ...
Path Analysis:
                                                             Focussing on Factor variables only

          ...
Path Analysis:
                                                                            Focussing on Factor variables o...
Path Analysis:
                                                                     Focussing on Factor variables only

  ...
Driver Analysis:
                                                                Focussing on Manifest variables only




...
Driver Analysis:
                                                                  Focussing on Manifest variables only

 ...
Driver Analysis for Product 10




             Plan


  Introduction

  Bayesian
  Networks

  Applications
             ...
Driver Analysis for Product 10

                                         Note that STE is only proposed in BayesiaLab for ...
Driver Analysis for Product 10

                                             To be able to use STE properly, we can use Ba...
Driver Analysis for Product 10


                                          Quadrant based on the potential Drivers



    ...
Driver Analysis for Product 10

                                          However, this kind of interpretation is not appr...
Driver Analysis for Product 10




             Plan

                                                By hovering over the...
Driver Analysis for Product 10

                                                 We use our Target Dynamic Profile tool to...
Driver Analysis for Product 10




             Plan


  Introduction

  Bayesian
  Networks

  Applications




         ...
Driver Analysis for Product 10




             Plan


  Introduction

  Bayesian
  Networks

  Applications




   ©2009 ...
Driver Analysis for Product 5


                                      Let’s compute the same Driver Analysis for Product 5...
Driver Analysis for Product 5




             Plan


  Introduction

  Bayesian
  Networks

  Applications




   ©2009 B...
Contact




             Plan
                                      Address

  Introduction                               ...
Upcoming SlideShare
Loading in...5
×

Probabilistic Structural Equations - Bayesian Networks for the Analysis of a Perfume Market

4,988

Published on

After a brief introduction of Bayesian Belief Networks, we describe how Probabilistic Structural Equations (PSE) can be induced by BayesiaLab to analyze a specific Perfume Market. We also describe the Mutli-Quadrant Analysis (opportunity plots), a new analysis tool allowing taking into account the competitive position of each product\'s drivers for the computation of the optimal policies.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,988
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
123
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Probabilistic Structural Equations - Bayesian Networks for the Analysis of a Perfume Market

  1. 1. Plan Probabilistic Structural Equations Introduction Bayesian Networks Application Application to the Analysis of a Perfume Market Dr. Lionel JOUFFE August 2009 ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 1
  2. 2. BayesiaLab’s Probabilistic Structural Equations for Perfume Market Analysis Plan Introduction Bayesian Networks Application ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 2
  3. 3. Plan Introduction Bayesian Networks INTRODUCTION Application ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 3
  4. 4. Bayesian Networks A Computational Tool to Model Uncertainty Plan Based both on graph theory and on probability theory Introduction Bayesian Manual modeling through brainstorming: Networks probabilistic expert systems Application Induction by automatic learning: data analysis, data mining ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 4
  5. 5. Bayesian Networks 1763: Bayes’ Theorem Plan P(A|B) = P(B|A)P(A)/P(B) Introduction 1988: Judea Pearl Bayesian “Probabilistic Reasoning in Intelligent Systems: Networks of Networks Plausible Inference” Application 1996: “Microsoft's competitive advantage is its expertise in Bayesian networks”, Bill Gates 2004: ©2009 Bayesia SA Bayesian Machine Learning at the 4th rank among the 10 All rights reserved. Forbidden reproduction in whole or part Emerging Technologies That Will Change Your World without the Bayesia’s express written permission 5
  6. 6. Example of Probabilistic Reasoning Letter from the analysis laboratory Plan Introduction “You recently went to our laboratory for a screening test. The targeted rare disease has a prevalence of one person out of ten Bayesian thousand. We regret to inform you that this test, which has a Networks symmetric efficiency of 99%, is positive.” Application What is your feeling after reading this letter? Do you think that the probability that you are affected is 1%, 50% or 99% ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 6
  7. 7. Example of Probabilistic Reasoning Letter from the analysis laboratory Plan Among the 9 999 other persons, “99.99 persons” will receive a letter with a positive test result Introduction Bayesian Networks Application One person out of 10 000 is affected. He will receive “0.99 letter” with a positive test result ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 7
  8. 8. Example of Probabilistic Reasoning Letter from the analysis laboratory Plan - There is then a total of 0.99 + 99.99 letters with a positive test result Introduction - Probability to be affected when one Bayesian receives such letter: Networks 0.99/(0.99+99.99) = 0.98% Application ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 8
  9. 9. Example of Probabilistic Reasoning Letter from the analysis laboratory Plan Introduction Bayesian Networks Application ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 9
  10. 10. Plan Introduction Bayesian Networks BAYESIAN BELIEF NETWORKS Application ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 10
  11. 11. ... are made of Two Distinct Parts Plan Structure Directed Acyclic Graph (DAG), i.e. no directed loop Introduction Nodes represent the domain’s variables Bayesian Networks Arcs represent the direct probabilistic influences between Application the variables (possibly causal) Parameters Probability distributions are associated to each node, usually by using tables ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 11
  12. 12. ... are Powerful Inference Engines We get some evidence on the states of a subset of variables Hard positive evidence Plan Hard negative evidence Introduction Likelihoods Bayesian Networks Application Probability distributions (fixed or not) Mean values (fixed or not) We then want to take these findings into account in a rigorous way to update our belief on the states of the other variables Probability distributions on their values ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express Multi-Directional Inference (Simulation and/or Diagnosis) written permission 21 12
  13. 13. How to Build a Bayesian Network? Modeling by Brainstorming Productive exchange between experts that can ease the Plan consensus An Expert System with powerful computational and analytical abilities Introduction Modeling of rare or never occurred cases Bayesian Networks Automatic Modeling by Data Mining Application Probability estimation/updating of a network Structural learning and probability estimation Missing values Filtered/censored states Initial network proposed by experts Discovering of all the direct probabilistic relations Target node characterization - Supervised learning ©2009 Bayesia SA Data clustering All rights reserved. Forbidden reproduction in whole or part Variable clustering without the Bayesia’s express written permission 13 Probabilistic Structural Equations
  14. 14. Plan PROBABILISTIC STRUCTURAL EQUATIONS* Introduction - Bayesian Perfume Market Analysis Networks Applications * see “Probabilistic Structural Equations and Path Analysis - Part I” (http:// www.bayesia.com/en/products/bayesialab/resources/tutorials/probabilistic-structural- ©2009 Bayesia SA equations-I.php) for a detailed BayesiaLab’s tutorial describing the complete workflow to get All rights reserved. Forbidden Probabilistic Structural Equations reproduction in whole or part without the Bayesia’s express written permission 14
  15. 15. Perfume Market Analysis Questionnaire’s characteristics Plan To get an insight of the market (11 products), 1.300 monadic tests have been carried out (each woman has only evaluated one perfume). Introduction Bayesian 1 target variable, the Purchase Intent: 6 numerical states Networks 27 questions relative to the perfume : 10 numerical levels Applications considered as continuous values and discretized into 5 numerical states (equal distances) 19 questions relative to the woman wearing the perfume: 10 numerical levels considered as continuous values and discretized into 5 numerical states (equal distances) 1 Just About Right (JAR) question for the fragrance Intensity: 5 numerical states ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 15
  16. 16. Step 1: Unsupervised learning on the Manifest variables only Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 16
  17. 17. Analysis of the arcs’ strength Plan Introduction Bayesian Networks Applications Here is the Kullback-Leibler Divergence associated to the arc, and its relative weight in the ©2009 Bayesia SA factorized representation of the Joint Probability All rights reserved. Forbidden reproduction in whole or part distribution without the Bayesia’s express written permission 17
  18. 18. Step 2: Variables’ Clustering to find the concepts Based on those Kullback-Liebler measures, 15 clusters are automatically proposed by the BayesiaLab’s variable clustering algorithm Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 18
  19. 19. Step 2: Variables’ Clustering Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 19
  20. 20. Step 3: Multiple Data Clustering By using the BayesiaLab’s Multiple-Clustering algorithm, we carry out data clustering on the implied subset of variables, for each cluster of variables. Plan Introduction Factor 0 is a new random variable summarizing these 5 Bayesian manifest variables Networks Factor 2 is a new Applications random variable that summarizes these 4 manifest variables Factor 1 is a new random variable that summarizes these 5 manifest variables ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express ..... written permission 20
  21. 21. Analysis of the Induced Factors: Factor 0 Based on the associated variables, we name this Factor “IS SELF-CONFIDENT” Plan Introduction Bayesian Networks 5 states have been automatically Applications created by the BayesiaLab’s Data Clustering algorithm. Here is the Marginal Distribution over those 5 states. ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 21
  22. 22. Analysis of the Induced Factors: Quality measurement of Factor 0 The state’s Purity is the mean When the purity is not of its posterior probabilities (given the 100%, the remaining probabilities Plan manifest variables), over all the points that have are used to define the probabilistic been associated to that state with the neighborhood maximum likelihood rule Introduction Bayesian Networks Applications The 2-dimensional representation of Factor 0. The bubble size is proportional to the prior probability, the darkness of the blue represents the state purity, and the bubble proximity is based on the probabilistic vicinity ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 22
  23. 23. Analysis of the Induced Factors: Quality measurement of Factor 0 The 5 states of Factor 0 summarize the Joint Probability Distribution over its 5 associated manifest variables. This Joint is a 5 dimensional hypercube, with 5 states per dimension, i.e. 5^5 cells = 3,125 probabilities Plan This probability density function is based on the database’s log- Introduction Likelihood returned by Factor 0’s network Bayesian Networks Applications The Contingency Table Fit measures the representation quality of the Joint Probability Distribution. 100% corresponds to the perfect representation with the fully connected network (no independence hypothesis), 0% corresponds to the representation with the fully unconnected network (no dependence hypothesis) ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 23
  24. 24. Analysis of the Induced Factors: Quality measurement of Factor 0 In the specific case of a Factor’s analysis, the dimension represented by that factor is not taken into account in the Joint. The Contingency Table Fit measures then the quality of the Joint’s summary realized by the Factor’s states Plan Introduction Bayesian Networks Applications Contingency Table Fit: 78.39% Contingency Table Fit: 85.04% The representation of the Joint (defined over the 5 manifest variables) with the 5 states latent variable Factor 0 is more precise than the one obtained with an unsupervised learning representing the direct probabilistic relations between the manifest variables ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 24
  25. 25. Analysis of the Induced Factors: Semantic analysis of Factor 0 The numerical value associated to each state corresponds to the mean value over the manifest variables when this latent state is observed (weighted by the relative significance of the manifest variables wrt that state). These values Plan allow to have a quick insight on the meaning of the state. For example, C3 corresponds to the lowest evaluations ... Introduction Bayesian Networks Applications ... whereas C5 corresponds to the highest ones ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 25
  26. 26. Analysis of the Induced Factors Plan Here is a table describing the Multiple Introduction Clustering key measures obtained during the data clustering of the 15 manifest variables’ clusters Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 26
  27. 27. Final Step: Unsupervised Learning on Manifest, Latent, and Target variables The “Probabilistic Structural Equation” has been obtained under some constraints: no arc from Manifests toward Factors no direct relation between Manifests no direct relation between the Target and Manifests Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 27
  28. 28. Path Analysis: Focussing on Factor variables only The Path can be highlighted just by hiding the Manifest variables Plan As we can see, the Purchase Intent in only directly connected to one Latent variable, Introduction the “ADEQUACY” Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 28
  29. 29. Path Analysis: Focussing on Factor variables only Plan Factors’ Hierarchization by using the Standardized Total Effects (STE) Introduction Bayesian Networks Applications Graphical representation of each Factor’s influence on the Purchase Intent ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 29
  30. 30. Path Analysis: Focussing on Factor variables only Our Quadrant Analysis allows to get a concise view of the Factors’ hierarchy wrt the Purchase Intent. Whereas the Y-axis is based on the Standardized Total Effect (STE), the X-axis corresponds to the Factors’ mean value Plan Mean of the Mean Values Introduction Bayesian Networks Applications Mean of the STEs ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 30
  31. 31. Driver Analysis: Focussing on Manifest variables only Plan The Bayesian network representing the Probabilistic Structural Equation (PSE) has been learnt by using the Perfume Total Market (11 products) Introduction useful for understanding the Total Market Bayesian inappropriate for finding the levers that can be used to improve a Networks given product Applications To be able to analyze the products’ drivers, we define the Product variable as a BayesiaLab’s Breakout variable the PSE’s structure remains the same for all the products the PSE’s parameters (conditional probability tables) are estimated, for each perfume, on its corresponding subset of lines ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 31
  32. 32. Driver Analysis: Focussing on Manifest variables only Only a subset of Manifest variables can be used as Drivers. The PSE below masks the non-actionable variables Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 32
  33. 33. Driver Analysis for Product 10 Plan Introduction Bayesian Networks Applications Due to non-linearity, the Standardized Total Effect (STE) does not reflect the importance of Intensity This graph highlights the non linear influence of Intensity on Purchase Intent (JAR variable) ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 33
  34. 34. Driver Analysis for Product 10 Note that STE is only proposed in BayesiaLab for some analysis tools. This is not a measure used for learning Bayesian networks (BN). As the states are discrete, the learning algorithms are not sensitive to linearity. Plan The analysis below ranks the Drivers wrt the Mutual Information criterion. Introduction As we can see, Intensity is now in the 4th position Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 34
  35. 35. Driver Analysis for Product 10 To be able to use STE properly, we can use BayesiaLab to linearize Intensity. It will then associate numerical values to the states in order to get a positive linear relation (sorting of the states wrt to their relation to Purchase Intent). Plan Introduction Bayesian Networks Applications Intensity is now in the 4th position with STE and with the Slopes in the Graphical representation ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 35
  36. 36. Driver Analysis for Product 10 Quadrant based on the potential Drivers Plan 1 2 Introduction Bayesian Networks Applications 4 3 Usually this kind of quadrant can be used to quickly see what the Drivers to prioritize are 1: Concentrate here 2: Keep on the good work 3: Possible overkill 4: Low priority ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 36
  37. 37. Driver Analysis for Product 10 However, this kind of interpretation is not appropriate here. Indeed, quadrants are defined with the means (STEs and Mean Values) of the studied product. Even if a variable is located in Quadrants 1 or 4, its value can be the highest of the Total Market. Conversely, variables belonging to Quadrants 2 and 3 can also have low values compared with the other products. Plan Introduction Thanks to the scales associated to each Bayesian variable, this new BayesiaLab’s Quadrant allows to quickly have an insight on how the Networks variables are ranked wrt the other products. Product 10 has the best Intensity value, but a Applications poor Flowery value (lower than the mean value over the products) ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 37
  38. 38. Driver Analysis for Product 10 Plan By hovering over the point, it is possible to have a specific view of the Introduction variable values for all the products. The best ranked product on Flowery is then Product 11, the Bayesian worse one being Product 1 Networks Applications This Multiple-Quadrant tool allows to export the variation percentage needed to reach the best market value, for each product and each variable. For Product 10, we need to apply a 10.02% increase on the Flowery mean to reach Product 11’s level. ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 38
  39. 39. Driver Analysis for Product 10 We use our Target Dynamic Profile tool to estimate the most realistic action policy. Here are the optimization parameters: maximize the Purchase Intent Mean value take into account the Joint Probability of the actions take the costs into account (1 per action consisting in reaching the max authorized value) Plan “Soft Increase” of the drivers’ mean by taking into account the exported variation values Introduction Bayesian Networks Applications The induced policy is !"(%$ then to work on Flowery, then Feminine, ...., !"($ and Fruity, to increase the Purchase Intent Value !"'%$ from 3.65 to 3.92. The Joint is 50.35%, which means that !"'$ half of those product evaluations corresponds to this !"&%$ setting. The column “Value/Mean at T” indicates the !"&$ ©2009 Bayesia SA impact of each action on the other drivers. As we !"#%$ All rights reserved. Forbidden reproduction in whole or part see, those impacts reduce the cost for !"#$ without the Bayesia’s express written permission the actions. )$*+,-+,$ ./-01+2$ .13,4,41$ 5+,6,47/$ 81479,-:;$ .+:,<2$ 39
  40. 40. Driver Analysis for Product 10 Plan Introduction Bayesian Networks Applications Here is the complete policy over all the drivers. The BayesiaLab’s Soft Increase allows to get a targeted mean value by using the closest probability distribution to the initial one. It then means that the corresponding action should be the easiest one, as it is close to the current state ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 40
  41. 41. Driver Analysis for Product 10 Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 41
  42. 42. Driver Analysis for Product 5 Let’s compute the same Driver Analysis for Product 5 Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 42
  43. 43. Driver Analysis for Product 5 Plan Introduction Bayesian Networks Applications ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 43
  44. 44. Contact Plan Address Introduction BAYESIA SA 6 rue Léonard de Vinci BP0119 Bayesian 53001 LAVAL Cedex Networks France Application Contact Dr. Lionel JOUFFE Managing Director / Cofounder Tel.: +33(0)243 49 75 58 Mobile: +33(0)607 25 70 05 Fax: +33(0)243 49 75 83 ©2009 Bayesia SA All rights reserved. Forbidden reproduction in whole or part without the Bayesia’s express written permission 44
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×