SlideShare a Scribd company logo
Modeling Vehicle Choice and Simulating Market
Share with Bayesian Networks

A case study about predicting the U.S. market share of the Porsche Panamera
using the Bayesia Market Simulator


White Paper 2010/II

Stefan Conrady, stefan.conrady@conradyscience.com

Dr. Lionel Jouffe, jouffe@bayesia.com

December 18, 2010




Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting
Simulating Market Share with the Bayesia Market Simulator




Table of Contents

Modeling Vehicle Choice and Simulating Market Share with Bayesian Net-
works
    Abstract/Executive Summary                                            1
    Objective                                                             1
    About the Authors                                                     2
        Stefan Conrady                                                    2
        Lionel Jouffe                                                     2
    Acknowledgements                                                      2

  Introduction                                                            2
    Bayesian Networks for Choice Modeling                                 3

  Case Study                                                              4
        Porsche Panamera                                                  4
        Common Forecasting Practices                                      6

  Tutorial                                                                6
    Data Preparation                                                      6
        Consumer Research                                                 6
        Variable Selection                                                7
        Set of Choice Alternatives                                        7
        Filtered Values (Censored States)                                 7
    Data Modeling                                                         8
        Data Import                                                       8
        Missing Values                                                    9
        Discretization                                                   10
        Variable Classes and Forbidden Arcs                              12
        Unsupervised Learning                                            13

  Simulation                                                             14
        Product Scenario Baseline                                        14
    Product Scenario Simulation                                          16
        Substitution and Cannibalization                                 19
    Market Scenario Simulation                                           20


Conrady Applied Science, LLC - www.conradyscience.com
                    i
Simulating Market Share with the Bayesia Market Simulator



  Limitations                                               20

  Outlook                                                   20

  Summary                                                   21

  Appendix                                                  22
    Utility-Based Choice Theory                             22
        Multinomial Logit Models                            22
    Stated Preference Data                                  23
    Revealed Preference Data                                23
    NVES Variables                                          23

  References                                                25

  Contact Information                                       26
        Conrady Applied Science, LLC                        26
        Bayesia SAS                                         26

  Copyright                                                 26




Conrady Applied Science, LLC - www.conradyscience.com
       ii
Simulating Market Share with the Bayesia Market Simulator




                                                              This innovative approach is explained step-by-step in a
Modeling Vehicle Choice                                       study about the introduction of the new Porsche Panam-
and Simulating Market                                         era in the U.S. market. The results con rm that market
                                                              share simulation with Bayesian networks is feasible even
Share with Bayesian                                           in niche markets that provide relatively few observa-
                                                              tions.
Networks
                                                              We believe that making this method and the tools acces-
                                                              sible to practitioners is an important contribution to
Abstract/Executive Summary                                    real-world marketing. We are con dent that for many
We present a new method and the associated work ow            companies this approach can yield a step-change in their
for estimating market shares of future products based         forecasting ability.
exclusively on pre-introduction data, such as syndicated
studies conducted prior to product launch. Our ap-            Objective
proach provides a highly practical, fast and economical       This tutorial is intended for marketing practitioners, who
alternative to conducting new primary research.               are exploring the use of Bayesian network for their
                                                              work. The example in this tutorial is meant to illustrate
With Bayesian networks as the framework, and by em-           the capabilities of BayesiaLab with a real-world case
ploying the BayesiaLab and Bayesia Market Simulator           study and actual consumer data. Beyond market re-
software packages, this approach helps market research-       searchers, analysts in many elds will hopefully nd the
ers and product planners to reliably perform market           proposed methodology valuable and intuitive. In this
share simulations on their desktop computers1 , which         context, many of the technical steps are outlined in great
would have been entirely inconceivable in the past.           detail, such as data preparation and the network learn-




    Market Share Simulation Work ow with BayesiaLab and Bayesia Market Simulator


                                                                   Scenario
      Market Data
        from Survey
                                                                  De nition
                                                                  from Analyst



                                                                                                 Projection
                                 Market Model                     Simulation
         Modeling
                                Bayesian Network                Bayesia Market
        BayesiaLab
                                                                   Simulator
                                                                                                Market Shares




1   BayesiaLab and Bayesia Market Simulator can run on a wide range of operating systems, including Windows, OS X,
Linux/Unix, etc.


Conrady Applied Science, LLC - www.conradyscience.com                                                                 1
Simulating Market Share with the Bayesia Market Simulator



ing, as they are applicable to research with BayesiaLab in        Bayesian networks. BayesiaLab enjoys broad acceptance
general, regardless of the domain.                                in academic communities as well as in business and in-
                                                                  dustry. The relevance of Bayesian networks, especially in
This paper is part of a series of tutorials, which are ex-
                                                                  the context of market research, is highlighted by
ploring a broad range of real-world applications of
                                                                  Bayesia’s strategic partnership with Procter & Gamble,
Bayesian networks.                                                who has deployed BayesiaLab globally since 2007.

About the Authors                                                 Acknowledgements
                                                                  Strategic Vision, Inc.2 (SVI) has generously made their
Stefan Conrady
Stefan Conrady is the co-founder and managing partner             2009 New Vehicle Experience Survey available as a data

of Conrady Applied Science, LLC, a privately held con-            source for this case study. In this context, special thanks
                                                                  go to Alexander Edwards, President, Automotive Divi-
sulting rm specializing in knowledge discovery and
probabilistic reasoning with Bayesian networks. In 2010,          sion of Strategic Vision.

Conrady Applied Science was appointed the authorized
                                                                  We would also like to thank Jeff Dotson3, John Fitzger-
sales and consulting partner of Bayesia SAS for North
                                                                  ald4 and Frank Koppelman5 for their ongoing coaching
America. Stefan Conrady has many years of marketing,              and their valuable comments on this paper. However, all
product planning and market research experience with
                                                                  errors remain the responsibility of the authors.
Mercedes-Benz, BMW Group, Rolls-Royce Motor Cars
and Nissan. In the context of these management assign-            Finally, Kenneth Train’s6 books and articles have been
ments, Stefan has been based in Europe, North America             very helpful over the years as we explored the eld of
and Asia.                                                         consumer choice modeling.

Lionel Jouffe                                                     Introduction
Dr. Lionel Jouffe is co-founder and CEO of France-based
                                                                  For the vast majority of businesses, market share is a key
Bayesia SAS. Lionel Jouffe holds a Ph.D. in Computer
                                                                  performance indicator. Market share is used as a metric
Science and has been working in the eld of Arti cial
                                                                  that allows comparing competitive performance inde-
Intelligence since the early 1990s. He and his team have
                                                                  pendently from overall market size and its uctuations.
been developing BayesiaLab since 1999 and it has
emerged as the leading software package for knowledge             In the product planning process, the expected market
discovery, data mining and knowledge modeling using               share is critical, along with the overall market forecast,




2   www.strategicvision.com
3   Assistant Professor of Marketing, Vanderbilt University, Owen Graduate School of Management.
4   President, Fitzgerald Brunetti Productions, Inc., New York.
5   Professor Emeritus, Professor Emeritus of Civil and Environmental Engineering, Robert R. McCormick School of En-
gineering and Applied Science, Northwestern University.
6   Adjunct Professor of Economics and Public Policy, University of California, Berkeley.


Conrady Applied Science, LLC - www.conradyscience.com                                                                      2
Simulating Market Share with the Bayesia Market Simulator



as together they de ne the sales volume expectation,              “oracles” that allow us to “deliberately reason about the
which, for obvious reasons, is a key element in most              consequences of actions we have not yet taken.” 8
business cases.
                                                                  Bayesian Networks for Choice Modeling
As a result, it is critical for decision makers to correctly      Using Bayesian networks9 as the general framework for
predict the future market shares of products not yet de-          modeling a domain or system has many advantages,
veloped. The task of such market share forecasts typi-            which Darwiche (2010) summarizes as follows:
cally falls into marketing and market research depart-
ments, who are mostly closely involved with understand-           • “Bayesian networks provide a systematic and localized
ing consumer behavior and, more speci cally, the                    method for structuring probabilistic information
product choices they make.                                             about a situation into a coherent whole […]”

If we fully understood the consumer’s decision making             • “Many applications can be reduced to Bayesian net-
process and observed all components of it, we could                 work inference, allowing one to to capitalize on Bayes-
simply generate a deterministic model for predicting                   ian network algorithms instead of having to invent
future consumer choices. However, we do not and it is                  specialized algorithms for each new application.”
obvious that many elements contributing to a consumer’s
                                                                  Given the very attractive properties of Bayesian net-
purchase decision are inherently unobservable. Despite
                                                                  works for representing a wide range of problem do-
our limited comprehension of the true human choice
                                                                  mains, it seems appropriate applying them for choice
process, there are a number of tools that still allow mod-
                                                                  modeling as well. In particular, the BayesiaLab software
eling consumer choice with what is observable, and ac-
                                                                  package has made it very convenient to automatically
counting for what will remain unknowable. In this con-
                                                                  machine-learn fairly large and complex Bayesian net-
text, and based on the seminal works of Nobel-laureate
                                                                  works from observational data.
Daniel McFadden7, choice modeling has emerged as an
important tool in understanding and simulating con-               Beyond the convenience and speed of estimating Bayes-
sumer choice.                                                     ian networks with BayesiaLab, there are three fundamen-
                                                                  tal differences in modeling consumer choice with Bayes-
Such choice models serve a representation of the “real
                                                                  ian networks compared to traditional discrete choice
world” and thus become, what Judea Pearl likes to call
                                                                  models.10




7   Daniel McFadden received, jointly with James Heckman, the 2000 Nobel Memorial Prize in Economic Sciences;
McFadden’s share of the prize was “for his development of theory and methods for analyzing discrete choice”.
8   A recurring quote from Judea Pearl’s many lectures on causality.
9   A Bayesian network is a graphical model that represents the joint probability distribution over a set of random vari-
ables and their conditional dependencies via a directed acyclic graph (DAG). For example, a Bayesian network could
represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to
compute the probabilities of the presence of various diseases. A very concise introduction to Bayesian networks can be
found in Darwiche (2010).
10   A very brief overview about utility-based choice models is provided in the appendix.


Conrady Applied Science, LLC - www.conradyscience.com                                                                       3
Simulating Market Share with the Bayesia Market Simulator



1.     Whereas utility-based choice models, such as multi-         As a result we obtain a choice probability as a function
       nomial logit models (MNL), will “ atten” the vector         of product and consumer attributes.
       of attribute utilities into a single scalar value, Bayes-
                                                                   In order to obtain a product’s projected market share, we
       ian networks do not inherently restrict all the di-
                                                                   then need to simulate choice probabilities across all
       mensions relating to choice. For example, learning a
       Bayesian network on observed vehicle choices might          product scenarios and across all individuals in the popu-
                                                                   lation under study. For this speci c purpose Bayesia SAS
       reveal that fuel economy and vehicle price are sub-
                                                                   has developed the Bayesia Market Simulator, which uses
       ject to tradeoff, while safety is a nonnegotiable basic
       requirement for the consumer. Correctly recognizing         the Bayesian networks generated by BayesiaLab. Both
                                                                   tools will play a central role in this case study.
       such dynamics are obviously critical for making
       predictions about future consumer choices.
                                                                   Case Study
2.     Bayesian networks are nonparametric and therefore           To illustrate the entire market share estimation process
       do not require the speci cation of a functional form.       with Bayesian networks, we have derived a case study
       No assumptions need to made regarding the form of           from the U.S. auto industry. More speci cally, we will
       links between variables. Potentially nonlinear pat-         model consumer choice behavior in the high-end vehicle
       terns are therefore not an issue for model estimation       market based on 2009 survey data. This is an interesting
       or simulation.                                              point in time, as it precedes the launch of the new Por-
                                                                   sche Panamera in model year 2010 (MY 2010), which
3.     Bayesian networks are inherently probabilistic and
                                                                   will be the focus of our study.
       as such there is no need to specify an error term. An
       error would be needed in a traditional choice model         Porsche Panamera
       to make it non-deterministic.

4.     In BayesiaLab all computations are natively discrete
       and therefore no transformation functions, such as
       logit or probit, are needed. Given that we are deal-
       ing with discrete consumer choices, this all-discrete
       approach is an advantage.

For our case study we use BayesiaLab 5.0 Professional
Edition to learn a Bayesian network from consumer
choices in the form of stated preference (SP) or revealed
                                                                   After the highly successful Cayenne, a four-door luxury
preference (RP) data.11 ,12 The learned Bayesian network
                                                                   SUV, the Panamera is Porsche’s second vehicle with four
allows us to compute the posterior probability distribu-
                                                                   doors. Clearly in uenced by the legendary 911’s styling,
tion in each choice situation, including hypothetical
                                                                   the Panamera is offers sports-car looks and performance
product alternatives (and even hypothetical consumers).
                                                                   while comfortably accommodating four passengers. It




11   The properties of Stated Preference (SP) and Revealed Preference (RP) data are explained in the appendix.
12   Although we focus here exclusively on machine-learning consumer behavior, within the BayesiaLab framework we
can also utilize expert knowledge about consumer behavior. For instance, vehicle dealers and their salespeople will have
extensive knowledge about how consumer behave in the showroom. A special Knowledge Elicitation module in
BayesiaLab can formally capture such expertise and build a new Bayesian network from it or augment an existing one.
Knowledge Elicitation with BayesiaLab will be the subject of a separate tutorial to be published in the near future.


Conrady Applied Science, LLC - www.conradyscience.com                                                                     4
Simulating Market Share with the Bayesia Market Simulator



enters a segment with well-established contenders, such        Beyond these traditional premium sedans, there are a
the Mercedes-Benz    S-Class13 ,
                              the BMW       7-series14   and   number of less conventional products that one can as-
the Audi A815, shown below in that order.                      sume to be in the Panamera’s competitive eld as well.
                                                               The coupe-like Mercedes-Benz CLS16 would probably
                                                               fall into this category.




                                                               Finally, the new Panamera may draw customers away
                                                               from Porsche’s own product offerings, such as the Cay-
                                                               enne17 , an effect that is often referred to as “product
                                                               substitution” or “product cannibalization.”




                                                               It is not our intention to speculate about potential
                                                               product interactions, but rather to attempt learning from




13   MY 2010 shown
14   MY 2009 shown
15   MY 2009 shown
16   MY 2010 shown
17   MY 2009 shown


Conrady Applied Science, LLC - www.conradyscience.com                                                                 5
Simulating Market Share with the Bayesia Market Simulator



revealed consumer behavior in a very formal way with                Tutorial
Bayesian networks.
                                                                    In this tutorial we will explain each step from data
In order not to prematurely restrict our consumer choice            preparation to market share simulation using BayesiaLab
set, we have de ned a broad set of competitors for our              and Bayesia Market Simulator, according to the follow-
purposes and included all non-domestic luxury vehicles18            ing outline:
(including Light Trucks) priced above $75,000.19
                                                                    1. Data preparation (external)
What was certainly a very real task for Porsche’s product
                                                                    2. BayesiaLab:
planning team in recent years, i.e. predicting the Panam-
era market share, now becomes the topic of our case                      a. Data import
study and tutorial. Our objective is to predict what mar-
ket share the Panamera will achieve without conducting                   b. Data modeling
any new research, strictly using RP data from before the
                                                                    3. Baseline product scenario generation (external)
product launch.
                                                                    4. Bayesia Market Simulator:
Common Forecasting Practices
Although we have no knowledge of the speci c forecast-                   a. Network import
ing methods at Porsche, we know from industry experi-
ence that volume and market share forecasts are often                    b. De nition of scenarios

determined through a long series of negotiations20 be-
                                                                         c. Market share simulation
tween stakeholders, typically with an optimistic market-
ing group on one side and a skeptical CFO on the other.             Notation
While expert consensus may indeed be a reasonable heu-
ristic for business planning, the lack of forecasting for-          To clearly distinguish between natural language,
malisms is often justi ed by saying that forecasting is at          software-speci c functions and study-speci c variable
least as much art as it is science.                                 names, the following notation is used:

The authors believe strongly that there is great risk in            • BayesiaLab and Bayesia Market Simulator functions,
relying too heavily on “art”, which is inherently non-                keywords, commands, etc., are shown in bold type.
auditable, and have therefore been pursuing easily trac-
                                                                    • Variable/node names are capitalized and italicized.
table, but scienti cally sound methods to support mana-
gerial decision making, especially in the context of fore-          Data Preparation
casting. With this in mind, this very formal and struc-
tured forecasting exercise was consciously chosen as the            Consumer Research
topic of the tutorial.                                              This tutorial utilizes the 2009 New Vehicle Experience
                                                                    Survey, a syndicated study conducted annually by Strate-
                                                                    gic Vision, Inc., which surveys new vehicle buyers in the




18   We followed the SVI segmentation and included “Luxury Car”, “Premium Coupe”, “Premium Convertible/Roadster”
and “Luxury Utility” in our selection.
19   The $75,000 threshold was chosen as it marks the lower end of the Panamera price range.
20   As an interesting aside, these negotiations are usually Markovian in nature, i.e. the starting point of today’s negotia-
tion only depends on the outcome of the previous negotiation.


Conrady Applied Science, LLC - www.conradyscience.com                                                                           6
Simulating Market Share with the Bayesia Market Simulator



U.S. This study is widely used in the auto industry and it         cles actual buyers did consider and which vehicles they
serves one of the primary market research tools. NVES              disposed in the context of their most recent purchase.23
contains over 1,000 variables and close to 200,000 re-
                                                                   As mentioned in the case study introduction, we included
spondent records. In large auto companies, hundreds of
                                                                   “Luxury Car”, “Premium Coupe”, “Premium
analysts typically have access to NVES, most often
through the mTAB interface provided by Productive Ac-              Convertible/Roadster” and “Luxury Utility” 24 in the
                                                                   choice set and we further restricted it by excluding all
cess, Inc. (PAI).21
                                                                   domestic vehicles and vehicles priced below $75,000. For
Variable Selection                                                 this segment of assumed Panamera competitors we have
Compared to traditional statistical models, Bayesian               approximately 1,200 unweighted observations in the
networks require much less “care” in terms of variable             2009 NVES, which, on a weighted basis, re ect ap-
selection, as overparameterization is generally not an             proximately 25,000 vehicles purchased in 2009.
issue. So, although we could easily start with all 1,000+
                                                                   Filtered Values (Censored States)
variables, for expositional clarity we will initially select
                                                                   Although in BayesiaLab we can be less rigorous regard-
only about 50 variables22 from the following categories,
                                                                   ing the maximum number of variables, we still need to
which we assume to capture relevant characteristics of
both the consumer and the product:                                 be conscious of the information contained in them.

                                                                   For instance, we need to distinguish unobserved values
1. Vehicle/product attributes, e.g. brand, segment, num-
                                                                   from non-existing values, although at      rst glance both
     ber of cylinders, transmission, drive type, etc.
                                                                   appear to be “simple” missing values in the database.
2. Consumer demographics, e.g. age, income, gender, etc.           BayesiaLab has a unique feature that allows treating
                                                                   non-existing values as Filtered Values or Censored States.
3. Vehicle-related consumer attitudes, e.g. “I want to
     look good when driving my vehicle”, “I want a basic,
     no-frills vehicle that does the job,” etc.                    To explain Filtered Values we need to resort to an auto-
                                                                   motive example from outside our speci c study. We as-
Set of Choice Alternatives
                                                                   sume that we have two questions about trailer towing.
Beyond variable selection, we must also de ne the set of
                                                                   We rst ask, “do you use your vehicle for towing?”, and
choice alternatives and assume which vehicles a potential          then, “what is the towing weight?” If the response to the
Panamera customer would consider. Not only that, but
                                                                    rst question is “no”, then a value for the second one
we also need to make sure that all choice alternatives for
                                                                   cannot exist, which in BayesiaLab’s nomenclature is a
the Panamera’s choice alternatives are included. For in-           Filtered Value or Censored State. We actually must not
stance, if we included the Porsche Cayenne in the choice
                                                                   impute a value for towing weight in this case and instead
set, then the Mercedes-Benz M-Class and the BMW X5
                                                                   Filtered Value code will indicate this special condition.
should be included too, and so on. One might argue that
the vehicle purchase might be an alternative to a kitchen          On the other hand, a respondent may answer “yes”, but
renovation or the purchase of a boat. Expert knowledge             then fail to provide a towing weight. In this case, a true
is clearly required at this point as to how far to expand          value for the towing weight exists, but we cannot ob-
the choice set. Furthermore, SVI’s NVES can also help us           serve it. Here it is entirely appropriate to impute a miss-
in this regard as it contains questions about what vehi-


21   www.paiwhq.com
22   A list of all variables used is given in the appendix. It should be noted that even 50 variables would create a major
computational challenge with MNL models.
23   Martin Krzywinski’s visualization tool, Circos, is highly recommended for the interpretation of cross-shopping behav-
ior: www.mkweb.bcgsc.ca/circos/
24   According to SVI’s segment de nition.


Conrady Applied Science, LLC - www.conradyscience.com                                                                          7
Simulating Market Share with the Bayesia Market Simulator



ing value, as we will explain as part of the Data Import
procedure.

To indicate Filtered Values to BayesiaLab, we will need
to apply a study-speci c logic and recode the relevant
variables in the original database. Most statistical soft-
ware package have a set of functions for this kind of
task.

For example, in STATISTICA this can be done with the
Recode function.



                                                               The table displayed in the Data Import wizard shows the
                                                               individual variables as columns and the respondent re-
                                                               cords as rows. There are a number of options available,
                                                               such as for Sampling. However, this is not necessary in
                                                               our example given the relatively small size of the data-
                                                               base.

                                                               Clicking the Next button prompts a data type analysis,
                                                               which provides BayesiaLab’s best guess regarding the
                                                               data type of each variable.

Alternatively, this recoding logic can also be expressed       Furthermore, the Information box provides a brief sum-
with the following pseudo code:                                mary regarding the number of records, the number of
                                                               missing values, ltered states, etc.
IF towing=yes THEN towing weight=unchanged

IF towing=no THEN towing weight=FV (Filtered Value)

A simple Excel function will achieve the same and it is
assumed that the reader can implement this without fur-
ther guidance.

Although Filtered Values are very important in many
research contexts, hence the emphasis here, our case
study does not require using them.

Data Modeling

Data Import
To start the analysis with BayesiaLab, we rst import the
                                                               For this example, we will need to override the default
database, which needs to be formatted as a CSV le.25
                                                               data type for the Unique Identi er variable, as each
With Data>Open Data Source>Text File, we start the
                                                               value is a nominal record identi er rather than a numeri-
Data Import wizard, which immediately provides a
                                                               cal scale value. We can change the data type by highlight-
preview of the data le.
                                                               ing the Unique Identi er column and clicking the Row


25   CSV stands for “comma-separated values”, a common format for text-based data les. As an alternative to this im-
port format, BayesiaLab offers a JDBC connection, which is practical when accessing large databases on servers.


Conrady Applied Science, LLC - www.conradyscience.com                                                                  8
Simulating Market Share with the Bayesia Market Simulator



Identi er check box, which changes the color of the           of discrete distributions, means-imputation typically also
Unique Identi er column to beige.                             introduces a bias. There are other, better techniques,
                                                              which typically demand signi cant computational effort
Although it is not imperative to maintain a Row Identi-
                                                              and thus often turn out like a labor-intensive standalone
 er, and we could instead assign the Not Distributed
                                                              project rather than being just a preparatory step.
status to the Unique Identi er variable, it can be quite
helpful for nding individual respondent records at a          Without going into too much detail at this point,
later point in the analysis.                                  BayesiaLab can estimate all missing values given the
                                                              learned network structure using the Expectation Maxi-
As the respondent records in the NVES survey are              mization (EM) algorithm. As a result, we obtain a com-
weighted, we need to select the Weight by clicking on the
                                                              plete database without “making things up.” In tradi-
Combined Base Weight variable, which will turn the
                                                              tional statistics, the equivalent would be to say that nei-
column green.
                                                              ther the mean nor the variance of the variables is af-
                                                              fected by the imputation process.

                                                              Continuing in our data import process, the next screen
                                                              provides options as to how to treat the missing values.
                                                              Clicking the small upside-down triangle next to the vari-
                                                              able names brings up a window with key statistics of the
                                                              selected variable, in this case Age Bracket.




Missing Values
In the context of data import, it is important to point out
how missing values are treated in BayesiaLab. The na-
tive, automatic processing of missing values reveals a
particular strength of BayesiaLab.

In traditional statistical analysis, the analyst has to
choose from a number of methods to handle missing
                                                              The very basic functions of ltering, i.e. case-wise dele-
values in a database, but unfortunately many of them
                                                              tion, and mean/modal value imputation are available.
have serious drawbacks. Perhaps the most common
                                                              However, at this point, we can take advantage of
method is case-wise deletion, which simply excludes re-
                                                              BayesiaLab’s advanced missing values processing algo-
cords that contain any missing values. Casually speaking,     rithms. We will select Dynamic Completion, which will
this means throwing away lots of good data (the non-
                                                              continuously “ ll in” and “update” the missing values
missing values) along with the bad (the missing values).
                                                              according to the conditional distribution of the variable,
Another method is means-imputation, by which any
                                                              as de ned by the current structure of the networks.
missing value is lled in with the variable’s mean. Inevi-
                                                              However, as our network is not yet connected and hence
tably, this reduces the variance of the variable and thus
                                                              does not have a structure, BayesiaLab will draw from the
has an impact on its summary statistics, which is clearly
undesirable considering the intended analysis. In the case




Conrady Applied Science, LLC - www.conradyscience.com                                                                  9
Simulating Market Share with the Bayesia Market Simulator



marginal distribution of each variable to “tentatively”
establish placeholder values for each missing value.

A screenshot from STATISTICA, where we have done
most of the preprocessing, shows the marginal distribu-
tion of the Age Bracket variable in the form of a
histogram.26




                                                                     By clicking on the Type drop-down menu, the choice of
                                                                     discretization algorithms appears.




                                                                     Selecting Manual will show a cumulative graph of the
The missing Age Bracket values will be drawn from this
                                                                     Purchase Price distribution, and we can see that it ranges
marginal distribution and are used as placeholders, until
                                                                     from $75,000 to $180,000.28
we can use the structure of the Bayesian network to rees-
timate our missing values. As Dynamic Completion im-
plies, BayesiaLab performs this on continuous basis in
the background, so at any point we would have the best
possible estimates for the missing values, given the cur-
rent network structure.

Discretization
The next step is the Discretization and Aggregation dia-
logue, which allows the analyst to determine the type of
discretization, which must be performed on all continu-
ous variables.27 We will use the Purchase Price variable
to explain the process. Highlighting a variable will show
the default discretization algorithm while the graph
panel is initially blank.                                            We could now manually select binning thresholds by
                                                                     way of point-and-click directly on the graph panel. This




26   The normal curve in the histogram is just for illustration purposes. BayesiaLab always uses the actual discrete distri-
bution, not a parametric approximation.
27   BayesiaLab requires discrete distributions for all variables.
28   $75,000 was previously selected as the lower boundary for this particular vehicle segment. $180,000 was the highest
reported price in NVES.


Conrady Applied Science, LLC - www.conradyscience.com                                                                       10
Simulating Market Share with the Bayesia Market Simulator



might be relevant, if there were government regulations
in place with speci c vehicle price thresholds.29

For our purposes, however, we want to create price cate-
gories that are meaningful in the context of our vehicle
segment and ve bins may seem like a reasonable start-
ing point.
                                                                 The resulting bins appear much more suitable to describe
Clicking Generate Discretization will prompt us to select        our domain.
the type of discretization and the number of desired in-
tervals. Without having a-priori knowledge about the
distribution of the Price variable, we may want to start
with the Equal Distances algorithm.




The resulting view shows the generated intervals and by
clicking on the interval boundaries we can see the per-
                                                                 We will proceed similarly with the only other continuous
centage of cases falling into the adjacent intervals.
                                                                 variable in the database, i.e. Age Bracket.


                                                                  Note

                                                                  For choosing discretization algorithms beyond this
                                                                  example, the following rule of thumb may be helpful:

                                                                  • For supervised learning, choose Decision Tree.

                                                                  • For unsupervised learning, choose, in the order of
                                                                    priority, K-Means, Equal Distances or Equal
                                                                    Frequencies.



                                                                 Clicking Finish completes the import process and 49
                                                                 variables (columns) from our database are now shown as
                                                                 blue nodes in the Graph Panel, which is the main win-
We learn from this that our bottom two intervals contain         dow for network editing.
89% of the cases, whereas the top two intervals contain
just under 5% of the cases. This suggests that we may
not have enough granularity to characterize the bulk of
the market towards the bottom end of the price spec-
trum. Perhaps we also have too few cases within the top
two intervals. So we will generate a new discretization,
now with four intervals, and select KMeans as the type
this time.

29   The now-expired luxury tax for passenger cars in the U.S. would be an example for such a policy.


Conrady Applied Science, LLC - www.conradyscience.com                                                                    11
Simulating Market Share with the Bayesia Market Simulator



                                                                  we are in P(Age < 45 | Number of children under 6
                                                                  = 2). Hence we focus the learning algorithm on the
                                                                  area of interest, i.e. product attributes vis-à-vis mar-
                                                                  ket attributes.

                                                             2.   We must not learn the dependencies between the
                                                                  product variables themselves because they would
                                                                  simply re ect today’s product offerings and their
                                                                  contingencies, e.g. P(Vehicle Segment=“4-door se-
                                                                  dan” | Brand=“Porsche”)=0. We do want to under-
                                                                  stand what is available today, but we certainly do
                                                                  not want to encode today’s product scenarios as
The six nodes on the far left column re ect product at-
                                                                  constraints in the network. Instead, we want to be
tributes (green), the second-from-left column shows ten
                                                                  able to introduce new scenarios, which are not
demographic attributes (yellow) and all remaining nodes
                                                                  available today.
to the right represent 33 vehicle-related attitudes (red).
This initial view represents a fully unconnected Bayesian    To focus learning in a speci c area, we need to take an
network.                                                     indirect approach and tell BayesiaLab “what not to
                                                             learn.” So, to prevent the algorithm from learning the
Also, to simplify our nomenclature, we will combine the
                                                             product-to-product variable relationships, we will “for-
demographic attributes (yellow) and the vehicle-related
                                                             bid” such arcs.
attitudes (red) and refer to them together as “Market”
variables (now all red).                                     We rst create a Class by highlighting all product nodes
                                                             then right-clicking them. From the menu, we then select
                                                             Properties>Classes>Add.




Variable Classes and Forbidden Arcs
One is now tempted to immediately start with Unsuper-
vised Learning to see how all these variables relate to
each other.

However, there are two reasons why we need to intro-
duce another step at this point:

1.   Our mission is to model the interactions between
                                                             When prompted for a name, we can choose something
     products variables on the one side and market vari-
                                                             descriptive, so we give this new Class class the label
     ables on the other, so we can see the consumer re-
                                                             “Product”.
     sponse to products. For instance, we are more inter-
     ested in learning P(Transmission= “Manual” | Atti-
     tude = “Driving is one of my favorite things”) than


Conrady Applied Science, LLC - www.conradyscience.com                                                                 12
Simulating Market Share with the Bayesia Market Simulator



                                                               As a result, these Forbidden Arc relationships will appear
                                                               in the Forbidden Arc Editor and will remain there unless
                                                               we subsequently choose to modify them.




Having introduced this Class of node, we can now very
easily manage Forbidden Arcs. More speci cally, we
want to make all arcs within the Class Products forbid-
den. A right-click anywhere on the Graph Panel opens
up the menu from which we can select Edit Forbidden
Arcs.




                                                               We are also reminded about the presence of Forbidden
                                                               Arcs by the symbol in the lower right corner of the
                                                               screen.




                                                               Unsupervised Learning
                                                               Now that the learning constraints are in place, we con-
                                                               tinue to learn the network by selecting Learning>Asso-
                                                               ciation Discovering>EQ.30



In the Forbidden Arc Editor, we can select the Class
Product both as start and end.




                                                               The resulting network may appear somewhat unwieldy
We now repeat the above steps and also create Forbid-          at rst glance, but upon closer inspection we can see that
den Arcs for the Market variables.                             arcs exist only between Product variables (green) and
                                                               Market variables (red), which is precisely what we in-
                                                               tended by establishing Forbidden Arcs.


30   EQ is one of the unsupervised learning algorithms implemented in BayesiaLab. Koller and Friedman (2009) provide a
comprehensive introduction to learning algorithms.


Conrady Applied Science, LLC - www.conradyscience.com                                                                 13
Simulating Market Share with the Bayesia Market Simulator



                                                                 ing the baseline scenario is described in the following
                                                                 section.

                                                                 Product Scenario Baseline
                                                                 The idea is that all available product con gurations were
                                                                 manifested in the market in 2009 and thus captured in
                                                                 the 2009 NVES.33

                                                                 It still requires careful consideration as to how many
                                                                 Product variables should be included to generate the
                                                                 baseline product scenario. We want to create a type of

However, we will not analyze this structure any further,         coordinate system, that allows us to identify products

but rather use it solely as a statistical device to be used in   through their principal characteristics. For instance, the

the Bayesia Market Simulator. We simply need to save             following attributes would uniquely de ne a “Mercedes-
                                                                 Benz S550 4Matic”:
the network in its native xbl le format, so the Bayesia
Market Simulator can subsequently import it.
                                                                 • Brand=“Mercedes-Benz”

Simulation                                                       • Engine Type=“V8”
With the Bayesia Market Simulator we have the ability
                                                                 • Drive Type=“AWD”
to simulate “alternate worlds” for both the Product
variables as well as for the Market variables. In most           • Transmission=“Automatic”
applications, however, marketing analysts will want to
primarily study new Product scenarios assuming the               • Segment=“High Premium”34
Market remains invariant, meaning that consumer
                                                                 • Price=“>$85,795 AND <= $99,378”
demographics and attitudes remain the same.31
                                                                 Relating consumer attributes and attitudes to these indi-
It will be the task of the analyst to de ne new product
                                                                 vidual product attributes, rather than to the vehicle as a
scenarios, which will need to include all products as-
                                                                 whole, will then allow us to construct hypothetical
sumed to be in the marketplace for the to-be-projected
                                                                 products during our simulation. To stay with the Mer-
timeframe, in our case 2010.32 As many products carry
                                                                 cedes example, we could de ne a new product by setting
over from one year to the next, e.g. from model year
                                                                 the engine type to “V6” and changing the price to “<
2010 to model year 2011, it is very helpful to use the
                                                                 $85,795”.
currently available products as a baseline scenario, upon
which changes can be built. Quite simply, we need to             It is easy to imagine how one can get the number of
take inventory of the product landscape today. In the            permutations to exceed the number of consumers. For
current version of Bayesia Market Simulator this step is         instance, in the High Premium segment, we could further
yet not automated, so a practical procedure for generat-         differentiate between short wheelbase and long wheel-



31   The year-to-year invariance assumption of the market has been challenged by many marketing executives during the
most recent recession. In this context, many media headlines also proclaimed a paradigm shift in consumer behavior.
The authors have believed - then as well as now - that more has remained the same than has changed in terms of con-
sumer attitudes.
32   For expositional simplicity, we make no distinction between model year and calendar year.
33   In our example, we judge this to be a reasonable simpli cation, even though a small number of automobiles at the
very top end of the market, e.g. the Rolls-Royce Phantom, may not be captured in the survey.
34   Using the Strategic Vision segmentation nomenclature, “High Premium” de nes a large four-door luxury sedan.


Conrady Applied Science, LLC - www.conradyscience.com                                                                   14
Simulating Market Share with the Bayesia Market Simulator



base versions, which would increase the number of base-
line product scenarios. We want to nd a reasonable
balance between product granularity and the ratio of
consumers to product scenarios, although we cannot
provide the reader with a hard-and-fast rule.

Pricing is obviously a very important part of the product
scenario con guration and here we are confronted with       This will export all variables and all records, including
the reality that no two customers pay exactly the same      values from previously performed missing value imputa-
for the identical product, and the survey data makes this   tions. The output will be in a semicolon-delimited text
very evident. Furthermore, there are numerous product        le, which can be easily imported into Excel or any sta-
features outside our “coordinate system”, e.g. an op-       tistical application, such as SPSS or STATISTICA. The
tional $6,000 high-end audio system, that would materi-     purpose of loading this into an external application is to
ally affect the price point of an individual vehicle, but   manipulate the database to extract the unique product
which would not move the vehicle into a different cate-     combinations available in the market.
gory from a consumer’s perspective. With options, an
S550 can easily reach a price of over $100,000. Still we    In Excel this can be done very quickly by deleting all
would want such a high-end S550 to be grouped with          columns unrelated to the product con guration, which
the standard S550. Thus it is important to de ne reason-    leaves us with just the product attributes.
able price brackets that cover the price spectrum of each
vehicle and minimize model fragmentation.

During the Data Import stage, BayesiaLab has discre-
tized all continuous numerical values, including Price,
and created discrete states. If these discrete states are
adequate considering the price positioning and price
spectrum of the vehicles under study, we can now lever-
age this existing binning for generating all current
product scenarios and select Data>Save Data.




                                                            In Excel 2010 (for Windows) and Excel 2011 (for Mac),
                                                            there is a very convenient feature, which allows to
                                                            quickly remove all duplicates, which is exactly what we
                                                            want to achieve. We want to know all the unique
                                                            product con gurations currently in the market.



In the subsequently appearing dialogue box, we need to
select Use the States’ Long Name. It is important that
Use Continuous Values is not checked, otherwise we will
lose the discretized states of the Price variable.

                                                            This leaves use with a table of approximately 100 unique
                                                            product scenario combinations available at the time of
                                                            the survey.


Conrady Applied Science, LLC - www.conradyscience.com                                                              15
Simulating Market Share with the Bayesia Market Simulator



To make these unique product scenarios available for        Upon loading we will see the principal interface of the
subsequent use in the Bayesia Market Simulator, we need     Bayesia Market Simulator. On the left panel, all nodes of
to save the table as a semicolon-delimited CSV le. This     the network appear as variables. We will now need to
is important to point out, as most programs will save       separate all variables into Market Variables and Scenario
CSV les by default as comma-delimited les.                  Variables by clicking the respective arrow buttons. In our
                                                            case, the aptly named Market variables are the Market
Product Scenario Simulation                                 Variables in BMS nomenclature and Product variables
Now that we have the Bayesian network describing the        are the Scenario Variables.
overall market (as an xbl le) as well as the baseline
product scenarios (as a csv le), we can proceed to open
the Bayesia Market Simulator.




                                                            All variables must be allocated before being able to con-
                                                            tinue to Scenario Editing. This also implies that Product
                                                            variables, which are not to be included as Scenario Vari-
Clicking File>Open will prompt us to open the xbl net-      ables, must be excluded from the Bayesian network le.
work le we previously generated with BayesiaLab.            If necessary, we will return to BayesiaLab to make such
                                                            edits

                                                            As we are working with RP data, every record in our
                                                            database re ects one vehicle purchase, i.e. “reveals” one
                                                            choice, and therefore we need to leave the Target Vari-
                                                            able and Target State elds blank. These elds would
                                                            only be used in conjunction with SP data, which includes
                                                            a variable indicating acceptance versus rejection.

                                                            Clicking Scenario Editing opens up a new window. We
                                                            can now manually add any product scenarios we wish to
                                                            simulate. Given the potentially large number of scenar-
                                                            ios, it will typically be better to load the baseline product
                                                            scenarios, which were saved earlier.




Conrady Applied Science, LLC - www.conradyscience.com                                                                16
Simulating Market Share with the Bayesia Market Simulator




                                                                Upon successful import, all baseline product scenarios
                                                                will appear in the Scenario Editing dialogue.



We can do that by selecting Offer>Import Offers.




We now select to open the semicolon-delimited CSV le
with the baseline product scenarios. It is very important
that the CSV le is formatted precisely as speci ed, for
instance, without any extra blank lines.

In case there are any import issues, it can be helpful to
review the CSV le in a text editor and to visually in-
spect the formatting.                                           The analyst can now add any new product scenarios or
                                                                delete those products, which are no longer expected to
                                                                be in the market.35 By clicking Add Offer an additional
                                                                scenario will be added at the bottom of the product sce-
                                                                nario list. In the case of long product scenario lists, this
                                                                may require scrolling all the way down.

                                                                Clicking on the product attributes of any scenario
                                                                prompts drop-down menus to appear with the available




35   To maintain expositional simplicity, we have added all Panamera versions for the entire year 2010 and not changed
any other product scenarios. It should be pointed out that the V6 version of the Porsche Panamera was introduced only
in mid-2010. BMW has also launched an additional six-cylinder version of the 7-series as well as AWD variants, which
are not re ected in the simulation. Finally, Jaguar has released a new XJ in 2010, while that year marked the runout of
the old-generation Audi A8.


Conrady Applied Science, LLC - www.conradyscience.com                                                                    17
Simulating Market Share with the Bayesia Market Simulator



attribute states, e.g. RWD or AWD.36 This also allows to       done by associating the original database, from which
change attributes of existing products, according to the       the network was learned, or by creating a new, arti cial
analysts requirements.                                         one that re ects the joint probability distribution of the
                                                               learned Bayesian network.

                                                               The latter can be achieved by selecting Database>Gener-
                                                               ate.




                                                               It is up to the analyst to determine the size of the data-
                                                               base to be generated. Although there is no xed rule, too
                                                               small of a database will limit the observability of prod-
                                                               ucts with a very small market share.




For our case study, we will add the following versions of
the Panamera as new product scenarios:

• Panamera (V6, RWD)
                                                               Alternatively, we can also associate the original database,
• Panamera 4 (V6, AWD)                                         which contains the survey responses. In our case, the
                                                               original database contains 1,203 records, which is very
• Panamera S (V8, RWD)                                         reasonable in terms of computational requirements.

• Panamera 4S (V8, AWD)                                        Once a database is associated, clicking the Simulation
                                                               button will start the market share estimation process.
• Panamera Turbo (V8 Turbo, RWD)

To characterize all of them as large 4-door luxury se-
dans, which is the key distinction versus previous Por-
sche products, we will assign the “High Premium” at-
tribute to them.




Once this is completed, we need to obtain a database
that represents the consumer base, on which these new
product scenarios will be “tried out”. This can either be


36   RWD and AWD stands for rear-wheel drive and all-wheel drive respectively


Conrady Applied Science, LLC - www.conradyscience.com                                                                  18
Simulating Market Share with the Bayesia Market Simulator



                                                                     Simulated High Premium Market Shares ($75,000+)
                                                                                    1%

                                                                              12%
                                                                                            21%
                                                                                                                 Audi
                                                                                                                 BMW
                                                                                                      3%         Jaguar
                                                                                                                 Lexus
                                                                                                10%
                                                                                                                 Mercedes
                                                                                                                 Porsche
                                                                        53%




                                                            As can be seen from the results, the Porsche Panamera’s
                                                            predicted market share appears to be compatible with
                                                            the reported running rate for calendar year 2010, which
                                                            was available at the time of writing. Unfortunately, we
With the given complexity of our network and around         do not know how this compares to Porsche’s expecta-
100 product scenarios, the simulation should take no        tions, but the Panamera seems to be quite successful
longer than 30 seconds on a typical desktop computer.       overall.

Upon completion, the simulation results will appear in      Substitution and Cannibalization
the form of a pie chart and a table. One can go back and    The fully simulated database can also be saved as a
review the scenarios by clicking the Scenario Editing       semicolon-delimited CSV le, which will allow reviewing
button.                                                     the choice probability for each product scenario by indi-
                                                            vidual consumer in a spreadsheet.




                                                            We can literally examine the new, simulated choices
                                                            record-by-record and see which customers have made
                                                            the switch to the Panamera. Applying conditional for-
                                                            matting to the spreadsheet can also be very helpful. The
The aggregated simulated market shares can also be cop-
                                                            above screenshot, for example, shows a selection of ac-
ied from the results table and pasted into Excel or any
other application for further editing and presentation      tual Mercedes buyers, who would either consider or pick
                                                            the Porsche Panamera in this simulation. High choice
purposes. An example is provided below, showing the
                                                            probabilities are shown in shades of red, while near-zero
simulated market shares of the brands under study in the
High Premium segment.                                       probabilities are depicted in dark blue.




Conrady Applied Science, LLC - www.conradyscience.com                                                                  19
Simulating Market Share with the Bayesia Market Simulator



It is equally interesting to examine which Porsche buyers        Upon editing the market segments, the simulation can be
would pick the Panamera over their current vehicle               rerun to obtain the new market share results.
choice.
                                                                 Limitations
                                                                 This approach can simulate product and market scenar-
                                                                 ios consisting of variations of con gurations, which can
                                                                 be observed with suf cient sample today. However, the
                                                                 impact of entirely new technologies cannot be simulated
                                                                 on this basis. As a result, projecting the market share of
                                                                 the all-electric Nissan Leaf38 would not possible, whereas
                                                                 estimating the share of a hypothetical three-row BMW
                                                                 crossover vehicle would be feasible. In all cases, it re-
                                                                 quires the analyst’s expert knowledge and judgment to
                                                                 determine the adequacy and equivalency of product at-
                                                                 tributes observable today.
Not surprisingly, our simulation suggests high probabili-
ties of Panamera choice for several current Cayenne              Outlook
owners. One is tempted to take this a step further and           There exist several natural extensions to the presented
calculate a rate of cannibalization. In this particular sur-     methodology, however it would go beyond the scope of
vey, however, the sample size is too small to attempt do-        this paper to present them. A brief summary shall suf ce
ing so. Otherwise, such a computation would be simple            for now and we will go into greater detail in forthcom-
arithmetic.                                                      ing case studies in this series:

Market Scenario Simulation                                       1.   Beyond learning from data, we can use expert
Although experimenting with product scenarios is ex-                  knowledge to create or augment Bayesian networks.
pected to be the primary use of the Bayesia Market                    BayesiaLab offers a Knowledge Elicitation module,
Simulator, it is also possible to change the market scenar-           which formally captures expert knowledge and en-
ios.                                                                  codes it in a Bayesian network. In absence of market
                                                                      data, this is an excellent approach to have decision
For example, this can be used to simulate the impact of               makers collectively (and formally correct) reason
policy changes. One could hypothesize that legislation                about future states of the world.
would prohibit or severely penalize ownership of vehi-
cles of a certain size or of a speci c engine type in urban      2.   We can extend the concept of product attributes to
areas.37                                                              consumers’ product satisfaction ratings. This will
                                                                      allow estimating the market share impact as a func-
                                                                      tion of changes in consumer ratings. For instance,
                                                                      an automaker could reason about the volume im-
                                                                      pact from a vehicle facelift, which is expected to
                                                                      raise the consumer rating of “styling”.

                                                                 3.   The product cannibalization or substitution rate can
                                                                      be estimated based on the simulated choice behav-
                                                                      ior, given that there is suf cient sample size. So, for
                                                                      most mainstream products, this seems to be realistic.



37   Given the draconian restrictions on motorists in Central London, this example is presumably not very far-fetched.
38   The all-electric Leaf was launched by Nissan in the U.S. in December of 2010.


Conrady Applied Science, LLC - www.conradyscience.com                                                                     20
Simulating Market Share with the Bayesia Market Simulator



4.   With the ability to study consumer choice at the
     model level, we can also aggregate these results to
     the segment level. Alternatively, using a less granular
     approach, we can model the entire market at the
     segment and brand level, which would allow study-
     ing market changes at a larger scale.

5.   Beyond simulating “hard” policy changes affecting
     the market, e.g. excluding a product class from a
     certain geography, we can also use BayesiaLab to
     simulate new populations with small changes in
     average consumer attitudes versus the originally
     surveyed population. For instance, such an arti -
     cially modi ed population could be more environ-
     mentally conscious and one could apply opinions
     prevalent on the West Coast to the whole country.
     Bayesia Market Simulator can then generate new
     market shares based on these new hypothetical
     market conditions.

Summary
BayesiaLab and Bayesia Market Simulator are unique in
their ability to use Bayesian networks for choice model-
ing and market share simulation. The presented work-
  ow provides a comprehensive method for simulating
market shares of future products based on their key
characteristics, without requiring new and costly ex-
periments.

As a result, BayesiaLab and Bayesia Market Simulator
allow using a vast range of existing research for market
share predictions. Given the signi cant resources many
corporations have allocated over many years to conduct-
ing consumer surveys, these BayesiaLab tools offer an
entirely new way to turn the accumulated research data
into practical market oracles.




Conrady Applied Science, LLC - www.conradyscience.com          21
Simulating Market Share with the Bayesia Market Simulator



Appendix                                                         vance how individual product and consumer attributes
                                                                 relate to these unobservable utilities. However, there are
Utility-Based Choice Theory                                      methods that allow us to estimate these unknown vari-

In today’s choice modeling practice, utility-based choice        ables and, based on this knowledge, they allow us to

theory plays a dominant role.                                    predict choice in the future. One such method is brie y
                                                                 highlighted in the following.
1.     The   rst concept of utility-based choice theory is
       that each individual chooses the alternative that         Multinomial Logit Models

       yields him or her the highest utility.                    In the domain of choice modeling, MultiNomial Logit
                                                                 models (MNL) have become the workhorse of the indus-
2.     The second idea refers to being able to collapse a        try, but here we only want to provide a cursory overview,
       vector describing attributes of choice alternatives       so the reader can compare the approach presented in the
       into a single scalar utility value for the chooser. For   case study with current practice.
       instance, a vector of attributes for one choice alter-
       native, e.g. [Price, Fuel Economy, Safety Rating],        MNL models provide a functional form for describing

       would translate into one scalar value, e.g. [5], spe-     the relationship between the utilities of alternatives and

       ci c to each chooser.                                     the probability of choice.

The following example is meant to illustrate both:               For instance, using an MNL model for a choice situation
                                                                 with three vehicle alternatives, Altima, Accord and
For Consumer A:                                                  Camry, the probability of choosing the Altima can be
                                                                 expressed as:
• Utility of Product 1:
  [Price=$25,000, Fuel Economy=25MPG, Safety Rat-                                                     exp(VAltima )
     ing=4 stars] = 7 ✓
                                                                 Pr(Altima) =
                                                                                      exp(VAltima ) + exp(VAccord ) + exp(VCamry )
• Utility of Product 2:
  [Price=$29,000, Fuel Economy=23MPG, Safety Rat-                VAltima in this case stands for the utility of the Altima
  ing=5 stars] = 5.5                                             alternative. The utilities VAltima, VAccord, and VCamry are a
                                                                 function of the product attributes, e.g.
For Consumer B:
                                                                 VAltima = β1 × Cost Altima + β 2 × FuelEconomyAltima + β 3 × SafetyRatingAltima
• Utility of Product 1:
  [Price=$25,000, Fuel Economy=25MPG, Safety Rat-                As we can observe tangible attributes like vehicle cost,
     ing=4 stars] = 4                                            fuel economy and safety rating, and we can also observe
                                                                 who bought which vehicle, we can estimate the unknown
• Utility of Product 2:                                          parameters. Once we have the parameters, we can simu-
  [Price=$29,000, Fuel Economy=23MPG, Safety Rat-                late choices based on new, hypothetical product attrib-
     ing=5 stars] = 7.5 ✓                                        utes, such as a better fuel economy for the Altima or a
                                                                 lower price for the Camry.
This concept implies that consumers make tradeoffs,
either explicitly or implicitly, and that there exists an        The parameters of MNL models can be estimated both
amount x of “Fuel Economy” that is equivalent in utility         from “stated preference” (SP) data, i.e. asking consumers
to an amount y of “Safety”. The reader may reasonably            about what they would choose, and “revealed prefer-
object that not even a fuel economy of 100MPG would              ence” (RP) data, i.e. observing what they have actually
make it acceptable to drive a vehicle that is rated very         chosen. There are numerous variations and extensions
poorly on safety.                                                to the class of MNL models and the reader is referred to
                                                                 Train (2003) and Koppelman (2006) for a comprehen-
Also, we do not know a priori what the utility values are        sive introduction.
nor can we measure them. Neither do we know in ad-



Conrady Applied Science, LLC - www.conradyscience.com                                                                                        22
Simulating Market Share with the Bayesia Market Simulator



Stated Preference Data                                         cal for a much broader audience. Although ELM has
Stated preference data typically comes from experiments,       successfully removed the burden of manual coding,
i.e. consumer surveys or product clinics. In this context,     countless iterations of speci cation and estimation re-
conjoint experiments have become a very popular choice         main a very time-consuming task of the analyst.
elicitation method and a wide range of tools have been
                                                               NVES Variables
developed for this particular approach. In conjoint stud-
                                                               The following variables from the 2009 Strategic Vision
ies, consumers would typically be given a set of arti -
cially generated product choices along with their attrib-      NVES were included this case study:

utes, from which preference responses are then elicited.
                                                               • UNIQUE IDENTIFIER
There are many variations of this method that all at-
tempt to address some of the inherent challenges related       • Combined Base Weight
to dealing with responses to hypothetical questions.
                                                               • New Model Purchased - Make/Model/Series (Alpha
The Sawtooth software package has become de-facto                Order)
industry standard for such conjoint studies.39
                                                               • New Model Purchased - Brand
Revealed Preference Data
                                                               • New Model Purchased - Region Origin
In contrast to SP data, revealed preference data is purely
derived from passive observations. As the name implies,        • New Model Segment
the consumer choice is revealed by their actual behavior
rather than by their stated intent in a hypothetical situa-    • Segmentation 2
tion. A key bene t is that it is typically easier and more
                                                               • Type Of Transmission
economical to obtain passive observations than to con-
duct formal experiments. A conceptual limitation of RP         • Number Of Cylinders (VIN)
data relates to the fact that non-yet-existing products can
obviously not be chosen by consumers in the present            • Drive Type (VIN)
market environment. Thus simulating market shares of
                                                               • Fuel Type
hypothetical products requires “assembling” them from
components and attributes of products, which are al-           • Gender
ready available in the market. This inherently limits the
exploration of entirely new technologies, which have           • Marital Status
little in common with the technologies they may replace.
                                                               • Age Bracket
Studies based on RP data have become very popular for
                                                               • Children Under 6
researching travel mode choice, as is also documented in
a large body of research. In market research related to        • Children 6 To 12
CPG products or durable goods, using RP data is some-
what less common.                                              • Children 13 To 17


We speculate that one of the reasons for the lack of           • Total Family Pre-Tax Income
popularity outside the world of academia is the absence
                                                               • Ethnic Group
of easy-to-use software packages. Only recently, with the
release of Easy Logit Modeling (ELM)40 , specifying and        • Location Of Residence
estimating multinomial logit models has become practi-


39   A wide range of tools is available from Sawtooth Software, Inc., www.sawtoothsoftware.com.
40   Easy Logit Modeling is available from ELM-Works, Inc., www.elm-works.com. ELM can estimate models based on
both RP and SP data, although we only mention it in the RP context.


Conrady Applied Science, LLC - www.conradyscience.com                                                              23
Simulating Market Share with the Bayesia Market Simulator



• Customer Region Classi cation #1                             • My choice of vehicle re ects my personality

• I Seek Variety in My Life                                    • I want a vehicle that says a lot about my success in life
                                                                 / career
• I'm Curious and Open to Experiences
                                                               • I will switch brand for features or price
• Luxury is Not Important Unless it Has Purpose
                                                               • There are lots of different brands of vehicles that I
• I Enjoy Expressing Myself Creatively                           would consider buying

• I See Life as Full of Endless Possibilities
                                                               • I prefer sofa-like comfort over a cockpit-like interior

• Driving is one of my favorite things to do
                                                               • I want a vehicle that provides the quietest interior

• I really don't enjoy driving                                 • I want to look good when driving my vehicle

• Whenever I get a chance, I love to go for a drive
                                                               • I want my vehicle to stand out in a crowd

• When I drive for fun, I mainly prefer to relax and lis-
                                                               • I would pay signi cantly more for environmentally
  ten to music or talk
                                                                 friendly vehicle

• I want vehicles that provide that open-air driving ex-       • Price is most important to me when buying a new
  perience
                                                                 vehicle

• I prefer a vehicle that has the capability to outperform
                                                               • Purchase Price (100's)
  others

• I prefer vehicles that provide superior straight ahead
  power

• I prefer vehicles that provide superior handling and
  cornering agility

• I prefer a balance of comfort and performance

• I prefer vehicles that provide the softest, most com-
  fortable ride quality

• I just want the basics on my vehicle - no extras

• Value equals balance of costs, comfort & performance

• I prefer vehicles that project a tough and workmanlike
  image

• Vehicles are a 'tool' or a part of the 'gear' in an active
  outdoors lifestyle

• I Want to be able to tow heavy loads

• I want to be able to traverse any terrain

• I want the most versatility in my interior

• I want a basic, no frills vehicle that does the job


Conrady Applied Science, LLC - www.conradyscience.com                                                                   24
Simulating Market Share with the Bayesia Market Simulator



References


Barber, David. “Bayesian Reasoning and Machine Learn-
    ing.” http://www.cs.ucl.ac.uk/staff/d.barber/brml.
———. Bayesian Reasoning and Machine Learning.
  Cambridge University Press, 2011.  
Darwiche, Adnan. “Bayesian networks.” Communica-
    tions of the ACM 53, no. 12 (12, 2010): 80.  
Koller, Daphne, and Nir Friedman. Probabilistic Graphi-
     cal Models: Principles and Techniques. The MIT
     Press, 2009.  
Koppelman, Frank, and Chandra Bhat. “A Self Instruct-
    ing Course in Mode Choice Modeling: Multinomial
    and Nested Logit Models.” January 31, 2006.
Krzywinski, M., J. Schein, I. Birol, J. Connors, R.
    Gascoyne, D. Horsman, S. J. Jones, and M. A.
    Marra. “Circos: An information aesthetic for com-
    parative genomics.” Genome Research 19, no. 9 (6,
    2009): 1639-1645.  
Neapolitan, Richard E., and Xia Jiang. Probabilistic
    Methods for Financial and Marketing Informatics.
    1st ed. Morgan Kaufmann, 2007.  
Pearl, Judea. Causality: Models, Reasoning and Infer-
     ence. 2nd ed. Cambridge University Press, 2009.  
Spirtes, Peter, Clark Glymour, and Richard Scheines.
     Causation, Prediction, and Search, Second Edition.
     2nd ed. The MIT Press, 2001.  
Train, Kenneth. Qualitative Choice Analysis: Theory,
     Econometrics, and an Application to Automobile
     Demand. 1st ed. The MIT Press, 1985.  
Train, Kenneth E. Discrete Choice Methods with Simula-
     tion. Cambridge University Press, 2003.  




Conrady Applied Science, LLC - www.conradyscience.com       25
Simulating Market Share with the Bayesia Market Simulator



Contact Information                                         Copyright
                                                            © 2010 Conrady Applied Science, LLC and Bayesia SAS.
Conrady Applied Science, LLC                                All rights reserved.
312 Hamlet’s End Way
Franklin, TN 37067                                          Any redistribution or reproduction of part or all of the
USA                                                         contents in any form is prohibited other than the follow-
+1 888-386-8383                                             ing:
info@conradyscience.com
www.conradyscience.com                                      • You may print or download this document for your
                                                              personal and noncommercial use only.
Bayesia SAS
6, rue Léonard de Vinci                                     • You may copy the content to individual third parties
                                                              for their personal use, but only if you acknowledge
BP 119
                                                              Conrady Applied Science, LLC and Bayesia SAS as the
53001 Laval Cedex
France                                                        source of the material.

+33(0)2 43 49 75 69
                                                            • You may not, except with our express written permis-
info@bayesia.com                                              sion, distribute or commercially exploit the content.
www.bayesia.com                                               Nor may you transmit it or store it in any other web-
                                                              site or other form of electronic retrieval system.




Conrady Applied Science, LLC - www.conradyscience.com                                                              26

More Related Content

Viewers also liked

Understanding Fractures
Understanding FracturesUnderstanding Fractures
Understanding Fracturesguest4334a9
 
Relacion de plazas docentes para contrato 2013 chepen
Relacion de plazas docentes para contrato 2013  chepenRelacion de plazas docentes para contrato 2013  chepen
Relacion de plazas docentes para contrato 2013 chepenclaro
 
MTech14: Marketing Automation for the New Buyer's Journey - Linda West
MTech14: Marketing Automation for the New Buyer's Journey - Linda WestMTech14: Marketing Automation for the New Buyer's Journey - Linda West
MTech14: Marketing Automation for the New Buyer's Journey - Linda West
New England Direct Marketing Association
 
Finding a journal article when you have the reference
Finding a journal article when you have the referenceFinding a journal article when you have the reference
Finding a journal article when you have the reference
Samantha Halford
 
Evaluation Question 1
Evaluation Question 1Evaluation Question 1
Evaluation Question 1
Chris Burke
 
Illustrator Creation
Illustrator CreationIllustrator Creation
Illustrator Creationalexinsomny
 
The Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
The Cyprus Bank Deposit Seizure: New Realities in a World of Government DebtThe Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
The Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
University of Texas at Austin
 
Istant report ost 25 11 2011 def
Istant report ost 25 11 2011 defIstant report ost 25 11 2011 def
Istant report ost 25 11 2011 def
Conetica
 
Hoe Werkt Een Balans
Hoe Werkt Een BalansHoe Werkt Een Balans
Hoe Werkt Een Balansguesta11592
 
Het bos natuurlijk
Het bos natuurlijkHet bos natuurlijk
Het bos natuurlijk
danniewammes
 
Learnings from great statups Antti Kosunen
Learnings from great statups Antti KosunenLearnings from great statups Antti Kosunen
Learnings from great statups Antti Kosunen
Antti Kosunen
 
Edu 2.0
Edu 2.0Edu 2.0
Edu 2.0
Ivy Chen
 
Türkiyede Eğitim Sitemi
Türkiyede Eğitim SitemiTürkiyede Eğitim Sitemi
Türkiyede Eğitim Sitemi
Yunus Emre
 
Source control branching and merging guidelines
Source control branching and merging guidelinesSource control branching and merging guidelines
Source control branching and merging guidelines
Orbit One - We create coherence
 

Viewers also liked (15)

Understanding Fractures
Understanding FracturesUnderstanding Fractures
Understanding Fractures
 
Relacion de plazas docentes para contrato 2013 chepen
Relacion de plazas docentes para contrato 2013  chepenRelacion de plazas docentes para contrato 2013  chepen
Relacion de plazas docentes para contrato 2013 chepen
 
MTech14: Marketing Automation for the New Buyer's Journey - Linda West
MTech14: Marketing Automation for the New Buyer's Journey - Linda WestMTech14: Marketing Automation for the New Buyer's Journey - Linda West
MTech14: Marketing Automation for the New Buyer's Journey - Linda West
 
Finding a journal article when you have the reference
Finding a journal article when you have the referenceFinding a journal article when you have the reference
Finding a journal article when you have the reference
 
Evaluation Question 1
Evaluation Question 1Evaluation Question 1
Evaluation Question 1
 
Illustrator Creation
Illustrator CreationIllustrator Creation
Illustrator Creation
 
The Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
The Cyprus Bank Deposit Seizure: New Realities in a World of Government DebtThe Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
The Cyprus Bank Deposit Seizure: New Realities in a World of Government Debt
 
Istant report ost 25 11 2011 def
Istant report ost 25 11 2011 defIstant report ost 25 11 2011 def
Istant report ost 25 11 2011 def
 
Hoe Werkt Een Balans
Hoe Werkt Een BalansHoe Werkt Een Balans
Hoe Werkt Een Balans
 
Investing in Youth
Investing in YouthInvesting in Youth
Investing in Youth
 
Het bos natuurlijk
Het bos natuurlijkHet bos natuurlijk
Het bos natuurlijk
 
Learnings from great statups Antti Kosunen
Learnings from great statups Antti KosunenLearnings from great statups Antti Kosunen
Learnings from great statups Antti Kosunen
 
Edu 2.0
Edu 2.0Edu 2.0
Edu 2.0
 
Türkiyede Eğitim Sitemi
Türkiyede Eğitim SitemiTürkiyede Eğitim Sitemi
Türkiyede Eğitim Sitemi
 
Source control branching and merging guidelines
Source control branching and merging guidelinesSource control branching and merging guidelines
Source control branching and merging guidelines
 

Similar to Bayesia Lab Choice Modeling 1

Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Bayesia USA
 
Infor in the BARC BI Survey 2012
Infor in the BARC BI Survey 2012Infor in the BARC BI Survey 2012
Infor in the BARC BI Survey 2012adamrakich
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock Market
Bayesia USA
 
Knowledge Discovery in Stock Market
Knowledge Discovery in Stock MarketKnowledge Discovery in Stock Market
Knowledge Discovery in Stock Market
jouffe
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
jouffe
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
Bayesia USA
 
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
Lora Cecere
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionRevolution Analytics
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
DataBench
 
Where Print and Digital Meet
Where Print and Digital MeetWhere Print and Digital Meet
Where Print and Digital Meet
Ian Cruickshank
 
Network Optimization
Network OptimizationNetwork Optimization
Network Optimization
singhmk74
 
SAP Adding fields to dynamic selection for fbln transactions (2)
SAP Adding fields to dynamic selection for fbln transactions (2)SAP Adding fields to dynamic selection for fbln transactions (2)
SAP Adding fields to dynamic selection for fbln transactions (2)Imran M Arab
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
Big Data Value Association
 
Business Development Services (BDS) Market Diagnostics in Rwanda
Business Development Services (BDS) Market Diagnostics in RwandaBusiness Development Services (BDS) Market Diagnostics in Rwanda
Business Development Services (BDS) Market Diagnostics in RwandaEast Africa Dairy Development
 
Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12
Alexandre Perrot
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
Aki Balogh
 
IDC MarketScape Virtual Tape Library
IDC MarketScape Virtual Tape LibraryIDC MarketScape Virtual Tape Library
IDC MarketScape Virtual Tape Library
arms8586
 
Spire Esomar Best Of Vietnam 23 Apr 2012
Spire Esomar Best Of Vietnam 23 Apr 2012Spire Esomar Best Of Vietnam 23 Apr 2012
Spire Esomar Best Of Vietnam 23 Apr 2012
Jeffrey BAHAR
 
Collaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCFCollaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCF
Park JunPyo
 

Similar to Bayesia Lab Choice Modeling 1 (20)

Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
 
Infor in the BARC BI Survey 2012
Infor in the BARC BI Survey 2012Infor in the BARC BI Survey 2012
Infor in the BARC BI Survey 2012
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock Market
 
Knowledge Discovery in Stock Market
Knowledge Discovery in Stock MarketKnowledge Discovery in Stock Market
Knowledge Discovery in Stock Market
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
 
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
Putting Together the Pieces - A Guide to S&OP Technology Selection- 20 AUGUST...
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Summer project
Summer projectSummer project
Summer project
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
 
Where Print and Digital Meet
Where Print and Digital MeetWhere Print and Digital Meet
Where Print and Digital Meet
 
Network Optimization
Network OptimizationNetwork Optimization
Network Optimization
 
SAP Adding fields to dynamic selection for fbln transactions (2)
SAP Adding fields to dynamic selection for fbln transactions (2)SAP Adding fields to dynamic selection for fbln transactions (2)
SAP Adding fields to dynamic selection for fbln transactions (2)
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
 
Business Development Services (BDS) Market Diagnostics in Rwanda
Business Development Services (BDS) Market Diagnostics in RwandaBusiness Development Services (BDS) Market Diagnostics in Rwanda
Business Development Services (BDS) Market Diagnostics in Rwanda
 
Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
 
IDC MarketScape Virtual Tape Library
IDC MarketScape Virtual Tape LibraryIDC MarketScape Virtual Tape Library
IDC MarketScape Virtual Tape Library
 
Spire Esomar Best Of Vietnam 23 Apr 2012
Spire Esomar Best Of Vietnam 23 Apr 2012Spire Esomar Best Of Vietnam 23 Apr 2012
Spire Esomar Best Of Vietnam 23 Apr 2012
 
Collaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCFCollaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCF
 

Bayesia Lab Choice Modeling 1

  • 1. Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks A case study about predicting the U.S. market share of the Porsche Panamera using the Bayesia Market Simulator White Paper 2010/II Stefan Conrady, stefan.conrady@conradyscience.com Dr. Lionel Jouffe, jouffe@bayesia.com December 18, 2010 Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting
  • 2. Simulating Market Share with the Bayesia Market Simulator Table of Contents Modeling Vehicle Choice and Simulating Market Share with Bayesian Net- works Abstract/Executive Summary 1 Objective 1 About the Authors 2 Stefan Conrady 2 Lionel Jouffe 2 Acknowledgements 2 Introduction 2 Bayesian Networks for Choice Modeling 3 Case Study 4 Porsche Panamera 4 Common Forecasting Practices 6 Tutorial 6 Data Preparation 6 Consumer Research 6 Variable Selection 7 Set of Choice Alternatives 7 Filtered Values (Censored States) 7 Data Modeling 8 Data Import 8 Missing Values 9 Discretization 10 Variable Classes and Forbidden Arcs 12 Unsupervised Learning 13 Simulation 14 Product Scenario Baseline 14 Product Scenario Simulation 16 Substitution and Cannibalization 19 Market Scenario Simulation 20 Conrady Applied Science, LLC - www.conradyscience.com i
  • 3. Simulating Market Share with the Bayesia Market Simulator Limitations 20 Outlook 20 Summary 21 Appendix 22 Utility-Based Choice Theory 22 Multinomial Logit Models 22 Stated Preference Data 23 Revealed Preference Data 23 NVES Variables 23 References 25 Contact Information 26 Conrady Applied Science, LLC 26 Bayesia SAS 26 Copyright 26 Conrady Applied Science, LLC - www.conradyscience.com ii
  • 4. Simulating Market Share with the Bayesia Market Simulator This innovative approach is explained step-by-step in a Modeling Vehicle Choice study about the introduction of the new Porsche Panam- and Simulating Market era in the U.S. market. The results con rm that market share simulation with Bayesian networks is feasible even Share with Bayesian in niche markets that provide relatively few observa- tions. Networks We believe that making this method and the tools acces- sible to practitioners is an important contribution to Abstract/Executive Summary real-world marketing. We are con dent that for many We present a new method and the associated work ow companies this approach can yield a step-change in their for estimating market shares of future products based forecasting ability. exclusively on pre-introduction data, such as syndicated studies conducted prior to product launch. Our ap- Objective proach provides a highly practical, fast and economical This tutorial is intended for marketing practitioners, who alternative to conducting new primary research. are exploring the use of Bayesian network for their work. The example in this tutorial is meant to illustrate With Bayesian networks as the framework, and by em- the capabilities of BayesiaLab with a real-world case ploying the BayesiaLab and Bayesia Market Simulator study and actual consumer data. Beyond market re- software packages, this approach helps market research- searchers, analysts in many elds will hopefully nd the ers and product planners to reliably perform market proposed methodology valuable and intuitive. In this share simulations on their desktop computers1 , which context, many of the technical steps are outlined in great would have been entirely inconceivable in the past. detail, such as data preparation and the network learn- Market Share Simulation Work ow with BayesiaLab and Bayesia Market Simulator Scenario Market Data from Survey De nition from Analyst Projection Market Model Simulation Modeling Bayesian Network Bayesia Market BayesiaLab Simulator Market Shares 1 BayesiaLab and Bayesia Market Simulator can run on a wide range of operating systems, including Windows, OS X, Linux/Unix, etc. Conrady Applied Science, LLC - www.conradyscience.com 1
  • 5. Simulating Market Share with the Bayesia Market Simulator ing, as they are applicable to research with BayesiaLab in Bayesian networks. BayesiaLab enjoys broad acceptance general, regardless of the domain. in academic communities as well as in business and in- dustry. The relevance of Bayesian networks, especially in This paper is part of a series of tutorials, which are ex- the context of market research, is highlighted by ploring a broad range of real-world applications of Bayesia’s strategic partnership with Procter & Gamble, Bayesian networks. who has deployed BayesiaLab globally since 2007. About the Authors Acknowledgements Strategic Vision, Inc.2 (SVI) has generously made their Stefan Conrady Stefan Conrady is the co-founder and managing partner 2009 New Vehicle Experience Survey available as a data of Conrady Applied Science, LLC, a privately held con- source for this case study. In this context, special thanks go to Alexander Edwards, President, Automotive Divi- sulting rm specializing in knowledge discovery and probabilistic reasoning with Bayesian networks. In 2010, sion of Strategic Vision. Conrady Applied Science was appointed the authorized We would also like to thank Jeff Dotson3, John Fitzger- sales and consulting partner of Bayesia SAS for North ald4 and Frank Koppelman5 for their ongoing coaching America. Stefan Conrady has many years of marketing, and their valuable comments on this paper. However, all product planning and market research experience with errors remain the responsibility of the authors. Mercedes-Benz, BMW Group, Rolls-Royce Motor Cars and Nissan. In the context of these management assign- Finally, Kenneth Train’s6 books and articles have been ments, Stefan has been based in Europe, North America very helpful over the years as we explored the eld of and Asia. consumer choice modeling. Lionel Jouffe Introduction Dr. Lionel Jouffe is co-founder and CEO of France-based For the vast majority of businesses, market share is a key Bayesia SAS. Lionel Jouffe holds a Ph.D. in Computer performance indicator. Market share is used as a metric Science and has been working in the eld of Arti cial that allows comparing competitive performance inde- Intelligence since the early 1990s. He and his team have pendently from overall market size and its uctuations. been developing BayesiaLab since 1999 and it has emerged as the leading software package for knowledge In the product planning process, the expected market discovery, data mining and knowledge modeling using share is critical, along with the overall market forecast, 2 www.strategicvision.com 3 Assistant Professor of Marketing, Vanderbilt University, Owen Graduate School of Management. 4 President, Fitzgerald Brunetti Productions, Inc., New York. 5 Professor Emeritus, Professor Emeritus of Civil and Environmental Engineering, Robert R. McCormick School of En- gineering and Applied Science, Northwestern University. 6 Adjunct Professor of Economics and Public Policy, University of California, Berkeley. Conrady Applied Science, LLC - www.conradyscience.com 2
  • 6. Simulating Market Share with the Bayesia Market Simulator as together they de ne the sales volume expectation, “oracles” that allow us to “deliberately reason about the which, for obvious reasons, is a key element in most consequences of actions we have not yet taken.” 8 business cases. Bayesian Networks for Choice Modeling As a result, it is critical for decision makers to correctly Using Bayesian networks9 as the general framework for predict the future market shares of products not yet de- modeling a domain or system has many advantages, veloped. The task of such market share forecasts typi- which Darwiche (2010) summarizes as follows: cally falls into marketing and market research depart- ments, who are mostly closely involved with understand- • “Bayesian networks provide a systematic and localized ing consumer behavior and, more speci cally, the method for structuring probabilistic information product choices they make. about a situation into a coherent whole […]” If we fully understood the consumer’s decision making • “Many applications can be reduced to Bayesian net- process and observed all components of it, we could work inference, allowing one to to capitalize on Bayes- simply generate a deterministic model for predicting ian network algorithms instead of having to invent future consumer choices. However, we do not and it is specialized algorithms for each new application.” obvious that many elements contributing to a consumer’s Given the very attractive properties of Bayesian net- purchase decision are inherently unobservable. Despite works for representing a wide range of problem do- our limited comprehension of the true human choice mains, it seems appropriate applying them for choice process, there are a number of tools that still allow mod- modeling as well. In particular, the BayesiaLab software eling consumer choice with what is observable, and ac- package has made it very convenient to automatically counting for what will remain unknowable. In this con- machine-learn fairly large and complex Bayesian net- text, and based on the seminal works of Nobel-laureate works from observational data. Daniel McFadden7, choice modeling has emerged as an important tool in understanding and simulating con- Beyond the convenience and speed of estimating Bayes- sumer choice. ian networks with BayesiaLab, there are three fundamen- tal differences in modeling consumer choice with Bayes- Such choice models serve a representation of the “real ian networks compared to traditional discrete choice world” and thus become, what Judea Pearl likes to call models.10 7 Daniel McFadden received, jointly with James Heckman, the 2000 Nobel Memorial Prize in Economic Sciences; McFadden’s share of the prize was “for his development of theory and methods for analyzing discrete choice”. 8 A recurring quote from Judea Pearl’s many lectures on causality. 9 A Bayesian network is a graphical model that represents the joint probability distribution over a set of random vari- ables and their conditional dependencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. A very concise introduction to Bayesian networks can be found in Darwiche (2010). 10 A very brief overview about utility-based choice models is provided in the appendix. Conrady Applied Science, LLC - www.conradyscience.com 3
  • 7. Simulating Market Share with the Bayesia Market Simulator 1. Whereas utility-based choice models, such as multi- As a result we obtain a choice probability as a function nomial logit models (MNL), will “ atten” the vector of product and consumer attributes. of attribute utilities into a single scalar value, Bayes- In order to obtain a product’s projected market share, we ian networks do not inherently restrict all the di- then need to simulate choice probabilities across all mensions relating to choice. For example, learning a Bayesian network on observed vehicle choices might product scenarios and across all individuals in the popu- lation under study. For this speci c purpose Bayesia SAS reveal that fuel economy and vehicle price are sub- has developed the Bayesia Market Simulator, which uses ject to tradeoff, while safety is a nonnegotiable basic requirement for the consumer. Correctly recognizing the Bayesian networks generated by BayesiaLab. Both tools will play a central role in this case study. such dynamics are obviously critical for making predictions about future consumer choices. Case Study 2. Bayesian networks are nonparametric and therefore To illustrate the entire market share estimation process do not require the speci cation of a functional form. with Bayesian networks, we have derived a case study No assumptions need to made regarding the form of from the U.S. auto industry. More speci cally, we will links between variables. Potentially nonlinear pat- model consumer choice behavior in the high-end vehicle terns are therefore not an issue for model estimation market based on 2009 survey data. This is an interesting or simulation. point in time, as it precedes the launch of the new Por- sche Panamera in model year 2010 (MY 2010), which 3. Bayesian networks are inherently probabilistic and will be the focus of our study. as such there is no need to specify an error term. An error would be needed in a traditional choice model Porsche Panamera to make it non-deterministic. 4. In BayesiaLab all computations are natively discrete and therefore no transformation functions, such as logit or probit, are needed. Given that we are deal- ing with discrete consumer choices, this all-discrete approach is an advantage. For our case study we use BayesiaLab 5.0 Professional Edition to learn a Bayesian network from consumer choices in the form of stated preference (SP) or revealed After the highly successful Cayenne, a four-door luxury preference (RP) data.11 ,12 The learned Bayesian network SUV, the Panamera is Porsche’s second vehicle with four allows us to compute the posterior probability distribu- doors. Clearly in uenced by the legendary 911’s styling, tion in each choice situation, including hypothetical the Panamera is offers sports-car looks and performance product alternatives (and even hypothetical consumers). while comfortably accommodating four passengers. It 11 The properties of Stated Preference (SP) and Revealed Preference (RP) data are explained in the appendix. 12 Although we focus here exclusively on machine-learning consumer behavior, within the BayesiaLab framework we can also utilize expert knowledge about consumer behavior. For instance, vehicle dealers and their salespeople will have extensive knowledge about how consumer behave in the showroom. A special Knowledge Elicitation module in BayesiaLab can formally capture such expertise and build a new Bayesian network from it or augment an existing one. Knowledge Elicitation with BayesiaLab will be the subject of a separate tutorial to be published in the near future. Conrady Applied Science, LLC - www.conradyscience.com 4
  • 8. Simulating Market Share with the Bayesia Market Simulator enters a segment with well-established contenders, such Beyond these traditional premium sedans, there are a the Mercedes-Benz S-Class13 , the BMW 7-series14 and number of less conventional products that one can as- the Audi A815, shown below in that order. sume to be in the Panamera’s competitive eld as well. The coupe-like Mercedes-Benz CLS16 would probably fall into this category. Finally, the new Panamera may draw customers away from Porsche’s own product offerings, such as the Cay- enne17 , an effect that is often referred to as “product substitution” or “product cannibalization.” It is not our intention to speculate about potential product interactions, but rather to attempt learning from 13 MY 2010 shown 14 MY 2009 shown 15 MY 2009 shown 16 MY 2010 shown 17 MY 2009 shown Conrady Applied Science, LLC - www.conradyscience.com 5
  • 9. Simulating Market Share with the Bayesia Market Simulator revealed consumer behavior in a very formal way with Tutorial Bayesian networks. In this tutorial we will explain each step from data In order not to prematurely restrict our consumer choice preparation to market share simulation using BayesiaLab set, we have de ned a broad set of competitors for our and Bayesia Market Simulator, according to the follow- purposes and included all non-domestic luxury vehicles18 ing outline: (including Light Trucks) priced above $75,000.19 1. Data preparation (external) What was certainly a very real task for Porsche’s product 2. BayesiaLab: planning team in recent years, i.e. predicting the Panam- era market share, now becomes the topic of our case a. Data import study and tutorial. Our objective is to predict what mar- ket share the Panamera will achieve without conducting b. Data modeling any new research, strictly using RP data from before the 3. Baseline product scenario generation (external) product launch. 4. Bayesia Market Simulator: Common Forecasting Practices Although we have no knowledge of the speci c forecast- a. Network import ing methods at Porsche, we know from industry experi- ence that volume and market share forecasts are often b. De nition of scenarios determined through a long series of negotiations20 be- c. Market share simulation tween stakeholders, typically with an optimistic market- ing group on one side and a skeptical CFO on the other. Notation While expert consensus may indeed be a reasonable heu- ristic for business planning, the lack of forecasting for- To clearly distinguish between natural language, malisms is often justi ed by saying that forecasting is at software-speci c functions and study-speci c variable least as much art as it is science. names, the following notation is used: The authors believe strongly that there is great risk in • BayesiaLab and Bayesia Market Simulator functions, relying too heavily on “art”, which is inherently non- keywords, commands, etc., are shown in bold type. auditable, and have therefore been pursuing easily trac- • Variable/node names are capitalized and italicized. table, but scienti cally sound methods to support mana- gerial decision making, especially in the context of fore- Data Preparation casting. With this in mind, this very formal and struc- tured forecasting exercise was consciously chosen as the Consumer Research topic of the tutorial. This tutorial utilizes the 2009 New Vehicle Experience Survey, a syndicated study conducted annually by Strate- gic Vision, Inc., which surveys new vehicle buyers in the 18 We followed the SVI segmentation and included “Luxury Car”, “Premium Coupe”, “Premium Convertible/Roadster” and “Luxury Utility” in our selection. 19 The $75,000 threshold was chosen as it marks the lower end of the Panamera price range. 20 As an interesting aside, these negotiations are usually Markovian in nature, i.e. the starting point of today’s negotia- tion only depends on the outcome of the previous negotiation. Conrady Applied Science, LLC - www.conradyscience.com 6
  • 10. Simulating Market Share with the Bayesia Market Simulator U.S. This study is widely used in the auto industry and it cles actual buyers did consider and which vehicles they serves one of the primary market research tools. NVES disposed in the context of their most recent purchase.23 contains over 1,000 variables and close to 200,000 re- As mentioned in the case study introduction, we included spondent records. In large auto companies, hundreds of “Luxury Car”, “Premium Coupe”, “Premium analysts typically have access to NVES, most often through the mTAB interface provided by Productive Ac- Convertible/Roadster” and “Luxury Utility” 24 in the choice set and we further restricted it by excluding all cess, Inc. (PAI).21 domestic vehicles and vehicles priced below $75,000. For Variable Selection this segment of assumed Panamera competitors we have Compared to traditional statistical models, Bayesian approximately 1,200 unweighted observations in the networks require much less “care” in terms of variable 2009 NVES, which, on a weighted basis, re ect ap- selection, as overparameterization is generally not an proximately 25,000 vehicles purchased in 2009. issue. So, although we could easily start with all 1,000+ Filtered Values (Censored States) variables, for expositional clarity we will initially select Although in BayesiaLab we can be less rigorous regard- only about 50 variables22 from the following categories, ing the maximum number of variables, we still need to which we assume to capture relevant characteristics of both the consumer and the product: be conscious of the information contained in them. For instance, we need to distinguish unobserved values 1. Vehicle/product attributes, e.g. brand, segment, num- from non-existing values, although at rst glance both ber of cylinders, transmission, drive type, etc. appear to be “simple” missing values in the database. 2. Consumer demographics, e.g. age, income, gender, etc. BayesiaLab has a unique feature that allows treating non-existing values as Filtered Values or Censored States. 3. Vehicle-related consumer attitudes, e.g. “I want to look good when driving my vehicle”, “I want a basic, no-frills vehicle that does the job,” etc. To explain Filtered Values we need to resort to an auto- motive example from outside our speci c study. We as- Set of Choice Alternatives sume that we have two questions about trailer towing. Beyond variable selection, we must also de ne the set of We rst ask, “do you use your vehicle for towing?”, and choice alternatives and assume which vehicles a potential then, “what is the towing weight?” If the response to the Panamera customer would consider. Not only that, but rst question is “no”, then a value for the second one we also need to make sure that all choice alternatives for cannot exist, which in BayesiaLab’s nomenclature is a the Panamera’s choice alternatives are included. For in- Filtered Value or Censored State. We actually must not stance, if we included the Porsche Cayenne in the choice impute a value for towing weight in this case and instead set, then the Mercedes-Benz M-Class and the BMW X5 Filtered Value code will indicate this special condition. should be included too, and so on. One might argue that the vehicle purchase might be an alternative to a kitchen On the other hand, a respondent may answer “yes”, but renovation or the purchase of a boat. Expert knowledge then fail to provide a towing weight. In this case, a true is clearly required at this point as to how far to expand value for the towing weight exists, but we cannot ob- the choice set. Furthermore, SVI’s NVES can also help us serve it. Here it is entirely appropriate to impute a miss- in this regard as it contains questions about what vehi- 21 www.paiwhq.com 22 A list of all variables used is given in the appendix. It should be noted that even 50 variables would create a major computational challenge with MNL models. 23 Martin Krzywinski’s visualization tool, Circos, is highly recommended for the interpretation of cross-shopping behav- ior: www.mkweb.bcgsc.ca/circos/ 24 According to SVI’s segment de nition. Conrady Applied Science, LLC - www.conradyscience.com 7
  • 11. Simulating Market Share with the Bayesia Market Simulator ing value, as we will explain as part of the Data Import procedure. To indicate Filtered Values to BayesiaLab, we will need to apply a study-speci c logic and recode the relevant variables in the original database. Most statistical soft- ware package have a set of functions for this kind of task. For example, in STATISTICA this can be done with the Recode function. The table displayed in the Data Import wizard shows the individual variables as columns and the respondent re- cords as rows. There are a number of options available, such as for Sampling. However, this is not necessary in our example given the relatively small size of the data- base. Clicking the Next button prompts a data type analysis, which provides BayesiaLab’s best guess regarding the data type of each variable. Alternatively, this recoding logic can also be expressed Furthermore, the Information box provides a brief sum- with the following pseudo code: mary regarding the number of records, the number of missing values, ltered states, etc. IF towing=yes THEN towing weight=unchanged IF towing=no THEN towing weight=FV (Filtered Value) A simple Excel function will achieve the same and it is assumed that the reader can implement this without fur- ther guidance. Although Filtered Values are very important in many research contexts, hence the emphasis here, our case study does not require using them. Data Modeling Data Import To start the analysis with BayesiaLab, we rst import the For this example, we will need to override the default database, which needs to be formatted as a CSV le.25 data type for the Unique Identi er variable, as each With Data>Open Data Source>Text File, we start the value is a nominal record identi er rather than a numeri- Data Import wizard, which immediately provides a cal scale value. We can change the data type by highlight- preview of the data le. ing the Unique Identi er column and clicking the Row 25 CSV stands for “comma-separated values”, a common format for text-based data les. As an alternative to this im- port format, BayesiaLab offers a JDBC connection, which is practical when accessing large databases on servers. Conrady Applied Science, LLC - www.conradyscience.com 8
  • 12. Simulating Market Share with the Bayesia Market Simulator Identi er check box, which changes the color of the of discrete distributions, means-imputation typically also Unique Identi er column to beige. introduces a bias. There are other, better techniques, which typically demand signi cant computational effort Although it is not imperative to maintain a Row Identi- and thus often turn out like a labor-intensive standalone er, and we could instead assign the Not Distributed project rather than being just a preparatory step. status to the Unique Identi er variable, it can be quite helpful for nding individual respondent records at a Without going into too much detail at this point, later point in the analysis. BayesiaLab can estimate all missing values given the learned network structure using the Expectation Maxi- As the respondent records in the NVES survey are mization (EM) algorithm. As a result, we obtain a com- weighted, we need to select the Weight by clicking on the plete database without “making things up.” In tradi- Combined Base Weight variable, which will turn the tional statistics, the equivalent would be to say that nei- column green. ther the mean nor the variance of the variables is af- fected by the imputation process. Continuing in our data import process, the next screen provides options as to how to treat the missing values. Clicking the small upside-down triangle next to the vari- able names brings up a window with key statistics of the selected variable, in this case Age Bracket. Missing Values In the context of data import, it is important to point out how missing values are treated in BayesiaLab. The na- tive, automatic processing of missing values reveals a particular strength of BayesiaLab. In traditional statistical analysis, the analyst has to choose from a number of methods to handle missing The very basic functions of ltering, i.e. case-wise dele- values in a database, but unfortunately many of them tion, and mean/modal value imputation are available. have serious drawbacks. Perhaps the most common However, at this point, we can take advantage of method is case-wise deletion, which simply excludes re- BayesiaLab’s advanced missing values processing algo- cords that contain any missing values. Casually speaking, rithms. We will select Dynamic Completion, which will this means throwing away lots of good data (the non- continuously “ ll in” and “update” the missing values missing values) along with the bad (the missing values). according to the conditional distribution of the variable, Another method is means-imputation, by which any as de ned by the current structure of the networks. missing value is lled in with the variable’s mean. Inevi- However, as our network is not yet connected and hence tably, this reduces the variance of the variable and thus does not have a structure, BayesiaLab will draw from the has an impact on its summary statistics, which is clearly undesirable considering the intended analysis. In the case Conrady Applied Science, LLC - www.conradyscience.com 9
  • 13. Simulating Market Share with the Bayesia Market Simulator marginal distribution of each variable to “tentatively” establish placeholder values for each missing value. A screenshot from STATISTICA, where we have done most of the preprocessing, shows the marginal distribu- tion of the Age Bracket variable in the form of a histogram.26 By clicking on the Type drop-down menu, the choice of discretization algorithms appears. Selecting Manual will show a cumulative graph of the The missing Age Bracket values will be drawn from this Purchase Price distribution, and we can see that it ranges marginal distribution and are used as placeholders, until from $75,000 to $180,000.28 we can use the structure of the Bayesian network to rees- timate our missing values. As Dynamic Completion im- plies, BayesiaLab performs this on continuous basis in the background, so at any point we would have the best possible estimates for the missing values, given the cur- rent network structure. Discretization The next step is the Discretization and Aggregation dia- logue, which allows the analyst to determine the type of discretization, which must be performed on all continu- ous variables.27 We will use the Purchase Price variable to explain the process. Highlighting a variable will show the default discretization algorithm while the graph panel is initially blank. We could now manually select binning thresholds by way of point-and-click directly on the graph panel. This 26 The normal curve in the histogram is just for illustration purposes. BayesiaLab always uses the actual discrete distri- bution, not a parametric approximation. 27 BayesiaLab requires discrete distributions for all variables. 28 $75,000 was previously selected as the lower boundary for this particular vehicle segment. $180,000 was the highest reported price in NVES. Conrady Applied Science, LLC - www.conradyscience.com 10
  • 14. Simulating Market Share with the Bayesia Market Simulator might be relevant, if there were government regulations in place with speci c vehicle price thresholds.29 For our purposes, however, we want to create price cate- gories that are meaningful in the context of our vehicle segment and ve bins may seem like a reasonable start- ing point. The resulting bins appear much more suitable to describe Clicking Generate Discretization will prompt us to select our domain. the type of discretization and the number of desired in- tervals. Without having a-priori knowledge about the distribution of the Price variable, we may want to start with the Equal Distances algorithm. The resulting view shows the generated intervals and by clicking on the interval boundaries we can see the per- We will proceed similarly with the only other continuous centage of cases falling into the adjacent intervals. variable in the database, i.e. Age Bracket. Note For choosing discretization algorithms beyond this example, the following rule of thumb may be helpful: • For supervised learning, choose Decision Tree. • For unsupervised learning, choose, in the order of priority, K-Means, Equal Distances or Equal Frequencies. Clicking Finish completes the import process and 49 variables (columns) from our database are now shown as blue nodes in the Graph Panel, which is the main win- We learn from this that our bottom two intervals contain dow for network editing. 89% of the cases, whereas the top two intervals contain just under 5% of the cases. This suggests that we may not have enough granularity to characterize the bulk of the market towards the bottom end of the price spec- trum. Perhaps we also have too few cases within the top two intervals. So we will generate a new discretization, now with four intervals, and select KMeans as the type this time. 29 The now-expired luxury tax for passenger cars in the U.S. would be an example for such a policy. Conrady Applied Science, LLC - www.conradyscience.com 11
  • 15. Simulating Market Share with the Bayesia Market Simulator we are in P(Age < 45 | Number of children under 6 = 2). Hence we focus the learning algorithm on the area of interest, i.e. product attributes vis-à-vis mar- ket attributes. 2. We must not learn the dependencies between the product variables themselves because they would simply re ect today’s product offerings and their contingencies, e.g. P(Vehicle Segment=“4-door se- dan” | Brand=“Porsche”)=0. We do want to under- stand what is available today, but we certainly do not want to encode today’s product scenarios as The six nodes on the far left column re ect product at- constraints in the network. Instead, we want to be tributes (green), the second-from-left column shows ten able to introduce new scenarios, which are not demographic attributes (yellow) and all remaining nodes available today. to the right represent 33 vehicle-related attitudes (red). This initial view represents a fully unconnected Bayesian To focus learning in a speci c area, we need to take an network. indirect approach and tell BayesiaLab “what not to learn.” So, to prevent the algorithm from learning the Also, to simplify our nomenclature, we will combine the product-to-product variable relationships, we will “for- demographic attributes (yellow) and the vehicle-related bid” such arcs. attitudes (red) and refer to them together as “Market” variables (now all red). We rst create a Class by highlighting all product nodes then right-clicking them. From the menu, we then select Properties>Classes>Add. Variable Classes and Forbidden Arcs One is now tempted to immediately start with Unsuper- vised Learning to see how all these variables relate to each other. However, there are two reasons why we need to intro- duce another step at this point: 1. Our mission is to model the interactions between When prompted for a name, we can choose something products variables on the one side and market vari- descriptive, so we give this new Class class the label ables on the other, so we can see the consumer re- “Product”. sponse to products. For instance, we are more inter- ested in learning P(Transmission= “Manual” | Atti- tude = “Driving is one of my favorite things”) than Conrady Applied Science, LLC - www.conradyscience.com 12
  • 16. Simulating Market Share with the Bayesia Market Simulator As a result, these Forbidden Arc relationships will appear in the Forbidden Arc Editor and will remain there unless we subsequently choose to modify them. Having introduced this Class of node, we can now very easily manage Forbidden Arcs. More speci cally, we want to make all arcs within the Class Products forbid- den. A right-click anywhere on the Graph Panel opens up the menu from which we can select Edit Forbidden Arcs. We are also reminded about the presence of Forbidden Arcs by the symbol in the lower right corner of the screen. Unsupervised Learning Now that the learning constraints are in place, we con- tinue to learn the network by selecting Learning>Asso- ciation Discovering>EQ.30 In the Forbidden Arc Editor, we can select the Class Product both as start and end. The resulting network may appear somewhat unwieldy We now repeat the above steps and also create Forbid- at rst glance, but upon closer inspection we can see that den Arcs for the Market variables. arcs exist only between Product variables (green) and Market variables (red), which is precisely what we in- tended by establishing Forbidden Arcs. 30 EQ is one of the unsupervised learning algorithms implemented in BayesiaLab. Koller and Friedman (2009) provide a comprehensive introduction to learning algorithms. Conrady Applied Science, LLC - www.conradyscience.com 13
  • 17. Simulating Market Share with the Bayesia Market Simulator ing the baseline scenario is described in the following section. Product Scenario Baseline The idea is that all available product con gurations were manifested in the market in 2009 and thus captured in the 2009 NVES.33 It still requires careful consideration as to how many Product variables should be included to generate the baseline product scenario. We want to create a type of However, we will not analyze this structure any further, coordinate system, that allows us to identify products but rather use it solely as a statistical device to be used in through their principal characteristics. For instance, the the Bayesia Market Simulator. We simply need to save following attributes would uniquely de ne a “Mercedes- Benz S550 4Matic”: the network in its native xbl le format, so the Bayesia Market Simulator can subsequently import it. • Brand=“Mercedes-Benz” Simulation • Engine Type=“V8” With the Bayesia Market Simulator we have the ability • Drive Type=“AWD” to simulate “alternate worlds” for both the Product variables as well as for the Market variables. In most • Transmission=“Automatic” applications, however, marketing analysts will want to primarily study new Product scenarios assuming the • Segment=“High Premium”34 Market remains invariant, meaning that consumer • Price=“>$85,795 AND <= $99,378” demographics and attitudes remain the same.31 Relating consumer attributes and attitudes to these indi- It will be the task of the analyst to de ne new product vidual product attributes, rather than to the vehicle as a scenarios, which will need to include all products as- whole, will then allow us to construct hypothetical sumed to be in the marketplace for the to-be-projected products during our simulation. To stay with the Mer- timeframe, in our case 2010.32 As many products carry cedes example, we could de ne a new product by setting over from one year to the next, e.g. from model year the engine type to “V6” and changing the price to “< 2010 to model year 2011, it is very helpful to use the $85,795”. currently available products as a baseline scenario, upon which changes can be built. Quite simply, we need to It is easy to imagine how one can get the number of take inventory of the product landscape today. In the permutations to exceed the number of consumers. For current version of Bayesia Market Simulator this step is instance, in the High Premium segment, we could further yet not automated, so a practical procedure for generat- differentiate between short wheelbase and long wheel- 31 The year-to-year invariance assumption of the market has been challenged by many marketing executives during the most recent recession. In this context, many media headlines also proclaimed a paradigm shift in consumer behavior. The authors have believed - then as well as now - that more has remained the same than has changed in terms of con- sumer attitudes. 32 For expositional simplicity, we make no distinction between model year and calendar year. 33 In our example, we judge this to be a reasonable simpli cation, even though a small number of automobiles at the very top end of the market, e.g. the Rolls-Royce Phantom, may not be captured in the survey. 34 Using the Strategic Vision segmentation nomenclature, “High Premium” de nes a large four-door luxury sedan. Conrady Applied Science, LLC - www.conradyscience.com 14
  • 18. Simulating Market Share with the Bayesia Market Simulator base versions, which would increase the number of base- line product scenarios. We want to nd a reasonable balance between product granularity and the ratio of consumers to product scenarios, although we cannot provide the reader with a hard-and-fast rule. Pricing is obviously a very important part of the product scenario con guration and here we are confronted with This will export all variables and all records, including the reality that no two customers pay exactly the same values from previously performed missing value imputa- for the identical product, and the survey data makes this tions. The output will be in a semicolon-delimited text very evident. Furthermore, there are numerous product le, which can be easily imported into Excel or any sta- features outside our “coordinate system”, e.g. an op- tistical application, such as SPSS or STATISTICA. The tional $6,000 high-end audio system, that would materi- purpose of loading this into an external application is to ally affect the price point of an individual vehicle, but manipulate the database to extract the unique product which would not move the vehicle into a different cate- combinations available in the market. gory from a consumer’s perspective. With options, an S550 can easily reach a price of over $100,000. Still we In Excel this can be done very quickly by deleting all would want such a high-end S550 to be grouped with columns unrelated to the product con guration, which the standard S550. Thus it is important to de ne reason- leaves us with just the product attributes. able price brackets that cover the price spectrum of each vehicle and minimize model fragmentation. During the Data Import stage, BayesiaLab has discre- tized all continuous numerical values, including Price, and created discrete states. If these discrete states are adequate considering the price positioning and price spectrum of the vehicles under study, we can now lever- age this existing binning for generating all current product scenarios and select Data>Save Data. In Excel 2010 (for Windows) and Excel 2011 (for Mac), there is a very convenient feature, which allows to quickly remove all duplicates, which is exactly what we want to achieve. We want to know all the unique product con gurations currently in the market. In the subsequently appearing dialogue box, we need to select Use the States’ Long Name. It is important that Use Continuous Values is not checked, otherwise we will lose the discretized states of the Price variable. This leaves use with a table of approximately 100 unique product scenario combinations available at the time of the survey. Conrady Applied Science, LLC - www.conradyscience.com 15
  • 19. Simulating Market Share with the Bayesia Market Simulator To make these unique product scenarios available for Upon loading we will see the principal interface of the subsequent use in the Bayesia Market Simulator, we need Bayesia Market Simulator. On the left panel, all nodes of to save the table as a semicolon-delimited CSV le. This the network appear as variables. We will now need to is important to point out, as most programs will save separate all variables into Market Variables and Scenario CSV les by default as comma-delimited les. Variables by clicking the respective arrow buttons. In our case, the aptly named Market variables are the Market Product Scenario Simulation Variables in BMS nomenclature and Product variables Now that we have the Bayesian network describing the are the Scenario Variables. overall market (as an xbl le) as well as the baseline product scenarios (as a csv le), we can proceed to open the Bayesia Market Simulator. All variables must be allocated before being able to con- tinue to Scenario Editing. This also implies that Product variables, which are not to be included as Scenario Vari- Clicking File>Open will prompt us to open the xbl net- ables, must be excluded from the Bayesian network le. work le we previously generated with BayesiaLab. If necessary, we will return to BayesiaLab to make such edits As we are working with RP data, every record in our database re ects one vehicle purchase, i.e. “reveals” one choice, and therefore we need to leave the Target Vari- able and Target State elds blank. These elds would only be used in conjunction with SP data, which includes a variable indicating acceptance versus rejection. Clicking Scenario Editing opens up a new window. We can now manually add any product scenarios we wish to simulate. Given the potentially large number of scenar- ios, it will typically be better to load the baseline product scenarios, which were saved earlier. Conrady Applied Science, LLC - www.conradyscience.com 16
  • 20. Simulating Market Share with the Bayesia Market Simulator Upon successful import, all baseline product scenarios will appear in the Scenario Editing dialogue. We can do that by selecting Offer>Import Offers. We now select to open the semicolon-delimited CSV le with the baseline product scenarios. It is very important that the CSV le is formatted precisely as speci ed, for instance, without any extra blank lines. In case there are any import issues, it can be helpful to review the CSV le in a text editor and to visually in- spect the formatting. The analyst can now add any new product scenarios or delete those products, which are no longer expected to be in the market.35 By clicking Add Offer an additional scenario will be added at the bottom of the product sce- nario list. In the case of long product scenario lists, this may require scrolling all the way down. Clicking on the product attributes of any scenario prompts drop-down menus to appear with the available 35 To maintain expositional simplicity, we have added all Panamera versions for the entire year 2010 and not changed any other product scenarios. It should be pointed out that the V6 version of the Porsche Panamera was introduced only in mid-2010. BMW has also launched an additional six-cylinder version of the 7-series as well as AWD variants, which are not re ected in the simulation. Finally, Jaguar has released a new XJ in 2010, while that year marked the runout of the old-generation Audi A8. Conrady Applied Science, LLC - www.conradyscience.com 17
  • 21. Simulating Market Share with the Bayesia Market Simulator attribute states, e.g. RWD or AWD.36 This also allows to done by associating the original database, from which change attributes of existing products, according to the the network was learned, or by creating a new, arti cial analysts requirements. one that re ects the joint probability distribution of the learned Bayesian network. The latter can be achieved by selecting Database>Gener- ate. It is up to the analyst to determine the size of the data- base to be generated. Although there is no xed rule, too small of a database will limit the observability of prod- ucts with a very small market share. For our case study, we will add the following versions of the Panamera as new product scenarios: • Panamera (V6, RWD) Alternatively, we can also associate the original database, • Panamera 4 (V6, AWD) which contains the survey responses. In our case, the original database contains 1,203 records, which is very • Panamera S (V8, RWD) reasonable in terms of computational requirements. • Panamera 4S (V8, AWD) Once a database is associated, clicking the Simulation button will start the market share estimation process. • Panamera Turbo (V8 Turbo, RWD) To characterize all of them as large 4-door luxury se- dans, which is the key distinction versus previous Por- sche products, we will assign the “High Premium” at- tribute to them. Once this is completed, we need to obtain a database that represents the consumer base, on which these new product scenarios will be “tried out”. This can either be 36 RWD and AWD stands for rear-wheel drive and all-wheel drive respectively Conrady Applied Science, LLC - www.conradyscience.com 18
  • 22. Simulating Market Share with the Bayesia Market Simulator Simulated High Premium Market Shares ($75,000+) 1% 12% 21% Audi BMW 3% Jaguar Lexus 10% Mercedes Porsche 53% As can be seen from the results, the Porsche Panamera’s predicted market share appears to be compatible with the reported running rate for calendar year 2010, which was available at the time of writing. Unfortunately, we With the given complexity of our network and around do not know how this compares to Porsche’s expecta- 100 product scenarios, the simulation should take no tions, but the Panamera seems to be quite successful longer than 30 seconds on a typical desktop computer. overall. Upon completion, the simulation results will appear in Substitution and Cannibalization the form of a pie chart and a table. One can go back and The fully simulated database can also be saved as a review the scenarios by clicking the Scenario Editing semicolon-delimited CSV le, which will allow reviewing button. the choice probability for each product scenario by indi- vidual consumer in a spreadsheet. We can literally examine the new, simulated choices record-by-record and see which customers have made the switch to the Panamera. Applying conditional for- matting to the spreadsheet can also be very helpful. The The aggregated simulated market shares can also be cop- above screenshot, for example, shows a selection of ac- ied from the results table and pasted into Excel or any other application for further editing and presentation tual Mercedes buyers, who would either consider or pick the Porsche Panamera in this simulation. High choice purposes. An example is provided below, showing the probabilities are shown in shades of red, while near-zero simulated market shares of the brands under study in the High Premium segment. probabilities are depicted in dark blue. Conrady Applied Science, LLC - www.conradyscience.com 19
  • 23. Simulating Market Share with the Bayesia Market Simulator It is equally interesting to examine which Porsche buyers Upon editing the market segments, the simulation can be would pick the Panamera over their current vehicle rerun to obtain the new market share results. choice. Limitations This approach can simulate product and market scenar- ios consisting of variations of con gurations, which can be observed with suf cient sample today. However, the impact of entirely new technologies cannot be simulated on this basis. As a result, projecting the market share of the all-electric Nissan Leaf38 would not possible, whereas estimating the share of a hypothetical three-row BMW crossover vehicle would be feasible. In all cases, it re- quires the analyst’s expert knowledge and judgment to determine the adequacy and equivalency of product at- tributes observable today. Not surprisingly, our simulation suggests high probabili- ties of Panamera choice for several current Cayenne Outlook owners. One is tempted to take this a step further and There exist several natural extensions to the presented calculate a rate of cannibalization. In this particular sur- methodology, however it would go beyond the scope of vey, however, the sample size is too small to attempt do- this paper to present them. A brief summary shall suf ce ing so. Otherwise, such a computation would be simple for now and we will go into greater detail in forthcom- arithmetic. ing case studies in this series: Market Scenario Simulation 1. Beyond learning from data, we can use expert Although experimenting with product scenarios is ex- knowledge to create or augment Bayesian networks. pected to be the primary use of the Bayesia Market BayesiaLab offers a Knowledge Elicitation module, Simulator, it is also possible to change the market scenar- which formally captures expert knowledge and en- ios. codes it in a Bayesian network. In absence of market data, this is an excellent approach to have decision For example, this can be used to simulate the impact of makers collectively (and formally correct) reason policy changes. One could hypothesize that legislation about future states of the world. would prohibit or severely penalize ownership of vehi- cles of a certain size or of a speci c engine type in urban 2. We can extend the concept of product attributes to areas.37 consumers’ product satisfaction ratings. This will allow estimating the market share impact as a func- tion of changes in consumer ratings. For instance, an automaker could reason about the volume im- pact from a vehicle facelift, which is expected to raise the consumer rating of “styling”. 3. The product cannibalization or substitution rate can be estimated based on the simulated choice behav- ior, given that there is suf cient sample size. So, for most mainstream products, this seems to be realistic. 37 Given the draconian restrictions on motorists in Central London, this example is presumably not very far-fetched. 38 The all-electric Leaf was launched by Nissan in the U.S. in December of 2010. Conrady Applied Science, LLC - www.conradyscience.com 20
  • 24. Simulating Market Share with the Bayesia Market Simulator 4. With the ability to study consumer choice at the model level, we can also aggregate these results to the segment level. Alternatively, using a less granular approach, we can model the entire market at the segment and brand level, which would allow study- ing market changes at a larger scale. 5. Beyond simulating “hard” policy changes affecting the market, e.g. excluding a product class from a certain geography, we can also use BayesiaLab to simulate new populations with small changes in average consumer attitudes versus the originally surveyed population. For instance, such an arti - cially modi ed population could be more environ- mentally conscious and one could apply opinions prevalent on the West Coast to the whole country. Bayesia Market Simulator can then generate new market shares based on these new hypothetical market conditions. Summary BayesiaLab and Bayesia Market Simulator are unique in their ability to use Bayesian networks for choice model- ing and market share simulation. The presented work- ow provides a comprehensive method for simulating market shares of future products based on their key characteristics, without requiring new and costly ex- periments. As a result, BayesiaLab and Bayesia Market Simulator allow using a vast range of existing research for market share predictions. Given the signi cant resources many corporations have allocated over many years to conduct- ing consumer surveys, these BayesiaLab tools offer an entirely new way to turn the accumulated research data into practical market oracles. Conrady Applied Science, LLC - www.conradyscience.com 21
  • 25. Simulating Market Share with the Bayesia Market Simulator Appendix vance how individual product and consumer attributes relate to these unobservable utilities. However, there are Utility-Based Choice Theory methods that allow us to estimate these unknown vari- In today’s choice modeling practice, utility-based choice ables and, based on this knowledge, they allow us to theory plays a dominant role. predict choice in the future. One such method is brie y highlighted in the following. 1. The rst concept of utility-based choice theory is that each individual chooses the alternative that Multinomial Logit Models yields him or her the highest utility. In the domain of choice modeling, MultiNomial Logit models (MNL) have become the workhorse of the indus- 2. The second idea refers to being able to collapse a try, but here we only want to provide a cursory overview, vector describing attributes of choice alternatives so the reader can compare the approach presented in the into a single scalar utility value for the chooser. For case study with current practice. instance, a vector of attributes for one choice alter- native, e.g. [Price, Fuel Economy, Safety Rating], MNL models provide a functional form for describing would translate into one scalar value, e.g. [5], spe- the relationship between the utilities of alternatives and ci c to each chooser. the probability of choice. The following example is meant to illustrate both: For instance, using an MNL model for a choice situation with three vehicle alternatives, Altima, Accord and For Consumer A: Camry, the probability of choosing the Altima can be expressed as: • Utility of Product 1: [Price=$25,000, Fuel Economy=25MPG, Safety Rat- exp(VAltima ) ing=4 stars] = 7 ✓ Pr(Altima) = exp(VAltima ) + exp(VAccord ) + exp(VCamry ) • Utility of Product 2: [Price=$29,000, Fuel Economy=23MPG, Safety Rat- VAltima in this case stands for the utility of the Altima ing=5 stars] = 5.5 alternative. The utilities VAltima, VAccord, and VCamry are a function of the product attributes, e.g. For Consumer B: VAltima = β1 × Cost Altima + β 2 × FuelEconomyAltima + β 3 × SafetyRatingAltima • Utility of Product 1: [Price=$25,000, Fuel Economy=25MPG, Safety Rat- As we can observe tangible attributes like vehicle cost, ing=4 stars] = 4 fuel economy and safety rating, and we can also observe who bought which vehicle, we can estimate the unknown • Utility of Product 2: parameters. Once we have the parameters, we can simu- [Price=$29,000, Fuel Economy=23MPG, Safety Rat- late choices based on new, hypothetical product attrib- ing=5 stars] = 7.5 ✓ utes, such as a better fuel economy for the Altima or a lower price for the Camry. This concept implies that consumers make tradeoffs, either explicitly or implicitly, and that there exists an The parameters of MNL models can be estimated both amount x of “Fuel Economy” that is equivalent in utility from “stated preference” (SP) data, i.e. asking consumers to an amount y of “Safety”. The reader may reasonably about what they would choose, and “revealed prefer- object that not even a fuel economy of 100MPG would ence” (RP) data, i.e. observing what they have actually make it acceptable to drive a vehicle that is rated very chosen. There are numerous variations and extensions poorly on safety. to the class of MNL models and the reader is referred to Train (2003) and Koppelman (2006) for a comprehen- Also, we do not know a priori what the utility values are sive introduction. nor can we measure them. Neither do we know in ad- Conrady Applied Science, LLC - www.conradyscience.com 22
  • 26. Simulating Market Share with the Bayesia Market Simulator Stated Preference Data cal for a much broader audience. Although ELM has Stated preference data typically comes from experiments, successfully removed the burden of manual coding, i.e. consumer surveys or product clinics. In this context, countless iterations of speci cation and estimation re- conjoint experiments have become a very popular choice main a very time-consuming task of the analyst. elicitation method and a wide range of tools have been NVES Variables developed for this particular approach. In conjoint stud- The following variables from the 2009 Strategic Vision ies, consumers would typically be given a set of arti - cially generated product choices along with their attrib- NVES were included this case study: utes, from which preference responses are then elicited. • UNIQUE IDENTIFIER There are many variations of this method that all at- tempt to address some of the inherent challenges related • Combined Base Weight to dealing with responses to hypothetical questions. • New Model Purchased - Make/Model/Series (Alpha The Sawtooth software package has become de-facto Order) industry standard for such conjoint studies.39 • New Model Purchased - Brand Revealed Preference Data • New Model Purchased - Region Origin In contrast to SP data, revealed preference data is purely derived from passive observations. As the name implies, • New Model Segment the consumer choice is revealed by their actual behavior rather than by their stated intent in a hypothetical situa- • Segmentation 2 tion. A key bene t is that it is typically easier and more • Type Of Transmission economical to obtain passive observations than to con- duct formal experiments. A conceptual limitation of RP • Number Of Cylinders (VIN) data relates to the fact that non-yet-existing products can obviously not be chosen by consumers in the present • Drive Type (VIN) market environment. Thus simulating market shares of • Fuel Type hypothetical products requires “assembling” them from components and attributes of products, which are al- • Gender ready available in the market. This inherently limits the exploration of entirely new technologies, which have • Marital Status little in common with the technologies they may replace. • Age Bracket Studies based on RP data have become very popular for • Children Under 6 researching travel mode choice, as is also documented in a large body of research. In market research related to • Children 6 To 12 CPG products or durable goods, using RP data is some- what less common. • Children 13 To 17 We speculate that one of the reasons for the lack of • Total Family Pre-Tax Income popularity outside the world of academia is the absence • Ethnic Group of easy-to-use software packages. Only recently, with the release of Easy Logit Modeling (ELM)40 , specifying and • Location Of Residence estimating multinomial logit models has become practi- 39 A wide range of tools is available from Sawtooth Software, Inc., www.sawtoothsoftware.com. 40 Easy Logit Modeling is available from ELM-Works, Inc., www.elm-works.com. ELM can estimate models based on both RP and SP data, although we only mention it in the RP context. Conrady Applied Science, LLC - www.conradyscience.com 23
  • 27. Simulating Market Share with the Bayesia Market Simulator • Customer Region Classi cation #1 • My choice of vehicle re ects my personality • I Seek Variety in My Life • I want a vehicle that says a lot about my success in life / career • I'm Curious and Open to Experiences • I will switch brand for features or price • Luxury is Not Important Unless it Has Purpose • There are lots of different brands of vehicles that I • I Enjoy Expressing Myself Creatively would consider buying • I See Life as Full of Endless Possibilities • I prefer sofa-like comfort over a cockpit-like interior • Driving is one of my favorite things to do • I want a vehicle that provides the quietest interior • I really don't enjoy driving • I want to look good when driving my vehicle • Whenever I get a chance, I love to go for a drive • I want my vehicle to stand out in a crowd • When I drive for fun, I mainly prefer to relax and lis- • I would pay signi cantly more for environmentally ten to music or talk friendly vehicle • I want vehicles that provide that open-air driving ex- • Price is most important to me when buying a new perience vehicle • I prefer a vehicle that has the capability to outperform • Purchase Price (100's) others • I prefer vehicles that provide superior straight ahead power • I prefer vehicles that provide superior handling and cornering agility • I prefer a balance of comfort and performance • I prefer vehicles that provide the softest, most com- fortable ride quality • I just want the basics on my vehicle - no extras • Value equals balance of costs, comfort & performance • I prefer vehicles that project a tough and workmanlike image • Vehicles are a 'tool' or a part of the 'gear' in an active outdoors lifestyle • I Want to be able to tow heavy loads • I want to be able to traverse any terrain • I want the most versatility in my interior • I want a basic, no frills vehicle that does the job Conrady Applied Science, LLC - www.conradyscience.com 24
  • 28. Simulating Market Share with the Bayesia Market Simulator References Barber, David. “Bayesian Reasoning and Machine Learn- ing.” http://www.cs.ucl.ac.uk/staff/d.barber/brml. ———. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011.   Darwiche, Adnan. “Bayesian networks.” Communica- tions of the ACM 53, no. 12 (12, 2010): 80.   Koller, Daphne, and Nir Friedman. Probabilistic Graphi- cal Models: Principles and Techniques. The MIT Press, 2009.   Koppelman, Frank, and Chandra Bhat. “A Self Instruct- ing Course in Mode Choice Modeling: Multinomial and Nested Logit Models.” January 31, 2006. Krzywinski, M., J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S. J. Jones, and M. A. Marra. “Circos: An information aesthetic for com- parative genomics.” Genome Research 19, no. 9 (6, 2009): 1639-1645.   Neapolitan, Richard E., and Xia Jiang. Probabilistic Methods for Financial and Marketing Informatics. 1st ed. Morgan Kaufmann, 2007.   Pearl, Judea. Causality: Models, Reasoning and Infer- ence. 2nd ed. Cambridge University Press, 2009.   Spirtes, Peter, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition. 2nd ed. The MIT Press, 2001.   Train, Kenneth. Qualitative Choice Analysis: Theory, Econometrics, and an Application to Automobile Demand. 1st ed. The MIT Press, 1985.   Train, Kenneth E. Discrete Choice Methods with Simula- tion. Cambridge University Press, 2003.   Conrady Applied Science, LLC - www.conradyscience.com 25
  • 29. Simulating Market Share with the Bayesia Market Simulator Contact Information Copyright © 2010 Conrady Applied Science, LLC and Bayesia SAS. Conrady Applied Science, LLC All rights reserved. 312 Hamlet’s End Way Franklin, TN 37067 Any redistribution or reproduction of part or all of the USA contents in any form is prohibited other than the follow- +1 888-386-8383 ing: info@conradyscience.com www.conradyscience.com • You may print or download this document for your personal and noncommercial use only. Bayesia SAS 6, rue Léonard de Vinci • You may copy the content to individual third parties for their personal use, but only if you acknowledge BP 119 Conrady Applied Science, LLC and Bayesia SAS as the 53001 Laval Cedex France source of the material. +33(0)2 43 49 75 69 • You may not, except with our express written permis- info@bayesia.com sion, distribute or commercially exploit the content. www.bayesia.com Nor may you transmit it or store it in any other web- site or other form of electronic retrieval system. Conrady Applied Science, LLC - www.conradyscience.com 26