Pre Engineered Building Manufacturers Hyderabad.pptx
Market basket predictive_model
1. Advanced Analytics – Business Intelligence (IT)
PREDICTIVE MODEL ANALYSIS
Prepared By: Fatima Khalid
2. Market Basket Analysis Predictive Model.
Above: Figure A
Purpose:
Enable profit increase by increasing the purchasing power threshold per user with the year-end
target approximating to an average 10% revenue uplift per user. Using cross-selling approach
predict customer behavior to create new marketing optimization strategies that provide offers
to these customerwho have the highest likelihood to buy it and allow sale of new
products/services, making the revenue uplift possible.
Reduce churn. This is largely possible as the customer dependence increases over time when
customer-catered strategies are launched and customer subscribes to more services, thereby
increasing customer loyalty and reducing volatility.
Enable real-time decision-ing at point of sale. Model (through what-if analysis and intelligent
Q&A) should be able to predict customer behavior and responses such that each subscriber’s
interest is addressed. This would allow delivering the right proposition to the right customer at
the point of sale thus, influencing their decision-making there and then.
Data sources and format:
Input Data Specifications:
3. Name Description Rank Storage ValueType KeyLevel OrderLevel MissingString
Accs_Meth_ID User
identification
0 Number Ordinal 1 1 ?
Tct Total
connection
time
4 Date/Time Continuous - - ?
Ncm # of
connections
made
2 Number Ordinal - 1 ?
CSTD Connection
STD
2 Date/Time Continuous - - ?
CED Connection ED 3 Date/Time Continuous - - ?
Acl Avg conn
length
5 Number Continuous - - ?
Tcs-x
Total
connection
time for
service x
6 Date/Time
Continuous - 1 ?
Ncs-x Number of
connections
for service x
7 Number Ordinal - 1 ?
CCP Customer’s
calling plan
1 String Nominal - - ?
ALNO Active LNO 8 String Nominal - - ?
ANSM Active NSM 9 String Nominal - - ?
AFN Active FN 10 String Nominal - - ?
Key To Above
Rank Rank of variable in data file
Storage Type for variable storage
ValueType Nominal (for categorical variables) , Ordinal (for numeric values,) or Continuous (numeric
containing metric values).
OrderLevel Can be used to sort data (1)
KeyLevel Primary key(1), secondary key(2) etc.
MissingString Specific code for missing value
Description Describes the variable
Test Data Set:
Target Variable: Service Offered Selected. (By the customer)
Format: Binary (yes/no)
4. One row of data per Example: single row of data. All of the information known about each
customer should be on a single line.
Text File: A data set that is in a single comma or tab delimited text file is preferred.
Size: File that fits into Excel (no more than 256 columns and 65,536 rows). This makes it simple
to browse through the data prior to building a model.
Technique:
One of the experimented techniques for the implementation of the model is using a Bayesian
networkto represent customer behavior and then using this network to predict which
customers are most likely to pick each service offered. It determines joint probability
distribution over attributes it describes allowing for inferences based on that distribution.
The method starts with a (possibly empty) network representing user’s background knowledge.
At each iteration patterns are found whose probabilities in customer data diverge most from
what the network predicts. The analyst then explains those discrepancies by updating the
network.
For comprehensive understanding, we pick the following three (optional VAS) services offered
to customers currently and then making a model which will predict which service of the three,
the customer is most likely to accept when offered based on its selection of previous services/
products.
GPRS:Internet browsing.
Missed Call Alert (MCA): Notification about calls that you did not answer
voluntarily, or when out of the coverage area or when handset powered off.
Voice Mail: Upto 3 favorite Mobilink numbers at rate of50 paisa/minute at any
time (FN).
Dummy Customer Profile Scenarios are then considered based on the above offers to further
the model creation. An example is as below:
Profile Characteristics
1 More SMS-type services. Short connection time.
2 Number of connections greater during peak hours.
3 Longer connection time to selected few numbers.
4 Peak hour connections made within network.
Reference Figure A, Association rule which form an integral part of market basket analysis are
now formed as below.
Basic Rule: I -> Jwhere I and J are both subsets of H (set of attributes) and I∩J is null. Two further
algorithmic formulas emerge from this.
Support ->S(I->J) = S(I U J) -what proportion of transactions in database contain all items in I U J
Confidence ->C(I->J) = S(I U J)/S(I) –how likely it is that transaction containing I also contain J.
Application of association rules requires method of selecting interesting rules.
5. The method is based on taking into account user’s knowledge of the analyzed problem. The
knowledge is represented using a formal model (Bayesian network). Association rules
discovered in data which do not agree with what users knowledge predicts are considered
interesting. Such rules are than used by the user to update the model, and the algorithm is
applied again to find new interesting rules.
inter(E) = |PBN(E)−PD(E)|that is, as the absolute difference between the
probability of that event obtained from data and predicted based on the Bayesian network.
Trivial apriori known dependencies such as Number of connections (Ncm) = Ncs 1+Ncs2+ Ncs3
and vertex drawn as in (Figure B). Rest of the vertices are drawn according to the Algorithm
operating for Interesting rule. Behavior such as number of connections affect the service added,
that is customers who make few calls don’t use either of the services offered to them and that
vertices is added on to the diagram below and so on.
An example of the Bayesian network formulated with reference to the above mentioned
variables/ rules is as follows (Figure B) which is self- explanatory. However, this is just a cross-
sectional approach to this. More comprehensive add-ons take place as the model performs the
algorithm execution more often and vertices are drawn accordingly. This is presented to get a
picture of how the variables fit in the final picture.
With a more comprehensive structure and vertices, in the end we would be able to determine
which factors affect the three offers the most. This then is helpful predicting customer behavior
as we can determine which customer behavior is influencing the factors and hence the
probabilities would determine if the customer accepted the offer or not (our final target
variable). The offers can then be targeted to the right customers and their buying habits
influenced.
Figure B: Bayesian Network