Data Mining with Enterprise Miner Improving Marketing Effectiveness for Home Equity ProductUC Berkeley Extension
Business DetailMarketing DataBusiness Objective: Improve marketing efficiency by estimating the likelihood of response for a marketing campaign promoting the home equity product Marketing Definition (Policy K.O.): Target potential customers with FICO exceeding 680Marketing Budget: Constrained to cover 30% of overall marketData Source: Response data from previous untargeted trial marketing campaign results
Analysis VariablesMarketing DataCredit InquiriesDerogatory AccountsCollection AccountsLiensHome EquityMortgageBank CardsInstallment LoansLeasesRetail TradesRevolvingFICOMDSResponse Flag (response variable)
Tech SpecsMarketing DataResponse Variable: binary (1=Y/0=N)The rare (target) event provides the probability that y = 1 given a particular set of values for x1, x2, . . . , xp.E(y)= estimate P(y=1, x1, x2, . . . , xp)Sampling: Target event over-sampled in order to improve statistical learning (28% - YES vs. 72% - NO)Data Partition: Train, Validation, Test (40%, 30%, 30%) respectivelyTransformations: FICO and MDS in order to maximize normality
Tech SpecsMarketing DataReplacement (Missing): FICO and MDS data is comprised of 28% missing values; imputation method is to populate with medianVariable Selection: utilized R2 criteria selection as first pass variable selection for Logit and Neural Network modelsModels:Logit: Full, Forward, Backwards, Stepwise, Interaction StepwiseDecision Tree: CHAID, CART, Entropy (C4.5)Neural Network: RBF, MLP2-2, MLP3-3Prior Probability: readjusted oversampling by applying a 1% (response) prior probability to the target event
Model ArchitectureMarketing Data
LIFTMarketing DataLogit ForwardLogit forward model provides the largest lift percentage (nearly 67%) at 30% market coverage, which equals the budget constraint
Targeted Response HistogramMarketing DataThe probability distribution of the target response in very telling. The left-side of the distribution identifies the majority of the market has a low probability of responding. Campaign focus should be geared to the left.
Adverse SelectionMarketing DataA target’s willingness to respond is negatively correlated to his/her FICO score.
ResultsMarketing DataBest Predictive Model: Logit Forward (67% lift at 30% target market)Extreme bimodal distribution separates responsive targets from non responsiveAs expected adverse selection exist. Target deciles 5,6 and 7 to marry marketing response expectations with credit risk guidance.

Data Mining Home Equity EM

  • 1.
    Data Mining withEnterprise Miner Improving Marketing Effectiveness for Home Equity ProductUC Berkeley Extension
  • 2.
    Business DetailMarketing DataBusinessObjective: Improve marketing efficiency by estimating the likelihood of response for a marketing campaign promoting the home equity product Marketing Definition (Policy K.O.): Target potential customers with FICO exceeding 680Marketing Budget: Constrained to cover 30% of overall marketData Source: Response data from previous untargeted trial marketing campaign results
  • 3.
    Analysis VariablesMarketing DataCreditInquiriesDerogatory AccountsCollection AccountsLiensHome EquityMortgageBank CardsInstallment LoansLeasesRetail TradesRevolvingFICOMDSResponse Flag (response variable)
  • 4.
    Tech SpecsMarketing DataResponseVariable: binary (1=Y/0=N)The rare (target) event provides the probability that y = 1 given a particular set of values for x1, x2, . . . , xp.E(y)= estimate P(y=1, x1, x2, . . . , xp)Sampling: Target event over-sampled in order to improve statistical learning (28% - YES vs. 72% - NO)Data Partition: Train, Validation, Test (40%, 30%, 30%) respectivelyTransformations: FICO and MDS in order to maximize normality
  • 5.
    Tech SpecsMarketing DataReplacement(Missing): FICO and MDS data is comprised of 28% missing values; imputation method is to populate with medianVariable Selection: utilized R2 criteria selection as first pass variable selection for Logit and Neural Network modelsModels:Logit: Full, Forward, Backwards, Stepwise, Interaction StepwiseDecision Tree: CHAID, CART, Entropy (C4.5)Neural Network: RBF, MLP2-2, MLP3-3Prior Probability: readjusted oversampling by applying a 1% (response) prior probability to the target event
  • 6.
  • 7.
    LIFTMarketing DataLogit ForwardLogitforward model provides the largest lift percentage (nearly 67%) at 30% market coverage, which equals the budget constraint
  • 8.
    Targeted Response HistogramMarketingDataThe probability distribution of the target response in very telling. The left-side of the distribution identifies the majority of the market has a low probability of responding. Campaign focus should be geared to the left.
  • 9.
    Adverse SelectionMarketing DataAtarget’s willingness to respond is negatively correlated to his/her FICO score.
  • 10.
    ResultsMarketing DataBest PredictiveModel: Logit Forward (67% lift at 30% target market)Extreme bimodal distribution separates responsive targets from non responsiveAs expected adverse selection exist. Target deciles 5,6 and 7 to marry marketing response expectations with credit risk guidance.