Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
 
Behavior-Based Predictive Models
Disclaimer <ul><li>Any opinions, advice, statements, or other information or content expressed or made in the following pr...
Special Thanks to <ul><li>- SAS and Dr. Jerry Oglesby </li></ul><ul><li>- ChoicePoint Precision Marketing (CPPM) analytic ...
Introduction <ul><li>Most popular Data Mining models:  </li></ul><ul><li>Logistic regression and its variants (NNets, SVM,...
Current Status <ul><li>Most efforts focusing on relationship exploration: </li></ul><ul><li>- NNets, SVM, GAM, CART, … … <...
Genuine Count Model <ul><li>Starting Point: basic Poisson Model </li></ul><ul><li>Major Drawback: </li></ul><ul><li>Strong...
Major Alternative <ul><li>Negative Binomial Model (continuous mixture): </li></ul><ul><li>E(Y|X) = λ and Var(Y|X) = λ + α ...
Composite Models <ul><li>Main Assumption => Multiple(2+) Components </li></ul><ul><li>- Data governed by multiple processe...
An Application <ul><li>Credit Card Data used in Econometric Analysis (Greene 1992) </li></ul><ul><li>Outcome: # of 60-day ...
Data Summary For outcome, Variance = 4 times Mean
EDA on Outcome 1. 80% Cardholders have 0 delinquency. 2. Large dispersion with long tail
Traditional Modeling Practice <ul><li>Logistic Regression based on 2-State assumption: </li></ul><ul><li>Define Y = 0 if M...
Standard Count Data Model <ul><li>Basic Poisson Model => Not Sufficient for data with 80% Zeroes </li></ul><ul><li>Negativ...
NB Output
NB Portfolio Prediction
How to Score
NB Account Prediction
Hurdle Model <ul><li>Two-Component Assumption: </li></ul><ul><li>- Zeroes counts determined by Binomial distribution </li>...
Hurdle Model in SAS <ul><li>proc nlmixed data = data; </li></ul><ul><li>params b0 = 0 b1 = 0 ... a0 = 0 a1 = 0 ...; </li><...
Hurdle Output Drivers for Presence of Delinquency Drivers for Severity of Delinquency
Hurdle Portfolio Prediction Un-normalized Truncated Poisson Distribution Composite Distribution
Hurdle Segmentation 1. Segmentation Model: Logistic Model separates BLUE from RED 2. Severity Model: Truncated Poisson pre...
Hurdle Account Prediction
Zero-Inflated Poisson Model <ul><li>Two-Component Assumption: </li></ul><ul><li>- Part of zeroes determined by Binomial di...
ZIP Model in SAS <ul><li>proc nlmixed data = data; </li></ul><ul><li>params b0 = 0 b1 = 0 ... a0 = 0 a1 = 0 ...; </li></ul...
ZIP Output Drivers for Existence of Risk Drivers for Severity of Risk
ZIP Portfolio Prediction Un-normalized Poisson Distribution Composite Distribution
ZIP Segmentation Same outcome but different risk implications 1. Blue (72%): Established, free from financial risk 2. Red ...
ZIP Account Prediction
Latent Class Poisson Model <ul><li>General S-Component Assumption for S>= 2: </li></ul><ul><li>- Avoid sharp dichotomizati...
LCP Model in SAS <ul><li>proc nlmixed data = data;  </li></ul><ul><li>params a0 = 0 ... b0 = 1 ...  </li></ul><ul><li>prio...
LCP Output Drivers for Low Risk Drivers for High Risk
LCP Portfolio Prediction Poisson Distribution of High Mean Composite Distribution Poisson Distribution of Low Mean
LCP Segmentation
LCP Account Prediction ~ 5% benefit at high-risk zone
Parameter Comparison In Hurdle / ZIP, 1 st  set of BETAs explain why delinquent and 2 nd  set explain how many delinquenci...
Prediction Comparison <ul><li>Overall, NB model fits the best </li></ul><ul><li>Hurdle / ZIP works better in excess zeroes...
Model Comparison <ul><li>Statistical Consideration:  </li></ul><ul><li>Better Statistics, More Parsimonious => NB </li></u...
Upcoming SlideShare
Loading in …5
×

Behavior-Based Predictive Models

1,509 views

Published on

Behavior-Based Predictive Models

Published in: Technology, Business
  • Be the first to comment

Behavior-Based Predictive Models

  1. 2. Behavior-Based Predictive Models
  2. 3. Disclaimer <ul><li>Any opinions, advice, statements, or other information or content expressed or made in the following presentation are those of the presenter and do not necessarily state or reflect the positions or opinions of JPMorgan Chase, its affiliates or subsidiaries. </li></ul>
  3. 4. Special Thanks to <ul><li>- SAS and Dr. Jerry Oglesby </li></ul><ul><li>- ChoicePoint Precision Marketing (CPPM) analytic team that supported me to finish this work but doesn’t exist any more. </li></ul><ul><li>Rules: </li></ul><ul><li>1. During talk, stop me any time if you have question. </li></ul><ul><li>2. After talk, welcome to discuss with me offline. </li></ul>
  4. 5. Introduction <ul><li>Most popular Data Mining models: </li></ul><ul><li>Logistic regression and its variants (NNets, SVM, … …) </li></ul><ul><li>2-state Assumption => Bernoulli Outcome </li></ul><ul><li>Predict the presence of certain behaviors (response, … …) </li></ul><ul><li>Major Limitation: </li></ul><ul><li>- Ignore Frequency and Severity given presence of behavior </li></ul><ul><li>Ex. 1st-time Auto Claim => Bad Luck => Normal </li></ul><ul><li>2 or More Claims => Bad Habit => Risky </li></ul><ul><li>- Consequence of 2-state: rank order head count but not $$$ </li></ul>
  5. 6. Current Status <ul><li>Most efforts focusing on relationship exploration: </li></ul><ul><li>- NNets, SVM, GAM, CART, … … </li></ul><ul><li>Overlook Definition of Left-hand Side: </li></ul><ul><li>- Binary outcome is derived from but over-simplifies behaviors </li></ul><ul><li>- Why not model behaviors directly using Count Models? </li></ul><ul><li>Any loss without 2-state assumption (Logistic Regression)? </li></ul><ul><li>Law of Small Numbers: </li></ul><ul><li>Bernoulli (N, p) ≈ Poisson (Np) given N -> ∞ and p -> 0 </li></ul><ul><li>=> Prob (Y = 1|Y~Bern.) ≈ Prob (Y ≥ 1|Y~Pois.) - Show later ! </li></ul>
  6. 7. Genuine Count Model <ul><li>Starting Point: basic Poisson Model </li></ul><ul><li>Major Drawback: </li></ul><ul><li>Strong Assumption of Equi-Dispersion => Mean = Variance </li></ul><ul><li>Real-world Data => Over-Dispersion </li></ul><ul><li>- Excess Zeroes: Majority with 0 delinquency in Credit Card </li></ul><ul><li>- Long Right Tail: Severely sick patients in Insurance </li></ul>Observed Heterogeneity
  7. 8. Major Alternative <ul><li>Negative Binomial Model (continuous mixture): </li></ul><ul><li>E(Y|X) = λ and Var(Y|X) = λ + α λ 2 > λ => Problem Solved ! </li></ul><ul><li>Potential Limitation: 1-Process Assumption </li></ul><ul><li>- Lack of flexibility for heterogeneous population </li></ul><ul><li>- Lack of intuitive interpretation on excess zeroes </li></ul><ul><li>- Lack of insight for customer segmentation </li></ul>Observed Heterogeneity Unobserved Heterogeneity ┴
  8. 9. Composite Models <ul><li>Main Assumption => Multiple(2+) Components </li></ul><ul><li>- Data governed by multiple processes </li></ul><ul><li>Ex. Insurance claimant might behave differently after 1st claim. </li></ul><ul><li>Models covered: </li></ul><ul><li>- Hurdle Model (Mullahy 1986) </li></ul><ul><li>- Zero-Inflated Poisson Model (Lambert 1992) </li></ul><ul><li>- Latent Class Poisson Model (Wedel 1993) </li></ul><ul><li>Additional Benefit: </li></ul><ul><li>- Segmentation by behavior or / and characteristics </li></ul>
  9. 10. An Application <ul><li>Credit Card Data used in Econometric Analysis (Greene 1992) </li></ul><ul><li>Outcome: # of 60-day Delinquencies in payment </li></ul><ul><li>Predictors: </li></ul>
  10. 11. Data Summary For outcome, Variance = 4 times Mean
  11. 12. EDA on Outcome 1. 80% Cardholders have 0 delinquency. 2. Large dispersion with long tail
  12. 13. Traditional Modeling Practice <ul><li>Logistic Regression based on 2-State assumption: </li></ul><ul><li>Define Y = 0 if MajorDrg = 0 and Y = 1 otherwise </li></ul><ul><li>Fit a logistic regression with 0/1 Bernoulli outcome </li></ul><ul><li>proc logistic data = credit; </li></ul><ul><li>model Y = < PREDICTORS > ; </li></ul><ul><li>run; </li></ul><ul><li>Can’t differentiate between 1 delinquency and 3 delinquencies </li></ul><ul><li>Able to capture head counts but not dollar </li></ul>
  13. 14. Standard Count Data Model <ul><li>Basic Poisson Model => Not Sufficient for data with 80% Zeroes </li></ul><ul><li>Negative Binomial Model: </li></ul>proc genmod data = credit; model Y = < PREDICTORS > / dist = NB link = log ; run; <ul><li>Goodness-of-Fit: Both portfolio level and account level </li></ul>
  14. 15. NB Output
  15. 16. NB Portfolio Prediction
  16. 17. How to Score
  17. 18. NB Account Prediction
  18. 19. Hurdle Model <ul><li>Two-Component Assumption: </li></ul><ul><li>- Zeroes counts determined by Binomial distribution </li></ul><ul><li>- Positive counts governed by Zero-Truncated Poisson distribution </li></ul><ul><li>2-Group Segmentation: </li></ul><ul><li>- Group without delinquency </li></ul><ul><li>- Group with delinquency </li></ul>
  19. 20. Hurdle Model in SAS <ul><li>proc nlmixed data = data; </li></ul><ul><li>params b0 = 0 b1 = 0 ... a0 = 0 a1 = 0 ...; </li></ul><ul><li>xb = b0 + b1 * INCOME ... ...); </li></ul><ul><li>mu = exp(xb); </li></ul><ul><li>xa = a0 + a1 * INCOME ... ...); </li></ul><ul><li>if y = 0 then p = exp(xa) / (1 + exp(xa)); </li></ul><ul><li>else p = (1 - exp(xa) / (1 + exp(xa))) / (1 - exp(-mu)) * (exp(-mu) * mu ** y / fact(y)) ; </li></ul><ul><li>ll = log(p); </li></ul><ul><li>model y ~ general(ll); </li></ul><ul><li>run; </li></ul>Probability for Zero Probability for Zero-Truncated Poisson
  20. 21. Hurdle Output Drivers for Presence of Delinquency Drivers for Severity of Delinquency
  21. 22. Hurdle Portfolio Prediction Un-normalized Truncated Poisson Distribution Composite Distribution
  22. 23. Hurdle Segmentation 1. Segmentation Model: Logistic Model separates BLUE from RED 2. Severity Model: Truncated Poisson predicts severity of RED
  23. 24. Hurdle Account Prediction
  24. 25. Zero-Inflated Poisson Model <ul><li>Two-Component Assumption: </li></ul><ul><li>- Part of zeroes determined by Binomial distribution </li></ul><ul><li>- Rest of zeroes together with positive counts determined by standard Poisson distribution </li></ul><ul><li>2-Group Segmentation: </li></ul><ul><li>- Group without delinquency risk </li></ul><ul><li>- Group with delinquency risk </li></ul>
  25. 26. ZIP Model in SAS <ul><li>proc nlmixed data = data; </li></ul><ul><li>params b0 = 0 b1 = 0 ... a0 = 0 a1 = 0 ...; </li></ul><ul><li>xb = b0 + b1 * INCOME ... ...); </li></ul><ul><li>mu = exp(xb); </li></ul><ul><li>xa = a0 + a1 * INCOME … …); </li></ul><ul><li>if y = 0 then p = exp(xa) / (1 + exp(xa)) + (1 - exp(xa) / (1 + exp(xa)) * exp(-mu); </li></ul><ul><li>else p = (1 - exp(xa) / (1 + exp(xa))) * (exp(-mu) * mu ** y / fact(y)); </li></ul><ul><li>ll = log(p); </li></ul><ul><li>model y ~ general(ll); </li></ul><ul><li>Run; </li></ul>Probability for zero Probability for Poisson after excluding zero
  26. 27. ZIP Output Drivers for Existence of Risk Drivers for Severity of Risk
  27. 28. ZIP Portfolio Prediction Un-normalized Poisson Distribution Composite Distribution
  28. 29. ZIP Segmentation Same outcome but different risk implications 1. Blue (72%): Established, free from financial risk 2. Red (8%): Vulnerable, might deteriorate in bad time
  29. 30. ZIP Account Prediction
  30. 31. Latent Class Poisson Model <ul><li>General S-Component Assumption for S>= 2: </li></ul><ul><li>- Avoid sharp dichotomization </li></ul><ul><li>- Each case drawn from an unobserved Poisson component with different parameter </li></ul><ul><li>- S is determined by AIC / BIC </li></ul><ul><li>Segmentation assumed S = 2: </li></ul><ul><li>- Group with low risk </li></ul><ul><li>- Group with high risk </li></ul>
  31. 32. LCP Model in SAS <ul><li>proc nlmixed data = data; </li></ul><ul><li>params a0 = 0 ... b0 = 1 ... </li></ul><ul><li>prior1 = 0 to 1 by 0.1; </li></ul><ul><li>xa = a0 + a1 * INCOME ... ...); ma = exp(xa); </li></ul><ul><li>pa = exp(-ma) * ma ** y / fact(y); </li></ul><ul><li>xb = b0 + b1 * INCOME ... ...); mb = exp(xb); </li></ul><ul><li>pb = exp(-mb) * mb ** y / fact(y); </li></ul><ul><li>p = prior1 * pa + (1 - prior1) * pb; </li></ul><ul><li>ll = log(p); </li></ul><ul><li>run; </li></ul>Probability of LC component 1 Probability of LC component 2
  32. 33. LCP Output Drivers for Low Risk Drivers for High Risk
  33. 34. LCP Portfolio Prediction Poisson Distribution of High Mean Composite Distribution Poisson Distribution of Low Mean
  34. 35. LCP Segmentation
  35. 36. LCP Account Prediction ~ 5% benefit at high-risk zone
  36. 37. Parameter Comparison In Hurdle / ZIP, 1 st set of BETAs explain why delinquent and 2 nd set explain how many delinquencies will be.
  37. 38. Prediction Comparison <ul><li>Overall, NB model fits the best </li></ul><ul><li>Hurdle / ZIP works better in excess zeroes </li></ul><ul><li>In Cherry-picking, all are comparable to Logistic regression </li></ul><ul><li>Implied Models: Hurdle NB / Zero-Inflated NB / Latent Class NB ? </li></ul>
  38. 39. Model Comparison <ul><li>Statistical Consideration: </li></ul><ul><li>Better Statistics, More Parsimonious => NB </li></ul><ul><li>Business Consideration: </li></ul><ul><li>Better Interpretation, More Insight => Hurdle / ZIP / LCP </li></ul>

×