Successfully reported this slideshow.
Upcoming SlideShare
×

# Linear Probability Models and Big Data: Prediction, Inference and Selection Bias

229 views

Published on

We compare the LPM with logit and probit under different study goals: inference, prediction and selection bias

Published in: Data & Analytics
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Linear Probability Models and Big Data: Prediction, Inference and Selection Bias

1. 1. Linear Probability Models and Big Data: Prediction, Inference and Selection Bias Suneel Chatla Galit Shmueli Institute of Service Science National Tsing Hua University Taiwan
2. 2. Outline  Introduction to binary outcome models  Motivation : Rare use of LPM  Study goals o Estimation and inference o Classification o Selection bias  Simulation study  eBay data – in paper  Conclusions 2
3. 3. 3
4. 4. E[𝑍|𝑥1, . . , 𝑥 𝑝] = 𝑝𝑟𝑜𝑏 (𝑍 = 1|𝑥1, . . , 𝑥 𝑝) ≝ 𝑝 𝐥𝐨𝐠 𝒑 𝟏−𝒑 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑝 𝑥 𝑝 𝜱−𝟏 𝒑 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑝 𝑥 𝑝 𝒑 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑝 𝑥 𝑝 Binary outcome models 𝑍 = {0,1} Logit Probit LPM OLS Regression: 𝐸 𝑍 𝑥1, … , 𝑥 𝑝 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑝 𝑥 𝑝 Standard normal cdf 0 1
5. 5. The purpose of binary-outcome regression models? Inference and estimation Selection Bias Prediction (Classificat ion) 5
6. 6. Summary of IS literature (MISQ,JAIS,ISR and MS: 2000~2016) • Inference and estimation60 • Selection bias31 • Classification and prediction5 Only 8 used LPM 3 are from this year alone 6 ”Implementing a campaign fixed effects model with Multinomial logit is challenging due to incidental parameter problem so we opt to employ LPM …” – Burtch et al. (2016) ”The LPM is simple for both estimation and inference. LPM is fast and it allows for a reasonable accurate approximation of true preferences.” – Schlereth & Skiera (2016)
7. 7. 7 Statisticians don’t like LPM Econometricians love LPM Researchers rarely use LPM WHY?
8. 8. Criticisms Non normal error Non constant error variance Unbounded predictions Functional form Logit ✔ ✔ ✔ ✔✖ Probit ✔ ✔ ✔ ✔✖ LPM ✖ ✖ ✖ ✖ Comparison of three models in terms their theoretical properties 8
9. 9. Advantages Convergence issues Incidental parameters Easier interpretation Computational speed Logit ✖ ✖ ✔ ✔✖ Probit ✖ ✖ ✖ ✔✖ LPM ✔ ✔ ✔ ✔ Comparison in terms of practical issues 9
10. 10. The Questions that Matter to Researchers? Logit Probit LPM Inference & Estimation Classification Selection Bias 10
11. 11. Inference and estimation • Consistency • Marginal effects 11
12. 12. Latent Framework 𝒀 𝑛×1 = 𝑋 𝑛×(𝑝+1) 𝛽(𝑝+1)×1 + 𝜀 𝑛×1 𝑍 = 1, 𝑖𝑓 𝒀 > 0 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Latent continuous (not observable) 12 Inference and estimation 𝑙𝑜𝑔𝑖𝑠(0,1) • Logit model 𝑁(0,1) • Probit model 𝑈(0,1) • Linear probability model
13. 13.  The MLE’s of both logit and probit are consistent. 𝛽 𝑝 𝛽  LPM estimates are proportionally and directionally consistent (Billinger, 2012) . 𝛽𝑙𝑝𝑚 𝑝 𝑘𝛽 n 𝑘𝛽 𝛽 𝛽 𝛽𝑙𝑝𝑚 13 Inference and estimation
14. 14. Marginal effects for interpreting effect size  For LPM ME for 𝑥𝑖𝑘 = 𝜕𝐸[𝑧 𝑖] 𝜕𝑥 𝑘 = 𝛽 𝑘  For logit model ME for 𝑥𝑖𝑘 = 𝜕𝐸[𝑧 𝑖] 𝜕𝑥 𝑘 = 𝑒 𝑥 𝑖 𝛽 (1+𝑒 𝑥 𝑖 𝛽)2 𝛽 𝑘  For probit model ME for 𝑥𝑖𝑘 = 𝜕𝐸[𝑧 𝑖] 𝜕𝑥 𝑘 = ∅(𝑥𝑖 𝛽) 𝛽 𝑘 14 Easy Interpretation No direct Interpretation Inference and estimation
15. 15. Simulation study • Sample sizes {50,500,50000} • Error distribution {Logistic, Normal, Uniform} • 100 Bootstrap samples 15 Inference and estimation
16. 16. Comparison of Standard Models 16 True Logit Probit LPM Intercept 0 0 0 0.5 𝑥1 1 0.99 1 0.47 𝑥2 -1 -1 -1.01 -0.43 𝑥3 0.5 0.5 0.5 0.21 𝑥4 -0.5 -0.5 -0.5 -0.21 k=0.4 Inference and estimation 1.02 -1.07 0.52 -0.52
17. 17. Non- significance results are identical coefficient significance results are identical Comparison of significance 17 Inference and estimation
18. 18. Comparison of marginal effects ●●●● ●● ● ● ● ●●●● ●● ● ● ● ● ● ●●● ● ● ● ● ●●●● ● ● ●●● ● ● ● ●● ●●● ● ● ● Logistic Normal Uniform −2 −1 0 1 2 x1 x2 x3 x4 x1 x2 x3 x4 x1 x2 x3 x4 MarginalEffect(ME ^ ) Model Probit Logit LPM Sample Size = 50 ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ●● Logistic Normal Uniform −1.0 −0.5 0.0 0.5 x1 x2 x3 x4 x1 x2 x3 x4 x1 x2 x3 x4 MarginalEffect(ME ^ ) Model Probit Logit LPM Sample Size = 500 ● ●● ● ● ● ●● ●●● ●● ●● ●● ●●●● ●● ●● ●●●●● ● ● ● Logistic Normal Uniform −0.6 −0.3 0.0 0.3 x1 x2 x3 x4 x1 x2 x3 x4 x1 x2 x3 x4 MarginalEffect(ME ^ ) Model Probit Logit LPM Sample Size = 50,000 distributions of marginal effects are identical 18 Inference and estimation
19. 19. Classification and prediction • Predictions beyond [0,1] 19
20. 20. Is trimming appropriate? Replace with 0.99, 0.999 Replace with 0.001, 0.0001 LogitPredictions 20 Classification and prediction
21. 21. Classification 21 Classification accuracies are identical Classification and prediction
22. 22. Selection Bias 22
23. 23. Quasi-experiments Like randomized experimental designs that test causal hypotheses but lack random assignment Treatment Assignment ● Assigned by experimenter ● Self selection 23 Selection Bias
24. 24. Selection BiasTwo-Stage (2SLS) Methods Stage 1: Selection model (T) Adjustment Stage 2: Outcome model (Y) 𝐸[𝑇|𝑋] = Φ(𝑋𝛾) 𝐼𝑀𝑅 = 𝜙(𝑋𝛾) Φ(𝑋𝛾) 𝑌 = 𝑋𝜷 +𝛿 𝐼𝑀𝑅 + 𝜀 (Heckman, 1977) 𝐸[𝑇|𝑋] = 𝑋𝛾 𝜆 = 𝑋𝛾 − 1 𝑌 = 𝑋𝜷 +𝛿 𝜆 + 𝜀 (Olsen, 1980) Probit LPM 24 Selection Adjustment Olsen is simpler
25. 25. Selection Bias Outcome model coefficients (bootstrap) Both Heckman and Olsen’s methods perform similar to the MLE 25 Selection Bias
26. 26. Bottom line Inference and Estimation • Use LPM with large sample; otherwise logit/probit is preferable • With small- sample LPM use robust standard errors Classification • Use LPM if goal is classification or ranking • Trim predicted probabilities • If probabilities are needed, then logit/probit is preferable Selection Bias • Use LPM if the sample is large • If both selection and outcome models have the same predictors, LPM suffers from multicollinearity 26
27. 27. Thank you! Suneel Chatla, Galit Shmueli, (2016), An Extensive Examination of Linear Regression Models with a Binary Outcome Variable, Journal of the Association for Information Systems (Accepted). 27