Your SlideShare is downloading. ×
0
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics and Informatics/

2,020

Published on

Published in: Technology, Economy & Finance
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,020
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
64
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. © Experian Limited 2007. All rights reserved. Experian and the marks used herein are service marks or registered trademarks of Experian Limited. Other product and company names mentioned herein may be the trademarks of their respective owners. No part of this copyrighted work may be reproduced, modified, or distributed in any form or manner without the prior written permission of Experian Limited. Confidential and proprietary. Stepwise Logistic Regression Lecture for FMI Students 27.05.2010 Alexander Efremov
  • 2. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 2 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 3. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 3 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 4. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 4 Introduction Applications of the Logistic Regression Medicine – diagnostics, modeling of disease growth, treatment effect Psychology – learn process modeling, psychological tests evaluation Economics – risk analysis, countries debt investigation, occupational choices Marketing – products consumption, retailers actions effect Criminology – risk factors for performing of criminal act Sociology – employment, graduation, vote analysis Ecology – modeling population growth linguistics – language changes Chemistry – reaction models Media – news effects, copycat reaction Finance – credit scoring, fraud detection Physics, Biology, etc. The Logistic Model
  • 5. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 5 Introduction System Under Investigation Individuals /rough data/ => System => Model => =>
  • 6. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 6 Introduction System Identification Stages
  • 7. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 7 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 8. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 8 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 9. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 9 Part I. Logistic Regression Model Development Logistic Model Linear relation Logistic relation
  • 10. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 10 k kyˆ ky N – index of current individual – intercept – number of observations – the i+1-th model parameter – dependent variable – the i-th independent variable /prob. of good/ – model output – i-th independent variable /predicted prob. of good/ Part I. Logistic Regression Model Development Logistic Model Logistic Relation – General Form “Linear” Log. Regression Model k k M M k e e y + = 1 ˆ kMk e y − + = 1 1 ˆ knnkk xxM ,,110 ... θθθ +++= )...( ,,110 1 1 ˆ knnk xxk e y θθθ +++− + = knnky y xx k k ,,110ˆ1 ˆ ...ln θθθ +++=− 0θ iθ kix , ni ,1= Nk ,1=
  • 11. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 11 Part I. Logistic Regression Model Development Logistic Model Notation Parameters vector Regression vector Logistic model 1+ ∈ n Rθ 1+ ∈ n k Rϕ T n ]...[ 10 θθθθ = T knkk xx ]...1[ ,,1=ϕ θϕθθθ T kknnk ee y xxk −+++− + = + = 1 1 1 1 ˆ )...( ,,110
  • 12. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 12 Part I. Logistic Regression Model Development Residual The Residual kkkk eye e y T k +=+ + = − ˆ 1 1 θϕ    =− =− =−= 0,ˆ 1,ˆ1 ˆ for for kk kk kkk yy yy yye Sources of Uncertainty Unavailable significant factors Simplified relations Time-varying performance Database errors Fraud
  • 13. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 13 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 14. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 14 Part I. Logistic Regression Model Development Maximum Likelihood Estimator Cost Function Model output Likelihood contribution Likelihood function Log-likelihood function Maximum Likelihood Criterion kk y k y kk yyl − −= 1 , )ˆ1(ˆθ θ θ θ θ LL ln2minlnmax −⇔ ∏ = = N k klL 1 ,θθ ∑ = −−+= N k kkkk yyyyL 1 ))ˆ1ln()1(ˆln(ln θ )|1(ˆ kkk yPy ϕ==
  • 15. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 15 Part I. Logistic Regression Model Development Maximum Likelihood Estimator Cost Function /-2 Log L/ for a Real Life Case
  • 16. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 16 Tailor Series Expansion Cost Function Models Linear model Quadratic model Part I. Logistic Regression Model Development Maximum Likelihood Estimator )()()1( ˆˆ iii θθθ ∆+=+ )()()( ˆ )( )( iTiii gfM θ θ ∆+= )()()( 2 1)()()( ˆ )( )()( iiTiiTiii HgfM θθθ θ ∆∆+∆+= 3 )()()( 2 1)()()( ˆ )( ˆ )()( OHgff iiTiiTiii +∆∆+∆+= ∆+ θθθ θθθ )( ˆ )( iTi fg θ ∇= )( ˆ 2)( ii fH θ ∇= Cost function Gradient Hessian )( ˆ )( ˆ ln ii Lf θθ −= ?)( =∆ i θ Estimates Update
  • 17. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 17 Part I. Logistic Regression Model Development Maximum Likelihood Estimator Gradient Hessian I-st Order Methods II-nd Order Method /e.g. Steepest Descent/ /e.g. Newton-Raphson/ gαθ −=∆ gH 1− −=∆ αθ [ ] 1 10 + ∂ ∂ ∂ ∂ ∂ ∂ ∈= nTfff Rg nθθθ L 11 2 2 1 2 0 2 1 2 2 1 2 01 2 0 2 10 2 2 0 2 +×+ ∂ ∂ ∂∂ ∂ ∂∂ ∂ ∂∂ ∂ ∂ ∂ ∂∂ ∂ ∂∂ ∂ ∂∂ ∂ ∂ ∂ ∈                   = nn fff fff fff RH nnn n n θθθθθ θθθθθ θθθθθ L MOMM L L θ (0) 1 2 θ*θopt 1 2 θ (0) θ* θopt
  • 18. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 18 Steepest Newton- Descent Raphson (NR) NR with NR with Line Search Quadratic Interpolation 1 2 θ (0) θ* θopt θ (0) 1 2 θ*θopt Part I. Logistic Regression Model Development Maximum Likelihood Estimator gαθ −=∆ gH 1− −=∆ αθ gH 1* − −=∆ αθ gH 1* − −=∆ αθ θ (0) 1 2 θ*θopt θ (0) 1 2 θ*θopt
  • 19. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 19 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 20. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 20 Numerical Problems Matrix inversion, hence SVD, EVD, QR, etc. Local Minima Part I. Logistic Regression Model Development Potential problems Model Overfitting αθθ −=+ )()1( ˆˆ ii 1− H g -2lnL k y2,k yk 1,ky
  • 21. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 21 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 22. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 22 Part I. Logistic Regression Model Development Frequently Used Statistics for Model Analysis Individual Estimate Measures Standard error Wald statistic p-value Overall Model Measures Coefficient of determination (R2) generalized R2 gen. max. resc. R2 Cost function 2 1 ˆ)ˆ( ~2 ˆ 2 2 ˆ 2 χ θθ σ θ σ θθ i i i ii iW == − N LL eR θθ ˆln0 ˆln 2 12 − −= 1 0 ˆln2 1 −−= N L esR θ Rs R mR 22 = )( ˆ )( ˆ ln2 ii Lf θθ −= iH i )][diag( 1 ˆ − =θ σ 2 1Pr χ> χ p-value WWi
  • 23. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 23 Part I. Logistic Regression Model Development Frequently Used Statistics for Model Analysis Modified criteria Akaike Information Criterion (AIC) Schwarz Criterion (SC) Minimum Description Length (MDL), Final Prediction Error (FPE), etc. Model Validation Data split into development and validation samples nLAIC 2ln2 ˆˆ +−= θθ )1ln(ln2 ˆˆ −+−= NnLSC θθ AIC -2lnL
  • 24. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 24 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 25. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 25 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 26. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 26 Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea xo, xe – sets of all variables, out/entered in the model xoi, xei – the most/less significant variable SLE – Significance Level to Enter SLS – Significance Level to Stay SWR
  • 27. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 27 Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea Available information
  • 28. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 28 Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea 1 Initialization
  • 29. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 29 Forward Selection Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea 1 2
  • 30. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 30 1 2 3 Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea Forward Selection
  • 31. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 31 2 3 Part II. Stepwise Logistic Regression Stepwise Logistic Regression – Basic Idea Backward Elimination
  • 32. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 32 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 33. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 33 Part II. Stepwise Logistic Regression Step 0. Initialization Logistic model 1. Intercept Model 2. Full model 3. One Factor Model Check for Enter Score Chi-Sq for all potential models Maximum Score Chi-Square p-value & threshold Model Determination (Optimization) θϕT ke yk − + = 1 1 ˆ ii T ii gHgS 1− = R∈θ 1=kϕ 1+ ∈ n Rθ T knkk xx ]1[ ,,1 K=ϕ i i Smaxarg1 =l SLEvalue-p 1 <l T kk x ]1[ ,1l=ϕ2 R∈θ
  • 34. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 34 Part II. Stepwise Logistic Regression Step 1. Forward Selection 1. Check for Enter Score Chi-Square of all potential models Maximum Score Chi-Square p-value & threshold 2. Model Determination (Optimization) 3. Statistics for Model Analysis Individual Estimate Measures standard error Wald statistic & p-value ii T ii gHgS 1− = i i i Smaxarg=l SLEvalue-p <il T kkk i xx ]1[ ,,1 ll K=ϕ1+ ∈ i Rθ
  • 35. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 35 Part II. Stepwise Logistic Regression Step 1. Forward Selection 3. Statistics for Model Analysis (part 2) Overall Model Measures Coefficients of determination Cost function Modified criteria Akaike Information Criterion (AIC) Schwarz Criterion (SC)
  • 36. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 36 Part II. Stepwise Logistic Regression Stepwise Logistic Regression SWR
  • 37. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 37 Part II. Stepwise Logistic Regression Step 2. Backward Elimination 1. Check for Leave Wald statistic & p-value of all potential models p-value & threshold 2. Model Determination (Optimization) 3. Statistics for Model Analysis Individual Estimate Measures standard error Wald statistic & p-value T kkkkk ijj xxxx ]1[ ,,,, 111 llll KK +− =ϕi R∈θ SLLvalue-pmax >il
  • 38. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 38 3. Statistics for Model Analysis (part 2) Overall Model Measures Coefficients of determination Cost function Modified criteria Akaike Information Criterion (AIC) Schwarz Criterion (SC) Part II. Stepwise Logistic Regression Step 2. Backward Elimination
  • 39. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 39 Agenda Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 40. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 40 Part II. Stepwise Logistic Regression Potential problems in the Stepwise Regression Local Minima & Initial Conditions Numerical Problems /SVD, EVD, QR, etc./ Model Overfitting
  • 41. © Experian Limited 2007. All rights reserved. Confidential and proprietary. 41 Summary Introduction Applications of the Logistic Regression System Identification & Stepwise Regression Part I. Logistic Regression Model Development Logistic Model Maximum Likelihood Estimator Potential Problems Model Analysis and Validation Part II. Stepwise Logistic Regression (SWR) Basic Idea SWR Algorithm Potential Problems Summary
  • 42. © Experian Limited 2007. All rights reserved. Experian and the marks used herein are service marks or registered trademarks of Experian Limited. Other product and company names mentioned herein may be the trademarks of their respective owners. No part of this copyrighted work may be reproduced, modified, or distributed in any form or manner without the prior written permission of Experian Limited. Confidential and proprietary. Stepwise Logistic Regression Lecture for FMI Students 27.05.2010 Alexander Efremov Thank You! http://anp.tu-sofia.bg/aefremov/index.htm

×