Upcoming SlideShare
×

# Prediction of Credit Default by Continuous Optimization

905 views

Published on

AACIMP 2009 Summer School lecture by Gerhard Wilhelm Weber. "Modern Operational Research and Its Mathematical Methods" course.

Published in: Education, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
905
On SlideShare
0
From Embeds
0
Number of Embeds
75
Actions
Shares
0
18
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Prediction of Credit Default by Continuous Optimization

1. 1. 4th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 5-16, 2009 Prediction of Credit Default by Continuous Optimization Gerhard- Gerhard-Wilhelm Weber * Efsun Kürüm, Kasırga Yıldırak Institute of Applied Mathematics Middle East Technical University, Ankara, Turkey * Faculty of Economics, Management and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal
2. 2. Outline • Main Problem from Credit Default • Logistic Regression and Performance Evaluation • Cut-Off Values and Thresholds • Classification and Optimization • Nonlinear Regression • Numerical Results • Outlook and Conclusion
3. 3. Main Problem from Credit Default Whether a credit application should be consented or rejected. Solution Learning about the default probability of the applicant.
4. 4. Main Problem from Credit Default Whether a credit application should be consented or rejected. Solution Learning about the default probability of the applicant.
5. 5. Logistic Regression  P(Y = 1 X = xl )  log  = β0 + β1 ⋅ xl1 + β2 ⋅ xl 2 + K + β p ⋅ xlp  P(Y = 0 X = x )   l  (l = 1, 2,..., N )
6. 6. Goal Our study is based on one of the Basel II criteria which recommend that the bank should divide corporate firms by 8 rating degrees with one of them being the default class. We have two problems to solve here: To distinguish the defaults from non-defaults. To put non-default firms in an order based on their credit quality and classify them into (sub) classes.
7. 7. Data Data have been collected by a bank from the firms operating in the manufacturing sector in Turkey. They cover the period between 2001 and 2006. There are 54 qualitative variables and 36 quantitative variables originally. Data on quantitative variables are formed based on a balance sheet submitted by the firms’ accountants. Essentially, they are the well-known financial ratios. The data set covers 3150 firms from which 92 are in the state of default. As the number of default is small, in order to overcome the possible statistical problems, we downsize the number to 551, keeping all the default cases in the set.
8. 8. We evaluate performance of the model non-default default cases cases cut-off value ROC curve test result value TPF, sensitivity FPF, 1-specificity
9. 9. Model outcome versus truth truth d n True Positive False Positive Fraction Fraction dı TPF FPF model outcome False Negative True Negative Fraction Fraction nı FNF TNF 1 1 total
10. 10. Definitions • sensitivity (TPF) := P( Dı | D) • specificity := P( NDı | ND ) • 1-specificity (FPF) := P( Dı | ND ) • points (TPF, FPF) constitute the ROC curve • c := cut-off value • c takes values between - ∞ and ∞ • TPF(c) := P( z>c | D ) • FPF(c) := P( z>c | ND )
11. 11. normal-deviate axes TPF Normal Deviate (TPF) FPF FPF (ci ) := Φ(ci ) TPF (ci ) := Φ(a + b ⋅ ci ) µn - µs σn a := b := σs σs Normal Deviate (FPF)
12. 12. normal-deviate axes TPF t Normal Deviate (TPF) FPF FPF (ci ) := Φ(ci ) TPF (ci ) := Φ(a + b ⋅ ci ) c µn - µs σn a := b := σs σs Normal Deviate (FPF)
13. 13. Classification Ex.: cut-off values actually non-default actually default cases cases c −∞ class I class II class III class IV class V ∞ To assess discriminative power of such a model, we calculate the Area Under (ROC) Curve: ∞ AUC := ∫ Φ(a + b ⋅ c) d Φ(c). −∞
14. 14. relationship between thresholds and cut-off values Ex.: TPF FPF t0 t1 t2 t3 t4 t5 R=5 Φ( c ) = t ⇔ c = Φ − 1 (t )
15. 15. Optimization in Credit Default Problem: Simultaneously to obtain the thresholds and the parameters a and b that maximize AUC, while balancing the size of the classes (regularization) and guaranteeing a good accuracy.
16. 16. Optimization Problem 2 1 -1 α 1 ⋅ ∫ Φ( a + b ⋅ Φ (t )) dt − α 2 ⋅∑   γi R −1  max − (ti +1 − ti )  i =0   n a,b,τ 0 ti +1 subject to ∫ Φ(a + b ⋅ Φ −1 (t ))d t ≥ δi (i = 0,1,..., R − 1) ti τ := ( t1 , t 2 ,..., t R -1 ) T t0 = 0, tR = 1
17. 17. Optimization Problem 2 1 -1 α 1 ⋅ ∫ Φ( a + b ⋅ Φ (t )) dt − α 2 ⋅∑   γi R −1  max − (ti +1 − ti )  i =0   n a,b,τ 0 ti +1 subject to ∫ Φ(a + b ⋅ Φ−1(t ))d t ≥ δi >0 (i = 0,1,..., R − 1) ti ⇒ ti +1 > ti τ := ( t1 , t 2 ,..., t R -1 ) T t0 = 0, tR = 1
18. 18. Over the ROC Curve TPF 1-AUC AUC FPF t0 t1 t2 t3 t4 t5 1 AOC : = ∫ (1 − Φ( a + b ⋅ Φ − 1 (t ))) dt 0
19. 19. New Version of the Optimization Problem R −1 2  γi  1 min α 2 ⋅ ∑  − (ti +1 − ti )  + α 1 ⋅ ∫ (1 − Φ(a + b ⋅ Φ −1 (t ))) dt i =0   a, b, τ n 0 subject to t j +1 ∫ (1− Φ(a + b ⋅Φ−1(t ))) dt ≤ t j +1 − t j − δ j ( j = 0,1, ..., R −1) t j
20. 20. Regression in Credit Default Optimization problem: Simultaneously to obtain the thresholds and the parameters a and b that maximize AUC, while balancing the size of the classes (regularization) and guaranteeing a good accuracy discretization of integral nonlinear regression problem
21. 21. Discretization of the Integral Riemann-Stieltjes integral ∞ AUC = ∫ Φ (a + b ⋅ c ) dΦ(c ) −∞ Riemann integral 1 AUC = ∫ Φ( a + b ⋅ Φ −1 (t )) dt 0 Discretization R AUC ≈ ∑ Φ( a + b ⋅ Φ −1 (tk )) ⋅ ∆tk k =1
22. 22. Optimization Problem with Penalty Parameters In the case of violation of anyone of these constraints, we introduce penalty parameters. As some penalty becomes increased, the iterates are forced towards the feasible set of the optimization problem. R −1 2  γi  1 ΠΘ ( a,b, τ ) := α 2 ⋅ ∑  − (ti +1 − ti )  − α 1 ⋅ ∫ (1 - Φ( a + b ⋅ Φ -1 (t ))) dt + i =0   n 0 R -1   t j +1  −1 α 3 ⋅ ∑ θ j ⋅  δ j −  ∫ Φ( a + b ⋅ Φ (t ))) dt     tj  j =0   1444442444444 4  3 =: Ψ j ( a , b , τ ) Θ := ( θ1 , θ 2 , ..., θ R − 1 ) T θj ≥0 ( j = 0,1, ..., R − 1)
23. 23. Optimization Problem further discretized R−1 2  γi R  ΠΘ(a,b,τ ) = α2 ⋅ ∑  − (ti+1 −ti )  + α1 ⋅ ∑( (1-Φ(a + b⋅Φ−1(t j ))) ∆t j )2 + i =0   n j =1  2  nj     R-1 α 3. ∑ θ j  ∑  −1(ην ) ) − δ j  ν   Φ(a + b ⋅Φ  j   ∆η j j =0 ν=0   t j +1 − t j         
24. 24. Optimization Problem further discretized R−1 2  γi R  ΠΘ(a,b,τ ) = α2 ⋅ ∑  − (ti+1 −ti )  + α1 ⋅ ∑( (1-Φ(a + b⋅Φ−1(t j ))) ∆t j )2 + i =0   n j =1  2 R-1  nj  −1(ην ) ) − δ j  ∆ην   α 3 . ∑ θ j  ∑   Φ(a + b ⋅Φ   j    ν=0   j  t j +1 − t j  j =0      
25. 25. Nonlinear Regression 2 ∑ j ( j ) N min f ( β ) =  d − g x ,β   j =1  N =: ∑ f j2 ( β ) j =1 F ( β ) := ( f1 ( β ),..., f N ( β ) ) T min f ( β ) = F T ( β ) F ( β )
26. 26. Nonlinear Regression β k +1 := β k + qk • Gauss-Newton method : ∇F ( β )∇T F ( β )q = −∇F ( β ) F ( β ) • Levenberg-Marquardt method : λ ≥0 ( ) ∇F ( β )∇T F (β ) + λ I p q = −∇F ( β ) F ( β )
27. 27. Nonlinear Regression alternative solution min t, t,q subject to ( ∇F (β )∇ T ) F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) ) 2 ≤ t , t ≥ 0, || Lq || 2 ≤ M conic quadratic programming
28. 28. Nonlinear Regression alternative solution min t, t,q subject to ( ∇F (β )∇ T ) F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) ) 2 ≤ t , t ≥ 0, || Lq || 2 ≤ M conic quadratic programming interior point methods
29. 29. Numerical Results Initial Parameters a b Threshold values (t) 1 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 1.5 0.85 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 0.80 0.95 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 2 0.70 0.0006 0.0015 0.0035 0.01 0.035 0.11 0.35 Optimization Results a b Threshold values (t) AUC 0.9999 0.9501 0.0004 0.0020 0.0032 0.012 0.03537 0.09 0.3400 0.8447 1.4999 0.8501 0.0003 0.0017 0.0036 0.011 0.03537 0.10 0.3500 0.9167 0.7999 0.9501 0.0004 0.0018 0.0032 0.011 0.03400 0.10 0.3300 0.8138 2.0001 0.7001 0.0004 0.0020 0.0031 0.012 0.03343 0.11 0.3400 0.9671
30. 30. Numerical Results Accuracy Error in Each Class I II III IV V VI VII VIII 0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0010 0.0075 0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0018 0.0094 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0018 0.0059 0.0000 0.0000 0.0000 0.0001 0.0001 0.0006 0.0018 0.0075 Number of Firms in Each Class I II III IV V VI VII VIII 4 56 27 133 115 102 129 61 2 42 52 120 119 111 120 61 4 43 40 129 114 116 120 61 4 56 24 136 106 129 111 61 Number of firms in each class at the beginning: 10, 26, 58, 106, 134, 121, 111, 61
31. 31. Generalized Additive Models http://144.122.137.55/gweber/