Summer 07-mfin7011-tang1922

MFIN 7011: Credit Risk Management Summer, 2007 Dragon Tang Lecture 18 Consumer Credit Risk Thursday, August 2, 2007 Readings: Niu (2004); Agarwal, Chomsisengphet, Liu, and Souleles (2006)

Consumer Credit Risk Objectives : Credit scoring approach for consumer credit risk Practice, challenge, and opportunity

Consumer Credit Default Risk (low in general) Low High Credit Products Fixed Term Revolving Residential Mortgage Retail Finance Personal Loans Overdrafts Credit Cards

Consumer Lending Examples: Automobile loans Home equity loans Revolving credit There is an exponential growth in consumer credit outstanding in the US, from USD 9.8 billion in 1946 to USD 2411 billion in January 2007 $878 billion revolving; $1526 billion non-revolving Currently interest rate is 13%; interest accessed is 15%

Consumer vs. Corporate Lending Consumer lending is not as glamorous as corporate lending Consumer lending is a volume business, where low cost producers who can manage the credit losses are able to enjoy profitable margins Corporate lending is often unprofitable as every bank is chasing the same corporate customers, depressing margins

Consumer Credit Risk: Art or Science? Art: consumers care about reputation Value of reputation is hard to model Reduced form model may be useful Science: creditworthiness can be predicted from financial health Using structural models of Merton type The answer is probably both! Hybrid structural-reduced form model should be most promising

Never make predictions, especially about the future. — Casey Stengel

The credit Decision Scoring vs. Judgmental Both methods Assume that the future will resemble the past Compare applicants to past experience Aim to grant credit only to acceptable risks Added value of scoring Defines degree of credit risk for each applicant Ranks risk relative to other applicants Allows decisions based on degree of risk Enables tracking of performance over time Permits known and measurable adjustments Permits decision automation

Evaluating the credit applicant Time at present address Time at present job Residential status Debt ratio Bank reference Age Income # of Recent inquiries % of Balance to avail. lines # of Major derogs. Overall Decision Odds of repayment • • • CHARACTERISTICS + + - + + N / A - - + + + Accept ? • • • JUDGMENT 12 20 5 21 28 15 5 -7 10 35 212 Accept 11:1 • • • CREDIT SCORING

Credit Scoring Project Input x feature vector Label y, default or not Data (x i , y i ) Target y=f(x) Objective Given new x, predict y so that probability of error is minimal

Typical Input Data Time at present address 0-1, 1-2, 3-4, 5+ years Home status Owner, tenant, other Telephone Yes, no Applicant's annual income $(0-10000), $(11000-20000), $(21000+) Credit card Yes, no Type of bank account Cheque and/or savings, none Age 18-25, 26-40, 41-55, 55+ years Type of occupation Coded Purpose of loan Coded Marital status Married, divorced, single, widow Time with bank Years Time with employer Years

Input Data: FICO Score Not in the score: demographic data

Characteristics of Data X: Continuous Discrete Normal distribution? Y: Binary data: 0 or 1 (=default)

Scoring Models Statistical Methods DA (Discriminant Analysis) Linear regression Logistic regression Probit analysis Non-parametric models Nearest-neighbor approach

Statistical Methods: Discriminant Analysis Multivariate statistical analysis: several predictors (independent variables) and several groups (categorical dependent variable, e.g. 0 and 1) Predictive DA: for a new observation, calculate the discriminant score, then classify it according to the score The objective is to maximize the between group to within group sum of squares ratio that results in the best discrimination between the groups (within group variance is solely due to randomness; between group variability is due to the difference of the means) Normal distribution for the response variables (dependent variables) is assumed (but normality only becomes important if significance tests are to be taken for small samples)

Statistical Credit Scoring Credit Score #Customers Good Credit Bad Credit Cut-off Score

Statistical Credit Scoring Credit scoring systems: Altman Z-score model: Z = .012 X 1 +.014 X 2 +.033 X 3 +.006 X 4 +1.0 X 5 X 1 = working capital/total assets ratio X 2 = retained earnings/total assets ratio X 3 = earnings before interest and taxes/total assets ratio X 4 = market value of equity/book value of total liabilities ratio X 5 = sales/total assets ratio

Statistical Methods: Linear Regression The regression model is like: For the true model, u can take only two values as Y; thus u can’t be normally distributed. u has heteroskedastic variances, which makes the OLS inefficient The estimated probability may well lie outside [0,1].

Statistical Methods: Nearest-Neighbor Approach A historical database has been divided into two groups (good and bad) When a consumer comes, calculate the distance between the consumer and everyone in the database The consumer will be classified in the category which is the same as the nearest one(s) Problems: The definition of distance and the number of the nearest ones Scoring speed: when a new x comes, we need calculate the distance between the new x and all of the historical data; too much calculation!

Scoring Models Non-statistical Methods Mathematical programming Recursive partitioning Expert systems Machine Learning Neural Networks Support Vector Machine (SVM)

Which Method is Best? In general there is no overall best method. What is best will depend on the details of the problem: The data structure The characteristics used The extent to which it is possible to separate the classes by using those characteristics The objective of the classification (overall misclassification rate, cost-weighted misclassification rate, bad risk rate among those accepted, some measure of profitability, etc.) In the following slides, we will introduce three models, Logistic, Neural Networks, and SVM in detail, which are used widely today

Logistic Regression Empirical studies show, logistic regression may perform better than linear models (Hence, better than Discriminant Analysis), when data is nonnormal (particularly for binary data), or when covariance matrices of the two groups are not identical. Therefore, logistic regression is the preferred method among the statistical methods Probit regression is similar to logistic regression

Performing Logistic Regression Logistic Regression can be performed using the Maximum Likelihood method In the maximum likelihood method, we are seeking parameter values that maximize the likelihood of the observations occurring

Logistic Regression: Setup Directly models the default probability as a function of the input variables X (a vector) Define Assume

Logistic Regression: Setup Assume the observations are independent, the probability (likelihood) of the observed sample is given by

Logistic Regression and ML ML estimator (of the coefficients a’s) for Logistic Regression can be found by applying non-linear optimization on the above likelihood function. The simplified version is given by

Logistic Regression and ML It is easy to show that the log of the odds (= logit) are a linear function: Therefore, the odds per se are a multiplicative function. Since probability takes on values between (0,1), the odds take on values between (0,∞), logits take on values between (-∞,∞). So, it looks very much like linear regression, and it does not need to restrict the dependent variable to values of {0, 1}. It is not solvable using OLS.

Logistic Function and Distribution

Normal Distribution The tails are much thinner than Logistic

RiskCalc: Moody’s Default Model Probit Regression Where x is the vector of the ratios

Neural Networks Non-parametric method Non-linear model estimation technique: e.g. Saturation effect: i.e. marginal effect of a financial ratio may decline quickly Multiplicative factors: highly leveraged firms have a harder time borrowing money Neural networks decide how to combine and transform the raw characteristics in the data, as well as yielding estimates of the parameters of the decision surface Well suited to situations where we have a poor understanding of the data structure

Neural Networks Use the logistic function as the activation function in all the nodes Works well with classification problems Drawbacks May take much longer to train In credit scoring, there is solid understanding of data

Multilayer Perceptron (MLP) The input values X are sent along with 1 to the hidden layer neuron The hidden layer generates a weight and generates a nonlinear output that is sent to the next layer The output neuron takes 1 with input from the hidden layer and generates the output signal When learning occurs, the weights are adjusted so that the final OUTs produce the least error (The output of a single neuron is called OUT) X1 X2 1 H1 H2 1 O Input Layer Hidden Layer Output Layer w01 w12 w21 w22 w11 w02 w1 w2 w0

Multilayer Perceptron (MLP) Input nodes do not perform processing Each hidden and output node processes the signals by an activation function. The most frequently used is given on the right. The parameters, w, are obtained by “training” the Neural Net to historical data.

Support Vector Machine (SVM) A relatively new promising supervised learning method for Pattern recognition (Classification) Regression estimation This originates from the statistical learning theory developed by Vaqnik and Chervonenkis 1960s, Vapnik V. N., Support Vector 1995, Statistical Learning Theory Vapnik, V. N., “The Nature of Statistical Learning Theory”. New York: Springer-Verlag, 1995 2 Cortes C. and Vapnik, V. N., “Support Vector Networks”, Machine Learning, 20:1-25,1995 Development, from 1995 to now

SVM Extension Proximal Support Vector Machine (PSVM) Glenn Fung and Olvi L. Mangasariany 2001 Incremental and Decremental Support Vector Machine Learning Least Squares Support Vector Machine (LS-SVM) Also, SVMs can be seen as a new training method for learning machines (such as NNs)

Linear Classifier There are infinitely many lines that have zero training error. Which line should we choose?

Choose the line with the largest margin . The optimal separating hyperplane (OSH) The “large margin classifier” Linear Classifier margin ” Support Vectors”

Performance of SVM S&P CreditModel White Paper Fan and Palaniswami (2000): SVM 70.35%–70.90% NN 66.11%–68.33% MDA 59.79%–63.68%

Credit Scoring and Beyond Data collected at application will become outdated pretty fast The way a customer uses its credit account is an indicator for future performance (Behavior Scoring) This leads to an update path of PD and credit control tools The future is moving into profitability scoring. Banks should not only care about getting its money back Banks want to extend credit to those it can make a positive NPV, risk-adjusted

Best Practice in Consumer Credit Risk Management Credit decision-making Adopt to changes in economy or within customer segment Credit scoring Adaptive algorithms using credit bureau data and firm’s own experience Loss forecasting Historical delinquency rates and charge-off trend analysis Delinquency flow and segmented vintage analysis Portfolio management Risk adjusted return on capital (RAROC)

Analytical Techniques Response analysis: avoid adverse selection consequences that result in increased concentrations of high-risk borrowers Pricing strategies: avoid “follow the competition”, focus on segment profitability and cash flow Loan amount determination: avoid to be judgmental, quantify probabilities of losses Credit loss forecasting: decompositional roll rate modeling, trend and seasonal indexing, and vintage curve Portfolio management strategies: important for repricing and retention, don’t be judgmental, integrating behavioral element and cash flow profitability analysis ( underwriting ) Collection strategies: behavioral models are useful

Credit Scoring and Loss Forecasting Two critical components of consumer credit risk analysis Corresponds to default probabilities and loss given default These two are linked Loss given default is higher when default probability is greater Market and economic variables matter In bad economic states, there will be more default and lower recovery Good modeling should achieve stability

Do Consumers Choose the Right Credit Contracts? Agarwal, Chomsisengphet, Liu, and Souleles (2006): Some don’t, especially when the stake is small But consumers with high balance do! Other issues: Personal bankruptcy in the U.S. soared! Avoid/fight predatory lending! (e.g., subprime lending) China is starting to have a consumer credit market

China’s Consumer Spending 64% 9198 8407 7811 7037 6462 6001 5603 TOTAL 80% 441 400 367 330 296 268 244 Services 120% 931 842 752 663 599 507 424 Housing 113% 1170 1057 945 837 739 643 550 Education&Entertainment 112% 614 554 498 437 385 337 290 Transport&Communication 91% 790 727 657 595 569 485 414 Household Durables 22% 958 885 866 791 728 750 785 Clothing 138% 506 455 401 356 300 255 213 Medicine&Healthcare 41% 3789 3487 3326 3029 2845 2756 2684 Food 97-03 2003 2002 2001 2000 1999 1998 1997 %Chg

China’s Consumer Credit Market 1999-2004: Growth rate 52% Automobile loans: 110% Only 15% of auto sales, compared to 80% in U.S. Bankcard: 36% Mostly debit cards Mortgage: 1000% Still a long way to go! Only 8% of GDP, compared to 45% in developed economies Other markets Student loan Credit cards! More opportunities are waiting!

Summary Introduction to Consumer Credit Risk: Credit scoring methods Practical issues Exam: Saturday, August 4, 2PM

Review for Exam Topics: Credit risk modeling: structural/reduced-form/incomplete information Recovery rate & default correlation Credit derivatives Credit VaR/Basel II/consumer credit risk Question Types (tentative!): True or False (20%) Multiple Choice (20%) Short Answers (20%) Problems (40%) 60% conceptual; 40% analytical Formulas will be provided if needed.

The plane separating and is defined by The dashed planes are given by Computing the Margin margin w

Divide by b Define new w = w/ b and α = a/b Computing the Margin margin w We have defined a scale for w and a

We have which gives Computing the Margin margin  w) x x +  w)

Quadratic Programming Problem Maximizing the margin is equivalent to minimizing || w || 2 . Minimize || w || 2 subject to the constraints: Where we have defined y(n) = +1 for all y(n) = –1 for all This enables us to write the constraints as

Quadratic Programming Problem Minimize the cost function (Lagrangian) Here we have introduced non-negative Lagrange multipliers l n  0 that express the constraints

Quadratic Programming Problem The first order conditions evaluated at the optimal solution are The solution can be derived (together with the constraint)

Quadratic Programming Problem The original minimizing problem is equivalent to the following maximizing problem (dual) For non-support vectors, λ will be zero, as the original constraint is not binding; only a few λ ’s would be nonzero.

Quadratic Programming Problem Having solved for the optimal λ ’s (denoted as ), we can derive others To classify a new data point x, simply solve

Summer 07-mfin7011-tang1922

More Related Content

What's hot

Viewers also liked

Similar to Summer 07-mfin7011-tang1922

More from stone55

Recently uploaded

Summer 07-mfin7011-tang1922

Editor's Notes