Predictive Model for Loan Approval Process using SAS 9.3_M1Akanksha Jain
This is a Predictive Model which uses Logistic Regression to statistically help make better loan approval decisions in future for a German Bank. It uses an historical credit data set with 1000 data points and 20 variables.
Tool used:
SAS 9.3_M1
Steps Involved are:
- Data Quality check using Correlations and VIF Tests
- Analysis of different Variable Selection Methods such as Forward, Backward and Stepwise
- Variable Selection on the basis of Parameter Estimates and Odds Ratio
- Outlier Analysis to identify the outliers and improve the model
- Final Model Selection Decision based on ROC curve, Percent Concordant, PROC Rank and Hosmer Lemeshow Test
Regulatory capital requirements pose a major challenge for financial institutions today.
As the Asian financial crisis of 1997 and rapid development of credit risk management revealed many shortcomings and loop holes in measuring capital charges under Basel I, Basel II was issued in 2004 with the sole intent of improving international convergence of capital measurement and capital standards.
This paper introduces Basel II, the construction of risk weight functions and their limits in two sections:
In the first, basic fundamentals are presented to better understand these prerequisites: the likelihood of losses, expected and unexpected loss, Value at Risk, and regulatory capital. Then we discuss the founding principles of the regulatory formula for risk weight functions and how it works.
The latter section is dedicated to studying the different parameters of risk weight functions, in order to discuss their limits, modifications and impacts on the regulatory capital charge coefficient.
Predictive Model for Loan Approval Process using SAS 9.3_M1Akanksha Jain
This is a Predictive Model which uses Logistic Regression to statistically help make better loan approval decisions in future for a German Bank. It uses an historical credit data set with 1000 data points and 20 variables.
Tool used:
SAS 9.3_M1
Steps Involved are:
- Data Quality check using Correlations and VIF Tests
- Analysis of different Variable Selection Methods such as Forward, Backward and Stepwise
- Variable Selection on the basis of Parameter Estimates and Odds Ratio
- Outlier Analysis to identify the outliers and improve the model
- Final Model Selection Decision based on ROC curve, Percent Concordant, PROC Rank and Hosmer Lemeshow Test
Regulatory capital requirements pose a major challenge for financial institutions today.
As the Asian financial crisis of 1997 and rapid development of credit risk management revealed many shortcomings and loop holes in measuring capital charges under Basel I, Basel II was issued in 2004 with the sole intent of improving international convergence of capital measurement and capital standards.
This paper introduces Basel II, the construction of risk weight functions and their limits in two sections:
In the first, basic fundamentals are presented to better understand these prerequisites: the likelihood of losses, expected and unexpected loss, Value at Risk, and regulatory capital. Then we discuss the founding principles of the regulatory formula for risk weight functions and how it works.
The latter section is dedicated to studying the different parameters of risk weight functions, in order to discuss their limits, modifications and impacts on the regulatory capital charge coefficient.
This report is written to do financial analysis of National Grid Plc. (NGP) – a multinational gas and utility company headquartered in London. The entire analysis will be done using trend and ratios analysis using different groups of ratios such as profitability, efficiency, Liquidity etc. Besides these the presentation style of report will be analyzed to check the way use by company to communicate various strategies to shareholders
EAD Parameter : A stochastic way to model the Credit Conversion FactorGenest Benoit
This white paper aims at estimating credit risk by modelling the Credit Conversion Factor (CCF) parameter related to the Exposure-at-Default (EAD). It has been decided to perform the estimation thanks to stochastic processes instead of usual statistical methodologies (such as classification tree or GLM).
Our paper will focus on two types of model: the Ornstein Uhlenbeck (OU) model – part of ARMA model types – and the Geometric Brownian Movement (GBM) model. First, we will describe, then implement and calibrate each model to ensure relevance and robustness of our results. Then, we will focus on GBM model to model CCF.
How to Make Awesome SlideShares: Tips & TricksSlideShare
Turbocharge your online presence with SlideShare. We provide the best tips and tricks for succeeding on SlideShare. Get ideas for what to upload, tips for designing your deck and more.
An Overview of ROC Curves in SAS PROC LOGISTICQuanticate
The repeated dosing of some drugs can induce injury to the human liver. Regular monitoring of biomarkers assayed in blood samples may help to diagnose safety issues sooner. There is interest in developing new biomarkers that are more specific than the standard tests [e.g. Alanine Transaminase (ALT)] commonly used.
In medical diagnostics, a receiver operating characteristic (ROC) analysis is a powerful statistical analysis tool that is used to assess the ability of a test to correctly classify diseased (sensitivity) and non-diseased (specificity) subjects. The sensitivity and specificity rates are used to construct the ROC curve which is used to visually inspect the ability of the test to discriminate between patients’ true status of disease. The most widely used summary statistic is the area under the ROC curve (AUROC).
We present recent enhancements to PROC LOGISTIC for constructing ROC curves and compare AUROCs between biomarkers with standard errors and 95% confidence intervals (CIs). We present an overview of the code, output and interpretation of the ROC features of PROC LOGISTIC in SAS v9.3 using simulated data on two candidate biomarkers. We discuss the limitations of ROC analysis in the context of identifying and validating the best candidate biomarker.
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature MappingIlyas F ☁☁☁
If you are a Cloud Architect, Developer, IT Manager, Director or whoever may be, if you are associated with Azure or AWS cloud in some form, I’m sure you must have come across a common question.
“What is the alternate service available in Azure or AWS vice versa and it’s pricing?” I’m sure you will say yes!
Agreed, it’s hard to remember all the services offered by public clouds, i.e. Azure and AWS. Remembering existing services and their benefits itself is a big task, on top of that updating ourselves with the new feature releases and enhancements is another major task.
So I put together a Service & Feature Mappings between Microsoft Azure & AWS for my and colleagues quick reference.
I hope you also find this piece informative.
This report is written to do financial analysis of National Grid Plc. (NGP) – a multinational gas and utility company headquartered in London. The entire analysis will be done using trend and ratios analysis using different groups of ratios such as profitability, efficiency, Liquidity etc. Besides these the presentation style of report will be analyzed to check the way use by company to communicate various strategies to shareholders
EAD Parameter : A stochastic way to model the Credit Conversion FactorGenest Benoit
This white paper aims at estimating credit risk by modelling the Credit Conversion Factor (CCF) parameter related to the Exposure-at-Default (EAD). It has been decided to perform the estimation thanks to stochastic processes instead of usual statistical methodologies (such as classification tree or GLM).
Our paper will focus on two types of model: the Ornstein Uhlenbeck (OU) model – part of ARMA model types – and the Geometric Brownian Movement (GBM) model. First, we will describe, then implement and calibrate each model to ensure relevance and robustness of our results. Then, we will focus on GBM model to model CCF.
How to Make Awesome SlideShares: Tips & TricksSlideShare
Turbocharge your online presence with SlideShare. We provide the best tips and tricks for succeeding on SlideShare. Get ideas for what to upload, tips for designing your deck and more.
An Overview of ROC Curves in SAS PROC LOGISTICQuanticate
The repeated dosing of some drugs can induce injury to the human liver. Regular monitoring of biomarkers assayed in blood samples may help to diagnose safety issues sooner. There is interest in developing new biomarkers that are more specific than the standard tests [e.g. Alanine Transaminase (ALT)] commonly used.
In medical diagnostics, a receiver operating characteristic (ROC) analysis is a powerful statistical analysis tool that is used to assess the ability of a test to correctly classify diseased (sensitivity) and non-diseased (specificity) subjects. The sensitivity and specificity rates are used to construct the ROC curve which is used to visually inspect the ability of the test to discriminate between patients’ true status of disease. The most widely used summary statistic is the area under the ROC curve (AUROC).
We present recent enhancements to PROC LOGISTIC for constructing ROC curves and compare AUROCs between biomarkers with standard errors and 95% confidence intervals (CIs). We present an overview of the code, output and interpretation of the ROC features of PROC LOGISTIC in SAS v9.3 using simulated data on two candidate biomarkers. We discuss the limitations of ROC analysis in the context of identifying and validating the best candidate biomarker.
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature MappingIlyas F ☁☁☁
If you are a Cloud Architect, Developer, IT Manager, Director or whoever may be, if you are associated with Azure or AWS cloud in some form, I’m sure you must have come across a common question.
“What is the alternate service available in Azure or AWS vice versa and it’s pricing?” I’m sure you will say yes!
Agreed, it’s hard to remember all the services offered by public clouds, i.e. Azure and AWS. Remembering existing services and their benefits itself is a big task, on top of that updating ourselves with the new feature releases and enhancements is another major task.
So I put together a Service & Feature Mappings between Microsoft Azure & AWS for my and colleagues quick reference.
I hope you also find this piece informative.
Maintaining Credit Quality in Banks and Credit UnionsLibby Bierman
In this session, Sageworks presented different ways that people in the bank can curb credit risk in an effort to maintain and improve credit quality of the portfolio.
We use GARCH model to calculate the probability to default.
Our innovation is to use two dimensional GARCH model through a formula that combines both the firm's risk and the market risk.
The method is calculating the total risk by taking into consideration the different influences of the firm’s and market’s risk, i.e. Beta, using different weights for each one.
Counterparty Credit RISK | Evolution of standardised approachGRATeam
In this Article, we have made a focus on the new standard methodology (SA-CCR) for computing the EAD related to Counterparty Credit Risk portfolios. The implementation of a SA-CCR approach will become increasingly important for the Banks given the publication of the finalised Basel III reforms; in which it will require from financial institutions to compute an output floor to compare their level of RWAs between Internal and Standard approaches.
Counterparty Credit Risk | Evolution of
the standardised approach to determine the EAD of counterparties
This article focuses on Counterparty Credit Risk. The topic of this article is on the evolution and need of standardised method for the assessment of Exposure at Default of counterparties and their Capitalisation under regulatory requirements.
Cecl automation banking book analytics v3Sohail Farooq
Our CECL approach is designed to leverage internally available data with or without internal ratings. Our solution is cloud-based and is easily configurable with minimal consulting effort.
Credit risks are calculated based on the borrowers’ overall ability to repay. Our objective was to use optimization in order to create a tool that approves or rejects loans to borrowers. We also used optimization to establish how much interest rate/credit will be extended to borrowers who were approved for a loan.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
1. Loan Portfolio Manufacturing SME’s – Statistical Analysis
Manzar Ahmed
Advanced Credit Risk Management Coursework
1. Introduction
This paper will consider 310 manufacturing SMEs from a European country that constitutes a loan portfolio for
data collected in 2014.
The objective of this paper is to analyse the data in order to fit a probability of default (PD) model that
describes the relationship between the categorical response variable - default (0 = non-default and 1 = default)
and a set of predictor variables.
2. Statistical Analysis Tool
The loan portfolio data will be analysed using SAS University Edition, the tool is available for free to download
from the below link.
http://www.sas.com/en_gb/software/university-edition.html
The raw loans portfolio data has been uploaded into the SAS VM environment (via Jupyter Notebook) into the
/folders/myfolders/CreditRisk.
Please refer to appendix B of this document for the upload script as well as the scripts used for the analysis
and model in this paper.
3. Default Data
The loans portfolio data is contained in the SAS library CRDR called Default_Data.
Column Name
Type
Format
Description Example
Counterparty_Id Num 4. A key that uniquely identifies each counterparty in order of the data in the raw file.
Assigned using the automatic variable _N_ by SAS.
1
Default Num 1. Binary value 0 or 1 to indicate default. 1 = Default 0
Supplier_Target_Years Num 5.2 A temporal measure of financial sustainability expressed in years that considers all
short and medium term debts as well as other payables
0.36
Outside_Capital_Structure Num 5.2 This ratio evaluates a firm's capability to attract forms of financing other than
banks' loans. The higher the better
0.18
Cash_Ratio Num 5.2 The proportion of cash a company can generate in relation to its size 0.70
Capital_Tied_Up Num 5.2 Turnover of short term debts with respect to sales 0.12
Equity_Ratio Num 5.2 Financial leverage measure that divides equities by total assets 0.89
Cost_Income_Ratio Num 5.2 Efficiency measure that tells us how costs are changing compared to income 0.32
Trade_Payable_Ratio Num 5.2 How often does the company turn over during the year? A low ratio may be a sign
of chronic cash shortages
0.16
Liabities_Ratio Num 5.2 A debt ratio between long-term liabilities and total assets 0.83
Liquidity_Ratio Num 5.2 An index given an idea of how quickly a company can liquidate its assets to cover
short-term liabilities. The higher the better
0.70
Age Num 2 The age of the SME in years, by the end of 2014 3
Default_Status Char $15. Indicates if the counterparty has defaulted on their loan agreement. Will contain
either Default or Non-Default.
Non-
Default
2. 4. Data Analysis
The data will be analysed using proc univariate, proc means and proc corr to get a look and feel for the data.
4.1 Proc Univariate
The proc univariate procedure shows the distribution of the data, including the assessment of normality and
discovery of outliers. In the var statement of the procedure all the predictor variables have been listed in order
to get the feel for the data.
The following is available from the proc univariate analysis:
N, Mean, Standard Deviation, Skewness, Uncorrected Sum of Squares, Coefficient of Variation, Sum of
Weights, Sum of Observations, Variance, Kurtosis, Corrected Sum of Squares and Standard Error of the Mean
The UNIVARATE Procedure
SupplierTargetYears
OutsideCapitalStructure
CashRatio
CapitalTiedUp
EquityRatio
CostIncomeRatio
TradePayablesRatio
LiabilitiesRatio
LiquidityRatio
N 310 310 310 310 310 310 310 310 310
Mean 0.1974 0.5254 0.1265 0.0784 0.4342 0.2843 0.2407 0.6535 0.1211
Std Deviation 0.1601 0.2410 0.1018 0.0717 0.2599 0.1850 0.2083 0.2027 0.0982
Skewness 2.1444 -0.2262 1.8982 4.3089 0.3858 0.6971 1.0480 -0.5288 2.0616
Uncorrected SS 19.9969 103.4996 8.1618 3.4905 79.3137 35.6212 31.3595 145.0873 7.5213
Coeff Variation 81.1019 45.8652 80.4338 91.4701 59.8734 65.0720 86.5325 31.0113 81.0864
Sum
Observations
61.1900 162.8600 39.2200 24.2900 134.5900 88.1200 74.6100 202.5900 37.5300
Variance 0.0256 0.0581 0.0104 0.0051 0.0676 0.0342 0.0434 0.0411 0.0096
Kurtosis 5.5300 -1.0363 4.6882 30.4751 -0.9059 0.2093 0.1134 -0.3641 5.7568
Corrected SS 7.9188 17.9403 3.1998 1.5873 20.8799 10.5724 13.4026 12.6915 2.9777
Std Error Mean 0.0091 0.0137 0.0058 0.0041 0.0148 0.0105 0.0118 0.0115 0.0056
3. 4.2 Proc Means
The proc means procedure can be used analyse the mean, distribution and shape of the data by grouping the
data on the response variable default (using the class statement). The results show that there is a significant
difference in the mean and kurtosis of the predicator variables when grouped by counterparties which are
status default and non-default. The results are as expected as we would expect the healthier counterparties to
be different from the defaulted counterparties.
The MEANS Procedure
Default
Status
N
Obs Variable
Mean
StdDev
StdError
Kurtosis
Lower95%
CLforMean
Upper95%
CLforMean
Default 34 Supplier Target Years
Outside Capital Structure
Cash Ratio
Capital Tied Up
Equity Ratio
Cost Income Ratio
Trade Payable Ratio
Liabilities Ratio
Liquidity Ratio
0.294
0.655
0.094
0.118
0.298
0.331
0.270
0.768
0.093
0.150
0.165
0.088
0.076
0.223
0.171
0.179
0.159
0.086
0.026
0.028
0.015
0.013
0.038
0.029
0.031
0.027
0.015
-0.983
-0.207
2.164
-0.684
-0.083
-1.191
0.899
0.017
1.934
0.241
0.598
0.064
0.091
0.220
0.271
0.208
0.712
0.063
0.346
0.713
0.125
0.144
0.376
0.390
0.332
0.823
0.123
Non-Default 276 Supplier Target Years
Outside Capital Structure
Cash Ratio
Capital Tied Up
Equity Ratio
Cost Income Ratio
Trade Payable Ratio
Liabilities Ratio
Liquidity Ratio
0.186
0.509
0.130
0.074
0.451
0.279
0.237
0.639
0.124
0.157
0.244
0.103
0.070
0.260
0.186
0.212
0.203
0.099
0.009
0.015
0.006
0.004
0.016
0.011
0.013
0.012
0.006
7.475
-1.112
4.788
39.796
-0.912
0.443
0.101
-0.378
5.940
0.167
0.480
0.118
0.065
0.420
0.256
0.212
0.615
0.113
0.204
0.538
0.143
0.082
0.482
0.301
0.262
0.663
0.136
4. 4.3 Proc Corr
The proc corr procedure can be used to analyse the correlation (Pearson’s correlation coefficient) between
each predicator variables. The correlation cofficient will measure he lineasr dependence between each
predicator variables to give a value between +1 and -1 inclusive, where 1 is total positive linear correlation, 0 is
no linear correlation, and -1 is total negative linear correlation. The p-value tests if the correlation coefficient is
significant (highlighted in red).
4.4 Pearson Correlation Coefficients
The CORR Procedure
Pearson Correlation Coefficients, N = 310
Prob > |r| under H0: Rho=0
SupplierTargetYears
OutsideCapitalStructure
CashRatio
CapitalTiedUp
EquityRatio
CostIncomeRatio
TradePayableRatio
LiabilitiesRatio
LiquidityRatio
Supplier Target Years 1.00000
0.18792
0.0009
-0.18124
0.0014
0.34919
<.0001
-0.12503
0.0277
0.31932
<.0001
-0.16938
0.0028
0.13971
0.0138
-0.16088
0.0045
Outside Capital
Structure
0.18792
0.0009 1.00000
-0.16005
0.0047
0.28974
<.0001
-0.45627
<.0001
-0.43969
<.0001
0.65225
<.0001
0.47682
<.0001
-0.11401
0.0449
Cash Ratio
-0.18124
0.0014
-0.16005
0.0047 1.00000
-0.00434
0.9394
0.21627
0.0001
-0.07254
0.2028
0.13143
0.0206
-0.02776
0.6263
0.97047
<.0001
Capital Tied Up
0.34919
<.0001
0.28974
<.0001
-0.00434
0.9394 1.00000
-0.19404
0.0006
0.08843
0.1202
0.40803
<.0001
0.17815
0.0016
0.00605
0.9156
Equity Ratio
-0.12503
0.0277
-0.45627
<.0001
0.21627
0.0001
-0.19404
0.0006 1.00000
0.24874
<.0001
-0.45780
<.0001
-0.92874
<.0001
0.18564
0.0010
Cost Income Ratio
0.31932
<.0001
-0.43969
<.0001
-0.07254
0.2028
0.08843
0.1202
0.24874
<.0001 1.00000
-0.50407
<.0001
-0.25646
<.0001
-0.07783
0.1717
Trade Payable Ratio
-0.16938
0.0028
0.65225
<.0001
0.13143
0.0206
0.40803
<.0001
-0.45780
<.0001
-0.50407
<.0001 1.00000
0.47200
<.0001
0.15177
0.0074
Liabilities Ratio
0.13971
0.0138
0.47682
<.0001
-0.02776
0.6263
0.17815
0.0016
-0.92874
<.0001
-0.25646
<.0001
0.47200
<.0001 1.00000
0.02040
0.7205
Liquidity Ratio
-0.16088
0.0045
-0.11401
0.0449
0.97047
<.0001
0.00605
0.9156
0.18564
0.0010
-0.07783
0.1717
0.15177
0.0074
0.02040
0.7205 1.00000
6. 5. Probability of Default Model (PD)
In this section the logistic regression function will be used to model the probability of default (PD) for the loans
portfolio data.
5.1 The Logit Link Function
The SAS proc logistic will be used to model the binary response variable default. The logit of the default will be
used as the response in the regression equation:
𝐿𝑛 (
𝑃
1 − 𝑃
) = 𝛽0 + ∑ 𝛽𝑖 𝑥𝑖
10
𝑖=1
P is defined as the probability that Default = 1 (Default_Status = ‘Default’). The Xs are the predicator variables
as follows:
𝑥1 Supplier Target Years
𝑥2 Outside Capital Structure
𝑥3 Cash Ratio
𝑥4 Capital Tied Up
𝑥5 Equity Ratio
𝑥6 Cost Income Ratio
𝑥7 Trader Payable Ratio
𝑥8 Liabilities Ratio
𝑥9 Liquidity Ratio
𝑥10 Age
The SAS proc logistic procedure will be used to estimate the beta’s.
5.2 PD Model - Logit
The full output for the proc logistic can be found in appendix 6.1.
The summary of the PD model output is as follows:
The binary logit model was used with Fisher’s scoring optimisation technique
The model used 310 observations (counterparties)
The probability model was for Default_Status = ‘Default’
The model convergence status shows it successfully converged to a solution
Using a standard alpha criterion for significance of 0.05, the model is significant based on the
likelihood ratio and score, since chi-square is < .0001
Testing Global Null Hypothesis: BETA=0
Test
Chi-
Square DF Pr > ChiSq
Likelihood Ratio 39.3721 10 <.0001
Score 37.6438 10 <.0001
Wald 30.1195 10 0.0008
7. The Analysis of Maximum Likelihood Estimate table shows the estimates for beta in the logistic
regression equation. The maximum likelihood estimate shows that the predicator variables are not
significant at the 0.05 alpha level
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate
Standard
Error
Wald
Chi-
Square
Pr >
ChiSq
Intercept 1 -11.0281 3.2990 11.1746 0.0008
Supplier_Target_Year 1 0.0976 1.5079 0.0042 0.9484
Outside_Capital_Stru 1 3.2529 1.3918 5.4624 0.0194
Cash_Ratio 1 -7.6011 22.5496 0.1136 0.7361
Capital_Tied_Up 1 5.3388 2.7314 3.8205 0.0506
Equity_Ratio 1 2.6575 2.4262 1.1997 0.2734
Cost_Income_Ratio 1 2.6339 1.4102 3.4886 0.0618
Trade_Payable_Ratio 1 -2.2046 1.5860 1.9321 0.1645
Liabilities_Ratio 1 7.0681 3.3923 4.3411 0.0372
Liquidity_Ratio 1 4.6822 22.7943 0.0422 0.8373
Age 1 0.2205 0.1934 1.2998 0.2542
The below table summarises the ability of the model to discriminate counterparties that will default
on their loan agreement. A typical value to report is the concordance statistic, labelled c. This value
indicates that 80% of the time the model is able to correctly predict counterparties that will default
on their loan agreement.
Association of Predicted Probabilities and Observed Responses
Percent Concordant 80 Somers' D 0.6
Percent Discordant 20 Gamma 0.6
Percent Tied 0 Tau-a 0.118
Pairs 9384 c 0.8
The Hosmer-Lemeshow goodness of fit (GOF) test is a way to assess whether there is evidence for lack
of fit in a regression model. Results from the proc logistic is, HL chi-square of 8.5784 with 8 df,
yielding a p-value of 0.3791. As the p-value is greater than 0.05, it means we can reject the null
hypothesis i.e. evidence for lack of fit.
8. 5.3 Final Model
The beta estimates for the model are as follows:
Beta Parameter Estimate
0 Intercept -11.0281
1 Supplier Target Year 0.0976
2 Outside Capital Structure 3.2529
3 Cash Ratio -7.6011
4 Capital Tied Up 5.3388
5 Equity Ratio 2.6575
6 Cost Income Ratio 2.6339
7 Trade Payable Ratio -2.2046
8 Liabilities Ratio 7.0681
9 Liquidity Ratio 4.6822
10 Age 0.2205
5.4 PD Calculation
The SAS proc logistic procedure contains the outest option, which populates the beta’s into a sas dataset
called Default_Data_Est.
The below script generates the counterparty PD calculation into a new dataset called Counteryparty_PD.
9. 6. Appendix A
6.1 Proc Logistic
Model Information
Data Set CRDR.DEFAULT_DATA
Response Variable Default_Status Default Status
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring
Number of Observations Read 310
Number of Observations Used 310
Response Profile
Ordered Value Default_Status Total Frequency
1 Default 34
2 Non-Default 276
Probability modelled is Default_Status='Default'.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Criterion
Intercept
Only
Intercept and
Covariates
AIC 216.421 197.049
SC 220.158 238.151
-2 Log L 214.421 175.049
R-Square 0.1193 Max-rescaled R-Square 0.2389
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 39.3721 10 <.0001
Score 37.6438 10 <.0001
Wald 30.1195 10 0.0008