SlideShare a Scribd company logo
1 of 27
Construction of a robust
prediction model to forecast the
likelihood of a credit card holder
to experience payment defaults in
upcoming months.
Xi global resources
Group of company
March 28, 2024
contents
Introduction
01
Resources
02
Methodology
03
Result and discussion
04
Conclusion
06
Recommendations
05
Overview
 Card payments are essential in digital commerce.
 Card payments integral in digital commerce.
 Show consumer preference and retailer confidence.
 eWallets offer convenient alternative, affirm status.
 Installment options rise, nearly half retailers accept.
 Bank transfers, direct debits less common.
 Longer processing times, lower consumer demand.
Overview cont.'s
 a payment card is one of the best options for
obtaining cash and every year the traditional
cash in the wallet is being displaced more
and more by “plastic money”.
 Number of issued payment cards in 2009–
Q2 2021 (in units) with adjusted lineartrend
and forecast (including 90% confidence
interval)
(2) (PDF) Development of the Payment Cards Market in Poland in the Era of the Covid-19 Pandemic. Available from:
https://www.researchgate.net/publication/361429793_Development_of_the_Payment_Cards_Market_in_Poland_in_the_Era_of_the_Covid-19_Pandemic [accessed Mar 30 2024].
Introduction
Research Motivation and Significant of study
 Build a logistic regression model for classification of customer with a default
payment next month from those without it
 Accurate prediction vital for informed decision-making.
 Payment process affects company finances
significantly.
 Impact on customer relationships and credit
risk.
Problem Statement
Resources
Question addressed
 The only resource provided is the dataset. This dataset contains information on default payments,
demographic factors, credit data, history of payment, and bill statements of credit card clients in
Taiwan from April 2005 to September 2005. The dataset contains 25 variables such as:
 Given historical data and demographic information, can a predictive model effectively estimate
default payment with a high degree of accuracy?
Dataset Information
Resources
Content
There are 25 variables:
1. ID: ID of each client
2. LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit
3. SEX: Gender (1=male, 2=female)
4. EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others, 5=unknown, 6=unknown)
5. MARRIAGE: Marital status (1=married, 2=single, 3=others)
6. AGE: Age in years
7. PAY_0 to PAY_6 (6 features) Repayment status from September to April, 2005: (-1=pay duly, 1=payment
delay for one month, 2=payment delay for two months, … 8=payment delay for eight months, 9=payment
delay for nine months and above)
8. BILL_AMT1 to BILL_AMT6: Amount of bill statement from April to September, 2005 (NT dollar)
9. PAY_AMT1 to PAY_AMT6: Amount of previous payment from April to September, 2005 (NT dollar)
10. default.payment.next.month: Default payment (1=yes, 0=no)
Dataset Information
Methods
1. Process of assigning labels: Labeling data
2. Preliminary analysis
Exploratory analysis:
Checking the data structure
Detecting missing values
Detecting outliers: boxplot
3. Visualization: A bar chart was used for qualitative variables, while boxplot and density plot was used for the
quantitative or continuous variables.
Methods
 Relationship: Correlation matrix to investigates variable relationships.
 Data partitioning: Dataset split into 75:25 training and test to prevents over fitting.
 Model trained: the model was trained with all features
 Feature selection: Backward selection process was applied to remove insignification feature with a stop
condition set at alpha level of 0.05
 Model evaluation: The model accuracy level was estimated.
Structure of the dataset
Figure 1: Variable contained in the dataset displaying the total number of observation by the
variable type(integer or numeric)
Structure of the dataset
Figure 2: Variable contained in the dataset displaying the percentage of values present(or if
there is any missing values) by total number of observation.
Preliminary Analysis: Exploratory data analysis
Figure 3: Distribution of default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 3: Distribution of gender by default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 4: Distribution of marriage by default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 5: Distribution of education by default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 6: Distribution of repayment status in September, 2005 by
default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 7: Distribution of repayment status in August, 2005 by default payment
next month
Preliminary Analysis: Exploratory data analysis
Figure 8: Distribution of repayment status in July, 2005 by default
payment next month
Preliminary Analysis: Exploratory data analysis
Figure 9: Distribution of repayment status in June, 2005 by default
payment next month
Preliminary Analysis: Exploratory data analysis
Figure 10: Distribution of repayment status in May, 2005 by default
payment next month
Preliminary Analysis: Exploratory data analysis
Figure 11: Distribution of repayment status in April, 2005 by default
payment next month
Preliminary Analysis: Exploratory data analysis
Figure 12: Distribution of age by default payment next month
Preliminary Analysis: Exploratory data analysis
Figure 12: Distribution of amount of given credit bill
by default payment next month
Model evaluation
Figure 31:Confusion matrix showing the counts of true positive (TP), true
negative (TN), false positive (FP), and false negative (FN) predictions
made by the model on a dataset
Accura
cy
AUC
Trained
Model
80.84 % 0.73
Retrained
Model
80.91 % 0.72
Recommendation
 Significant Features: The bill amounts (e.g., BILL_AMT1, BILL_AMT3) and previous payment amounts (e.g.,
PAY_AMT1, PAY_AMT2) play a significant role in predicting default offering.
 Payment Status important: It is crucial for the business to closely monitor customers' payment behavior,
especially when there are signs of payment delays or defaults.
 Customer Segmentation: Utilize the insights from the model to segment customers based on their risk profiles.
 Customer Assistance Program: Implement customer assistance programs or financial counseling services to
support customers experiencing financial difficulties.
Conclusion
 In conclusion, the model provide valuable insights into the
factors influencing default payment next month in the dataset.
 The analysis highlights the significance of payment status, age,
bill amounts, and previous payments in predicting default.
 By closely monitoring these factors and adapting strategies
accordingly, businesses can better manage default risks and
improve their financial stability.
THANK YOU!

More Related Content

Similar to prediction of default payment next month using a logistic approach

Alternative Payment Systems in the U.S., 2nd Edition
Alternative Payment Systems in the U.S., 2nd EditionAlternative Payment Systems in the U.S., 2nd Edition
Alternative Payment Systems in the U.S., 2nd Edition
MarketResearch.com
 
Credit Card Marketing Classification Trees Fr.docx
 Credit Card Marketing Classification Trees Fr.docx Credit Card Marketing Classification Trees Fr.docx
Credit Card Marketing Classification Trees Fr.docx
ShiraPrater50
 
Our Time LineThursday, October 6 Assignment 3 homew.docx
Our Time LineThursday, October 6  Assignment 3 homew.docxOur Time LineThursday, October 6  Assignment 3 homew.docx
Our Time LineThursday, October 6 Assignment 3 homew.docx
gerardkortney
 

Similar to prediction of default payment next month using a logistic approach (20)

Cross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customersCross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customers
 
Alternative Payment Systems in the U.S., 2nd Edition
Alternative Payment Systems in the U.S., 2nd EditionAlternative Payment Systems in the U.S., 2nd Edition
Alternative Payment Systems in the U.S., 2nd Edition
 
Consumer credit-risk3440
Consumer credit-risk3440Consumer credit-risk3440
Consumer credit-risk3440
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Credit Card Marketing Classification Trees Fr.docx
 Credit Card Marketing Classification Trees Fr.docx Credit Card Marketing Classification Trees Fr.docx
Credit Card Marketing Classification Trees Fr.docx
 
Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients
 
Implementing a Kenyan Credit Information Sharing System: Progress and Challe...
Implementing a Kenyan Credit Information Sharing System:  Progress and Challe...Implementing a Kenyan Credit Information Sharing System:  Progress and Challe...
Implementing a Kenyan Credit Information Sharing System: Progress and Challe...
 
Zero Base Training Report
Zero Base Training ReportZero Base Training Report
Zero Base Training Report
 
Zero Base Training Report
Zero Base Training ReportZero Base Training Report
Zero Base Training Report
 
Credit Scoring of Turkey with Semiparametric Logit Models
Credit Scoring of Turkey with Semiparametric Logit ModelsCredit Scoring of Turkey with Semiparametric Logit Models
Credit Scoring of Turkey with Semiparametric Logit Models
 
A Holistic Approach to Property Valuations
A Holistic Approach to Property ValuationsA Holistic Approach to Property Valuations
A Holistic Approach to Property Valuations
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Pensions Core Course 2013: Pension Indicators - Reliable Statistics to Improv...
Pensions Core Course 2013: Pension Indicators - Reliable Statistics to Improv...Pensions Core Course 2013: Pension Indicators - Reliable Statistics to Improv...
Pensions Core Course 2013: Pension Indicators - Reliable Statistics to Improv...
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
 
Historical Credit Data | Total Credit Card Spend
Historical Credit Data | Total Credit Card SpendHistorical Credit Data | Total Credit Card Spend
Historical Credit Data | Total Credit Card Spend
 
PhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive ClassificationPhD Defense - Example-Dependent Cost-Sensitive Classification
PhD Defense - Example-Dependent Cost-Sensitive Classification
 
Global Open Loop Prepaid Cards Market Intelligence, Innovation, Strategy, and...
Global Open Loop Prepaid Cards Market Intelligence, Innovation, Strategy, and...Global Open Loop Prepaid Cards Market Intelligence, Innovation, Strategy, and...
Global Open Loop Prepaid Cards Market Intelligence, Innovation, Strategy, and...
 
Mortgage Insurance Data Organization Havlicek Mrotek
Mortgage Insurance Data Organization Havlicek MrotekMortgage Insurance Data Organization Havlicek Mrotek
Mortgage Insurance Data Organization Havlicek Mrotek
 
Our Time LineThursday, October 6 Assignment 3 homew.docx
Our Time LineThursday, October 6  Assignment 3 homew.docxOur Time LineThursday, October 6  Assignment 3 homew.docx
Our Time LineThursday, October 6 Assignment 3 homew.docx
 
Forecasting peer-to-peer lending risk
Forecasting peer-to-peer lending riskForecasting peer-to-peer lending risk
Forecasting peer-to-peer lending risk
 

Recently uploaded

一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
benishzehra469
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 

Recently uploaded (20)

Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 

prediction of default payment next month using a logistic approach

  • 1. Construction of a robust prediction model to forecast the likelihood of a credit card holder to experience payment defaults in upcoming months. Xi global resources Group of company March 28, 2024
  • 3. Overview  Card payments are essential in digital commerce.  Card payments integral in digital commerce.  Show consumer preference and retailer confidence.  eWallets offer convenient alternative, affirm status.  Installment options rise, nearly half retailers accept.  Bank transfers, direct debits less common.  Longer processing times, lower consumer demand.
  • 4. Overview cont.'s  a payment card is one of the best options for obtaining cash and every year the traditional cash in the wallet is being displaced more and more by “plastic money”.  Number of issued payment cards in 2009– Q2 2021 (in units) with adjusted lineartrend and forecast (including 90% confidence interval) (2) (PDF) Development of the Payment Cards Market in Poland in the Era of the Covid-19 Pandemic. Available from: https://www.researchgate.net/publication/361429793_Development_of_the_Payment_Cards_Market_in_Poland_in_the_Era_of_the_Covid-19_Pandemic [accessed Mar 30 2024].
  • 5. Introduction Research Motivation and Significant of study  Build a logistic regression model for classification of customer with a default payment next month from those without it  Accurate prediction vital for informed decision-making.  Payment process affects company finances significantly.  Impact on customer relationships and credit risk. Problem Statement
  • 6. Resources Question addressed  The only resource provided is the dataset. This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005. The dataset contains 25 variables such as:  Given historical data and demographic information, can a predictive model effectively estimate default payment with a high degree of accuracy? Dataset Information
  • 7. Resources Content There are 25 variables: 1. ID: ID of each client 2. LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit 3. SEX: Gender (1=male, 2=female) 4. EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others, 5=unknown, 6=unknown) 5. MARRIAGE: Marital status (1=married, 2=single, 3=others) 6. AGE: Age in years 7. PAY_0 to PAY_6 (6 features) Repayment status from September to April, 2005: (-1=pay duly, 1=payment delay for one month, 2=payment delay for two months, … 8=payment delay for eight months, 9=payment delay for nine months and above) 8. BILL_AMT1 to BILL_AMT6: Amount of bill statement from April to September, 2005 (NT dollar) 9. PAY_AMT1 to PAY_AMT6: Amount of previous payment from April to September, 2005 (NT dollar) 10. default.payment.next.month: Default payment (1=yes, 0=no) Dataset Information
  • 8. Methods 1. Process of assigning labels: Labeling data 2. Preliminary analysis Exploratory analysis: Checking the data structure Detecting missing values Detecting outliers: boxplot 3. Visualization: A bar chart was used for qualitative variables, while boxplot and density plot was used for the quantitative or continuous variables.
  • 9. Methods  Relationship: Correlation matrix to investigates variable relationships.  Data partitioning: Dataset split into 75:25 training and test to prevents over fitting.  Model trained: the model was trained with all features  Feature selection: Backward selection process was applied to remove insignification feature with a stop condition set at alpha level of 0.05  Model evaluation: The model accuracy level was estimated.
  • 10. Structure of the dataset Figure 1: Variable contained in the dataset displaying the total number of observation by the variable type(integer or numeric)
  • 11. Structure of the dataset Figure 2: Variable contained in the dataset displaying the percentage of values present(or if there is any missing values) by total number of observation.
  • 12. Preliminary Analysis: Exploratory data analysis Figure 3: Distribution of default payment next month
  • 13. Preliminary Analysis: Exploratory data analysis Figure 3: Distribution of gender by default payment next month
  • 14. Preliminary Analysis: Exploratory data analysis Figure 4: Distribution of marriage by default payment next month
  • 15. Preliminary Analysis: Exploratory data analysis Figure 5: Distribution of education by default payment next month
  • 16. Preliminary Analysis: Exploratory data analysis Figure 6: Distribution of repayment status in September, 2005 by default payment next month
  • 17. Preliminary Analysis: Exploratory data analysis Figure 7: Distribution of repayment status in August, 2005 by default payment next month
  • 18. Preliminary Analysis: Exploratory data analysis Figure 8: Distribution of repayment status in July, 2005 by default payment next month
  • 19. Preliminary Analysis: Exploratory data analysis Figure 9: Distribution of repayment status in June, 2005 by default payment next month
  • 20. Preliminary Analysis: Exploratory data analysis Figure 10: Distribution of repayment status in May, 2005 by default payment next month
  • 21. Preliminary Analysis: Exploratory data analysis Figure 11: Distribution of repayment status in April, 2005 by default payment next month
  • 22. Preliminary Analysis: Exploratory data analysis Figure 12: Distribution of age by default payment next month
  • 23. Preliminary Analysis: Exploratory data analysis Figure 12: Distribution of amount of given credit bill by default payment next month
  • 24. Model evaluation Figure 31:Confusion matrix showing the counts of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) predictions made by the model on a dataset Accura cy AUC Trained Model 80.84 % 0.73 Retrained Model 80.91 % 0.72
  • 25. Recommendation  Significant Features: The bill amounts (e.g., BILL_AMT1, BILL_AMT3) and previous payment amounts (e.g., PAY_AMT1, PAY_AMT2) play a significant role in predicting default offering.  Payment Status important: It is crucial for the business to closely monitor customers' payment behavior, especially when there are signs of payment delays or defaults.  Customer Segmentation: Utilize the insights from the model to segment customers based on their risk profiles.  Customer Assistance Program: Implement customer assistance programs or financial counseling services to support customers experiencing financial difficulties.
  • 26. Conclusion  In conclusion, the model provide valuable insights into the factors influencing default payment next month in the dataset.  The analysis highlights the significance of payment status, age, bill amounts, and previous payments in predicting default.  By closely monitoring these factors and adapting strategies accordingly, businesses can better manage default risks and improve their financial stability.