SlideShare a Scribd company logo
Machine Learning Application:
Credit Scoring
Programming Techniques
Professor Carlos Costa
Master in Mathematical Finance
Federico Innocenti 53251
Miguel Albergaria 48547
Claudio Napoli 53358
Iacopo Fiorentino 53315 Lisbon, December 11th
2019
Context
► The data is collected from Thomson Reuters from firms
included in the main stock indexes.
► The goal is to set a score of a company to decide
whether to give a loan or not to that firm based on a
client’s probability of default.
► For that we compute many ratios and at the end we
want to “differentiate winners from losers”.
Data preparation
► Importing data, checking the type of data and
clearing missing values;
► Correlation matrix;
► See how the data is distributed through graphs;
► Rearranging the data clearing very low values
and very high values, i.e., outliers.
► After all of that, we did the correlation matrix
and graphs again to compare them and to have
a better view of our results.
Modelling data
► Our data doesn´t have a probability of default, so we need to create one.
► In order to compute the machine learning approach we use:
► Supervised learning: logistic regression and random forest
► Unsupervised learning: clustering K-mean
► We decided to use a financial scorecard, in order to give a certain score to
different ratios.
Setting the score
► Relevant ratios: current ratio, debt ratio, equity to asset ratio, debt to
equity ratio, return on asset, return on equity, long term coverage ratio and
asset turnover ratio.
► The company’s goal is to obtain the highest score that we compute in the
way showed before. An example of the code is shown here:
► The final score is set by adding all of the “ratios’ scores”.
Evaluation
► For the evaluation of our model we compute a confusion matrix in order to
see the result and have an easier first parametre to compare the three
models.
► After setting the score we binarize the score being 1 the lowest probability
of default and 0 the highest. We chose as threshold a score of 500 points and
then we proceed to the evaluation.
Logistic Regression
► We leave the set of the logistic
regression in default mode with
a test size of 0.7.
► The final result is good with a
AUC of 0.75, which means that
it is a good model distinguishing
the given classes.
► But there is a problem!
► The model has a type 2 error. In
other words, it predicts 1 but
actually is 0.
► So the F1 score (measure of
accuracy) is 0.68.
Random Forest
► In order to optimize the process we put the “number of jobs” 150 and the
“number of estimators” is 1 since it is a binomial classification.
► This model achieved a really high AUC: 0.87 and a good F1-Score.
► High precision and high recall means low probability of error type I and II.
K-Mean
► We increased the number of iterations to 400 times in order to optimize this
model and to try to get more stable results.
► The main problem with the K-mean clustering model is that it suffers from a low
precision predicting the default cases (type I error).
► On the other hand it has an acceptable F1-Score and a AUC of 0.80.
Conclusions
► The standardization of the ratio and the cleaning of the data gets the models
to have a high AUC on the three models.
► The better model is the Random Forest, getting a better AUC result.
► We confirm that machine learning algorithms are really powerful in analysing
data and it can be helpful to solve this specific problem.

More Related Content

What's hot

Strayer mat 540 week 2 quiz 1 set 3 questions new
Strayer mat 540 week 2 quiz 1 set 3 questions newStrayer mat 540 week 2 quiz 1 set 3 questions new
Strayer mat 540 week 2 quiz 1 set 3 questions new
eyavagal
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
Johnson Ubah
 
R operators
R   operatorsR   operators
@elemorfaruk
@elemorfaruk@elemorfaruk
@elemorfaruk
Elemor Faruk
 
Tut8 model selection
Tut8 model selectionTut8 model selection
Machine learning
Machine learningMachine learning
Machine learning
Mike Martinez
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
Abhimanyu Dwivedi
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Md. Main Uddin Rony
 
Statistical Learning on Credit Data
Statistical Learning on Credit DataStatistical Learning on Credit Data
Statistical Learning on Credit Data
Firas Obeid
 
Multiple reg presentation
Multiple reg presentationMultiple reg presentation
Multiple reg presentation
Seth Anandaram Jaipuria College
 
Use Of Calculus In Programming
Use Of Calculus In ProgrammingUse Of Calculus In Programming
Use Of Calculus In Programming
Afaq Siddiqui
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
Saleesh Satheeshchandran
 
Array sheet
Array sheet Array sheet
Array sheet
Mahmoud Abuelmagd
 
House price prediction
House price predictionHouse price prediction
House price prediction
SabahBegum
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
mailund
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
Saleesh Satheeshchandran
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
Saleesh Satheeshchandran
 
Chapter 2
Chapter 2Chapter 2

What's hot (18)

Strayer mat 540 week 2 quiz 1 set 3 questions new
Strayer mat 540 week 2 quiz 1 set 3 questions newStrayer mat 540 week 2 quiz 1 set 3 questions new
Strayer mat 540 week 2 quiz 1 set 3 questions new
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
R operators
R   operatorsR   operators
R operators
 
@elemorfaruk
@elemorfaruk@elemorfaruk
@elemorfaruk
 
Tut8 model selection
Tut8 model selectionTut8 model selection
Tut8 model selection
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Statistical Learning on Credit Data
Statistical Learning on Credit DataStatistical Learning on Credit Data
Statistical Learning on Credit Data
 
Multiple reg presentation
Multiple reg presentationMultiple reg presentation
Multiple reg presentation
 
Use Of Calculus In Programming
Use Of Calculus In ProgrammingUse Of Calculus In Programming
Use Of Calculus In Programming
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
 
Array sheet
Array sheet Array sheet
Array sheet
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 

Similar to Machine Learning Application: Credit Scoring

Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
Jieming Wei
 
Py data19 final
Py data19   finalPy data19   final
Py data19 final
Maria Navarro Jiménez
 
Week14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxWeek14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptx
Usamamalik345378
 
Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paper
James by CrowdProcess
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
Rahul Bhatia
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
Bastiaan Frerix
 
Computational Finance Introductory Lecture
Computational Finance Introductory LectureComputational Finance Introductory Lecture
Computational Finance Introductory Lecture
Stuart Gordon Reid
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
rajalakshmi5921
 
Logistic_regression_ML.pdf
Logistic_regression_ML.pdfLogistic_regression_ML.pdf
Logistic_regression_ML.pdf
CHINTASAISIRI20BCE73
 
Chapter 04
Chapter 04 Chapter 04
Chapter 04
Tuul Tuul
 
Decision Tree and Bayesian Classification
Decision Tree and Bayesian ClassificationDecision Tree and Bayesian Classification
Decision Tree and Bayesian Classification
Komal Kotak
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
CYPatrickKwee
 
Machine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approachMachine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approach
Ajit Ghodke
 
cas_washington_nov2010_web
cas_washington_nov2010_webcas_washington_nov2010_web
cas_washington_nov2010_web
Yanwei (Wayne) Zhang
 
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
inventionjournals
 
Ch08 ci estimation
Ch08 ci estimationCh08 ci estimation
Ch08 ci estimation
Mohamed Elias
 
The following calendar-year information is taken from the December.docx
The following calendar-year information is taken from the December.docxThe following calendar-year information is taken from the December.docx
The following calendar-year information is taken from the December.docx
cherry686017
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
Boston Institute of Analytics
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
weka Content
 
WEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been LearnedWEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been Learned
DataminingTools Inc
 

Similar to Machine Learning Application: Credit Scoring (20)

Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
 
Py data19 final
Py data19   finalPy data19   final
Py data19 final
 
Week14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptxWeek14_Business Simulation Modeling MSBA.pptx
Week14_Business Simulation Modeling MSBA.pptx
 
Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paper
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
 
Computational Finance Introductory Lecture
Computational Finance Introductory LectureComputational Finance Introductory Lecture
Computational Finance Introductory Lecture
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Logistic_regression_ML.pdf
Logistic_regression_ML.pdfLogistic_regression_ML.pdf
Logistic_regression_ML.pdf
 
Chapter 04
Chapter 04 Chapter 04
Chapter 04
 
Decision Tree and Bayesian Classification
Decision Tree and Bayesian ClassificationDecision Tree and Bayesian Classification
Decision Tree and Bayesian Classification
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
 
Machine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approachMachine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approach
 
cas_washington_nov2010_web
cas_washington_nov2010_webcas_washington_nov2010_web
cas_washington_nov2010_web
 
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
 
Ch08 ci estimation
Ch08 ci estimationCh08 ci estimation
Ch08 ci estimation
 
The following calendar-year information is taken from the December.docx
The following calendar-year information is taken from the December.docxThe following calendar-year information is taken from the December.docx
The following calendar-year information is taken from the December.docx
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
 
WEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been LearnedWEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been Learned
 

More from eurosigdoc acm

Blockchain e o Futuro do Setor Financeiro
Blockchain e o Futuro do Setor FinanceiroBlockchain e o Futuro do Setor Financeiro
Blockchain e o Futuro do Setor Financeiro
eurosigdoc acm
 
No code – Caso Prático no App Inventor - BroTrip
No code – Caso Prático no App Inventor - BroTripNo code – Caso Prático no App Inventor - BroTrip
No code – Caso Prático no App Inventor - BroTrip
eurosigdoc acm
 
The oracle problem nos smart contracts
The oracle problem nos smart contractsThe oracle problem nos smart contracts
The oracle problem nos smart contracts
eurosigdoc acm
 
Robotic process automation
Robotic process automation Robotic process automation
Robotic process automation
eurosigdoc acm
 
Robotic Process Automation: caso de estudo Delloite
Robotic Process Automation: caso de estudo DelloiteRobotic Process Automation: caso de estudo Delloite
Robotic Process Automation: caso de estudo Delloite
eurosigdoc acm
 
Projeção do Crowdfunding em Portugal: a plataforma ppl
Projeção do Crowdfunding em Portugal: a plataforma pplProjeção do Crowdfunding em Portugal: a plataforma ppl
Projeção do Crowdfunding em Portugal: a plataforma ppl
eurosigdoc acm
 
Implementação de uma aplicação em Power Apps – Low Code
Implementação de uma aplicação em Power Apps – Low CodeImplementação de uma aplicação em Power Apps – Low Code
Implementação de uma aplicação em Power Apps – Low Code
eurosigdoc acm
 
Proteção de dados e redes sociais
Proteção de dados e redes sociaisProteção de dados e redes sociais
Proteção de dados e redes sociais
eurosigdoc acm
 
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIALCLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
eurosigdoc acm
 
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDINGCROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
eurosigdoc acm
 
Low code: O futuro do desenvolvimento de aplicações
Low code: O futuro do desenvolvimento de aplicaçõesLow code: O futuro do desenvolvimento de aplicações
Low code: O futuro do desenvolvimento de aplicações
eurosigdoc acm
 
Robotic Process Automation
Robotic Process AutomationRobotic Process Automation
Robotic Process Automation
eurosigdoc acm
 
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDECrowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
eurosigdoc acm
 
Business Intelligence e o Desporto
Business Intelligence e o DesportoBusiness Intelligence e o Desporto
Business Intelligence e o Desporto
eurosigdoc acm
 
Blockchain
Blockchain Blockchain
Blockchain
eurosigdoc acm
 
Blockchain: viável ou em luta com o meio ambiente?
Blockchain: viável ou em luta com o meio ambiente?Blockchain: viável ou em luta com o meio ambiente?
Blockchain: viável ou em luta com o meio ambiente?
eurosigdoc acm
 
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
eurosigdoc acm
 
Viabilidade das NFT’s a Longo Prazo
Viabilidade das NFT’s a Longo Prazo Viabilidade das NFT’s a Longo Prazo
Viabilidade das NFT’s a Longo Prazo
eurosigdoc acm
 
Outsystems e o Universo do Low-Code
Outsystems e o Universo do Low-CodeOutsystems e o Universo do Low-Code
Outsystems e o Universo do Low-Code
eurosigdoc acm
 
Erp
ErpErp

More from eurosigdoc acm (20)

Blockchain e o Futuro do Setor Financeiro
Blockchain e o Futuro do Setor FinanceiroBlockchain e o Futuro do Setor Financeiro
Blockchain e o Futuro do Setor Financeiro
 
No code – Caso Prático no App Inventor - BroTrip
No code – Caso Prático no App Inventor - BroTripNo code – Caso Prático no App Inventor - BroTrip
No code – Caso Prático no App Inventor - BroTrip
 
The oracle problem nos smart contracts
The oracle problem nos smart contractsThe oracle problem nos smart contracts
The oracle problem nos smart contracts
 
Robotic process automation
Robotic process automation Robotic process automation
Robotic process automation
 
Robotic Process Automation: caso de estudo Delloite
Robotic Process Automation: caso de estudo DelloiteRobotic Process Automation: caso de estudo Delloite
Robotic Process Automation: caso de estudo Delloite
 
Projeção do Crowdfunding em Portugal: a plataforma ppl
Projeção do Crowdfunding em Portugal: a plataforma pplProjeção do Crowdfunding em Portugal: a plataforma ppl
Projeção do Crowdfunding em Portugal: a plataforma ppl
 
Implementação de uma aplicação em Power Apps – Low Code
Implementação de uma aplicação em Power Apps – Low CodeImplementação de uma aplicação em Power Apps – Low Code
Implementação de uma aplicação em Power Apps – Low Code
 
Proteção de dados e redes sociais
Proteção de dados e redes sociaisProteção de dados e redes sociais
Proteção de dados e redes sociais
 
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIALCLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
CLOUD COMPUTING E SUSTENTABILIDADE EMPRESARIAL
 
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDINGCROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
CROWDFUNDING: IMPACTO DA GAMIFICAÇÃO NAS PLATAFORMAS DE CROWDFUNDING
 
Low code: O futuro do desenvolvimento de aplicações
Low code: O futuro do desenvolvimento de aplicaçõesLow code: O futuro do desenvolvimento de aplicações
Low code: O futuro do desenvolvimento de aplicações
 
Robotic Process Automation
Robotic Process AutomationRobotic Process Automation
Robotic Process Automation
 
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDECrowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
Crowdsourcing: DEFINIÇÕES E APLICAÇÕES NA ÁREA DA SAÚDE
 
Business Intelligence e o Desporto
Business Intelligence e o DesportoBusiness Intelligence e o Desporto
Business Intelligence e o Desporto
 
Blockchain
Blockchain Blockchain
Blockchain
 
Blockchain: viável ou em luta com o meio ambiente?
Blockchain: viável ou em luta com o meio ambiente?Blockchain: viável ou em luta com o meio ambiente?
Blockchain: viável ou em luta com o meio ambiente?
 
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
Cloud Computing e a sua Implementação na Educação no Contexto de Pandemia COV...
 
Viabilidade das NFT’s a Longo Prazo
Viabilidade das NFT’s a Longo Prazo Viabilidade das NFT’s a Longo Prazo
Viabilidade das NFT’s a Longo Prazo
 
Outsystems e o Universo do Low-Code
Outsystems e o Universo do Low-CodeOutsystems e o Universo do Low-Code
Outsystems e o Universo do Low-Code
 
Erp
ErpErp
Erp
 

Recently uploaded

Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
nimaruinazawa258
 
Bridging the gap: Online job postings, survey data and the assessment of job ...
Bridging the gap: Online job postings, survey data and the assessment of job ...Bridging the gap: Online job postings, survey data and the assessment of job ...
Bridging the gap: Online job postings, survey data and the assessment of job ...
Labour Market Information Council | Conseil de l’information sur le marché du travail
 
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Suomen Pankki
 
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdfSeeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
Ashis Kumar Dey
 
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
yeuwffu
 
International Sustainability Standards Board
International Sustainability Standards BoardInternational Sustainability Standards Board
International Sustainability Standards Board
Kumar Ramaiah
 
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
k4ncd0z
 
How to Use Payment Vouchers in Odoo 18.
How to Use Payment Vouchers in  Odoo 18.How to Use Payment Vouchers in  Odoo 18.
How to Use Payment Vouchers in Odoo 18.
FinShe
 
The Impact of Generative AI and 4th Industrial Revolution
The Impact of Generative AI and 4th Industrial RevolutionThe Impact of Generative AI and 4th Industrial Revolution
The Impact of Generative AI and 4th Industrial Revolution
Paolo Maresca
 
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptxOAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
hiddenlevers
 
What's a worker’s market? Job quality and labour market tightness
What's a worker’s market? Job quality and labour market tightnessWhat's a worker’s market? Job quality and labour market tightness
What's a worker’s market? Job quality and labour market tightness
Labour Market Information Council | Conseil de l’information sur le marché du travail
 
Detailed power point presentation on compound interest and how it is calculated
Detailed power point presentation on compound interest  and how it is calculatedDetailed power point presentation on compound interest  and how it is calculated
Detailed power point presentation on compound interest and how it is calculated
KishanChaudhary23
 
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
5spllj1l
 
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
bresciafarid233
 
Independent Study - College of Wooster Research (2023-2024)
Independent Study - College of Wooster Research (2023-2024)Independent Study - College of Wooster Research (2023-2024)
Independent Study - College of Wooster Research (2023-2024)
AntoniaOwensDetwiler
 
Ending stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across ScotlandEnding stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across Scotland
ResolutionFoundation
 
Fabular Frames and the Four Ratio Problem
Fabular Frames and the Four Ratio ProblemFabular Frames and the Four Ratio Problem
Fabular Frames and the Four Ratio Problem
Majid Iqbal
 
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy VisaNew Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
Amit Kakkar
 
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
Donc Test
 
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
5spllj1l
 

Recently uploaded (20)

Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
 
Bridging the gap: Online job postings, survey data and the assessment of job ...
Bridging the gap: Online job postings, survey data and the assessment of job ...Bridging the gap: Online job postings, survey data and the assessment of job ...
Bridging the gap: Online job postings, survey data and the assessment of job ...
 
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
 
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdfSeeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
Seeman_Fiintouch_LLP_Newsletter_Jun_2024.pdf
 
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
 
International Sustainability Standards Board
International Sustainability Standards BoardInternational Sustainability Standards Board
International Sustainability Standards Board
 
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证)皇家墨尔本理工大学毕业证如何办理
 
How to Use Payment Vouchers in Odoo 18.
How to Use Payment Vouchers in  Odoo 18.How to Use Payment Vouchers in  Odoo 18.
How to Use Payment Vouchers in Odoo 18.
 
The Impact of Generative AI and 4th Industrial Revolution
The Impact of Generative AI and 4th Industrial RevolutionThe Impact of Generative AI and 4th Industrial Revolution
The Impact of Generative AI and 4th Industrial Revolution
 
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptxOAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
OAT_RI_Ep20 WeighingTheRisks_May24_Trade Wars.pptx
 
What's a worker’s market? Job quality and labour market tightness
What's a worker’s market? Job quality and labour market tightnessWhat's a worker’s market? Job quality and labour market tightness
What's a worker’s market? Job quality and labour market tightness
 
Detailed power point presentation on compound interest and how it is calculated
Detailed power point presentation on compound interest  and how it is calculatedDetailed power point presentation on compound interest  and how it is calculated
Detailed power point presentation on compound interest and how it is calculated
 
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
快速办理(SMU毕业证书)南卫理公会大学毕业证毕业完成信一模一样
 
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
Tdasx: In-Depth Analysis of Cryptocurrency Giveaway Scams and Security Strate...
 
Independent Study - College of Wooster Research (2023-2024)
Independent Study - College of Wooster Research (2023-2024)Independent Study - College of Wooster Research (2023-2024)
Independent Study - College of Wooster Research (2023-2024)
 
Ending stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across ScotlandEnding stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across Scotland
 
Fabular Frames and the Four Ratio Problem
Fabular Frames and the Four Ratio ProblemFabular Frames and the Four Ratio Problem
Fabular Frames and the Four Ratio Problem
 
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy VisaNew Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
New Visa Rules for Tourists and Students in Thailand | Amit Kakkar Easy Visa
 
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
TEST BANK Principles of cost accounting 17th edition edward j vanderbeck mari...
 
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
在线办理(TAMU毕业证书)美国德州农工大学毕业证PDF成绩单一模一样
 

Machine Learning Application: Credit Scoring

  • 1. Machine Learning Application: Credit Scoring Programming Techniques Professor Carlos Costa Master in Mathematical Finance Federico Innocenti 53251 Miguel Albergaria 48547 Claudio Napoli 53358 Iacopo Fiorentino 53315 Lisbon, December 11th 2019
  • 2. Context ► The data is collected from Thomson Reuters from firms included in the main stock indexes. ► The goal is to set a score of a company to decide whether to give a loan or not to that firm based on a client’s probability of default. ► For that we compute many ratios and at the end we want to “differentiate winners from losers”.
  • 3. Data preparation ► Importing data, checking the type of data and clearing missing values; ► Correlation matrix; ► See how the data is distributed through graphs; ► Rearranging the data clearing very low values and very high values, i.e., outliers. ► After all of that, we did the correlation matrix and graphs again to compare them and to have a better view of our results.
  • 4. Modelling data ► Our data doesn´t have a probability of default, so we need to create one. ► In order to compute the machine learning approach we use: ► Supervised learning: logistic regression and random forest ► Unsupervised learning: clustering K-mean ► We decided to use a financial scorecard, in order to give a certain score to different ratios.
  • 5. Setting the score ► Relevant ratios: current ratio, debt ratio, equity to asset ratio, debt to equity ratio, return on asset, return on equity, long term coverage ratio and asset turnover ratio.
  • 6. ► The company’s goal is to obtain the highest score that we compute in the way showed before. An example of the code is shown here: ► The final score is set by adding all of the “ratios’ scores”.
  • 7. Evaluation ► For the evaluation of our model we compute a confusion matrix in order to see the result and have an easier first parametre to compare the three models. ► After setting the score we binarize the score being 1 the lowest probability of default and 0 the highest. We chose as threshold a score of 500 points and then we proceed to the evaluation.
  • 8. Logistic Regression ► We leave the set of the logistic regression in default mode with a test size of 0.7. ► The final result is good with a AUC of 0.75, which means that it is a good model distinguishing the given classes. ► But there is a problem! ► The model has a type 2 error. In other words, it predicts 1 but actually is 0. ► So the F1 score (measure of accuracy) is 0.68.
  • 9. Random Forest ► In order to optimize the process we put the “number of jobs” 150 and the “number of estimators” is 1 since it is a binomial classification.
  • 10. ► This model achieved a really high AUC: 0.87 and a good F1-Score. ► High precision and high recall means low probability of error type I and II.
  • 11. K-Mean ► We increased the number of iterations to 400 times in order to optimize this model and to try to get more stable results. ► The main problem with the K-mean clustering model is that it suffers from a low precision predicting the default cases (type I error). ► On the other hand it has an acceptable F1-Score and a AUC of 0.80.
  • 12. Conclusions ► The standardization of the ratio and the cleaning of the data gets the models to have a high AUC on the three models. ► The better model is the Random Forest, getting a better AUC result. ► We confirm that machine learning algorithms are really powerful in analysing data and it can be helpful to solve this specific problem.