SlideShare a Scribd company logo
1 of 37
Study on Application of
Ensemble learning on
Credit Scoring
Ce Chen
Hokkaido University
Graduate School of Information Science and Technology
Laboratory of Harmonious Systems Engineering
1
Background
• Increase in financing opportunities due to the
development of Fintech
– Many people who have not received loans until now will be eligible
for loans.
– More and more people are involved in credit.
• Growing needs for “credit scoring”
– We need to have an effective way to judge whether the credit
of the individual is good or bad.
– By judging individual credit correctly, it can effectively
reduce the losses.
2
Problems of Credit Scoring
• Few understandable models
– Actual operation is difficult in models that do not know
the calculation process of Credit scoring
• Multiple evaluation indices
– Improve accuracy, minimize loss amount, etc.
• Imbalanced data in credit samples
– The number of good credit samples is more than the
bad credit samples.
• Machine learning is needed to tackle the problem
3
Purpose
• Purpose
– Building a model that can better deal with credit scoring
problems with machine learning.
• Approach
– Credit scoring as classification problem
– Ensemble Learning : XGBoost
• High accuracy reported in previous research
• Explainable criteria based on decision tree
• Evaluation indicators considering cost sensitive
– EasyEnsemble and Focal loss for imbalanced data
4
Credit Scoring as Classification Problem
• Use machine learning to solve credit scoring
5
X11,X12,X13,......Y1
X21,X22,X23,......Y2
.....
Xn1,Xn2,Xn3,......Yn
past debator
scoring model
scoring model
Input:
Xm1,Xm2,Xm3,.....
new customer's feature
Output:
probability of Y
new customer's credit
If probability of y>threshold, bad credit
If probability of y<=threshold, good credit
X:Feature of Individual
Y:Credit of Individual
Y=1, bad credit
Y=0,good credit
The principle of XGBoost 6
)(obj objective
 The coefficient in front of x

n
i
ii yyl ),(
^
cost function
n Number of samples
iy real value of y
iy
^
predict value of y


K
k
kf
1
)( regularization
K Number of decision tree
• Objective=cost function + regularization
kf The complexity of decision tree
The principle of XGBoost 7
)(^ t
iy The predicted results of round t
)( it xf The predicted result of the current tree
• After round t training
yy
t
i predict
)(^

The principle of XGBoost 8
• Difference between XGBoost and Gradient Boost
:First Derivative
:Second Derivative
• Objective results only depend on the first and second
derivatives of the cost function. For complex cost functions, it
can be easier to calculate.
second order Taylor expansion
cost function
XGBoost in Previous Research 9
• Dataset:
Australian,Japanese
• Model:
k-NearestNeighbor,Logistic
Regression,Linear Discriminant
Analysis,Support Vector Machine,
Decision Tree,Random
Forest,Gradient Boost Decision
Tree,Adaboost,XGBoost
• Result:
By comparing the models,
XGBoost has higher accuracy
He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble
method for credit scoring: Adaption of different
imbalance ratios. Expert Systems with Applications, 98,
105–117.
Assessment method(AUC) 10
predict_true predict_false
label_true TP FN
label_false FP TN
false positive rate= FP/(FP+TN)
true positive rate=TP/(TP+FN)
AUC(size of blue area)
In the figure, (0,1) is the best case, and all
samples can be separated correctly. The closer
the blue line is to (0,1), the more accurate the
model will be
Assessment method(cost sensitive) 11
• For the credit scoring problem, we need only the total
cost value to judge whether the model is good or bad.
• Cost sensitive:
• 1
• 2
• 3
amount cost
FP
FN
)( 21
1
LCLC FNFPCC FPFN

21 LCLC FNFP 
)(
1
21 LCLC
Total
EMC FNFP 
1L
2L
FPC
FNC
21cos LCLCttotalthe FNFP 
FNFP CC
samplestestofNumberTotal
Assessment method(cost sensitive) 12
German Australian
LR 177.8 51
DT 191.6 86
RF 190.8 52.6
XGBoost 165.2 43
German Australian
LR 0.875 0.393
DT 0.976 0.620
RF 0.954 0.394
XGBoost 0.813 0.381
21 LCLC FNFP 
German Australian
LR 29.6 8.8
DT 30.5 14.2
RF 31.8 9.1
XGBoost 27.5 8.6
)( 21
1
LCLC FNFPCC FPFN

)(
1
21 LCLC
Total
EMC FNFP 
West, D. (2000). Neural network credit
scoring models. Computers & Operations
Research, 27(11-12), 1131–1152.
LR:Logistic Regression
DT:Decision Tree
RF:Random Forest
The average cost of a sample,
The change of test samples' amount doesn't affect the result
Assessment method(cost senstive)
• The value of and
– , is usually set to 1, is set to a constant
that greater than 1.
• In many papers, ,
– Ting, K. M. Inducing cost-sensitive trees via instance weighting.
Lecture Notes in Computer Science, 1998
– C Elkan. The foundations of cost-sensitive learning.International
joint conference on artificial intelligence, 2001
– The contributor of German dataset suggests ,
• In this paper, ,
13
NFC
5nP FC
FNFP CC 
1N FC
nP FC
1N FC
5P FC
NFC PFC
1N FC
5nP FC
Proposed Methods
1.EasyEnsemble+XGBoost
– Easyensemble as a resampling technique, XGBoost as a
base model.
2.Change the structure of XGBoost
2.1.Customizing evalution metric
• EMC cost fomula as evalution metric
• weight(parameter of xgboost)+threshold(parameter in cost
fomula)
2.2.Customizing cost function(Focal loss)
• EMC cost fomula as evalution metric
• Focal loss as cost function.
14
Experiment Setting
• 5-fold cross Validation
• XGBoost
– XGBoost module in python
• Tunning Method
– Grid Search(Preventing local optimization and perform two rounds
of tuning on important parameters)
• number of boosting round→ eta(learning rate)→ max depth,min child weight→
subsample,colsample bytree→eta(learning rate)→ number of boosting round
15
Data Set for Credit Scoring
• Data on credit scoring
– the amount of public data is small
• Data used in previous research
16
Datasets Samples Features Good/Bad
German 1000 24 700/300
Australian 690 14 307/383
Taiwan 30000 23 23364/6636
Qianhai 40000 491 34737/5263
Japanese 690 15 307/383
Data Set for Credit Scoring
• Introduction of data set
• The first method
German,Australian,Taiwan,Qianhai
• The second method
German,Taiwan
17
Datasets Samples Features Good/Bad
German 1000 24 700/300
Australian 690 14 307/383
Taiwan 30000 23 23364/6636
Qianhai 40000 491 34737/5263
Method 1(EasyEnsemble) 18
Purpose: Increase the sensitivity of minority samples. Reducing cost without reducing AUC.
Reducing losses of creditors without reducing customer satisfaction.
X1 X2 Y
1 1 0
2 2 0
3 3 0
4 4 1
5 5 1
X1 X2 Y
1 1 0
2 2 0
3 3 0
X1 X2 Y
1 1 0
2 2 0
X1 X2 Y
2 2 0
3 3 0
X1 X2 Y
3 3 0
1 1 0
A3
A1
A2
A
X1 X2 Y
4 4 1
5 5 1
B
Divide sample into two classes according to the value of y.Majority
class (good credit)is divided into several small groups, the number of
which is equal to minority class(bad credit.)
Method 1(EasyEnsemble) 19
A1+B A2+B A3+B
adaboost adaboost adaboost
Class(Y)
Experiment(EasyEnsemble) 20
Resampling AUC
Origin(adaboost
)
0.750
OverSampling 0.751
SMOTE 0.733
UnderSampling 0.742
EasyEnsemble 0.771
Resampling AUC
Origin(adaboost
)
0.763
OverSampling 0.758
SMOTE 0.719
UnderSampling 0.759
EasyEnsemble 0.776
German dataset Taiwan dataset
• Oversampling is easily lead to overfitting.
• Undersampling is easily lead to underfitting.
• Smote:Manually generated data.
• EasyEnsemble:All data is original data.
Experiment Setting(EasyEnsemble) 21
X1 X2 Y
1 1 0
2 2 0
3 3 0
4 4 1
5 5 1
X1 X2 Y
1 1 0
2 2 0
3 3 0
X1 X2 Y
1 1 0
2 2 0
X1 X2 Y
2 2 0
3 3 0
X1 X2 Y
3 3 0
1 1 0
A3
A1
A2
A
X1 X2 Y
4 4 1
5 5 1
B
X1 X2 Y
4 4 1
5 5 1
X1 X2 Y
4 4 1
4 4 1
B1
B2
bootstrap method:
Put it back after extracting
Experiment Setting(EasyEnsemble)
Ensemble learning
A1+B1
xgboost
P(Y=1) P(Y=0)
feature importance(1)
n
YP
Y
n
i
i

 1
)1(
)1(P
n
YP
Y
n
i
i

 1
)0(
)0(P
n
ceimpofeature
ceimpor
n
i
n
 1
tan
tanfeature
xgboost xgboost
A2+B2 An+Bn
P(Y=1) P(Y=0)
feature importance(2)
P(Y=1) P(Y=0)
feature importance(n)
Resampling:Easyensemble
Base model:XGBoost
parameter(colsample bytree)
adjusted within range of 0.1
colsample bytree:Column
sampling, select the proportion
of features
n:Number of base models
outpu(simple average method):
• probability of y
• feature importance
Outcome(EasyEnsemble) 23
german australian taiwan qianhai
EasyEnsemble 0.77 0.88 0.70 0.65
XGBoost 0.81 0.95 0.78 0.71
XGBoost_Easy
Ensemble
0.82 0.95 0.78 0.72
AUC
german australian taiwan qianhai
XGBoost 0.813 0.381 0.741 0.713
XGBoost_Easy
Ensemble
0.578 0.343 0.556 0.541
Cost(EMC)
Cost is reduced
without reducing AUC.
XGBoost is optimized
without losing accuracy
Method 2(Change structure)
• Purpose:
– Customizing evaluation metric to get the minimum cost
– Reducing cost without considering AUC
– The only objective is to reduce loss
• Evaluation metric
– Playing no role in directly optimizing or training model
– Stopping model from training once it stops improving
– Example:people use the logloss objective to train,create an AUC
metric to evaluate the model.
• Cost function
– The critical function to training
– It need to be optimized
24
Method 2.1(Evalution metric)
• Weight(parameter of XGBoost)
– Adjust the weight of minority class.When the cost reach lowest,
value of weight cannot be obtained.
– grad:first derivative
– hess:second derivative
• Evaluation metric(parameter of XGBoost)
– By customizing the evaluation metric, we can minimize the cost of
the model
25
originnew gradweightgrad *
originnew hessweighthess *
Experiment Setting(Evalution metric)
2.1 Customizing Evaluation metrics
• Weight
Adjust the weight of minority class
• Add threshold in evaluation metrics
– default threshold=0.5
– If probability of y>threshold, predict value of y=1
– If probability of y<=threshold, predict value of y=0
26
originnew gradweightgrad *
originnew hessweighthess *
)(
1
21 LCLC
Total
EMC FNFP 
Outcome(Evalution metric) 27
German n=5
XGBoost 0.813
XGBoost_Customized
Evalution_metric
0.565
Taiwan n=5
XGBoost 0.742
XGBoost_Customized
Evalution_metric
0.553
Range of weight(1,10, interval=1)
Range of threshold(0.2,0.8,interval=0.05)
The best parameters
• weight=5,threshold=0.5
Range of weight(1,10, interval=1)
Range of threshold(0.2,0.8,interval=0.05)
The best parameters
• weight=3 ,threshold=0.4
Method 2.2(Focal loss)
• Focal loss:
• In focal loss:
• Reduce the weight of Easy negative(Easy Example)
• Increase the weight of hard negative(Hard
Example)
• Increase the sensitivity of minority class
28
      

N
i
iiiiii xpyxpy
N
xyloss
1
1log)1(log)(
1
),(log

)( ixp
))(1( ixp
))(1log()()1())(log())(1(
1
iii
m
i
i xpxpyxpxpyFL  


))(1log()()1())(log())(1(
1
iii
m
i
i xpxpyxpxpyFL  



Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2018). Focal loss for dense object detection.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–1.
Method 2.2(Focal loss)
• Credit scoring
– Features of good credit and bad credit are very different.
– Most of good credit samples are easy negative samples.
– Only a few sample features are similar with bad credit samples.
• Cost function needs to calculate the first and second derivatives
– First derivative
– Second derivative
• Weight(function same to alplha,ignore)
29
)))(1log())(1()(()()1()))(log()()(1())(1( iiiiiiii xpxpxpxpyxpxpxpxpy   
  12)()12()1)()())((log())(1)((  
iiiiii xpxpxpxpxpxyp
    12))(1)(12())()())((1))((1log())(1()(y-1  
iiiiiii xpxpxpxpxpxpxp
originnew gradweightgrad *
originnew hessweighthess *
Experiment Setting(Focal loss) 30
))(1log()()1())(log())(1(
1
iii
m
i
i xpxpyxpxpyFL  



When α=1, =0,FL=logloss
lossaFL log)0,1(  
      

N
i
iiiiii xpyxpy
N
xyloss
1
1log)1(log)(
1
),(log

0default
1default
5.0thresholddefault
Therefore
Outcome(Focal loss) 31
threshold cost
0.05 0.567
0.1 0.539
0.2 0.545
0.3 0.614
0.4 0.693
gamma cost
0 0.813
0.5 0.797
1 0.800
2 0.822
3 0.817
4 0.832
aplha cost
1 0.813
2 0.671
3 0.636
4 0.582
5 0.567
6 0.575
threshold cost
0.05 0.741
0.1 0.612
0.2 0.552
0.3 0.616
0.4 0.686
gamma cost
0 0.742
0.5 0.741
1 0.747
2 0.741
3 0.745
4 0.746
aplha cost
1 0.742
2 0.641
3 0.581
4 0.553
5 0.554
6 0.557
German dataset
Taiwan dataset
alpha=1,gamma=0 alpha=1,threshold=0.5 gamma=0,threshold=0.5
alpha=1
threshold=0.4
gamma=3
lowerst cost=0.547
alpha=2
threshold=0.3
gamma=1.5
lowest cost=0.532
alpha=1,gamma=0 alpha=1,threshold=0.5 gamma=0,threshold=0.5
Range
alpha(0,10, interval=1)
gamma(0,5,interval=0.5)
threshold(0,0.8,interval=0.05)
Outcome(Focal loss) 32
Ggerman n=5
XGBoost 0.813
XGBoost_focal 0.532
German dataset
Number of FP,FN in XGBoost Number of FP,FN in XGBoost_focal
predict
good
predict
bad
true
good
124 15
ture bad 29 30
Saved cost:47
predict
good
predict
bad
true
good
126 53
ture bad 12 27
21cos LCLCttotalthe FNFP 
1N FC 5P FC
Outcome(Focal loss) 33
Taiwan n=5
XGBoost 0.742
XGBoost_focal 0.547
Taiwan dataset
predict
good
predict
bad
true
good
4445 227
ture bad 844 482
Number of FP,FN in XGBoostNumber of FP,FN in XGBoost_focal
predict
good
predict
bad
true
good
3397 1275
ture bad 402 924
of each sample:amount of loan
1N %20 XofAmountCF 
1P XofAmountCF 
1X
Saved cost:85200 (Taiwan New Dollar)
)( 1L
)( 2L
)( 1L
)( 2L
predict
good
predict
bad
true
good
0 1680
ture bad 74000 0
predict
good
predict
bad
true
good
0 2880
ture bad 158000 0
Total cost of FP,FN in XGBoost_focal Total cost of FP,FN in XGBoost
Discussion
1.EasyEnsemble +XGBoost
– All the data are original data
– Error=bias+variance
– is the variance, is correlation coefficient
– is Number of base models
– Increasing the difference of the base model can reduce the
variance and thus reduce the error
34
2N2P
1
11
)
1
( NNPP
n
i
i
n
n
nn
n
n
X
n
Var 


 



N ,P
n
NP  ,
)
1
()
1
()
1
(  
n
i
i
n
i
i
n
i
i D
n
VarP
n
VarX
n
Var
2N2P
1
11
)
1
( NNPP
n
i
i
n
n
nn
n
n
X
n
Var 


 



Discussion
2.Change the structure of XGBoost
2.1.Customizing evalution metric
• The clear target was determined, when the value of evaluation
metric is optimal, and the model stop training
2.2.Customizing cost function(Focal loss)
• Increasing the sensitivity of minority samples.
• Distinguishing between difficult and easy to sample.
35
Conclusion
• Growing needs for “credit scoring”
• Problem
– Understandable
– Imbalanced data
– Cost sensitive
• Solution
– Resampling(Easyensemble)
– Changing structure(customize evaluation metric and cost function )
• Outcome
– Reducing cost
– Reducing creditors' losses
36
Research performance
・Information Processing Society of Japan
1) Ce Chen, Soichiro Yokoyama, Tomohisa Yamashita, Hidenori Kawamura: Application of XGBoost to credit
scoring , Special Internet Groups(Sig),Vol 194, Hokkaido(2019)
7

More Related Content

What's hot

Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Digital image processing short quesstion answers
Digital image processing short quesstion answersDigital image processing short quesstion answers
Digital image processing short quesstion answersAteeq Zada
 
Image Acquisition and Representation
Image Acquisition and RepresentationImage Acquisition and Representation
Image Acquisition and RepresentationAmnaakhaan
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Varun Ojha
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image RestorationMathankumar S
 
04 image enhancement edge detection
04 image enhancement edge detection04 image enhancement edge detection
04 image enhancement edge detectionRumah Belajar
 
3 intensity transformations and spatial filtering slides
3 intensity transformations and spatial filtering slides3 intensity transformations and spatial filtering slides
3 intensity transformations and spatial filtering slidesBHAGYAPRASADBUGGE
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataLiyuan Xu
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lecturesTaymoor Nazmy
 
Image degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafImage degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafMD Naseem Ashraf
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram ProcessingAmnaakhaan
 
New approach for generalised unsharp masking alogorithm
New approach for generalised unsharp masking alogorithmNew approach for generalised unsharp masking alogorithm
New approach for generalised unsharp masking alogorithmeSAT Publishing House
 
03 digital image fundamentals DIP
03 digital image fundamentals DIP03 digital image fundamentals DIP
03 digital image fundamentals DIPbabak danyal
 

What's hot (20)

Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Digital image processing short quesstion answers
Digital image processing short quesstion answersDigital image processing short quesstion answers
Digital image processing short quesstion answers
 
Image Acquisition and Representation
Image Acquisition and RepresentationImage Acquisition and Representation
Image Acquisition and Representation
 
Spatial domain and filtering
Spatial domain and filteringSpatial domain and filtering
Spatial domain and filtering
 
Image transforms
Image transformsImage transforms
Image transforms
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)
 
Lecture 5
Lecture 5Lecture 5
Lecture 5
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image Restoration
 
04 image enhancement edge detection
04 image enhancement edge detection04 image enhancement edge detection
04 image enhancement edge detection
 
SPATIAL FILTER
SPATIAL FILTERSPATIAL FILTER
SPATIAL FILTER
 
3 intensity transformations and spatial filtering slides
3 intensity transformations and spatial filtering slides3 intensity transformations and spatial filtering slides
3 intensity transformations and spatial filtering slides
 
Digital image processing
Digital image processing  Digital image processing
Digital image processing
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison Data
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lectures
 
Image degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafImage degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem Ashraf
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram Processing
 
New approach for generalised unsharp masking alogorithm
New approach for generalised unsharp masking alogorithmNew approach for generalised unsharp masking alogorithm
New approach for generalised unsharp masking alogorithm
 
03 digital image fundamentals DIP
03 digital image fundamentals DIP03 digital image fundamentals DIP
03 digital image fundamentals DIP
 

Similar to Study on Application of Ensemble learning on Credit Scoring

Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnDataRobot
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruningVasileiosMezaris
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Olivier Jeunen
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...PyData
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programmingSoumya Mukherjee
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptx22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptxMarceloHenriques20
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep LearningSourya Dey
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesGiuseppe (Pino) Di Fabbrizio
 
Convex optmization in communications
Convex optmization in communicationsConvex optmization in communications
Convex optmization in communicationsDeepshika Reddy
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
11.1. PPT on How to crack ML Competitions all steps explained.pptx
11.1. PPT on How to crack ML Competitions all steps explained.pptx11.1. PPT on How to crack ML Competitions all steps explained.pptx
11.1. PPT on How to crack ML Competitions all steps explained.pptxhu153574
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 

Similar to Study on Application of Ensemble learning on Credit Scoring (20)

Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruning
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
 
CH1.ppt
CH1.pptCH1.ppt
CH1.ppt
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
 
report
reportreport
report
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptx22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptx
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep Learning
 
Ch13 slides
Ch13 slidesCh13 slides
Ch13 slides
 
Survadapt-Webinar_2014_SLIDES
Survadapt-Webinar_2014_SLIDESSurvadapt-Webinar_2014_SLIDES
Survadapt-Webinar_2014_SLIDES
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectives
 
Convex optmization in communications
Convex optmization in communicationsConvex optmization in communications
Convex optmization in communications
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
11.1. PPT on How to crack ML Competitions all steps explained.pptx
11.1. PPT on How to crack ML Competitions all steps explained.pptx11.1. PPT on How to crack ML Competitions all steps explained.pptx
11.1. PPT on How to crack ML Competitions all steps explained.pptx
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 

More from harmonylab

【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也harmonylab
 
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究harmonylab
 
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...harmonylab
 
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究harmonylab
 
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究harmonylab
 
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...harmonylab
 
DLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile BackboneDLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile Backboneharmonylab
 
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat ModelsDLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Modelsharmonylab
 
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationDLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimationharmonylab
 
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language ModelsVoyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Modelsharmonylab
 
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose EstimationDLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimationharmonylab
 
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language ModelsReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Modelsharmonylab
 
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究harmonylab
 
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究harmonylab
 
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究harmonylab
 
深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究harmonylab
 
競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究harmonylab
 
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究harmonylab
 
A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...harmonylab
 
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究harmonylab
 

More from harmonylab (20)

【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
【修士論文】代替出勤者の選定業務における依頼順決定方法に関する研究   千坂知也
 
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
【修士論文】経路探索のための媒介中心性に基づく道路ネットワーク階層化手法に関する研究
 
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
A Study on Decision Support System for Snow Removal Dispatch using Road Surfa...
 
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
【卒業論文】印象タグを用いた衣服画像生成システムに関する研究
 
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
【卒業論文】大規模言語モデルを用いたマニュアル文章修正手法に関する研究
 
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
DLゼミ:Primitive Generation and Semantic-related Alignment for Universal Zero-S...
 
DLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile BackboneDLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
DLゼミ: MobileOne: An Improved One millisecond Mobile Backbone
 
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat ModelsDLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
DLゼミ: Llama 2: Open Foundation and Fine-Tuned Chat Models
 
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationDLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
DLゼミ: ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
 
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language ModelsVoyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
 
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose EstimationDLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
DLゼミ: Ego-Body Pose Estimation via Ego-Head Pose Estimation
 
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language ModelsReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
 
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究形態素解析を用いた帝国議会議事速記録の変遷に関する研究
形態素解析を用いた帝国議会議事速記録の変遷に関する研究
 
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
【卒業論文】深層生成モデルを用いたユーザ意図に基づく衣服画像の生成に関する研究
 
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究灯油タンク内の液面高計測を用いた灯油残量推定システムに関する研究
灯油タンク内の液面高計測を用いた 灯油残量推定システムに関する研究
 
深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究深層自己回帰モデルを用いた俳句の生成と評価に関する研究
深層自己回帰モデルを用いた俳句の生成と評価に関する研究
 
競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究競輪におけるレーティングシステムを用いた予想記事生成に関する研究
競輪におけるレーティングシステムを用いた予想記事生成に関する研究
 
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
【卒業論文】B2Bオークションにおけるユーザ別 入札行動予測に関する研究
 
A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...A Study on Estimation of Household Kerosene Consumption for Optimization of D...
A Study on Estimation of Household Kerosene Consumption for Optimization of D...
 
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究
マルチエージェント深層強化学習による自動運転車両の追越行動の獲得に関する研究
 

Recently uploaded

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 

Recently uploaded (20)

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 

Study on Application of Ensemble learning on Credit Scoring

  • 1. Study on Application of Ensemble learning on Credit Scoring Ce Chen Hokkaido University Graduate School of Information Science and Technology Laboratory of Harmonious Systems Engineering 1
  • 2. Background • Increase in financing opportunities due to the development of Fintech – Many people who have not received loans until now will be eligible for loans. – More and more people are involved in credit. • Growing needs for “credit scoring” – We need to have an effective way to judge whether the credit of the individual is good or bad. – By judging individual credit correctly, it can effectively reduce the losses. 2
  • 3. Problems of Credit Scoring • Few understandable models – Actual operation is difficult in models that do not know the calculation process of Credit scoring • Multiple evaluation indices – Improve accuracy, minimize loss amount, etc. • Imbalanced data in credit samples – The number of good credit samples is more than the bad credit samples. • Machine learning is needed to tackle the problem 3
  • 4. Purpose • Purpose – Building a model that can better deal with credit scoring problems with machine learning. • Approach – Credit scoring as classification problem – Ensemble Learning : XGBoost • High accuracy reported in previous research • Explainable criteria based on decision tree • Evaluation indicators considering cost sensitive – EasyEnsemble and Focal loss for imbalanced data 4
  • 5. Credit Scoring as Classification Problem • Use machine learning to solve credit scoring 5 X11,X12,X13,......Y1 X21,X22,X23,......Y2 ..... Xn1,Xn2,Xn3,......Yn past debator scoring model scoring model Input: Xm1,Xm2,Xm3,..... new customer's feature Output: probability of Y new customer's credit If probability of y>threshold, bad credit If probability of y<=threshold, good credit X:Feature of Individual Y:Credit of Individual Y=1, bad credit Y=0,good credit
  • 6. The principle of XGBoost 6 )(obj objective  The coefficient in front of x  n i ii yyl ),( ^ cost function n Number of samples iy real value of y iy ^ predict value of y   K k kf 1 )( regularization K Number of decision tree • Objective=cost function + regularization kf The complexity of decision tree
  • 7. The principle of XGBoost 7 )(^ t iy The predicted results of round t )( it xf The predicted result of the current tree • After round t training yy t i predict )(^ 
  • 8. The principle of XGBoost 8 • Difference between XGBoost and Gradient Boost :First Derivative :Second Derivative • Objective results only depend on the first and second derivatives of the cost function. For complex cost functions, it can be easier to calculate. second order Taylor expansion cost function
  • 9. XGBoost in Previous Research 9 • Dataset: Australian,Japanese • Model: k-NearestNeighbor,Logistic Regression,Linear Discriminant Analysis,Support Vector Machine, Decision Tree,Random Forest,Gradient Boost Decision Tree,Adaboost,XGBoost • Result: By comparing the models, XGBoost has higher accuracy He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications, 98, 105–117.
  • 10. Assessment method(AUC) 10 predict_true predict_false label_true TP FN label_false FP TN false positive rate= FP/(FP+TN) true positive rate=TP/(TP+FN) AUC(size of blue area) In the figure, (0,1) is the best case, and all samples can be separated correctly. The closer the blue line is to (0,1), the more accurate the model will be
  • 11. Assessment method(cost sensitive) 11 • For the credit scoring problem, we need only the total cost value to judge whether the model is good or bad. • Cost sensitive: • 1 • 2 • 3 amount cost FP FN )( 21 1 LCLC FNFPCC FPFN  21 LCLC FNFP  )( 1 21 LCLC Total EMC FNFP  1L 2L FPC FNC 21cos LCLCttotalthe FNFP  FNFP CC samplestestofNumberTotal
  • 12. Assessment method(cost sensitive) 12 German Australian LR 177.8 51 DT 191.6 86 RF 190.8 52.6 XGBoost 165.2 43 German Australian LR 0.875 0.393 DT 0.976 0.620 RF 0.954 0.394 XGBoost 0.813 0.381 21 LCLC FNFP  German Australian LR 29.6 8.8 DT 30.5 14.2 RF 31.8 9.1 XGBoost 27.5 8.6 )( 21 1 LCLC FNFPCC FPFN  )( 1 21 LCLC Total EMC FNFP  West, D. (2000). Neural network credit scoring models. Computers & Operations Research, 27(11-12), 1131–1152. LR:Logistic Regression DT:Decision Tree RF:Random Forest The average cost of a sample, The change of test samples' amount doesn't affect the result
  • 13. Assessment method(cost senstive) • The value of and – , is usually set to 1, is set to a constant that greater than 1. • In many papers, , – Ting, K. M. Inducing cost-sensitive trees via instance weighting. Lecture Notes in Computer Science, 1998 – C Elkan. The foundations of cost-sensitive learning.International joint conference on artificial intelligence, 2001 – The contributor of German dataset suggests , • In this paper, , 13 NFC 5nP FC FNFP CC  1N FC nP FC 1N FC 5P FC NFC PFC 1N FC 5nP FC
  • 14. Proposed Methods 1.EasyEnsemble+XGBoost – Easyensemble as a resampling technique, XGBoost as a base model. 2.Change the structure of XGBoost 2.1.Customizing evalution metric • EMC cost fomula as evalution metric • weight(parameter of xgboost)+threshold(parameter in cost fomula) 2.2.Customizing cost function(Focal loss) • EMC cost fomula as evalution metric • Focal loss as cost function. 14
  • 15. Experiment Setting • 5-fold cross Validation • XGBoost – XGBoost module in python • Tunning Method – Grid Search(Preventing local optimization and perform two rounds of tuning on important parameters) • number of boosting round→ eta(learning rate)→ max depth,min child weight→ subsample,colsample bytree→eta(learning rate)→ number of boosting round 15
  • 16. Data Set for Credit Scoring • Data on credit scoring – the amount of public data is small • Data used in previous research 16 Datasets Samples Features Good/Bad German 1000 24 700/300 Australian 690 14 307/383 Taiwan 30000 23 23364/6636 Qianhai 40000 491 34737/5263 Japanese 690 15 307/383
  • 17. Data Set for Credit Scoring • Introduction of data set • The first method German,Australian,Taiwan,Qianhai • The second method German,Taiwan 17 Datasets Samples Features Good/Bad German 1000 24 700/300 Australian 690 14 307/383 Taiwan 30000 23 23364/6636 Qianhai 40000 491 34737/5263
  • 18. Method 1(EasyEnsemble) 18 Purpose: Increase the sensitivity of minority samples. Reducing cost without reducing AUC. Reducing losses of creditors without reducing customer satisfaction. X1 X2 Y 1 1 0 2 2 0 3 3 0 4 4 1 5 5 1 X1 X2 Y 1 1 0 2 2 0 3 3 0 X1 X2 Y 1 1 0 2 2 0 X1 X2 Y 2 2 0 3 3 0 X1 X2 Y 3 3 0 1 1 0 A3 A1 A2 A X1 X2 Y 4 4 1 5 5 1 B Divide sample into two classes according to the value of y.Majority class (good credit)is divided into several small groups, the number of which is equal to minority class(bad credit.)
  • 19. Method 1(EasyEnsemble) 19 A1+B A2+B A3+B adaboost adaboost adaboost Class(Y)
  • 20. Experiment(EasyEnsemble) 20 Resampling AUC Origin(adaboost ) 0.750 OverSampling 0.751 SMOTE 0.733 UnderSampling 0.742 EasyEnsemble 0.771 Resampling AUC Origin(adaboost ) 0.763 OverSampling 0.758 SMOTE 0.719 UnderSampling 0.759 EasyEnsemble 0.776 German dataset Taiwan dataset • Oversampling is easily lead to overfitting. • Undersampling is easily lead to underfitting. • Smote:Manually generated data. • EasyEnsemble:All data is original data.
  • 21. Experiment Setting(EasyEnsemble) 21 X1 X2 Y 1 1 0 2 2 0 3 3 0 4 4 1 5 5 1 X1 X2 Y 1 1 0 2 2 0 3 3 0 X1 X2 Y 1 1 0 2 2 0 X1 X2 Y 2 2 0 3 3 0 X1 X2 Y 3 3 0 1 1 0 A3 A1 A2 A X1 X2 Y 4 4 1 5 5 1 B X1 X2 Y 4 4 1 5 5 1 X1 X2 Y 4 4 1 4 4 1 B1 B2 bootstrap method: Put it back after extracting
  • 22. Experiment Setting(EasyEnsemble) Ensemble learning A1+B1 xgboost P(Y=1) P(Y=0) feature importance(1) n YP Y n i i   1 )1( )1(P n YP Y n i i   1 )0( )0(P n ceimpofeature ceimpor n i n  1 tan tanfeature xgboost xgboost A2+B2 An+Bn P(Y=1) P(Y=0) feature importance(2) P(Y=1) P(Y=0) feature importance(n) Resampling:Easyensemble Base model:XGBoost parameter(colsample bytree) adjusted within range of 0.1 colsample bytree:Column sampling, select the proportion of features n:Number of base models outpu(simple average method): • probability of y • feature importance
  • 23. Outcome(EasyEnsemble) 23 german australian taiwan qianhai EasyEnsemble 0.77 0.88 0.70 0.65 XGBoost 0.81 0.95 0.78 0.71 XGBoost_Easy Ensemble 0.82 0.95 0.78 0.72 AUC german australian taiwan qianhai XGBoost 0.813 0.381 0.741 0.713 XGBoost_Easy Ensemble 0.578 0.343 0.556 0.541 Cost(EMC) Cost is reduced without reducing AUC. XGBoost is optimized without losing accuracy
  • 24. Method 2(Change structure) • Purpose: – Customizing evaluation metric to get the minimum cost – Reducing cost without considering AUC – The only objective is to reduce loss • Evaluation metric – Playing no role in directly optimizing or training model – Stopping model from training once it stops improving – Example:people use the logloss objective to train,create an AUC metric to evaluate the model. • Cost function – The critical function to training – It need to be optimized 24
  • 25. Method 2.1(Evalution metric) • Weight(parameter of XGBoost) – Adjust the weight of minority class.When the cost reach lowest, value of weight cannot be obtained. – grad:first derivative – hess:second derivative • Evaluation metric(parameter of XGBoost) – By customizing the evaluation metric, we can minimize the cost of the model 25 originnew gradweightgrad * originnew hessweighthess *
  • 26. Experiment Setting(Evalution metric) 2.1 Customizing Evaluation metrics • Weight Adjust the weight of minority class • Add threshold in evaluation metrics – default threshold=0.5 – If probability of y>threshold, predict value of y=1 – If probability of y<=threshold, predict value of y=0 26 originnew gradweightgrad * originnew hessweighthess * )( 1 21 LCLC Total EMC FNFP 
  • 27. Outcome(Evalution metric) 27 German n=5 XGBoost 0.813 XGBoost_Customized Evalution_metric 0.565 Taiwan n=5 XGBoost 0.742 XGBoost_Customized Evalution_metric 0.553 Range of weight(1,10, interval=1) Range of threshold(0.2,0.8,interval=0.05) The best parameters • weight=5,threshold=0.5 Range of weight(1,10, interval=1) Range of threshold(0.2,0.8,interval=0.05) The best parameters • weight=3 ,threshold=0.4
  • 28. Method 2.2(Focal loss) • Focal loss: • In focal loss: • Reduce the weight of Easy negative(Easy Example) • Increase the weight of hard negative(Hard Example) • Increase the sensitivity of minority class 28         N i iiiiii xpyxpy N xyloss 1 1log)1(log)( 1 ),(log  )( ixp ))(1( ixp ))(1log()()1())(log())(1( 1 iii m i i xpxpyxpxpyFL     ))(1log()()1())(log())(1( 1 iii m i i xpxpyxpxpyFL      Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2018). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–1.
  • 29. Method 2.2(Focal loss) • Credit scoring – Features of good credit and bad credit are very different. – Most of good credit samples are easy negative samples. – Only a few sample features are similar with bad credit samples. • Cost function needs to calculate the first and second derivatives – First derivative – Second derivative • Weight(function same to alplha,ignore) 29 )))(1log())(1()(()()1()))(log()()(1())(1( iiiiiiii xpxpxpxpyxpxpxpxpy      12)()12()1)()())((log())(1)((   iiiiii xpxpxpxpxpxyp     12))(1)(12())()())((1))((1log())(1()(y-1   iiiiiii xpxpxpxpxpxpxp originnew gradweightgrad * originnew hessweighthess *
  • 30. Experiment Setting(Focal loss) 30 ))(1log()()1())(log())(1( 1 iii m i i xpxpyxpxpyFL      When α=1, =0,FL=logloss lossaFL log)0,1(           N i iiiiii xpyxpy N xyloss 1 1log)1(log)( 1 ),(log  0default 1default 5.0thresholddefault Therefore
  • 31. Outcome(Focal loss) 31 threshold cost 0.05 0.567 0.1 0.539 0.2 0.545 0.3 0.614 0.4 0.693 gamma cost 0 0.813 0.5 0.797 1 0.800 2 0.822 3 0.817 4 0.832 aplha cost 1 0.813 2 0.671 3 0.636 4 0.582 5 0.567 6 0.575 threshold cost 0.05 0.741 0.1 0.612 0.2 0.552 0.3 0.616 0.4 0.686 gamma cost 0 0.742 0.5 0.741 1 0.747 2 0.741 3 0.745 4 0.746 aplha cost 1 0.742 2 0.641 3 0.581 4 0.553 5 0.554 6 0.557 German dataset Taiwan dataset alpha=1,gamma=0 alpha=1,threshold=0.5 gamma=0,threshold=0.5 alpha=1 threshold=0.4 gamma=3 lowerst cost=0.547 alpha=2 threshold=0.3 gamma=1.5 lowest cost=0.532 alpha=1,gamma=0 alpha=1,threshold=0.5 gamma=0,threshold=0.5 Range alpha(0,10, interval=1) gamma(0,5,interval=0.5) threshold(0,0.8,interval=0.05)
  • 32. Outcome(Focal loss) 32 Ggerman n=5 XGBoost 0.813 XGBoost_focal 0.532 German dataset Number of FP,FN in XGBoost Number of FP,FN in XGBoost_focal predict good predict bad true good 124 15 ture bad 29 30 Saved cost:47 predict good predict bad true good 126 53 ture bad 12 27 21cos LCLCttotalthe FNFP  1N FC 5P FC
  • 33. Outcome(Focal loss) 33 Taiwan n=5 XGBoost 0.742 XGBoost_focal 0.547 Taiwan dataset predict good predict bad true good 4445 227 ture bad 844 482 Number of FP,FN in XGBoostNumber of FP,FN in XGBoost_focal predict good predict bad true good 3397 1275 ture bad 402 924 of each sample:amount of loan 1N %20 XofAmountCF  1P XofAmountCF  1X Saved cost:85200 (Taiwan New Dollar) )( 1L )( 2L )( 1L )( 2L predict good predict bad true good 0 1680 ture bad 74000 0 predict good predict bad true good 0 2880 ture bad 158000 0 Total cost of FP,FN in XGBoost_focal Total cost of FP,FN in XGBoost
  • 34. Discussion 1.EasyEnsemble +XGBoost – All the data are original data – Error=bias+variance – is the variance, is correlation coefficient – is Number of base models – Increasing the difference of the base model can reduce the variance and thus reduce the error 34 2N2P 1 11 ) 1 ( NNPP n i i n n nn n n X n Var         N ,P n NP  , ) 1 () 1 () 1 (   n i i n i i n i i D n VarP n VarX n Var 2N2P 1 11 ) 1 ( NNPP n i i n n nn n n X n Var        
  • 35. Discussion 2.Change the structure of XGBoost 2.1.Customizing evalution metric • The clear target was determined, when the value of evaluation metric is optimal, and the model stop training 2.2.Customizing cost function(Focal loss) • Increasing the sensitivity of minority samples. • Distinguishing between difficult and easy to sample. 35
  • 36. Conclusion • Growing needs for “credit scoring” • Problem – Understandable – Imbalanced data – Cost sensitive • Solution – Resampling(Easyensemble) – Changing structure(customize evaluation metric and cost function ) • Outcome – Reducing cost – Reducing creditors' losses 36
  • 37. Research performance ・Information Processing Society of Japan 1) Ce Chen, Soichiro Yokoyama, Tomohisa Yamashita, Hidenori Kawamura: Application of XGBoost to credit scoring , Special Internet Groups(Sig),Vol 194, Hokkaido(2019) 7