3. Problem Statement
• Credit risk is a crucial factor when
commercial banks and financial
institutions grant loans to
customers.
• Constructing reliable evaluation
models that will play a huge role in
loss control
and revenue maximization.
• This project aims to reduce credit
risks by predicting defaulters based
on the behaviour of past defaulters.
5. • While the banking sector has always required an automated and
reliable way to distinguish between ‘good’ and ‘bad’ customers
due to the inaccuracies in the current models, the need for
accuracy far outweighs the available.
• The objective of this model is to improve and make the system
more usable. by creating three levels:
• one unsupervised, clustering level;
• the next a supervised, classification level which involves
several algorithms; and
• And the third semi-supervised, which takes a consensus
of the classes.
6. The necessity of this endeavour
is abundant in the banking sector,
where about 5.21 trillion Indian
rupees was lost in NPA due to
defaulting.
8. What can be
• Guided, carefully analysed
decisions
• Highly accurate predictions
• Understanding over
intuition
• More Performing Assets,
lesser liabilities
10. • The current system is largely based on credit scoring,
which in India is handled by CIBIL.
• The score ranges between 300 and 900.
• The problem with this type of scoring is that it
depends on the number of defaults rather than the
density of the amount defaulted.
• This leads to numerous exploits and loopholes in the
system that potentially affects the economic balance of
the customers.
16. Route 1: No model works best for every
problem.
Route 2: Drawback of not being able to make
sense of the data before it is processed, which
can add to a lot of complexity and error.
Route 3: Does not provide an accuracy good
enough to positively make the system useful.
Route 4: This reduces pre-processing problems,
inconsistencies and inaccuracies of the system.
17. Process Flow of
the Ensemble -
Credit Risk
Assessment
system, based
on Route 4
21. Software Requirements Specification
Hardware Requirements:
• Minimum 2GB RAM
• Intel Pentium 4 or Higher
• Recommended 1GB Storage
Space
Preferences:
• 6GB RAM or higher
• Intel Core i3 6600K or higher
Software Requirements:
• Python 3.6 installed
Preferred Operating Systems:
• Windows 10
• Ubuntu 18.04LTS
22. Experimental Results
• Algorithms have been tested on
the main dataset. The
performance metrics used to
appraise this model are
Accuracy, Recall, Precision, F1
Score and Confusion Matrix.
• The accuracy of the model is
93%.
23. ● As can be inferred from both the tables next slide, the prediction of defaulters is
affected mostly by the fact that the number of samples that exist for them are
low.
● The overall performance of the model is significantly better than many
currently used models, and with a few more improvements can be useful for
progressing research in credit risk assessment.
24. • Using clustering algorithms that does not
require pre-set number of clusters at all, and
also identifies noisy data and does not use
them as a data point.
• The proposed CRA model can be
further enhanced to the following effects:
• Faster processing
• Reduced data overheads
• Client–side advisory assistant, so that the
debtor is warned about following a bad
spending behavioural pattern.
Future Work
25. Conclusion
• This approach to assessing credit risk allows for much more accurate
and far-sighted predictions such that countermeasures or advisory
procedures can be followed beforehand.
• This, in turn, recovers the public monetary assets that are rendered
ineffective due to defaulters of larger proportions, which is a major
problem in the Indian economy.
• Hence, the improved CRA model will make improvements in society
and country, and make it a better place to live in.
26. References
• Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art
classification algorithm for credit scoring.
• Breiman, L. (1999). Random forest.
• Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine.
• Cortes, C., & Vapnik, V. (1995). Support vector machine.
• Zhou, L., Lai, K. K., & Yu, L. (2010). Least square support vector machines ensemble models for
credit scoring.
• Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification.
• Henley, W. E., & Hand, D. J. (1996). A k- nearest- neighbor classifier for assessing consumer
credit risk.
• Islam, M. J., Wu, Q. M. J., Ahmadi, M., & Sid-Ahmed, M. A. (2007). Investigating the
performance of Naïve-Bayes classifier and K- nearest neighbor classifiers.
• Asgharbeygi, N., & Maleki, A. (2008).Geodesic K- means clustering.