“LOAN PREDICTION BASED ON
CUSTOMER BEHAVIOR ”
Predict who possible Defaulters are for the Consumer
Loans Product
PRESENTED BY:
Aa (000)
Aa (000)
Aa (000)
Aa (000)
DATAANALYTICS:
SECOND DELIVERABLE -
VIDEO PRESENTATION
• Introduction
• Name And Source Of Dataset
• Collection Of Dataset
• Purpose Of Each Column In Dataset
• Overview Of The Data
• Data Quality Problems And Its Solution
• Top Distinctive Categorical Attribute
• Top Distinctive Numerical Attribute
• Attribute Having No Impact
INTRODUCTION
• Loan prediction is a crucial aspect of the financial
industry.
• Data analytics can provide valuable insights into
customer behavior for accurate loan prediction.
• This presentation will explore the benefits and process of
using data analytics for loan prediction based on
customer behavior.
DATA EXPLORATION
Aa (000)
THE NAME AND SOURCE
OF DATASET
• Name: Loan Prediction Based on Customer Behavior: Predict
who possible Defaulters are for the Consumer Loans Product
• Source: https://www.kaggle.com/datasets/subhamjain/loan-
prediction-based-on-customer-behavior
COLLECTION OF
DATASET
• The dataset was created after collecting data about
demographics of user, professional experience,
ownerships, marital status, risk flag, area of residence,
and time in that residence. Based on this information, a
dataset made to predict whether the student is worthy
giving credit or not.
PURPOSE OF EACH
COLUMN IN DATASET
• Income – Income of the user
• Age – Age of the user
• Experience – Professional experience of the user in years
• Married – Whether married or single
• Ownership – Does the person own a car/house
• Risk Flag – Defaulted on a loan
• Current Job/ House Years:Years of experience in the current
job/ house
AN OVERVIEW OF THE
DATA
• We converted the .CSV file to .ARFF using Weka
itself
• Everything was done correctly.
DATA QUALITY
PROBLEMS AND ITS
SOLUTION
• The only data quality issue faced was that the
visualizations were black and white, and inaccurate.
• Later it was discovered that it was due to the class label
being numeric.
• To rectify this issue we applied a filter to the class label
(Unsupervised > Attributes> NumericToNominal -R last)
The issue was solved immediately
VISUALIZATION
TOP DISTINCTIVE
CATEGORICAL
ATTRIBUTE
• The top distinctive categorical attribute is: Risk Flag
• This attribute is highly correlated with the class label. It
is because it plays a important role in predicting
whether the student ever defaulted.
TOP DISTINCTIVE
NUMERICAL ATTRIBUTE
• The top distinctive numerical attribute is the credit
worthiness of the student.
• It plays a vital role in determining whether the student is
worthy enough to grant loan and at what interest rate.
ATTRIBUTE HAVING NO
IMPACT ON THE CLASS
LABEL
• The attribute that has no impact on the class label is
whether the student is married or single
• According to the visualization, there is minimum
correlation between them.
THE END

LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR.pptx

  • 1.
    “LOAN PREDICTION BASEDON CUSTOMER BEHAVIOR ” Predict who possible Defaulters are for the Consumer Loans Product PRESENTED BY: Aa (000) Aa (000) Aa (000) Aa (000)
  • 2.
    DATAANALYTICS: SECOND DELIVERABLE - VIDEOPRESENTATION • Introduction • Name And Source Of Dataset • Collection Of Dataset • Purpose Of Each Column In Dataset • Overview Of The Data • Data Quality Problems And Its Solution • Top Distinctive Categorical Attribute • Top Distinctive Numerical Attribute • Attribute Having No Impact
  • 3.
    INTRODUCTION • Loan predictionis a crucial aspect of the financial industry. • Data analytics can provide valuable insights into customer behavior for accurate loan prediction. • This presentation will explore the benefits and process of using data analytics for loan prediction based on customer behavior.
  • 4.
  • 5.
    THE NAME ANDSOURCE OF DATASET • Name: Loan Prediction Based on Customer Behavior: Predict who possible Defaulters are for the Consumer Loans Product • Source: https://www.kaggle.com/datasets/subhamjain/loan- prediction-based-on-customer-behavior
  • 6.
    COLLECTION OF DATASET • Thedataset was created after collecting data about demographics of user, professional experience, ownerships, marital status, risk flag, area of residence, and time in that residence. Based on this information, a dataset made to predict whether the student is worthy giving credit or not.
  • 7.
    PURPOSE OF EACH COLUMNIN DATASET • Income – Income of the user • Age – Age of the user • Experience – Professional experience of the user in years • Married – Whether married or single • Ownership – Does the person own a car/house • Risk Flag – Defaulted on a loan • Current Job/ House Years:Years of experience in the current job/ house
  • 8.
    AN OVERVIEW OFTHE DATA • We converted the .CSV file to .ARFF using Weka itself • Everything was done correctly.
  • 9.
    DATA QUALITY PROBLEMS ANDITS SOLUTION • The only data quality issue faced was that the visualizations were black and white, and inaccurate. • Later it was discovered that it was due to the class label being numeric. • To rectify this issue we applied a filter to the class label (Unsupervised > Attributes> NumericToNominal -R last) The issue was solved immediately
  • 10.
  • 11.
    TOP DISTINCTIVE CATEGORICAL ATTRIBUTE • Thetop distinctive categorical attribute is: Risk Flag • This attribute is highly correlated with the class label. It is because it plays a important role in predicting whether the student ever defaulted.
  • 12.
    TOP DISTINCTIVE NUMERICAL ATTRIBUTE •The top distinctive numerical attribute is the credit worthiness of the student. • It plays a vital role in determining whether the student is worthy enough to grant loan and at what interest rate.
  • 13.
    ATTRIBUTE HAVING NO IMPACTON THE CLASS LABEL • The attribute that has no impact on the class label is whether the student is married or single • According to the visualization, there is minimum correlation between them.
  • 14.