2. PROBLEM 1 - CREDIT CARD DATASET FOR CLUSTERING
Steps performed:
EDA
Handling missing values
Building K-means clustering model
Hyperparameter tunning the model
Characterize customers who
(a) make frequent purchases using the credit card
(b) have high balances in their account
3. CHARACTERIZE CUSTOMERS – BALANCE, PURCHASE FREQUENCY
Cluster 2 has grouped customers with :
High Balance, High Purchase Frequency
High amount of purchases done in installment
High amount of cash in advance transactions
Highest credit limit
Cluster 1 has grouped customers with :
Low Balance but relatively moderate purchase
frequency
Low amount of cash in advance transactions,
installment purchases and credit limit
4. PROBLEM 2 - SOUTH GERMAN CREDIT DATA FOR
CLUSTERING
Steps performed:
EDA
Missing values check and dropping Target column
Building K-means clustering model
Hyperparameter tunning the model
Predicting good/bad credits
Characterize debtors who are bad credits
5. PREDICTING
GOOD/BAD
CREDITS
Target column is added to the cluster output dataset
Records are grouped by clusters and based on majority voting; the winning
target column value is assigned as the predicted value to the respective
cluster records
Predicted value is compared against the actual target column value and
accuracy is calculated using right predictions/total records
True positives and false negatives are also calculated by comparing the
predicted vs target values
6. CHARACTERIZING DEBTORS WHO ARE BAD CREDITS
Cluster 7 and 10 predict bad credits
Taking statistics for these clusters, we can conclude that bad credits are having :
Credit duration around 40 months, High credit amount,
Another debtor or guarantor for the credit,
No valuable property, 35 years in age