SlideShare a Scribd company logo
Reduction in Customer Complaints –
Mortgage Servicing Industry
By prediction of customers likely to complaint & taking
proactive steps for prevention of the same.
Original Project Partners
Pranov Mishra
Aniket Chhabra
Vivek Chandel
Madhu Gollpudi
Codes and cleaned dataset can be
found in my github account.
Link
(https://github.com/Pranov1984/Great-Lakes-Capstone-Project)
Executive Summary
Project Overview:
The project aims at analysis of Customer Complaints/Inquiries received by a US based mortgage (loan)
servicing company . The scope of the project is limited to complaints received with respect to the part of the servicing
life cycle that is related to Escrow Analysis and other related or subsidiary activities.
Goal Statement
Identification of major contributors towards complaints/inquiries. Utilization of the identified significant
contributors and coming up with recommendations for changes/new implementations with the below goals
▫ Reducing Re-work
▫ Reducing Operational Cost
▫ Improve Customer Satisfaction
▫ Improve company preparedness to respond to customers
Data Considered
A few months of data of standard servicing loans of the organisation which comprises of circa 154,000
records was used for data exploration, visualization and hypothesis generation. The data had a lot of missing
values and the missing values were typically when the event corresponding to the variable concerned) did not
occur for that observation. The data was cleaned by creating dummy variables with no missing values.
The code and the dataset provide in the github link constitutes the cleaned data.
Executive Summary Continued ...
• Escalations typically lead to extra work, reputational damage, sometimes regulatory scrutiny and penalties.
Preventing customer complaints and escalations is in the best interest of the company.
• The data was highly imbalanced (Majority class: Minority Class = 96%:4%) and hence appropriate model evaluation
metric was required to be chosen. A combination of Harmonic mean (F1 score) and Area Under the Curve (AUC)
was used to finalize the best model.
• Models tried to arrive at the best are
 Simple Model like Logistic Regression with different thresholds for classification
 Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)
 Stochastic Gradient Boosting technique after balancing the dataset as was the case with random forest
• The key insights derived from the model with best results, indicate that the variables that significantly impact
customer behaviour can be broadly classified as below:
 Waiver of escrow payments which could arise due to incorrect escrow analysis conducted resulting in customer
requesting for waiver of extra charges levied.
 Presence of “Initials” which comes into play when a customer is escrowed for the first time. So the customer
could be escalating either because he is incorrectly escrowed or the initial payment calculation for the escrow
services is incorrect.
 Process of handling force escrowed loans have been inadequate leading to customer queries and complaints.
• A Gains chart was prepared which gave a cumulative lift of 133% in the first 4 deciles. The customers with highest
probability of making a complaint were identified. The company could use this information to proactively review the
operations performed on the customer’s account and correct any errors if found.
Brief Overview of Escrow Account
Escrow:
Money held by a third-party on behalf of transacting parties
Escrow account:
An escrow account is established with a lender to pay for recurring
expenses related to ones property, such as real estate taxes and
homeowner’s insurance.
It helps borrower to anticipate and manage payment of these expenses
by including these expenses as a portion of monthly mortgage payment.
How does an escrow account work?
• At the time one establishes an escrow account, the customer’s
annual real estate taxes and homeowner’s insurance are estimated,
based on the customer’s most recent bills and premiums.
• An incremental amount of these expenses is added to the
customers’ monthly mortgage payment, in order to cover these
expenses when they are due.
• Each year, this escrow account is reviewed to determine if the
amount being escrowed each month is sufficient to pay for any
change in your real estate taxes or homeowner’s insurance
premiums.
• Incase a non escrowed customer defaults on payment of taxes and
insurance, the lender advances payments to the respective agencies
to protect the rights on the property. The lender then force escrows
the delinquent customers to recover the money.
• Missing value
treatment
• Outlier treatment
• Removing
inconsistencies
Data Treatment
• Review of each
variable and
transforming the
appropriate
variables
Exploratory Data
Analysis
• Event rate is highly
skewed in favor of
“No Complaints.
• SMOTE* used to
balance the data.
Balancing Data set
• Data was split in
70:30 ratio.
• Models were
trained on 70%
data & validated on
30%.
Data Partition
Analytical framework used to prepare the model
* Synthetic Minority
oversampling Technique
Model Building
Logistic Regression
Random Forest
Stochastic Gradient Boosting
Validation – Evaluation Metrics
Accuracy
Sensitivity(TPR) & Specificity(TNR)
Area under the Curve (AUC)
Data Visualization & Data Preparation
Both the graphs give the impression that the distribution is that of a factor variable. They look like
variables which should be a factor variable with most of the data crowding around zero and the
remaining few crowding around one. No data points in between zero and one. Variables transformed to
factor.
Data Visualization & Data Preparation
Both the graphs give the impression that the distribution is that of a factor variable. They look like
variables which should be a factor variable with most of the data crowding around zero and the
remaining few crowding around one. No data points in between zero and one. Variables transformed to
factor. Additionally looks like presence of Waiver is associated with complaints received
Data Visualization & Data Preparation
Both the graphs give the impression that the distribution is that of a factor variable. They look like
variables which should be a factor variable with most of the data crowding around zero and the
remaining few crowding around one. No data points in between zero and one. Variables transformed to
factor. Additionally, presence of Reversed seems to be associated with complaints received in a bigger
way.
Data Visualization & Data Preparation
Looks like a variable which should be numeric but with a high number of outliers. Most of the values are small with
a few very high values. Close to 17% of observations are outliers. Upon further analysis by deciling it, Eight deciles
have min and max as zero which constitute 80% of the data. Complaints received only in the 9th and 10th deciles
when there are surpluses greater than 0. Essentially there is a very high chance of a complaint/query when there is
a surplus. This indicates that the customer thinks that the analysis is incorrect or the surplus is not being returned
on time by the company. Variable converted to factor.
summary(mydata$Surplus)
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 0.0 0.0 75.7 0.0 452213.2
Data Visualization & Data Preparation
Shortage Spread looks like a continuous variable but had outliers. The outliers were treated by compressing the
extreme values to between 0 and 85 percentile of the actual values.
Anytime there is a shortage, there is a higher chance of a complaint.
Data Visualization & Data Preparation
With highly skewed numbers in favor of NonBK analysis a separate analysis only on NONBK analysis loans could be
contemplated.
Random Forest models give the best results. Tuning of the parameters i.e. Mtry (no. of variables used while
training models on bootstrapped samples) improved the results marginally. The gradient boosting results were
also nearly as good (marginally less) as the results from random forest. The important point to note though is
that the tree based models were trained on the data after they were balanced by using SMOTE (Synthetic
Minority Oversampling Technique). The logistic regression results were least impressive.
Lift Chart
A Cumulative lift of 133% is achieved by use of the lift chart in the first four deciles. This means by choosing
40% of the total customers, with the aid of the model and the associated gains chart, we can identify more
than 50% of the customers who are likely to complain. Without the model we would have probably identified
20% of the potential complaining customers.
Variable Importance using Random Forest model which gave the best results

More Related Content

What's hot

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
Louis (Yufeng) Wang
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
Kyuhwan Jung
 
FAKE NEWS DETECTION (1).pptx
FAKE NEWS DETECTION (1).pptxFAKE NEWS DETECTION (1).pptx
FAKE NEWS DETECTION (1).pptx
SrivarshiniInakollu
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
Bong-Ho Lee
 
Interpretable machine learning
Interpretable machine learningInterpretable machine learning
Interpretable machine learning
Sri Ambati
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)
Hsing-chuan Hsieh
 
Generative AI Fundamentals - Databricks
Generative AI Fundamentals - DatabricksGenerative AI Fundamentals - Databricks
Generative AI Fundamentals - Databricks
Vijayananda Mohire
 
Data drift and machine learning
Data drift and machine learningData drift and machine learning
Data drift and machine learning
Smita Agrawal
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
MLReview
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
dataalcott
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
Domain adaptation
Domain adaptationDomain adaptation
Domain adaptation
Tomoya Koike
 
07 regularization
07 regularization07 regularization
07 regularization
Ronald Teo
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
Leon Dohmen
 
Generative models
Generative modelsGenerative models
Generative models
Birger Moell
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
Hima Patel
 
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Camera Culture Group, MIT Media Lab
 
Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025
Ikhwan115951
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
Po-Chuan Chen
 

What's hot (20)

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
 
FAKE NEWS DETECTION (1).pptx
FAKE NEWS DETECTION (1).pptxFAKE NEWS DETECTION (1).pptx
FAKE NEWS DETECTION (1).pptx
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
 
Interpretable machine learning
Interpretable machine learningInterpretable machine learning
Interpretable machine learning
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)
 
Generative AI Fundamentals - Databricks
Generative AI Fundamentals - DatabricksGenerative AI Fundamentals - Databricks
Generative AI Fundamentals - Databricks
 
Data drift and machine learning
Data drift and machine learningData drift and machine learning
Data drift and machine learning
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
Domain adaptation
Domain adaptationDomain adaptation
Domain adaptation
 
07 regularization
07 regularization07 regularization
07 regularization
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Generative models
Generative modelsGenerative models
Generative models
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
 
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
 
Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
 

Similar to Reduction in customer complaints - Mortgage Industry

Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industry
skewdlogix
 
Creditscore
CreditscoreCreditscore
Creditscorekevinlan
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
Boston Institute of Analytics
 
Case Study: It’s All About Data – And the Customer
Case Study: It’s All About Data – And the CustomerCase Study: It’s All About Data – And the Customer
Case Study: It’s All About Data – And the Customer
Jill Kirkpatrick
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
Aniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
patilaniket2418
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paperShubhashish Biswas
 
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
55296
 
Leveraging Data Analysis for Sales
Leveraging Data Analysis for SalesLeveraging Data Analysis for Sales
Leveraging Data Analysis for Sales
Aditya Ratnaparkhi
 
statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
KexinZhang22
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
Carolyn Knight
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
Saleesh Satheeshchandran
 
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptxModule_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
HarshitGoel87
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
Rising Media, Inc.
 
201406 IASA: Analytics Maturity - Unlocking The Business Impact
201406 IASA: Analytics Maturity - Unlocking The Business Impact201406 IASA: Analytics Maturity - Unlocking The Business Impact
201406 IASA: Analytics Maturity - Unlocking The Business Impact
Steven Callahan
 
Essay On Stamford International Inc
Essay On Stamford International IncEssay On Stamford International Inc
Essay On Stamford International Inc
Deborah Gastineau
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimizationMike Nguyen
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn dataset
Rohan Choksi
 

Similar to Reduction in customer complaints - Mortgage Industry (20)

Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industry
 
Creditscore
CreditscoreCreditscore
Creditscore
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Case Study: It’s All About Data – And the Customer
Case Study: It’s All About Data – And the CustomerCase Study: It’s All About Data – And the Customer
Case Study: It’s All About Data – And the Customer
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paper
 
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
 
Leveraging Data Analysis for Sales
Leveraging Data Analysis for SalesLeveraging Data Analysis for Sales
Leveraging Data Analysis for Sales
 
statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptxModule_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Final Report
Final ReportFinal Report
Final Report
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
201406 IASA: Analytics Maturity - Unlocking The Business Impact
201406 IASA: Analytics Maturity - Unlocking The Business Impact201406 IASA: Analytics Maturity - Unlocking The Business Impact
201406 IASA: Analytics Maturity - Unlocking The Business Impact
 
Essay On Stamford International Inc
Essay On Stamford International IncEssay On Stamford International Inc
Essay On Stamford International Inc
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
 
Data mining and analysis of customer churn dataset
Data mining and analysis of customer churn datasetData mining and analysis of customer churn dataset
Data mining and analysis of customer churn dataset
 

More from Pranov Mishra

Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep Learning
Pranov Mishra
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Pranov Mishra
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
Pranov Mishra
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
Pranov Mishra
 
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Pranov Mishra
 
Recommendations for Preventive Maintenance - A Machine Learning Project
Recommendations for Preventive Maintenance - A Machine Learning ProjectRecommendations for Preventive Maintenance - A Machine Learning Project
Recommendations for Preventive Maintenance - A Machine Learning Project
Pranov Mishra
 

More from Pranov Mishra (6)

Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep Learning
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
 
Recommendations for Preventive Maintenance - A Machine Learning Project
Recommendations for Preventive Maintenance - A Machine Learning ProjectRecommendations for Preventive Maintenance - A Machine Learning Project
Recommendations for Preventive Maintenance - A Machine Learning Project
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

Reduction in customer complaints - Mortgage Industry

  • 1. Reduction in Customer Complaints – Mortgage Servicing Industry By prediction of customers likely to complaint & taking proactive steps for prevention of the same. Original Project Partners Pranov Mishra Aniket Chhabra Vivek Chandel Madhu Gollpudi Codes and cleaned dataset can be found in my github account. Link (https://github.com/Pranov1984/Great-Lakes-Capstone-Project)
  • 2. Executive Summary Project Overview: The project aims at analysis of Customer Complaints/Inquiries received by a US based mortgage (loan) servicing company . The scope of the project is limited to complaints received with respect to the part of the servicing life cycle that is related to Escrow Analysis and other related or subsidiary activities. Goal Statement Identification of major contributors towards complaints/inquiries. Utilization of the identified significant contributors and coming up with recommendations for changes/new implementations with the below goals ▫ Reducing Re-work ▫ Reducing Operational Cost ▫ Improve Customer Satisfaction ▫ Improve company preparedness to respond to customers Data Considered A few months of data of standard servicing loans of the organisation which comprises of circa 154,000 records was used for data exploration, visualization and hypothesis generation. The data had a lot of missing values and the missing values were typically when the event corresponding to the variable concerned) did not occur for that observation. The data was cleaned by creating dummy variables with no missing values. The code and the dataset provide in the github link constitutes the cleaned data.
  • 3. Executive Summary Continued ... • Escalations typically lead to extra work, reputational damage, sometimes regulatory scrutiny and penalties. Preventing customer complaints and escalations is in the best interest of the company. • The data was highly imbalanced (Majority class: Minority Class = 96%:4%) and hence appropriate model evaluation metric was required to be chosen. A combination of Harmonic mean (F1 score) and Area Under the Curve (AUC) was used to finalize the best model. • Models tried to arrive at the best are  Simple Model like Logistic Regression with different thresholds for classification  Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)  Stochastic Gradient Boosting technique after balancing the dataset as was the case with random forest • The key insights derived from the model with best results, indicate that the variables that significantly impact customer behaviour can be broadly classified as below:  Waiver of escrow payments which could arise due to incorrect escrow analysis conducted resulting in customer requesting for waiver of extra charges levied.  Presence of “Initials” which comes into play when a customer is escrowed for the first time. So the customer could be escalating either because he is incorrectly escrowed or the initial payment calculation for the escrow services is incorrect.  Process of handling force escrowed loans have been inadequate leading to customer queries and complaints. • A Gains chart was prepared which gave a cumulative lift of 133% in the first 4 deciles. The customers with highest probability of making a complaint were identified. The company could use this information to proactively review the operations performed on the customer’s account and correct any errors if found.
  • 4. Brief Overview of Escrow Account Escrow: Money held by a third-party on behalf of transacting parties Escrow account: An escrow account is established with a lender to pay for recurring expenses related to ones property, such as real estate taxes and homeowner’s insurance. It helps borrower to anticipate and manage payment of these expenses by including these expenses as a portion of monthly mortgage payment. How does an escrow account work? • At the time one establishes an escrow account, the customer’s annual real estate taxes and homeowner’s insurance are estimated, based on the customer’s most recent bills and premiums. • An incremental amount of these expenses is added to the customers’ monthly mortgage payment, in order to cover these expenses when they are due. • Each year, this escrow account is reviewed to determine if the amount being escrowed each month is sufficient to pay for any change in your real estate taxes or homeowner’s insurance premiums. • Incase a non escrowed customer defaults on payment of taxes and insurance, the lender advances payments to the respective agencies to protect the rights on the property. The lender then force escrows the delinquent customers to recover the money.
  • 5. • Missing value treatment • Outlier treatment • Removing inconsistencies Data Treatment • Review of each variable and transforming the appropriate variables Exploratory Data Analysis • Event rate is highly skewed in favor of “No Complaints. • SMOTE* used to balance the data. Balancing Data set • Data was split in 70:30 ratio. • Models were trained on 70% data & validated on 30%. Data Partition Analytical framework used to prepare the model * Synthetic Minority oversampling Technique Model Building Logistic Regression Random Forest Stochastic Gradient Boosting Validation – Evaluation Metrics Accuracy Sensitivity(TPR) & Specificity(TNR) Area under the Curve (AUC)
  • 6. Data Visualization & Data Preparation Both the graphs give the impression that the distribution is that of a factor variable. They look like variables which should be a factor variable with most of the data crowding around zero and the remaining few crowding around one. No data points in between zero and one. Variables transformed to factor.
  • 7. Data Visualization & Data Preparation Both the graphs give the impression that the distribution is that of a factor variable. They look like variables which should be a factor variable with most of the data crowding around zero and the remaining few crowding around one. No data points in between zero and one. Variables transformed to factor. Additionally looks like presence of Waiver is associated with complaints received
  • 8. Data Visualization & Data Preparation Both the graphs give the impression that the distribution is that of a factor variable. They look like variables which should be a factor variable with most of the data crowding around zero and the remaining few crowding around one. No data points in between zero and one. Variables transformed to factor. Additionally, presence of Reversed seems to be associated with complaints received in a bigger way.
  • 9. Data Visualization & Data Preparation Looks like a variable which should be numeric but with a high number of outliers. Most of the values are small with a few very high values. Close to 17% of observations are outliers. Upon further analysis by deciling it, Eight deciles have min and max as zero which constitute 80% of the data. Complaints received only in the 9th and 10th deciles when there are surpluses greater than 0. Essentially there is a very high chance of a complaint/query when there is a surplus. This indicates that the customer thinks that the analysis is incorrect or the surplus is not being returned on time by the company. Variable converted to factor. summary(mydata$Surplus) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 0.0 0.0 75.7 0.0 452213.2
  • 10. Data Visualization & Data Preparation Shortage Spread looks like a continuous variable but had outliers. The outliers were treated by compressing the extreme values to between 0 and 85 percentile of the actual values. Anytime there is a shortage, there is a higher chance of a complaint.
  • 11. Data Visualization & Data Preparation With highly skewed numbers in favor of NonBK analysis a separate analysis only on NONBK analysis loans could be contemplated.
  • 12. Random Forest models give the best results. Tuning of the parameters i.e. Mtry (no. of variables used while training models on bootstrapped samples) improved the results marginally. The gradient boosting results were also nearly as good (marginally less) as the results from random forest. The important point to note though is that the tree based models were trained on the data after they were balanced by using SMOTE (Synthetic Minority Oversampling Technique). The logistic regression results were least impressive.
  • 13.
  • 14. Lift Chart A Cumulative lift of 133% is achieved by use of the lift chart in the first four deciles. This means by choosing 40% of the total customers, with the aid of the model and the associated gains chart, we can identify more than 50% of the customers who are likely to complain. Without the model we would have probably identified 20% of the potential complaining customers.
  • 15. Variable Importance using Random Forest model which gave the best results