SlideShare a Scribd company logo
1 of 16
ADVANCED BUSINESS
ANALYTICS
A study on identifying the factors
influencing Fraudulent Insurance
Claims
BALA GOWTHAM CHANDRASEKARAN- A0148536X
-
Vision Statement
 To perform an exploratory data analysis on the
ABIBA automobile insurance transactional data.
 To employ dimensionality reduction methods like
Principal Component Analysis (PCA) to reduce
the given input variables to minimal factors.
 To determine the factors, its usage and it’s
reliability to enable the data analytic process for
the given data sets
Team 8 - Assignment 1
Factor Analysis
 The Data Set contains:
◦ 33 input variables and 15420 sample records
◦ The initial Scree plot is shown below:
The
Component
No. greater
than λ = 1
has 12
factors.
Hence we
start from
factor
analysis
with 12
values.
Team 8 - Assignment 1
Factor Analysis
 The Factor Analysis was done for
Fraudsters keeping the value of
FraudFound = 1
 And the factor analysis for Non-
Fraudsters by keeping the value of
FraudFound = 0
 The sample data was checked for multi-
collinearity from the correlation table
Team 8 - Assignment 1
Significant Variables (Fraudsters Vs
Non-Fraudsters)
 The order of variables
based on communality:
1. Policy Type
2. Vehicle Category
3. Month
4. Month Claimed
The above mentioned variables have high
communalities (i.e. >5)
Variables
Communality
Extraction
PolicyType .930
VehicleCategory .930
Month .919
MonthClaimed .919
AgeOfVehicle .871
AgeOfPolicyHolder .871
Team 8 - Assignment 1
How ABIBA is benefitted?
 The above mentioned variables helps ABIBA
to find the fraudster by limiting the 33 input
variables to 3 significant factors.
 These Factors provides ABIBA with higher
probability of identifying the fraudster and
non-fraudster.
 ABIBA can closely monitor these six input
variables to prevent fraudulent activities in
their company.
Team 8 - Assignment 1
Model Output Indicating
factors
Rotated Component Matrixa,b
Component
1 2 3
VehicleCategory .964
PolicyType .964
Month .959
MonthClaimed .958
AgeOfPolicyHolder .933
AgeOfVehicle .933
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.
b. Only cases for which FraudFound = 0 are used in the analysis phase.
• The Rotated Component Matric indicates the resulting
components from the most significant variables
• This is the result of Varimax Rotation which consists of
three components resulting in three factors
Team 8 - Assignment 1
Model Output Indicating
factors
The Scree plot has 3 factors over Eigen value = 1 for both Fraudulent Vs Non-Fraudulent Factoring
methodologies
FaultFound = 1 FaultFound = 0
Team 8 - Assignment 1
Model Output Indicating
factors
The Cumulative Variance are greater than 90% for both Fraudulent and
Non-Fraudulent Factors
Here the absolute value was mentioned as 0.5.
Total Variance Explained
a
Component
Initial Eigenvalues
Extraction Sums of Squared
Loadings
Rotation Sums of Squared
Loadings
Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
%
1 1.946 32.435 32.435 1.946 32.435 32.435 1.927 32.118 32.118
2 1.805 30.086 62.521 1.805 30.086 62.521 1.770 29.500 61.618
3 1.712 28.536 91.057 1.712 28.536 91.057 1.766 29.439 91.057
4 .233 3.884 94.941
5 .230 3.837 98.778
6 .073 1.222 100.000
Extraction Method: Principal Component Analysis.
a. Only cases for which FraudFound = 1 are used in the analysis phase.
Total Variance Explained
a
Component
Initial Eigenvalues
Extraction Sums of Squared
Loadings
Rotation Sums of Squared
Loadings
Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
% Total
% of
Variance
Cumulative
%
1 1.867 31.110 31.110 1.867 31.110 31.110 1.860 31.006 31.006
2 1.859 30.985 62.095 1.859 30.985 62.095 1.837 30.625 61.631
3 1.714 28.571 90.666 1.714 28.571 90.666 1.742 29.035 90.666
4 .259 4.311 94.977
5 .162 2.706 97.684
6 .139 2.316 100.000
Extraction Method: Principal Component Analysis.
a. Only cases for which FraudFound = 0 are used in the analysis phase.
FaultFound = 0
FaultFound = 1
Team 8 - Assignment 1
Factors based on Order of
Importance
1. Vehicle Category Vs Policy Type
2. Month Vs Month Claimed
3. Age Of Policy Holder Vs Age Of Vehicle
Rotated Component Matrixa,b
Component
1 2 3
VehicleCategory .964
PolicyType .964
Month .959
MonthClaimed .958
AgeOfPolicyHolder .933
AgeOfVehicle .933
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.
b. Only cases for which FraudFound = 0 are used in the analysis phase.
Team 8 - Assignment 1
Factors contributing to percentage
of Variance
 From slide 9, we infer that the cumulative
variance is greater than 90% (91.057 and
90.666) for both Fraudulent and Non-Fraudulent
values
 KMO Measure of sampling Adequacy > 0.600
 Significance < 0.05
KMO and Bartlett's Testa
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .600
Bartlett's Test of Sphericity Approx. Chi-Square 3441.530
df 15
Sig. .000
a. Only cases for which FraudFound = 1 are used in the analysis phase.
Team 8 - Assignment 1
Reliability of Factors
Reliability Statistics
Cronbach's Alpha N of Items
.786 2
Vehicle Category Vs Policy Type
Reliability Statistics
Cronbach's Alpha N of Items
.853 2
Age Of Vehicle Vs Age Of Policy Holder
Month Vs Month Claimed
Reliability Statistics
Cronbach's Alpha N of Items
.909 2
The Reliability is greater than 0.7 for all the factors
and thus all the factors are highly reliable.
Type equation here.
1
2
3
Team 8 - Assignment 1
My Factors = ABIBA’s
Success
 By employing factor analysis, we’ve reduced the
number of variables which influence fraud as 6 against
the original 33.
 This will ABIBA to narrow down to the exact variables
to manage and there is less cost involved in spotting the
fraud.
 These 6 variables can be used to construct a logistic
regression model or any other model instead of given 33
input variables.
 Value added for business is how critical these 6 variables
are in order to predict the probability of being fraud or
not
Team 8 - Assignment 1
My Factors = ABIBA’s
Success
Team 8 - Assignment 1
My Factors = ABIBA’s
Success
 From the Composite Score of the Factors, ABIBA
could find the component contributing the most to
fraudulent suspicion.
 For instance, in the previous slide, the factor 1
value is high for customer 1 (around 2.255) and
hence for that customer it attributes to the
particular component.
 Similarly 1.35 for customer 4 is attributed to
factor 2 and so on.
 A negative value indicates that the factor
contributes negatively to determine the fraudster.
Team 8 - Assignment 1
Team 8 - Assignment 1

More Related Content

Viewers also liked

Measuring roi of training ppt slides
Measuring roi of training ppt slidesMeasuring roi of training ppt slides
Measuring roi of training ppt slidesYodhia Antariksa
 
Creative thinking skills for hr managers PPT Slides
Creative thinking skills for hr managers PPT SlidesCreative thinking skills for hr managers PPT Slides
Creative thinking skills for hr managers PPT SlidesYodhia Antariksa
 
Competency based hr management PPT Slides
Competency based hr management PPT SlidesCompetency based hr management PPT Slides
Competency based hr management PPT SlidesYodhia Antariksa
 
Principles of hr management ppt slides
Principles of hr management ppt slidesPrinciples of hr management ppt slides
Principles of hr management ppt slidesYodhia Antariksa
 
HR SCORECARD Human Resource Scorecard PPT Slides
HR SCORECARD Human Resource Scorecard PPT SlidesHR SCORECARD Human Resource Scorecard PPT Slides
HR SCORECARD Human Resource Scorecard PPT SlidesYodhia Antariksa
 
Communication skills ppt slides
Communication skills ppt slidesCommunication skills ppt slides
Communication skills ppt slidesYodhia Antariksa
 
Career management PPT Slides
Career management PPT SlidesCareer management PPT Slides
Career management PPT SlidesYodhia Antariksa
 
Balanced scorecard ppt slides
Balanced scorecard ppt slidesBalanced scorecard ppt slides
Balanced scorecard ppt slidesYodhia Antariksa
 
Change Management PPT Slides
Change Management PPT SlidesChange Management PPT Slides
Change Management PPT SlidesYodhia Antariksa
 

Viewers also liked (12)

Measuring roi of training ppt slides
Measuring roi of training ppt slidesMeasuring roi of training ppt slides
Measuring roi of training ppt slides
 
Good to great PPT Slides
Good to great PPT SlidesGood to great PPT Slides
Good to great PPT Slides
 
Creative thinking skills for hr managers PPT Slides
Creative thinking skills for hr managers PPT SlidesCreative thinking skills for hr managers PPT Slides
Creative thinking skills for hr managers PPT Slides
 
Competency based hr management PPT Slides
Competency based hr management PPT SlidesCompetency based hr management PPT Slides
Competency based hr management PPT Slides
 
Principles of hr management ppt slides
Principles of hr management ppt slidesPrinciples of hr management ppt slides
Principles of hr management ppt slides
 
HR SCORECARD Human Resource Scorecard PPT Slides
HR SCORECARD Human Resource Scorecard PPT SlidesHR SCORECARD Human Resource Scorecard PPT Slides
HR SCORECARD Human Resource Scorecard PPT Slides
 
Communication skills ppt slides
Communication skills ppt slidesCommunication skills ppt slides
Communication skills ppt slides
 
Fraud Cases in Auditing
Fraud Cases in AuditingFraud Cases in Auditing
Fraud Cases in Auditing
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Career management PPT Slides
Career management PPT SlidesCareer management PPT Slides
Career management PPT Slides
 
Balanced scorecard ppt slides
Balanced scorecard ppt slidesBalanced scorecard ppt slides
Balanced scorecard ppt slides
 
Change Management PPT Slides
Change Management PPT SlidesChange Management PPT Slides
Change Management PPT Slides
 

Similar to Fraud Analysis

Operations Management VTU BE Mechanical 2015 Solved paper
Operations Management VTU BE Mechanical 2015 Solved paperOperations Management VTU BE Mechanical 2015 Solved paper
Operations Management VTU BE Mechanical 2015 Solved paperSomashekar S.M
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraudBastiaan Frerix
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIVikas Virani
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...IRJET Journal
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General InsuranceSyed Danish Ali
 
Lecture 3 Statistical ProcessControl (SPC).docx
Lecture 3 Statistical ProcessControl (SPC).docxLecture 3 Statistical ProcessControl (SPC).docx
Lecture 3 Statistical ProcessControl (SPC).docxsmile790243
 
Churn model for telecom
Churn model for telecomChurn model for telecom
Churn model for telecomAmit Kumar
 
Risk_Management_Final_Report
Risk_Management_Final_ReportRisk_Management_Final_Report
Risk_Management_Final_ReportRohan Sanas
 
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYN
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYNAutomobile Insurance Claim Fraud Detection using Random Forest and ADASYN
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYNIRJET Journal
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industryskewdlogix
 
Statistical Process Control & Operations Management
Statistical Process Control & Operations ManagementStatistical Process Control & Operations Management
Statistical Process Control & Operations Managementajithsrc
 
C ost+behaviour+estimation
C ost+behaviour+estimationC ost+behaviour+estimation
C ost+behaviour+estimationKhalid Aziz
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Reducing False Positives
Reducing False PositivesReducing False Positives
Reducing False PositivesMayank Johri
 
Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsJieming Wei
 

Similar to Fraud Analysis (20)

Operations Management VTU BE Mechanical 2015 Solved paper
Operations Management VTU BE Mechanical 2015 Solved paperOperations Management VTU BE Mechanical 2015 Solved paper
Operations Management VTU BE Mechanical 2015 Solved paper
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMI
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General Insurance
 
Lecture 3 Statistical ProcessControl (SPC).docx
Lecture 3 Statistical ProcessControl (SPC).docxLecture 3 Statistical ProcessControl (SPC).docx
Lecture 3 Statistical ProcessControl (SPC).docx
 
Churn model for telecom
Churn model for telecomChurn model for telecom
Churn model for telecom
 
Risk_Management_Final_Report
Risk_Management_Final_ReportRisk_Management_Final_Report
Risk_Management_Final_Report
 
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYN
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYNAutomobile Insurance Claim Fraud Detection using Random Forest and ADASYN
Automobile Insurance Claim Fraud Detection using Random Forest and ADASYN
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industry
 
Statistical Process Control & Operations Management
Statistical Process Control & Operations ManagementStatistical Process Control & Operations Management
Statistical Process Control & Operations Management
 
C ost+behaviour+estimation
C ost+behaviour+estimationC ost+behaviour+estimation
C ost+behaviour+estimation
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Reducing False Positives
Reducing False PositivesReducing False Positives
Reducing False Positives
 
Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
 
APT_&_VaR[1]
APT_&_VaR[1]APT_&_VaR[1]
APT_&_VaR[1]
 

Fraud Analysis

  • 1. ADVANCED BUSINESS ANALYTICS A study on identifying the factors influencing Fraudulent Insurance Claims BALA GOWTHAM CHANDRASEKARAN- A0148536X -
  • 2. Vision Statement  To perform an exploratory data analysis on the ABIBA automobile insurance transactional data.  To employ dimensionality reduction methods like Principal Component Analysis (PCA) to reduce the given input variables to minimal factors.  To determine the factors, its usage and it’s reliability to enable the data analytic process for the given data sets Team 8 - Assignment 1
  • 3. Factor Analysis  The Data Set contains: ◦ 33 input variables and 15420 sample records ◦ The initial Scree plot is shown below: The Component No. greater than λ = 1 has 12 factors. Hence we start from factor analysis with 12 values. Team 8 - Assignment 1
  • 4. Factor Analysis  The Factor Analysis was done for Fraudsters keeping the value of FraudFound = 1  And the factor analysis for Non- Fraudsters by keeping the value of FraudFound = 0  The sample data was checked for multi- collinearity from the correlation table Team 8 - Assignment 1
  • 5. Significant Variables (Fraudsters Vs Non-Fraudsters)  The order of variables based on communality: 1. Policy Type 2. Vehicle Category 3. Month 4. Month Claimed The above mentioned variables have high communalities (i.e. >5) Variables Communality Extraction PolicyType .930 VehicleCategory .930 Month .919 MonthClaimed .919 AgeOfVehicle .871 AgeOfPolicyHolder .871 Team 8 - Assignment 1
  • 6. How ABIBA is benefitted?  The above mentioned variables helps ABIBA to find the fraudster by limiting the 33 input variables to 3 significant factors.  These Factors provides ABIBA with higher probability of identifying the fraudster and non-fraudster.  ABIBA can closely monitor these six input variables to prevent fraudulent activities in their company. Team 8 - Assignment 1
  • 7. Model Output Indicating factors Rotated Component Matrixa,b Component 1 2 3 VehicleCategory .964 PolicyType .964 Month .959 MonthClaimed .958 AgeOfPolicyHolder .933 AgeOfVehicle .933 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 4 iterations. b. Only cases for which FraudFound = 0 are used in the analysis phase. • The Rotated Component Matric indicates the resulting components from the most significant variables • This is the result of Varimax Rotation which consists of three components resulting in three factors Team 8 - Assignment 1
  • 8. Model Output Indicating factors The Scree plot has 3 factors over Eigen value = 1 for both Fraudulent Vs Non-Fraudulent Factoring methodologies FaultFound = 1 FaultFound = 0 Team 8 - Assignment 1
  • 9. Model Output Indicating factors The Cumulative Variance are greater than 90% for both Fraudulent and Non-Fraudulent Factors Here the absolute value was mentioned as 0.5. Total Variance Explained a Component Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % 1 1.946 32.435 32.435 1.946 32.435 32.435 1.927 32.118 32.118 2 1.805 30.086 62.521 1.805 30.086 62.521 1.770 29.500 61.618 3 1.712 28.536 91.057 1.712 28.536 91.057 1.766 29.439 91.057 4 .233 3.884 94.941 5 .230 3.837 98.778 6 .073 1.222 100.000 Extraction Method: Principal Component Analysis. a. Only cases for which FraudFound = 1 are used in the analysis phase. Total Variance Explained a Component Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % 1 1.867 31.110 31.110 1.867 31.110 31.110 1.860 31.006 31.006 2 1.859 30.985 62.095 1.859 30.985 62.095 1.837 30.625 61.631 3 1.714 28.571 90.666 1.714 28.571 90.666 1.742 29.035 90.666 4 .259 4.311 94.977 5 .162 2.706 97.684 6 .139 2.316 100.000 Extraction Method: Principal Component Analysis. a. Only cases for which FraudFound = 0 are used in the analysis phase. FaultFound = 0 FaultFound = 1 Team 8 - Assignment 1
  • 10. Factors based on Order of Importance 1. Vehicle Category Vs Policy Type 2. Month Vs Month Claimed 3. Age Of Policy Holder Vs Age Of Vehicle Rotated Component Matrixa,b Component 1 2 3 VehicleCategory .964 PolicyType .964 Month .959 MonthClaimed .958 AgeOfPolicyHolder .933 AgeOfVehicle .933 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 4 iterations. b. Only cases for which FraudFound = 0 are used in the analysis phase. Team 8 - Assignment 1
  • 11. Factors contributing to percentage of Variance  From slide 9, we infer that the cumulative variance is greater than 90% (91.057 and 90.666) for both Fraudulent and Non-Fraudulent values  KMO Measure of sampling Adequacy > 0.600  Significance < 0.05 KMO and Bartlett's Testa Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .600 Bartlett's Test of Sphericity Approx. Chi-Square 3441.530 df 15 Sig. .000 a. Only cases for which FraudFound = 1 are used in the analysis phase. Team 8 - Assignment 1
  • 12. Reliability of Factors Reliability Statistics Cronbach's Alpha N of Items .786 2 Vehicle Category Vs Policy Type Reliability Statistics Cronbach's Alpha N of Items .853 2 Age Of Vehicle Vs Age Of Policy Holder Month Vs Month Claimed Reliability Statistics Cronbach's Alpha N of Items .909 2 The Reliability is greater than 0.7 for all the factors and thus all the factors are highly reliable. Type equation here. 1 2 3 Team 8 - Assignment 1
  • 13. My Factors = ABIBA’s Success  By employing factor analysis, we’ve reduced the number of variables which influence fraud as 6 against the original 33.  This will ABIBA to narrow down to the exact variables to manage and there is less cost involved in spotting the fraud.  These 6 variables can be used to construct a logistic regression model or any other model instead of given 33 input variables.  Value added for business is how critical these 6 variables are in order to predict the probability of being fraud or not Team 8 - Assignment 1
  • 14. My Factors = ABIBA’s Success Team 8 - Assignment 1
  • 15. My Factors = ABIBA’s Success  From the Composite Score of the Factors, ABIBA could find the component contributing the most to fraudulent suspicion.  For instance, in the previous slide, the factor 1 value is high for customer 1 (around 2.255) and hence for that customer it attributes to the particular component.  Similarly 1.35 for customer 4 is attributed to factor 2 and so on.  A negative value indicates that the factor contributes negatively to determine the fraudster. Team 8 - Assignment 1
  • 16. Team 8 - Assignment 1