Fraud Analysis and Other Applications of Unsupervised Learning in Property and Casualty Insurance

Applications of Unsupervised Learning
in Property and Casualty Insurance
with emphasis on fraud analysis

Louise Francis, FCAS, MAAA
Francis Analytics and Actuarial
Data Mining, Inc.
www.data-mines.com
Louise.francus@data-mines.com

Objectives
 Review classic unsupervised learning
techniques
 Introduce 2 new unsupervised
learning techniques
 RandomForest
 PRIDIT
 Apply the techniques to insurance
data
 Automobile Fraud data set
 A publically available automobile
insurance database

Motivation for Topic
 New book: Predictive Modeling in
Actuarial Science
 An introduction to predictive modeling for
actuaries and other insurance
professionals
 Publisher: Cambridge University Press
 Hope to Publish: Fall 2012
 Chapter on Unsupervised Learning
 Li Yang and Louise Francis
 Li Yang – Variable grouping (PCA)
 Louise Francis- record grouping
(clustering)

Book Project
 Predictive Modeling 2 Volume Book Project


 A joint project leading to a two volume pair of
books on Predictive Modeling in Actuarial Science.
 Volume 1 would be on Theory and Methods and
 Volume 2 would be on Property and Casualty
Applications.
 The first volume will be introductory with basic
concepts and a wide range of techniques designed
to acquaint actuaries with this sector of problem
solving techniques. The second volume would be
a collection of applications to P&C problems,
written by authors who are well aware of the
advantages and disadvantages of the first volume
techniques but who can explore relevant
applications in detail with positive results.

The Fraud Study Data
• 1993 AIB closed PIP claims
• Dependent Variables
• Suspicion Score
• Expert assessment of liklihood of
fraud or abuse
• Predictor Variables
• Red flag indicators
• Claim file variables

6/26/2012 5
Data Mining, Inc.

The Fraud Problem
from: www.agentinsure.com

6/26/2012 6
Data Mining, Inc.

The Fraud Problem (2)
from Coalition Against Insurance Fraud

Francis Analytics and Actuarial Da
6/26/2012 7
Mining, Inc.

Fraud and Abuse
 Planned fraud
 Staged accidents
 Abuse
 Opportunistic
 Exaggerate claim

6/26/2012 8
Data Mining, Inc.

The Fraud Red Flags
 Binary variables that capture
characteristics of claims
associated with fraud and abuse
 Accident variables (acc01 - acc19)
 Injury variables (inj01 – inj12)
 Claimant variables (ch01 – ch11)
 Insured variables (ins01 – ins06)
 Treatment variables (trt01 – trt09)
 Lost wages variables (lw01 – lw07)

The Red Flag Variables
Red Flag Variables
Indicator
Subject Variable Description
Accident ACCO1 No report by police officer at scene
A0004 Single vehicle accident
A0009 No plausible explanation for accident
ACC10 Claimant in old, low valued vehicle
ACC11 Rental vehicle involved in accident
ACC14 Property Damage was inconsistent with accident
ACC15 Very minor impact collision
ACC16 Claimant vehicle stopped short
ACC19 Insured felt set up, denied fault
Claimant CLT02 Had a history of previous claims
CLT04 Was an out of state accident
CLT07 Was one of three or more claimants in vehicle
Injury INJO1 Injury consisted of strain or sprain only
INJ02 No objective evidence of injury
INJO3 Police report showed no injury or pain
INJ05 No emergency treatment was given
INJO6 Non-emergency treatment was delayed
INJ11 Unusual injury for auto accident
Insured INSO1 Had history of previous claims
INSO3 Readily accepted fault for accident
INSO6 Was difficult to contact/uncooperative
INSO7 Accident occurred soon after effective date
Lost Wages LWO1 Claimant worked for self or a family member
LW03 Claimant recently started employment

6/26/2012 10
Data Mining, Inc.

Dependent Variable
Problem
 Insurance companies frequently do
not collect information as to
whether a claim is suspected of
fraud or abuse
 Even when claims are referred for
special investigation
 Solution: unsupervised learning

6/26/2012 11
Data Mining, Inc.

Supervised Learning

6/26/2012 12
Data Mining, Inc.

Dimension Reduction
PolicyCount VehicleCou
Frequency Frequency Frequency NonBusines ntNonBusin
ZipCode BI PD Comb sUse essUse SeverityBI SeverityPD
90095 - 54.50 0.03 2.00 3.00 1,973.50
93741 - - - 1.00 1.00
90015 22.65 43.93 0.04 1.00 2.00 10,181.16 2,442.36
90067 15.53 44.41 0.04 3.00 6.00 13,146.57 2,565.56
90004 26.71 48.45 0.04 11.00 17.00 8,538.56 2,354.08

6/26/2012 13
Data Mining, Inc.

The CAARP Data
 This assigned risk automobile data was made
available to researchers in 2005 for the purpose of
studying the effect of change in regultion on territorial
variables
 contain exposure information (car counts, premium)
and claim and loss information (Bodily Injury (BI)
counts, BI ultimate losses, Property Damage (PD)
claim counts, PD ultimate losses).
 Each record is a zip code
 Good example of using unsupervised learning for
territory construction
6/26/2012 14
Data Mining, Inc.

R Cluster Library
 The “cluster” library from R used
 Many of the functions in the library
are described in the Kaufman and
Rousseeuw’s (1990) classic
bookon clustering.
 Finding Groups in Data.

6/26/2012 15
Data Mining, Inc.

Grouping Records

6/26/2012 16
Data Mining, Inc.

Dissimilarity
 Euclidian Distance: the record by
record squared difference between
the value of each the variables for
a record and the values for the
record it is being compared to.

6/26/2012 17
Data Mining, Inc.

RF Similarity
 Varies between 0 and 1
 Proximity matrix is an output of RF
 After a tree is fit, all records run through model
 If 2 records in same terminal node, their
proximity increased by 1
 1-proximity forms distance

 Can be used as an input to clustering and other
unsupervised learning procedures
 See “Unsupervised Learning with Random
Forest Predictors” by Shi and Actuarial
Francis Analytics
and Horvath
6/26/2012 18
Data Mining, Inc.

Clustering
 Hierarchical clustering
 K-Means clustering
 This analysis uses k-means

6/26/2012 19
Data Mining, Inc.

K-means Clustering
 An iterative procedure is used to assign
each record in the data to one of the k
clusters.
 The iteration begins with the initial centers
or mediods for k groups.
 uses a dissimilarity measure to assign
records to a group and to iterate to a final
grouping. An iterative procedure is used to
assign each record to one of the k
6/26/2012
clusters. byFrancis Analytics and Actuarial
the user, 21
Data Mining, Inc.

R Cluster Output

6/26/2012 22
Data Mining, Inc.

Cluster Plot

6/26/2012 23
Data Mining, Inc.

Silhouette Plot

6/26/2012 24
Data Mining, Inc.

Silhouette Plot – Euclidean
Distance Clustering

Testing using Expert
Scores: Fit a Tree to Suspicion
Score for Importance Ranking

6/26/2012 27
Data Mining, Inc.

Importance Ranking of
the Clusters

6/26/2012 28
Data Mining, Inc.

Fit Tree to Binary Fraud
Indicator

6/26/2012 29
Data Mining, Inc.

Importance Ranking (2)

6/26/2012 30
Data Mining, Inc.

RF Ranking of the
“Predictors”: Top 10 of 44
Variable MeanDecreaseGini Description

acc10 10.50 Claimant in old low value vehical

trt01 9.05 arge # visits to chiro

inj01 8.64 strain or sprain

inj02 8.64 readily accepted fauld

inj05 8.62 non emergency treatment given for injury

acc01 8.55 no police report

clt07 7.47 one of 3 or more claimants in vehical

inj06 7.44 non emergency trt delayed

acc15 7.36 very minor collision

trt03 6.82 large # visits to PT

6/26/2012 31
Data Mining, Inc.

Problem: Categorical
Variables
 It is not clear how to best perform
Principal Components/Factor
Analysis on categorical variables
 The categories may be coded as a
series of binary dummy variables
 If the categories are ordered
categories, you may loose
important information
 This is the problem that PRIDIT
addresses

RIDIT
 Variables are ordered so that
lowest value is associated with
highest probability of fraud
 Use Cumulative distribution of
claims at each value, i, to create
RIDIT statistic for claim t, value i

Rti ˆ
ptj ˆ
ptj
j i j i

Example: RIDIT for Legal
Representation

Legal Representation
Proportion Proportion
Value Code Number Proportion Below Above RIDIT
Yes 1 706 0.504 0.000 0.496 -0.496
No 2 694 0.496 0.504 0.000 0.504

PRIDIT
 Use RIDIT statistics in Principal
Components Analysis

Component Matri xa
C om pon e n t
1
S IU .248
Pol i ce Re port .220
At Faul t .709
Le gal Re p .752
Medi cal Audi t .341
Pri or C l ai m .406
Extracti on Me th od: Pri n ci pal Com pon e n t An al ys i s.
a. 1 component s ext r act ed.

PRIDITS of Accident
Flags

6/26/2012 36
Data Mining, Inc.

Fit Tree with PRIDITS for
Each Type of Flag

6/26/2012 38
Data Mining, Inc.

Pridits

6/26/2012 39
Data Mining, Inc.

Factors

6/26/2012 40
Data Mining, Inc.

Add RF and Euclid
Clusters to PRIDIT
Factors

6/26/2012 41
Data Mining, Inc.

Use Salford RF MDS
 Top variable in importance (acc10)
used as binary dependent
 Run tree with 1,000 forests
 Output proximities and MDS
 Use MDS scales as to cluster
(k=3)
 Run Tree to get Importance
ranking

6/26/2012 42
Data Mining, Inc.

MDS Graph

6/26/2012 43
Data Mining, Inc.

Rank of cluster
procedures to Tree
Prediction

6/26/2012 44
Data Mining, Inc.

Labeling Clusters

6/26/2012 45
Data Mining, Inc.

Relation Between
PRIDIT Factor and
Suspicion

6/26/2012 46
Data Mining, Inc.

Next Steps
 Add claim file variables
 Rerun clusters
 Rerun PRIDITS

 Do Random Forest proximities on
the RIDITS
 Apply the procedures to other
fraud databases

6/26/2012 47
Data Mining, Inc.

PRIDIT REFERENCES
Ai, J., Brockett, Patrick L., and Golden, Linda L. (2009) “Assessing Consumer
Fraud Risk in Insurance Claims with Discrete and Continuous Data,”
North American Actuarial Journal 13: 438-458.

Brockett, Patrick L., Derrig, Richard A., Golden, Linda L., Levine, Albert and
Alpert, Mark, (2002), Fraud Classification Using Principal Component
Analysis of RIDITs, Journal of Risk and Insurance, 69:3, 341-373.

Brockett, Patrick L., Xiaohua, Xia and Derrig, Richard A., (1998), Using
Kohonen’ Self-Organizing Feature Map to Uncover Automobile Bodily
Injury Claims Fraud, Journal of Risk and Insurance, 65:245-274

Bross, Irwin D.J., (1958), How To Use RIDIT Analysis, Biometrics,
4:18-38.
Chipman, H.E.I. George and R.E. McCulloch, 2006, Baysian Ensemble Learning,
Neural Information Processing Systems

Lieberthal, Robert D., (2008), Hospital Quality: A PRIDIT Approach, Health
Services Research, 43:3, 988–1005.

Questions?

6/26/2012 49
Data Mining, Inc.

Fraud Analysis and Other Applications of Unsupervised Learning in Property and Casualty Insurance

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

Similar to Fraud Analysis and Other Applications of Unsupervised Learning in Property and Casualty Insurance

Similar to Fraud Analysis and Other Applications of Unsupervised Learning in Property and Casualty Insurance (20)

More from Salford Systems

More from Salford Systems (20)

Recently uploaded

Recently uploaded (20)

Fraud Analysis and Other Applications of Unsupervised Learning in Property and Casualty Insurance