SlideShare a Scribd company logo
Claim Pattern Anomalies
Making a Mole Hill Out of a Mountain
Predictive Analytics World for Business
San Francisco
May 17, 2017
CAS Analytics & Data Provisioning Team v01
Darryl Humphrey, PhD, PMP
linkedin.com/in/dghumphrey1
Provider and member claiming behavior is
affected by many factors.
2CAS ADP Team
FraudAnalytics
Member and
Provider
Claiming
Patterns are
Dynamic
Economic Conditions
Plan Design
Policies and Processes
Compliance Verification
Industry Realities
Analyzing equivalent of 87,000,000 claim lines
monthly encompassing 17,000 providers and
1.6 million members.
–Nine (9) practice
areas across health,
dental, and pharmacy
benefits
–70 measures of
claiming behavior
–Six (6) algorithms
–Look for converging
results
3CAS ADP Team
Multi-variate distance measure identifies providers
whose claiming patterns differ from the population.
ProportionofTotal$
AssociatedwithRiskyClaims
0
.2.4.6.8
1
0 50 100 150
DrugRD
Non Outlier MCD Outlier
November Analytic Run
All providers reviewed
4.18
4CAS ADP Team
0
.2.4.6.8
1
0 50 100 150
DrugRD
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Cluster 5 Cluster 6 Cluster 7 Cluster 8
November Analytic Run
All Providers reviewed kmeans results
Clustering algorithm sharpens the focus on the
riskiest providers.
4.18
Providers that cluster together have similar claiming patterns.
24
5
54
n=34
Small clusters
with high RD
scores are of most
interest.
ProportionofTotal$
thatareatRisk
5CAS ADP Team
CAS ADP Team 6
Reviewing the cluster characteristics gives insight into what
claiming patterns are driving the outlier scores.
Mean Z Scores
Cluster # Prvr
Avg
DrugRD
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9
4 5 24 2.0 29.7 -0.5 -0.1 0.1 -0.4 -0.2 -0.2 -0.5
2 24 14 -0.1 0.8 10.3 1.7 1.7 -0.3 0.3 0.6 0.7
3 34 84 -1.3 0.8 -0.4 36.8 8.4 8.9 4.3 0.5 1.1
5 54 46 -0.4 0.9 0.9 17.1 3.4 1.9 2.5 0.9 0.5
…
1 682 2.15 0.2 0.0 0.2 -0.3 0.0 -0.2 -0.1 -0.2 0.2
CAS ADP Team 7
Claim-specific risk is estimated for the variables highlighted in
the K-means and MCD analyses.
RiskMA(i,j) = (e-(MA(i,j)/Max
MA
(i)) * (1-di(j)/r))-e-1)/(1-e-1)
Limited investigation resources are targeted on the specific claims
most likely to be an issue.
Network analysis can reveal relationships that warrant
further investigation.
Collusion between members
and suspect providers?
CAS ADP Team 8
Problematic providers tend to
have customers in common.
Claiming patterns for narcotics are of particular
interest.
CAS ADP Team 9
Highly concentrated business
relationships are flagged.
Are members seeking narcotics from
multiple doctors and pharmacies?
Machine learning (ML) = architectures for building
algorithms that learn.
CAS ADP Team 10
mA
SVM
Random
Forest
NN
Neural
Network
CNNDBN
Deep Learning
RBM
K-NN
RNN
Machine Learning
Random Forest algorithm classifies observations based
on the majority vote of many decision trees.
Risk classification
…
1200 obs
7 vars
Sample
with
replacement
Sample
with
replacement
Sample
with
replacement
11CAS ADP Team
Random Forest technique shows promise in predicting
which investigations will yield findings of note.
1 0
1 25 9
0 1 50
True Positive Rate: 74%
True Negative Rate: 98%
CAS ADP Team 12
RiskMA(i,j) = (e-(MA(i,j)/Max
MA
(i)) * (1-di(j)/r))-e-1)/(1-e-1)
Random Forest provides a measure of a variable’s
importance to classification success.
Var 6
Var 2
Var 3
Var 4
Var 1
Var 5
Var 7
CAS ADP Team 13
Automated review of receipts provides early detection
of potential issues.
Machine learning algorithm is being used to
determine if the document is a valid receipt.
Data lift technology extracts
the information.
Analytics is one input used to match cost-to-investigate
with the anticipated ROI.
15CAS ADP Team
There are many paths to generating ROI from
fraud detection analytics.
– Business knowledge and a
willingness to learn are more
important than the tool set
– Analytics are tools; keep them
sharp
– Verify that the analyses are:
– Relevant
– Reliable
– Responsible
– Tailor audit investigations to
the nature and magnitude of
the risk
16CAS ADP Team
Jil Tanguay, BSc (Spec), CFI, CRMA
Manager
Claims Assurance Services
Alberta Blue Cross
jtanguay@ab.bluecross.ca
Darryl Humphrey, PhD, PMP
Senior Data Scientist
Claims Assurance Services
Alberta Blue Cross
dhumphrey@ab.bluecross.ca
Yemi Dare-Ode, BSc
Nazanin Tahmasebi, PhD
Wesley Wood, Bsc
17CAS ADP Team
Random forest classification accuracy stabilizes at
approximately 220 trees.
18CAS ADP Team
Many data sets contain nonlinear relationships which can
reduce the effectiveness of some detection methods.
– Datasets that are linearly
separable with some noise work
out great
0 x
0 x
0
x2
x
– Some data sets aren’t linear in
their initial state
– The data can be mapped to a
higher-dimensional space
19CAS ADP Team
Map feature space to one of higher dimensionality
where the training set is linearly separable.
Φ: x → φ(x)
20CAS ADP Team
Support Vector Machines find the
optimal surface that separates the
groups.
– Maximizes the distance between the
hyperplane and the “difficult points”
close to decision boundary
– If there are no points near the decision
surface, then there will be fewer false
positives and false negatives
– Support vectors are the observations
near the decision boundary that
contribute to determining the boundary.
– Implies that only support vectors matter;
other training examples are ignorable
Ch. 15
21CAS ADP Team
RD QuintileRD Quintile
Random Forest Confusion Matrix
Quintile Accuracy
0-0.20 0.80
0.20-0.40 0.67
0.40-0.60 0.50
0.60-0.80 0.69
0.80 -1 0.91
Quintile Accuracy
0-0.20 0.85
0.20-0.40 0.54
0.40-0.60 0.54
0.60-0.80 0.67
0.80 -1 0.94
SVM Confusion Matrix
22CAS ADP Team
– Artificial neural networks are
composed of multiple nodes
which imitate neurons of the
human brain.
Neural networks are well-suited to detection tasks.
– Neurons are connected by links
and they interact with each
other. Each link is associated
with a weight
– Artificial neural networks learn
by modifying the weights in
response to feedback
– Deep learning = lots of hidden
layers
– Most often used for images
23CAS ADP Team
Eye movement research indicates that we recognize
objects by extracting features.
CAS ADP Team 24
The series of layers between input & output do
feature extraction and processing in stages, just as our
brains do.
CAS ADP Team 25
Learning
Variables
Network analysis is used to show the effect of
ownership on a pharmacy’s claiming behavior.
– Assertion is that company policy /
implicit guidelines can drive
claiming behavior across the
pharmacies owned by a single
corporate entity
– Network defined by pharmacies
registered with the same legal name
– Red = high total $ from risky claiming
relative to other pharmacies
– Large = high proportion of a
pharmacy’s $ from risky claiming
– Close together = similar high total $
at risk
26CAS ADP Team

More Related Content

Similar to 1530 track2 humphrey

V34132136
V34132136V34132136
V34132136
IJERA Editor
 
Customer Profiling using Data Mining
Customer Profiling using Data Mining Customer Profiling using Data Mining
Customer Profiling using Data Mining
Suman Chatterjee
 
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal
 
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
Dr. Haxel Consult
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
IRJET Journal
 
Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?
Data Con LA
 
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET Journal
 
DataMining_CA2-4
DataMining_CA2-4DataMining_CA2-4
DataMining_CA2-4
Aravind Kumar
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
HPCC Systems
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analytics
imtiaz khan
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
Valery Tkachenko
 
Decoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data StandardsDecoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data Standards
d-Wise Technologies
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project final
Craig Cannon
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
Vaibhav Dhattarwal
 
Neural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learningNeural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learning
Francisco E. Figueroa-Nigaglioni
 
Data Mining based on Hashing Technique
Data Mining based on Hashing TechniqueData Mining based on Hashing Technique
Data Mining based on Hashing Technique
ijtsrd
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
IRJET Journal
 
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solutionDA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
gitikasingh2004
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualization
Vini Vasundharan
 

Similar to 1530 track2 humphrey (20)

V34132136
V34132136V34132136
V34132136
 
Customer Profiling using Data Mining
Customer Profiling using Data Mining Customer Profiling using Data Mining
Customer Profiling using Data Mining
 
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
 
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?Healthcare deserts: How accessible is US healthcare?
Healthcare deserts: How accessible is US healthcare?
 
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
 
DataMining_CA2-4
DataMining_CA2-4DataMining_CA2-4
DataMining_CA2-4
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analytics
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Decoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data StandardsDecoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data Standards
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project final
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Neural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learningNeural networks, naïve bayes and decision tree machine learning
Neural networks, naïve bayes and decision tree machine learning
 
Data Mining based on Hashing Technique
Data Mining based on Hashing TechniqueData Mining based on Hashing Technique
Data Mining based on Hashing Technique
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solutionDA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualization
 

More from Rising Media, Inc.

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
Rising Media, Inc.
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
Rising Media, Inc.
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
Rising Media, Inc.
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
Rising Media, Inc.
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
Rising Media, Inc.
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
Rising Media, Inc.
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
Rising Media, Inc.
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
Rising Media, Inc.
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
Rising Media, Inc.
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
Rising Media, Inc.
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
Rising Media, Inc.
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
Rising Media, Inc.
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
Rising Media, Inc.
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
Rising Media, Inc.
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
Rising Media, Inc.
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
Rising Media, Inc.
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
Rising Media, Inc.
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
Rising Media, Inc.
 
Elder shareable
Elder shareableElder shareable
Elder shareable
Rising Media, Inc.
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
Rising Media, Inc.
 

More from Rising Media, Inc. (20)

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
 
Elder shareable
Elder shareableElder shareable
Elder shareable
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
 

Recently uploaded

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 

Recently uploaded (20)

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 

1530 track2 humphrey

  • 1. Claim Pattern Anomalies Making a Mole Hill Out of a Mountain Predictive Analytics World for Business San Francisco May 17, 2017 CAS Analytics & Data Provisioning Team v01 Darryl Humphrey, PhD, PMP linkedin.com/in/dghumphrey1
  • 2. Provider and member claiming behavior is affected by many factors. 2CAS ADP Team FraudAnalytics Member and Provider Claiming Patterns are Dynamic Economic Conditions Plan Design Policies and Processes Compliance Verification Industry Realities
  • 3. Analyzing equivalent of 87,000,000 claim lines monthly encompassing 17,000 providers and 1.6 million members. –Nine (9) practice areas across health, dental, and pharmacy benefits –70 measures of claiming behavior –Six (6) algorithms –Look for converging results 3CAS ADP Team
  • 4. Multi-variate distance measure identifies providers whose claiming patterns differ from the population. ProportionofTotal$ AssociatedwithRiskyClaims 0 .2.4.6.8 1 0 50 100 150 DrugRD Non Outlier MCD Outlier November Analytic Run All providers reviewed 4.18 4CAS ADP Team
  • 5. 0 .2.4.6.8 1 0 50 100 150 DrugRD Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8 November Analytic Run All Providers reviewed kmeans results Clustering algorithm sharpens the focus on the riskiest providers. 4.18 Providers that cluster together have similar claiming patterns. 24 5 54 n=34 Small clusters with high RD scores are of most interest. ProportionofTotal$ thatareatRisk 5CAS ADP Team
  • 6. CAS ADP Team 6 Reviewing the cluster characteristics gives insight into what claiming patterns are driving the outlier scores. Mean Z Scores Cluster # Prvr Avg DrugRD Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 4 5 24 2.0 29.7 -0.5 -0.1 0.1 -0.4 -0.2 -0.2 -0.5 2 24 14 -0.1 0.8 10.3 1.7 1.7 -0.3 0.3 0.6 0.7 3 34 84 -1.3 0.8 -0.4 36.8 8.4 8.9 4.3 0.5 1.1 5 54 46 -0.4 0.9 0.9 17.1 3.4 1.9 2.5 0.9 0.5 … 1 682 2.15 0.2 0.0 0.2 -0.3 0.0 -0.2 -0.1 -0.2 0.2
  • 7. CAS ADP Team 7 Claim-specific risk is estimated for the variables highlighted in the K-means and MCD analyses. RiskMA(i,j) = (e-(MA(i,j)/Max MA (i)) * (1-di(j)/r))-e-1)/(1-e-1) Limited investigation resources are targeted on the specific claims most likely to be an issue.
  • 8. Network analysis can reveal relationships that warrant further investigation. Collusion between members and suspect providers? CAS ADP Team 8 Problematic providers tend to have customers in common.
  • 9. Claiming patterns for narcotics are of particular interest. CAS ADP Team 9 Highly concentrated business relationships are flagged. Are members seeking narcotics from multiple doctors and pharmacies?
  • 10. Machine learning (ML) = architectures for building algorithms that learn. CAS ADP Team 10 mA SVM Random Forest NN Neural Network CNNDBN Deep Learning RBM K-NN RNN Machine Learning
  • 11. Random Forest algorithm classifies observations based on the majority vote of many decision trees. Risk classification … 1200 obs 7 vars Sample with replacement Sample with replacement Sample with replacement 11CAS ADP Team
  • 12. Random Forest technique shows promise in predicting which investigations will yield findings of note. 1 0 1 25 9 0 1 50 True Positive Rate: 74% True Negative Rate: 98% CAS ADP Team 12 RiskMA(i,j) = (e-(MA(i,j)/Max MA (i)) * (1-di(j)/r))-e-1)/(1-e-1)
  • 13. Random Forest provides a measure of a variable’s importance to classification success. Var 6 Var 2 Var 3 Var 4 Var 1 Var 5 Var 7 CAS ADP Team 13
  • 14. Automated review of receipts provides early detection of potential issues. Machine learning algorithm is being used to determine if the document is a valid receipt. Data lift technology extracts the information.
  • 15. Analytics is one input used to match cost-to-investigate with the anticipated ROI. 15CAS ADP Team
  • 16. There are many paths to generating ROI from fraud detection analytics. – Business knowledge and a willingness to learn are more important than the tool set – Analytics are tools; keep them sharp – Verify that the analyses are: – Relevant – Reliable – Responsible – Tailor audit investigations to the nature and magnitude of the risk 16CAS ADP Team
  • 17. Jil Tanguay, BSc (Spec), CFI, CRMA Manager Claims Assurance Services Alberta Blue Cross jtanguay@ab.bluecross.ca Darryl Humphrey, PhD, PMP Senior Data Scientist Claims Assurance Services Alberta Blue Cross dhumphrey@ab.bluecross.ca Yemi Dare-Ode, BSc Nazanin Tahmasebi, PhD Wesley Wood, Bsc 17CAS ADP Team
  • 18. Random forest classification accuracy stabilizes at approximately 220 trees. 18CAS ADP Team
  • 19. Many data sets contain nonlinear relationships which can reduce the effectiveness of some detection methods. – Datasets that are linearly separable with some noise work out great 0 x 0 x 0 x2 x – Some data sets aren’t linear in their initial state – The data can be mapped to a higher-dimensional space 19CAS ADP Team
  • 20. Map feature space to one of higher dimensionality where the training set is linearly separable. Φ: x → φ(x) 20CAS ADP Team
  • 21. Support Vector Machines find the optimal surface that separates the groups. – Maximizes the distance between the hyperplane and the “difficult points” close to decision boundary – If there are no points near the decision surface, then there will be fewer false positives and false negatives – Support vectors are the observations near the decision boundary that contribute to determining the boundary. – Implies that only support vectors matter; other training examples are ignorable Ch. 15 21CAS ADP Team
  • 22. RD QuintileRD Quintile Random Forest Confusion Matrix Quintile Accuracy 0-0.20 0.80 0.20-0.40 0.67 0.40-0.60 0.50 0.60-0.80 0.69 0.80 -1 0.91 Quintile Accuracy 0-0.20 0.85 0.20-0.40 0.54 0.40-0.60 0.54 0.60-0.80 0.67 0.80 -1 0.94 SVM Confusion Matrix 22CAS ADP Team
  • 23. – Artificial neural networks are composed of multiple nodes which imitate neurons of the human brain. Neural networks are well-suited to detection tasks. – Neurons are connected by links and they interact with each other. Each link is associated with a weight – Artificial neural networks learn by modifying the weights in response to feedback – Deep learning = lots of hidden layers – Most often used for images 23CAS ADP Team
  • 24. Eye movement research indicates that we recognize objects by extracting features. CAS ADP Team 24
  • 25. The series of layers between input & output do feature extraction and processing in stages, just as our brains do. CAS ADP Team 25 Learning Variables
  • 26. Network analysis is used to show the effect of ownership on a pharmacy’s claiming behavior. – Assertion is that company policy / implicit guidelines can drive claiming behavior across the pharmacies owned by a single corporate entity – Network defined by pharmacies registered with the same legal name – Red = high total $ from risky claiming relative to other pharmacies – Large = high proportion of a pharmacy’s $ from risky claiming – Close together = similar high total $ at risk 26CAS ADP Team