SlideShare a Scribd company logo
Giridharan Gurumoorthy – Kavi Global
Rajesh Inbasekaran – Kavi Global
Pharmacy Claims – Fraud
Detection
#DS9SAIS
Outline
– Background
• Fraud Waste and Abuse
• Current Status and Challenges
– Project
• Objective
• Steps
– Implementation
– Initial Results
– Next steps
– Q&A
2#DS9SAIS
Background
Healthcare spend
$3.35 Trillion
$10K
Prescriptions
4 billion
11.9%
Pharmacies
67,000
700
Physicians
950K
$6.5B
3#DS9SAIS
Background
4#DS9SAIS
Physician
• Prescribes
Patient
• Fills up
Pharmacy
• Submits Claim
Insurance
Company
• Pays Claim
Fraud Waste and Abuse
• 3% to 10% of healthcare spend
• Possible Actors ( Alone or in collusion)
– Patient
– Physician
– Pharmacy
5#DS9SAIS
Current Status and Challenges
6#DS9SAIS
• Transaction level checks are in place
• Actor level checks are primitive
• Rule based checks are based on historical fraud
– Fraudsters are innovative
• False positives are expensive
Project
Objective
• Identify anomalous / possible
fraudulent actors
• Rank them
– Prioritize investigations
What did we do?
• Identify and rank anomalous
behavior
• Identify and rank anomalous
relationships between actors
• Generate consolidated scores
to rank
7#DS9SAIS
Steps
Data
Cleaning
Data
Summary
Behavior
Scoring
Relationship
Scoring
Consolidated
Score
8#DS9SAIS
Data Summary
9#DS9SAIS
Patient Physician Pharmacy Patient &
Pharmacy
Patient &
Physician
Physician &
Pharmacy
Patient &
Physician &
Pharmacy
Categories
Patient
Count
Physician
Count
Pharmacy
Count
Claim
Volume
Cost
Utilized
Quantity
Dispensed
Drug Spread
Metrics
Timeframe
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Q1 Q2 Q3 Q4
Anomalous behavior
• Grouping similar behavior
• Group Density
• Distance from Group Center
10#DS9SAIS
K Means Model
11#DS9SAIS
Optimal ‘K’
• Elbow Method
– Compute cluster algorithm for
different values of K
– Calculate WSS for each K and
plot the curve
– Location of the Bend will be
the optimal value of ‘K’
12#DS9SAIS
Anomaly Score
• Density Factor = Size of the cluster / Total Size
• Distance Factor = Distance between data point and
center of cluster / Distance of the farthest point in the
cluster
• Score = Max of two scores
• Actor Score = Sum of level wise scores with weightages
13#DS9SAIS
Example
14#DS9SAIS
Patient Level Score card
Physician Level Score card
Patient ID
Cluster_Size_Factor
Group 1
Cluster_Distance_Factor
Group 1
Cluster_Size_Factor
Group 2
Cluster_Distance_Factor
Group 2
Cluster_Size_Factor
Group 3
Cluster_Distance_Factor
Group 3
Cluster_Size_Factor
Group 4
Cluster_Distance_Factor
Group 4 Anomaly_Score
Patient - 1 1.0000 1.0000 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9992
Patient - 2 1.0000 0.9985 0.9974 0.9997 0.9979 1.0000 0.9979 1.0000 0.9989
Patient - 3 0.9971 0.9997 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9987
Patient - 4 0.9971 0.9990 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9986
Patient - 5 0.9971 0.9972 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9984
Patient - 6 0.9971 0.9962 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9983
Physician ID
Cluster_Size_Factor
Group 1
Cluster_Distance_Factor
Group 1
Cluster_Size_Factor
Group 2
Cluster_Distance_Factor
Group 2
Cluster_Size_Factor
Group 3
Cluster_Distance_Factor
Group 3
Cluster_Size_Factor
Group 4
Cluster_Distance_Factor
Group 4 Anomaly_Score
Physician - 1 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000
Physician - 2 0.9997 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000
Physician - 3 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000
Physician - 4 0.9997 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000
Physician - 5 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999
Physician - 6 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999
Pharmacy Level Score card
Pharmacy ID
Cluster_Size_Factor
Group 1
Cluster_Distance_Factor
Group 1
Cluster_Size_Factor
Group 2
Cluster_Distance_Factor
Group 2
Cluster_Size_Factor
Group 3
Cluster_Distance_Factor
Group 3
Cluster_Size_Factor
Group 4
Cluster_Distance_Factor
Group 4 Anomaly_Score
Pharmacy - 1 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000
Pharmacy - 2 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000
Pharmacy - 3 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000
Pharmacy - 4 0.9994 0.9998 0.9998 1.0000 0.9998 1.0000 0.9999 1.0000 1.0000
Pharmacy - 5 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999
Pharmacy - 6 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999
Anomalous Relationship
• Analyze
– # of Connected Neighbors
– # of Neighbors’ Neighbors
15#DS9SAIS
GraphX
• Create a graph with
multiple node types
– Physician
– Patient
– Pharmacy
16#DS9SAIS
Calculate Neighbor Count
• First level Neighbor
• Compute Immediate neighbor’s degree from the each vertex
• Second Level Neighbor
• Identify Neighbor’s Neighbor degree and bring to the parent vertex
• Third Level Neighbor
• Identify Second Level Neighbor’s Neighbor degree and bring to the parent
vertex
17#DS9SAIS
Anomaly Score
• Based on position in each count
• Consolidated score = Max of all the counts
18#DS9SAIS
Consolidate Score
• Sum of Behavior Score and Relationship Score with
configurable weightages
19#DS9SAIS
Implementation – How it works?
• Web Application for the use of Analysts
• Features
– Upload Claim Data
– Run models
– View Results – Actor wise ranks
– Action – Tag as False Positive or Initiate Case
– Feedback – Input Investigate status
20#DS9SAIS
Initial Results
• Many findings would have escaped Rule based checks
• Initial investigation results prove less false positives
• Ranking weightages might need some tuning
21#DS9SAIS
Next steps
• Supervised techniques with investigation results
• Use additional data
– Social media – Twitter, Facebook, Review data
– Address and Property data - Zillow
22#DS9SAIS
Q & A
23#DS9SAIS
24#DS9SAIS
Rajesh Inbasekaran : Rajesh@kaviglobal.com
Giridharan Gurumoorthy : Giridharan@kaviglobal.com

More Related Content

Similar to Pharmacy Claims Fraud Detection Using Apache Spark with Rajesh Inbasekaran and Giridharan Gurumoorthy

Alternatives to Call Coverage Per Diems
Alternatives to Call Coverage Per DiemsAlternatives to Call Coverage Per Diems
Alternatives to Call Coverage Per Diems
MD Ranger, Inc.
 
Scan4Safety - Positive Patient ID
Scan4Safety - Positive Patient ID Scan4Safety - Positive Patient ID
Scan4Safety - Positive Patient ID
Carol Read
 
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
Cyntegrity | Data Science for Clinical Trials
 
Titan Medical Market Potential
Titan Medical Market PotentialTitan Medical Market Potential
Titan Medical Market Potential
NextLevelAnalytics
 
LeanApproach_MedicationOrder_DispensingProcess
LeanApproach_MedicationOrder_DispensingProcessLeanApproach_MedicationOrder_DispensingProcess
LeanApproach_MedicationOrder_DispensingProcessS CG, PMP®, PMI-RMP®
 
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
SOURAV DAS
 
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
Cyntegrity | Data Science for Clinical Trials
 
Improving Surgical Safety and Patient Outcomes
Improving Surgical Safety and Patient OutcomesImproving Surgical Safety and Patient Outcomes
Improving Surgical Safety and Patient Outcomes
C Daniel Smith
 
Capstone Presentation.pptx
Capstone Presentation.pptxCapstone Presentation.pptx
Capstone Presentation.pptx
ManikjitChohan
 
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
GCRclinics
 
Rx15 le tues_330_1_daugherty-simons_2friedman
Rx15 le tues_330_1_daugherty-simons_2friedmanRx15 le tues_330_1_daugherty-simons_2friedman
Rx15 le tues_330_1_daugherty-simons_2friedman
OPUNITE
 
Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017
scanFOAM
 
Physician Contracting for Small Hospitals
Physician Contracting for Small HospitalsPhysician Contracting for Small Hospitals
Physician Contracting for Small HospitalsMD Ranger, Inc.
 
Physician credentialing process
Physician credentialing processPhysician credentialing process
Physician credentialing process
MGSI - Medical Group Services
 
5 dangerous ideas
5 dangerous ideas5 dangerous ideas
5 dangerous ideasWayne Pan
 
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)Carla Pitcher
 
Cost savings for insurance
Cost savings for insuranceCost savings for insurance
Cost savings for insurance
Nelson Hendler
 
Measuring & Monitoring Clinical Quality Measures Using Practice Fusion
Measuring & Monitoring Clinical Quality Measures Using Practice FusionMeasuring & Monitoring Clinical Quality Measures Using Practice Fusion
Measuring & Monitoring Clinical Quality Measures Using Practice Fusion
Practice Fusion
 

Similar to Pharmacy Claims Fraud Detection Using Apache Spark with Rajesh Inbasekaran and Giridharan Gurumoorthy (20)

Alternatives to Call Coverage Per Diems
Alternatives to Call Coverage Per DiemsAlternatives to Call Coverage Per Diems
Alternatives to Call Coverage Per Diems
 
A3_Greenbelt_EXAMPLE
A3_Greenbelt_EXAMPLEA3_Greenbelt_EXAMPLE
A3_Greenbelt_EXAMPLE
 
Scan4Safety - Positive Patient ID
Scan4Safety - Positive Patient ID Scan4Safety - Positive Patient ID
Scan4Safety - Positive Patient ID
 
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
 
Titan Medical Market Potential
Titan Medical Market PotentialTitan Medical Market Potential
Titan Medical Market Potential
 
LeanApproach_MedicationOrder_DispensingProcess
LeanApproach_MedicationOrder_DispensingProcessLeanApproach_MedicationOrder_DispensingProcess
LeanApproach_MedicationOrder_DispensingProcess
 
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
Hypothesis testing STATISTICS NOTES ON HONOURS , MAJOR , ACTUARIAL SCIENCE by...
 
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
MyRBQM Academy | Webinar Fraud and Sloppiness Detection in Clinical Trials [P...
 
NadiaNegron716
NadiaNegron716NadiaNegron716
NadiaNegron716
 
Improving Surgical Safety and Patient Outcomes
Improving Surgical Safety and Patient OutcomesImproving Surgical Safety and Patient Outcomes
Improving Surgical Safety and Patient Outcomes
 
Capstone Presentation.pptx
Capstone Presentation.pptxCapstone Presentation.pptx
Capstone Presentation.pptx
 
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
San Francisco Bay Area: Top 10 Radiology Clinics & Imaging Centers in 2018
 
Rx15 le tues_330_1_daugherty-simons_2friedman
Rx15 le tues_330_1_daugherty-simons_2friedmanRx15 le tues_330_1_daugherty-simons_2friedman
Rx15 le tues_330_1_daugherty-simons_2friedman
 
Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017
 
Physician Contracting for Small Hospitals
Physician Contracting for Small HospitalsPhysician Contracting for Small Hospitals
Physician Contracting for Small Hospitals
 
Physician credentialing process
Physician credentialing processPhysician credentialing process
Physician credentialing process
 
5 dangerous ideas
5 dangerous ideas5 dangerous ideas
5 dangerous ideas
 
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)
Successehs webinar-stage-2-and-3-mu-130424105245-phpapp02 (1)
 
Cost savings for insurance
Cost savings for insuranceCost savings for insurance
Cost savings for insurance
 
Measuring & Monitoring Clinical Quality Measures Using Practice Fusion
Measuring & Monitoring Clinical Quality Measures Using Practice FusionMeasuring & Monitoring Clinical Quality Measures Using Practice Fusion
Measuring & Monitoring Clinical Quality Measures Using Practice Fusion
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 

Pharmacy Claims Fraud Detection Using Apache Spark with Rajesh Inbasekaran and Giridharan Gurumoorthy

  • 1. Giridharan Gurumoorthy – Kavi Global Rajesh Inbasekaran – Kavi Global Pharmacy Claims – Fraud Detection #DS9SAIS
  • 2. Outline – Background • Fraud Waste and Abuse • Current Status and Challenges – Project • Objective • Steps – Implementation – Initial Results – Next steps – Q&A 2#DS9SAIS
  • 3. Background Healthcare spend $3.35 Trillion $10K Prescriptions 4 billion 11.9% Pharmacies 67,000 700 Physicians 950K $6.5B 3#DS9SAIS
  • 4. Background 4#DS9SAIS Physician • Prescribes Patient • Fills up Pharmacy • Submits Claim Insurance Company • Pays Claim
  • 5. Fraud Waste and Abuse • 3% to 10% of healthcare spend • Possible Actors ( Alone or in collusion) – Patient – Physician – Pharmacy 5#DS9SAIS
  • 6. Current Status and Challenges 6#DS9SAIS • Transaction level checks are in place • Actor level checks are primitive • Rule based checks are based on historical fraud – Fraudsters are innovative • False positives are expensive
  • 7. Project Objective • Identify anomalous / possible fraudulent actors • Rank them – Prioritize investigations What did we do? • Identify and rank anomalous behavior • Identify and rank anomalous relationships between actors • Generate consolidated scores to rank 7#DS9SAIS
  • 9. Data Summary 9#DS9SAIS Patient Physician Pharmacy Patient & Pharmacy Patient & Physician Physician & Pharmacy Patient & Physician & Pharmacy Categories Patient Count Physician Count Pharmacy Count Claim Volume Cost Utilized Quantity Dispensed Drug Spread Metrics Timeframe Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Q1 Q2 Q3 Q4
  • 10. Anomalous behavior • Grouping similar behavior • Group Density • Distance from Group Center 10#DS9SAIS
  • 12. Optimal ‘K’ • Elbow Method – Compute cluster algorithm for different values of K – Calculate WSS for each K and plot the curve – Location of the Bend will be the optimal value of ‘K’ 12#DS9SAIS
  • 13. Anomaly Score • Density Factor = Size of the cluster / Total Size • Distance Factor = Distance between data point and center of cluster / Distance of the farthest point in the cluster • Score = Max of two scores • Actor Score = Sum of level wise scores with weightages 13#DS9SAIS
  • 14. Example 14#DS9SAIS Patient Level Score card Physician Level Score card Patient ID Cluster_Size_Factor Group 1 Cluster_Distance_Factor Group 1 Cluster_Size_Factor Group 2 Cluster_Distance_Factor Group 2 Cluster_Size_Factor Group 3 Cluster_Distance_Factor Group 3 Cluster_Size_Factor Group 4 Cluster_Distance_Factor Group 4 Anomaly_Score Patient - 1 1.0000 1.0000 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9992 Patient - 2 1.0000 0.9985 0.9974 0.9997 0.9979 1.0000 0.9979 1.0000 0.9989 Patient - 3 0.9971 0.9997 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9987 Patient - 4 0.9971 0.9990 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9986 Patient - 5 0.9971 0.9972 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9984 Patient - 6 0.9971 0.9962 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9983 Physician ID Cluster_Size_Factor Group 1 Cluster_Distance_Factor Group 1 Cluster_Size_Factor Group 2 Cluster_Distance_Factor Group 2 Cluster_Size_Factor Group 3 Cluster_Distance_Factor Group 3 Cluster_Size_Factor Group 4 Cluster_Distance_Factor Group 4 Anomaly_Score Physician - 1 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 2 0.9997 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 3 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 4 0.9997 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 5 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Physician - 6 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Pharmacy Level Score card Pharmacy ID Cluster_Size_Factor Group 1 Cluster_Distance_Factor Group 1 Cluster_Size_Factor Group 2 Cluster_Distance_Factor Group 2 Cluster_Size_Factor Group 3 Cluster_Distance_Factor Group 3 Cluster_Size_Factor Group 4 Cluster_Distance_Factor Group 4 Anomaly_Score Pharmacy - 1 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 2 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 3 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 4 0.9994 0.9998 0.9998 1.0000 0.9998 1.0000 0.9999 1.0000 1.0000 Pharmacy - 5 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999 Pharmacy - 6 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999
  • 15. Anomalous Relationship • Analyze – # of Connected Neighbors – # of Neighbors’ Neighbors 15#DS9SAIS
  • 16. GraphX • Create a graph with multiple node types – Physician – Patient – Pharmacy 16#DS9SAIS
  • 17. Calculate Neighbor Count • First level Neighbor • Compute Immediate neighbor’s degree from the each vertex • Second Level Neighbor • Identify Neighbor’s Neighbor degree and bring to the parent vertex • Third Level Neighbor • Identify Second Level Neighbor’s Neighbor degree and bring to the parent vertex 17#DS9SAIS
  • 18. Anomaly Score • Based on position in each count • Consolidated score = Max of all the counts 18#DS9SAIS
  • 19. Consolidate Score • Sum of Behavior Score and Relationship Score with configurable weightages 19#DS9SAIS
  • 20. Implementation – How it works? • Web Application for the use of Analysts • Features – Upload Claim Data – Run models – View Results – Actor wise ranks – Action – Tag as False Positive or Initiate Case – Feedback – Input Investigate status 20#DS9SAIS
  • 21. Initial Results • Many findings would have escaped Rule based checks • Initial investigation results prove less false positives • Ranking weightages might need some tuning 21#DS9SAIS
  • 22. Next steps • Supervised techniques with investigation results • Use additional data – Social media – Twitter, Facebook, Review data – Address and Property data - Zillow 22#DS9SAIS
  • 24. 24#DS9SAIS Rajesh Inbasekaran : Rajesh@kaviglobal.com Giridharan Gurumoorthy : Giridharan@kaviglobal.com