How Big Data is Reducing Costs and Improving Outcomes in Health Care

11© 2017 MapR Technologies
Big Data in Healthcare
Carol McDonald
@caroljmcdonald

The Motivation for Big Data: Poor ROI
•  USA spends a lot more per
capita
•  US Health System ranks last
among eleven countries
(OECD)
–  healthy lives, access, quality,
efficiency

Who Knew Healthcare could be so complicated?

Value Based Care & Value Based Reimbursement
Incentives for Technology:
•  Improve coordination and
outcome
•  shifting from fee-for-service
•  to value based data driven incentives

55© 2017 MapR Technologies© 2016 MapR Technologies© 2016 MapR Technologies
The Data

Where is the Big Data Opportunity?
McKinsey Global Institute

Where is the Big Data Opportunity?
According to McKinsey Global Institute the big
data opportunity:
•  Claims
–  utilization of care
•  Pharmaceutical
–  clinical trials
•  Clinical Data
–  Electronic Medical Records
•  Patient Behavior and
Population Health
lab
EMR / EHR
Doctor’s notes
Claims
images
HL7
Social Media

Building a Healthcare Data Lake on MapR
Data
Lake
Claims
Clinical
Pharmacy
EMR
Logs and
Notes
3rd Party
Additional
Data
CB Header data, Social, ...
Historical procedures, co-morbidities (prof & inst.)
Lab results, vital signs, ...
Dr. Notes, Customer call logs, emails
Licensing, death master, …
Electronic Medical Records, images & text
Prescriptions, adherence

Big Data Use Cases

Patient Data Management
Analyzed
Unstructured Data
Patient 360 View
Lab
EMR / EHR
Analysts
Doctor’s notes
Claims
Images
HL7
Social Media
Providers
MapR Converged Data
Platform

Reducing Fraud Waste and Abuse with Big Data Analytics
•  Healthcare Fraud >$60 billion yr
•  UnitedHealthcare:
–  2200% ROI using MapR for
Fraud
•  Medicare/Medicaid prevented
>$210.7 million fraud 1 year
Machine Learning
Model
EDI Claim
Fraud
Score

Predictive Analytics to Improve Outcomes
• Early Diagnosis of sepsis, CHF
• Predicting risk of readmission
• Matching treatments
Early Detection of Congestive Heart Failure
Sun, Jimeng, Large-scale Patient Similarity Learning for health analytics, Georgia Tech

Predictive Analytics/ Machine Learning
•  Aetna Labs predict future risk of metabolic syndrome
–  https://www.healthcare-informatics.com/article/how-aetna-using-big-data-give-patients-
personalized-care
•  Optum Labs data from 150 million patient records gives insight about
what works best
–  http://www.modernhealthcare.com/article/20150926/MAGAZINE/309269979

Real Time Monitoring and Alerts
Medical Devices
Stream
Stream
Stream Dashboards
Global Analytics &
Alerting

Why combine IOT with Machine Learning?
•  Cheaper sensors and machine learning are making it possible for
doctors to rapidly apply smart medicine to their patients’ cases
–  https://www.wsj.com/articles/the-smart-medicine-solution-to-the-health-care-
crisis-1499443449

•  A Stanford team has shown that a machine-learning model can
identify arrhythmias from an EKG better than an expert
–  https://www.technologyreview.com/s/608234/the-machines-are-getting-ready-to-play-
doctor/

Applying Machine Learning to Live Patient Data
–  https://www.healthitoutcomes.com/doc/applying-machine-learning-to-live-
data-0001

Real Time Monitoring Potential
•  CDC: chronic diseases—such as heart disease—are the major
causes of sickness and health care costs in the nation
•  McKinsey: Better management of congestive heart failure could
reduce treatment costs by a billion dollars annually

•  Connected care ensuring quicker Sepsis treatment:
–  Blood pressure, pulse rates and oxygen levels from monitoring devices
combined with machine learning to provide alerts
–  http://www.computerweekly.com/news/450422258/Putting-sepsis-algorithms-into-
electronic-patient-records

Solution Architecture

Serve DataStore DataCollect Data
What Do We Need to Do ?
Process DataData Sources
images
? ? ? ?

Collect the Data with NFS mounted on MapR-XD
•  Data Ingest:
–  File Based:
NFS with
MapR-FS
•  Move hot data
to $$ storage
•  Move cold
data to
cheaper MapR-
XD
Collect Data
MapR-FS
Data Sources
images
NFS
$$$ Storage
NFS
RDBMS
Data
Warehouse
NFS
Unlimited
Inexpensive
Storage

Collect the Events with MapR Streams
Consumers
Consumers
Consumers
Producers
Producers
Producers
MapR-FS
Kafka API Kafka API

Collect Data
Batch processing
MapR-FS
Process Data
•  Spark Parallel processing high
throughput fast
•  Hive, Pig, MapReduce slower but can
be simpler for batch file processing

Apache Spark Distributed Datasets
Distributed Dataset
Node
Executor
P4
Node
Executor
P1 P3
Node
Executor
P2
partitioned
Partition 1
8213034705, 95,
2.927373,
jake7870, 0……
Partition 2
8213034705,
115, 2.943484,
Davidbresler2,
1….
Partition 3
8213034705,
100, 2.951285,
gladimacowgirl,
58…
Partition 4
8213034705,
117, 2.998947,
daysrus, 95….
•  Data read into Memory Cache
•  Partitioned across a cluster
•  Operated on in parallel
•  Cached in memory for iterations

Streaming Data
Stream processing
Process Data
•  scalable, high-throughput, stream
processing of live data
raw
enriched
alerts

Streaming Analytics

Store the Data with MapR-DB
Key
Range
xxxx
xxxx
Key
Range
xxxx
xxxx
Key
Range
xxxx
xxxx
Key colB col
C
val val val
xxx val val
Key colB col
C
val val val
xxx val val
Key colB col
C
val val val
xxx val val
Fast Reads and Writes by Key! Data is automatically partitioned
by Key Range!

Store Lots of Data with NoSQL MapR-DB
bottleneck
Storage ModelRDBMS MapR-DB
Normalized schema à Joins for
queries can cause bottleneck De-Normalized schema à Data that
is read together is stored together
Key colB colC
xxx val val
xxx val val
Key colB colC
xxx val val
xxx val val
Key colB colC
xxx val val
xxx val val

What is Drill?
•  SQL engine on “everything”
•  Files: JSON, CSV, Parquet
•  Structured formats – Ex: parquet
•  Ecosystem components – Hbase, MapRDB, Hive
•  Schema optional
•  interactive response times

Apache Drill Architecture
•  massively parallel processing execution engine
•  distributed query processing

Serve DataStore DataCollect Data
What Do We Need to Do ?
MapR-FS
Process DataData Sources
MapR-FS
Stream
Topic

Customer Data Lakes

MapR Healthcare Customers
Delivers clinical intelligence
to healthcare providers
Sepsis control based on
real time patient data
Genomic data platform
Research grant analysis
80+ use cases; FWA, …
Genomics analysisRadiology analytics Customized solutions for
value-based care
MRI
manufacturer
Novartis

MapR Healthcare Architecture

Data Lake Architectures
Agile, self-
service data
exploration
ETL into operational
reporting formats (e.g.,
Parquet)
Multi-tenancy: job/
data placement
control, volumes
Access controls:
file, table, column,
column family, doc,
sub-doc levels
Sources
Labs
Claims pharmacy
EHR
Auditing:
compliance, analyze
user accesses
Snapshots:
track data lineage
and history
Table Replication:
global multi-master,
business continuity
MapR Converged Data Platform
Enterprise Storage Database Event Streaming
MapR-FS MapR-DB MapR Streams
MapR-DB: time
series, structured
data, JSON
MapR-XD:
unstructured data
NFS/ raw files
MapR Event Streams:
real-time event data

Valence Health
Population Health SaaS for 85,000 doctors 135 hospitals
•  3,000 inbound data feeds
–  Labs, EHR, claims…
Business Problem:
•  ETL for 20 million lab records took 22 hours to process.
Solution with MapR:
•  With NFS 20 million lab records now take 20 minutes with less
hardware
•  https://www.cioreview.com/news/valence-health-cuts-down-processing-time-and-
drives-customer-satisfaction-with-mapr-nid-11084-cid-15.html

UnitedHealthcare Optum
MapR Data Lake single platform to analyze claims, prescriptions..
•  NFS to ingest 1 million claims, 10 terabytes per day
•  2200% ROI machine learning for Payment Integrity
•  Machine learning for improving outcomes: Diabetes, reduce readmissions…

Baptist Health South Florida
Problem:
•  Oracle too expensive for big data
•  Need a common data platform for patient history
Solution:
1.  MapR data lake
2.  Offload cold data from Oracle $$ NFS to MapR
3.  Integration with EMR
4.  Admission/Readmission prediction
5.  Early sepsis detection/notification
6.  real time monitoring

Use Case: Streaming System of Record for Healthcare
•  Objective:
–  Build a flexible, secure
healthcare information
exchange
Challenges:
•  Many different data models
•  Security and privacy issues
•  HIPAA compliance

Solution: Streaming System of Record for Healthcare
•  Solution:
–  Streaming system of record
•  secure
•  immutable
•  rewindable
Auditable
•  Materialized views continuously computed
•  Selective cross data center replication
Stream
Topic
Records
Applications
6 5 4 3 2 1
Search
Graph DB
JSON
HBase
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
A
P
I
Streaming System of Record
Materialized
Views

Streaming System of Record for Healthcare
Case Study: Liaison Technologies
Raw
Data
workflow
Key/Value
MapR-DB
materialized
view
workflow
Search
Engine
materialized
view
CEP
k v v v v v
k v v v
k v v
k v v v v
k v v v
k v v v v v
Document Log
(MapR-FS)
log
API
App
pre-
processor
workflow
Graph DB
materialized
view
workflow
Time
Series DB
materialized
view
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
App AppApp
...
MapR-ES as Immutable Log
MapR Event Streams (MapR-ES)
•  Immutable log for all data
ingested or consumed.
•  Events become system of
record, processed by
consumers based on their
permissions.
MapR-ES powers compliance-
ready lineage:
•  Immutability. MapR-ES throws
no data away.
•  Auditing. Who wrote/read
events?
•  Rewind. What was status of
data two days ago?
•  Replay. Rebuild derivative data
stores.
Auditors want to see:
•  Data lineage. Where data came
from, how it got there.
•  Audit logging. Who wrote to,
updated, or read the data.

Q&A
@mapr
https://www.mapr.com/blog/author/carol-mcdonald
Engage with us!
mapr-technologies

How Big Data is Reducing Costs and Improving Outcomes in Health Care

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How Big Data is Reducing Costs and Improving Outcomes in Health Care

Similar to How Big Data is Reducing Costs and Improving Outcomes in Health Care (20)

More from Carol McDonald

More from Carol McDonald (13)

Recently uploaded

Recently uploaded (20)

How Big Data is Reducing Costs and Improving Outcomes in Health Care