Bob Rogers, PhD, Chief Scientist and Co-founder at Apixio, and Vishnu Vyas, Principal Scientist at Apixio will be presenting on October 30, 2013. They will describe use cases in which Apixio is using NoSQL and Hadoop to deliver powerful risk assessment results based on unstructured data in electronic health record systems.
4. Overview
•
•
•
•
•
•
What is wrong with healthcare?
What is ObamaCare?
What does patient data look like?
Risk Adjustment use case
Care Network use case
Apixio’s Big Data solutions
5. Poll
Are you a:
A. Programmer
B. Data scientist
C. Manager
D. Health IT technologist
E. Other?
13. Decision Support Fails Without
Access to Required Clinical Data
How is Splenectomy documented?
% with
Pneumococcal
Vaccine
54%
Coded
29%
17%
Non-coded
71%
Coded
History
11/1/2013
No Coded
History in EHR
3x
lower
14. Poll:
What percent of the key clinical data to you
think is missing from the coded layer?
A.
B.
C.
D.
10-25 %
25-50 %
50-75%
75+ %
22. How Much Data Is There?
Sources: EHR Structured, EHR
Text, EHR Scanned, Claims, RAPS
200,000 Pts over 5 years 10 TB
Structured: 13 M unique codes
4.8 M CPT, 4.8 M ICD9
Narrative: 338 M unique codes
98 M CPT, 120 M ICD9
32. Apixio Architecture High Level
Client Ingest Pipeline
Clinical Knowledge Exchange
Application
General
Event Stream
Care
Optimizer
HCC
Event Stream
Quality
Optimizer
Quality
Event Stream
HCC
Optimizer
Provider
files
EHR coded
data
EHR text
documents
EHR scan
documents
Parse
OCR
Norm.
Load
Patient
Object
Model
API
Referral
Event Stream
Claims
Eligibility
3rd Party
Event Stream
3rd Party
Event Stream
33. Apixio Platform Physical Architecture
External Clients
End Users
Apixio Pipeline
Receiver (HTTP)
Web Tier
Java/Python
Metrics (Graphite)
Apixio REST API
Logging
(Hive/Trace CF)
Job Control
Compute
Pipeline
Applications
Experimental
Infrastructure
Audit
(Trace CF)
Logging
Persistence
Cassandra
Hive/HDFS
S3
34. Apixio Platform Logical Architecture
• Append Only Model
in Cassandra
• Document Based
L0
L1
• Event Based Append
Only Model
• Transient (Stored in
HDFS)
• Used for Inference
• Application Specific
Data Model
• Optimized for Quick
Retrieval
L2
35. L0 – Document Level
• Stored in cassandra
• 2 Column Family / Customer
• Append only
Documents Column Family
DOCID1
DOCID2
DOCID3
Partial Patient
Object
ApixioID
Partial Patient
Object
Partial Patient
Object
Indices Column Family (2 types of data)
DocID:<DOCID>
ApixioID
APIXIOID
DocHash:<HASH>
ApixioID
APIXIOID
36. L1 – Event Streams
Cassandra
HIVE/HDFS
An event is an assertion (fact) about a specific
subject (patient) at a specific time
37. Event Extraction & Inference
Mapper
Reducer
Cassandra
HIVE/HDFS
Converts Documents/Patients to Events
Combines multiple events to create new events
38. Event Extraction & Inference
Functional Composition of extractors/transformers
gives us a scalable flexible inference engine.
39. Auditing
• Access information stored in a tracing CF in cassandra
• Append only
• Keyed by document
Audit Column Family
Timestamp1
Timestamp2
Timestamp2
Activity Info
DocID
Activity Info
Activity Info
Parsing
User Access
Timeline
User Access
We can reconstruct the timeline of activity on
any document once it hits our system.
40. What happens when something goes
wrong?
• Comprehensive Logging through custom appenders (log4j)
• All pipeline level events are logged to a trace column family
• Real-time metrics logged through graphite.