Data Management 2: Conquering Data Proliferation

James Kerr
Solutions Architect
james.kerr@mongodb.com
Conquering Data
Proliferation

2
Part 2 In The Data Management Series
Data integration
Capture data changes
Engaging with your data
From Relational
To MongoDB
Conquering
Data Proliferation
Bulletproof
Data Management
ç
Ω
Part
1
Part
2
Part
3

3
Agenda
• Today's Problem
• Systems of Engagement
• Single View of…
• Changing Data
• Summary

6
Result
• Data walled off in "silos"
• Can't get a complete picture
• Have to "swivel chair" system to system
• Hard to find new avenues to add value
• Frustrated ops
• Frustrated customers

7
Example
• 20+ million Veterans in the US today
• 250,000+ employees at Veterans Affairs
• $3.9 billion for IT in 2015 budget
• What happens when a Veteran has to change their
address with the VA?
• How does a doctor see a single view of a Veteran's
health record?

9
Big Wave of Change Happening
Today's Systems of Record were
yesterday's Systems of Engagement!
Enterprise IT Transition From
• Systems of Record
To the Next Stage
• Systems of Engagement

10
Definition
• Incorporate technologies which encourage peer
interactions
• More decentralized
• More options for infrastructure especially cloud
• Enable new / faster interactions

11
Notional Architecture
Systems of Engagement
DataServices Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…

12
Many Complexities to Tackle
• Data Extraction (ETL)
• Change Data Capture (CDC)
• Data Governance
• Data Lineage
– Versioning
– Merging changes
• Security / Entitlements

13
Focus for Today
• Data Extraction (ETL)
• Change Data Capture (CDC)
• Data Governance
• Data Lineage
– Versioning
– Merging changes
• Security / Entitlements

15
Don't Boil the Ocean
• Information is often spread across multiple systems of
record
• Start with a read-only view of that information
• Target high value/impact data – "moments of
engagement"

16
Example – Single View of a Health Record
• Veteran's view
• Doctor's view
• Case worker's view

17
Single View Architecture
DataServices
Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…
ETL
record
record

18
• Dynamic schema
• Rich querying
• Aggregation framework
• High scale/performance
• Auto-sharding
• Map-reduce capability (Native MR or Hadoop Connector)
• Enterprise Security Features
Single View – Why MongoDB?

19
Systems of Record Data Model
• Continuity of Care (CCR) XML docs
• Pulled some examples from
http://googlehealthsamples.googlecode.com/svn/trunk/CCR_samples
...
<Immunizations>
<Immunization>
<CCRDataObjectID>BB0022</CCRDataObjectID>
<DateTime>
<Type>
<Text>Start date</Text>
</Type>
<ExactDateTime>1998-06-13T05:00:00Z</ExactDateTime>
</DateTime>
<Source>
<Actor>
<ActorID>Jane Smith</ActorID>
<ActorRole>
<Text>Ordering clinician</Text>
</ActorRole>
</Actor>
</Source>
...

20
...
<Medications>
<Medication>
<CCRDataObjectID>52</CCRDataObjectID>
<DateTime>
<Type>
<Text>Prescription Date</Text>
</Type>
<ExactDateTime>2007-03-09T12:00:00Z</ExactDateTime>
</DateTime>
<Type>
<Text>Medication</Text>
</Type>
<Source>
<Actor>
<ActorID>Rx History Supplier</ActorID>
</Actor>
</Source>
<Product>
<ProductName>
<Text>TIZANIDINE HCL 4 MG TABLET TEV</Text>
<Code>
<Value>-1</Value>
<CodingSystem>omi-coding</CodingSystem>
<Version>2005</Version>
...

21
Engagement Data Model
• Leverage dynamic schema / flexible data model
• Use an envelope/wrapper pattern
Source Data
Master Data /
Common Data Model
Metadata
Integrated Data
Metadata

22
Data Flow
1. Read most recent CCRs from each source system
2. Create a source document for each CCR in our system
of engagement database
1. Transform XML to JSON for the source data
2. Record the system and date in the metadata
3. Pull out the patient's identifying information to the
common data
4. Generate an Id for the raw file
3. Store the original CCR XML into GridFS
4. After each source document is created, update the
integrated document for the patient

23
Engagement Data Model - Metadata
{
_id : ObjectId("556b92b83f7e775b8e92b30a"),
meta : {
system : "EHR1",
lastUpdate : ISODate(...)
...
},
common : { ... },
source : { ... }
raw_id : "..."
}

24
Engagement Data Model - Source
{
...
source : {
...
Immunizations : {
Immunization : {
CCRDataObjectID :"BB0022",
DateTime : {
Type : {
Text :"Start date"
},
ExactDateTime :"1998-06-13T05:00:00Z"
},
Source : {
Actor : {
ActorID :"Jane Smith",
ActorRole : {
Text :"Ordering clinician"
}
}
},
...
},
...
}

25
Engagement Data Model - Common
{
...
common : {
patient : "D6E5D510-592D-C613-DB46..."
},
...
}

26
Engagement Data Model - Integrated
{
_id : ObjectId("556b92b83f7e775b8e92b30d"),
...
meta : {
lastUpdate : ISODate(...)
integrated : [
{ _id : ObjectId("...a"),
{ _id : ObjectId("...b")
]
},
common : { ... }
...
}

27
Engagement Data Model - Integrated
{
_id : ObjectId("556b92b83f7e775b8e92b30d"),
...
common : {
patient : "D6E5D510-592D-C613-DB46...",
CCRs : [
{
...
Medication : {
Product : {
ProductName :
"TIZANIDINE HCL 4 MG TABLET TEV"
}
}
...
},
{
...
Immunizations : { ... },
...
}
]
}
...
}

29
Single View Enables New Interactions
• Deliver faster
• Deliver to new applications (mobile, etc.)
• Improve services
• New analytics

30
Changing Data
• Now that data is easy to get to, users will want to make
changes
• With single view, can change data in the source systems
of record
• Remember the change of address scenario?

31
Example – Change of Address
• Enter in different systems
• Call different parts of the organization
• What if you have dependents that
live with you?

32
Capture Data Changes
DataServices
Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…
ETL
Bus
Apache Kafka
record
record
record

33
Engagement Data Model - Metadata
{
_id : ObjectId("556c1122c9c8f48313553be5"),
meta : {
system : "PatientRecords",
lastUpdate : ISODate(...),
version : 2,
lineage : { ... },
...
},
common : { ... },
source : { ... }
}

34
Engagement Data Model - Source
{
...
source : {
patientId : "D6E5D510-592D-C613-DB46..."
address1 : "John Smith",
address2 : null,
city : "New York",
state : "NY",
zip : "10007"
},
...
}

35
Engagement Data Model - Common
{
...
common : {
patient : "D6E5D510-592D-C613-DB46...",
address : {
addr1 : "John Smith",
city : "New York",
state : "NY",
zip : "10007"
}
},
...
}

36
• Address records can be in different systems
• Each system can be notified of the change to the record

37
Update Process
1. User accesses an application to change their address
2. User updates their address in the System of
Engagement
3. The address change is broadcast to any Systems of
Record that have registered
4. An adapter applies the address change to the System
of Record in an application-specific manner

38
Tracking Changes
• Add basic document versioning to track what changed
when
• Prefer the separate "current" and "history" collections
approach
– current contains the last updated version
– history contains all previous versions
• Can query history to see the lineage
(See http://askasya.com/post/revisitversions)

39
Engagement Data Model – Current
{
meta : {
version : 2,
lineage : {
event : "update",
source : "ProfileApp",
},
...
},
...
}

40
Engagement Data Model - History
{
_id : {
id : ObjectId("556c1122c9c8f48313553be5"), v : 1
},
meta : {
version : 1,
lineage : {
event : "update",
source : "PatientRecords",
},
...
},
...
}

41
Result – New Possibilities
• Change address in one place!
• Other value-add processes can be triggered by changes
• Example: Automated outreach
– heath and benefits centers in new location
– help moving
• Extend address change to Veteran’s dependents

43
Keep going
• Keep adding valuable processes to improve or provide
new services
• Phase out legacy if desired
– Part 1 – From Relational to MongoDB
• Improve data governance
– Part 3 – Bulletproof Data Management
• Reduce costs
• Innovate

44
• Systems of Engagement give users new ways to interact
with data
• You can start small and add value quickly
• MongoDB enables Systems of Engagement
– Dynamic schema
– Fast, flexible querying, analysis, & aggregation
– High performance
– Scalable
– Secure
Summary

45
• Systems of Engagement and the Future of Enterprise IT:
A Sea Change in Enterprise IT
http://www.aiim.org/futurehistory
• Systems of Engagement & the Enterprise
http://www-01.ibm.com/software/ebusiness/jstart/systemsofengagement/
• Geoffrey Moore - The Future of Enterprise IT
http://www.slideshare.net/SAPanalytics/geoffrey-moore-the-future-of-
enterprise-it
• Ask Asya
http://askasya.com/post/trackversions
http://askasya.com/post/revisitversions
References

Questions?
james.kerr@mongodb.com

Data Management 2: Conquering Data Proliferation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Data Management 2: Conquering Data Proliferation

Similar to Data Management 2: Conquering Data Proliferation (20)

More from MongoDB

More from MongoDB (20)

Data Management 2: Conquering Data Proliferation

Editor's Notes