SlideShare a Scribd company logo
1 of 46
James Kerr
Solutions Architect
james.kerr@mongodb.com
Conquering Data
Proliferation
2
Part 2 In The Data Management Series
Data integration
Capture data changes
Engaging with your data
From Relational
To MongoDB
Conquering
Data Proliferation
Bulletproof
Data Management
ç
Ω
Part
1
Part
2
Part
3
3
Agenda
• Today's Problem
• Systems of Engagement
• Single View of…
• Changing Data
• Summary
Today's Problem
5
Enterprises Today
6
Result
• Data walled off in "silos"
• Can't get a complete picture
• Have to "swivel chair" system to system
• Hard to find new avenues to add value
• Frustrated ops
• Frustrated customers
7
Example
• 20+ million Veterans in the US today
• 250,000+ employees at Veterans Affairs
• $3.9 billion for IT in 2015 budget
• What happens when a Veteran has to change their
address with the VA?
• How does a doctor see a single view of a Veteran's
health record?
Systems of Engagement
9
Big Wave of Change Happening
Today's Systems of Record were
yesterday's Systems of Engagement!
Enterprise IT Transition From
• Systems of Record
To the Next Stage
• Systems of Engagement
10
Definition
• Incorporate technologies which encourage peer
interactions
• More decentralized
• More options for infrastructure especially cloud
• Enable new / faster interactions
11
Notional Architecture
Systems of Engagement
DataServices Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…
12
Many Complexities to Tackle
• Data Extraction (ETL)
• Change Data Capture (CDC)
• Data Governance
• Data Lineage
– Versioning
– Merging changes
• Security / Entitlements
13
Focus for Today
• Data Extraction (ETL)
• Change Data Capture (CDC)
• Data Governance
• Data Lineage
– Versioning
– Merging changes
• Security / Entitlements
Getting Started
15
Don't Boil the Ocean
• Information is often spread across multiple systems of
record
• Start with a read-only view of that information
• Target high value/impact data – "moments of
engagement"
16
Example – Single View of a Health Record
• Veteran's view
• Doctor's view
• Case worker's view
17
Single View Architecture
Systems of Engagement
DataServices
Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…
ETL
record
record
18
• Dynamic schema
• Rich querying
• Aggregation framework
• High scale/performance
• Auto-sharding
• Map-reduce capability (Native MR or Hadoop Connector)
• Enterprise Security Features
Single View – Why MongoDB?
19
Systems of Record Data Model
• Continuity of Care (CCR) XML docs
• Pulled some examples from
http://googlehealthsamples.googlecode.com/svn/trunk/CCR_samples
...
<Immunizations>
<Immunization>
<CCRDataObjectID>BB0022</CCRDataObjectID>
<DateTime>
<Type>
<Text>Start date</Text>
</Type>
<ExactDateTime>1998-06-13T05:00:00Z</ExactDateTime>
</DateTime>
<Source>
<Actor>
<ActorID>Jane Smith</ActorID>
<ActorRole>
<Text>Ordering clinician</Text>
</ActorRole>
</Actor>
</Source>
...
20
Systems of Record Data Model
...
<Medications>
<Medication>
<CCRDataObjectID>52</CCRDataObjectID>
<DateTime>
<Type>
<Text>Prescription Date</Text>
</Type>
<ExactDateTime>2007-03-09T12:00:00Z</ExactDateTime>
</DateTime>
<Type>
<Text>Medication</Text>
</Type>
<Source>
<Actor>
<ActorID>Rx History Supplier</ActorID>
</Actor>
</Source>
<Product>
<ProductName>
<Text>TIZANIDINE HCL 4 MG TABLET TEV</Text>
<Code>
<Value>-1</Value>
<CodingSystem>omi-coding</CodingSystem>
<Version>2005</Version>
...
21
Engagement Data Model
• Leverage dynamic schema / flexible data model
• Use an envelope/wrapper pattern
Source Data
Master Data /
Common Data Model
Metadata
Integrated Data
Metadata
22
Data Flow
1. Read most recent CCRs from each source system
2. Create a source document for each CCR in our system
of engagement database
1. Transform XML to JSON for the source data
2. Record the system and date in the metadata
3. Pull out the patient's identifying information to the
common data
4. Generate an Id for the raw file
3. Store the original CCR XML into GridFS
4. After each source document is created, update the
integrated document for the patient
23
Engagement Data Model - Metadata
{
_id : ObjectId("556b92b83f7e775b8e92b30a"),
meta : {
system : "EHR1",
lastUpdate : ISODate(...)
...
},
common : { ... },
source : { ... }
raw_id : "..."
}
24
Engagement Data Model - Source
{
_id : ObjectId("556b92b83f7e775b8e92b30a"),
...
source : {
...
Immunizations : {
Immunization : {
CCRDataObjectID :"BB0022",
DateTime : {
Type : {
Text :"Start date"
},
ExactDateTime :"1998-06-13T05:00:00Z"
},
Source : {
Actor : {
ActorID :"Jane Smith",
ActorRole : {
Text :"Ordering clinician"
}
}
},
...
},
...
}
25
Engagement Data Model - Common
{
_id : ObjectId("556b92b83f7e775b8e92b30a"),
...
common : {
patient : "D6E5D510-592D-C613-DB46..."
},
...
}
26
Engagement Data Model - Integrated
{
_id : ObjectId("556b92b83f7e775b8e92b30d"),
...
meta : {
lastUpdate : ISODate(...)
integrated : [
{ _id : ObjectId("...a"),
{ _id : ObjectId("...b")
]
},
common : { ... }
...
}
27
Engagement Data Model - Integrated
{
_id : ObjectId("556b92b83f7e775b8e92b30d"),
...
common : {
patient : "D6E5D510-592D-C613-DB46...",
CCRs : [
{
...
Medication : {
Product : {
ProductName :
"TIZANIDINE HCL 4 MG TABLET TEV"
}
}
...
},
{
...
Immunizations : { ... },
...
}
]
}
...
}
Engage!
29
Single View Enables New Interactions
• Deliver faster
• Deliver to new applications (mobile, etc.)
• Improve services
• New analytics
30
Changing Data
• Now that data is easy to get to, users will want to make
changes
• With single view, can change data in the source systems
of record
• Remember the change of address scenario?
31
Example – Change of Address
• Enter in different systems
• Call different parts of the organization
• What if you have dependents that
live with you?
32
Capture Data Changes
Systems of Engagement
DataServices
Data Processing
Integration,
Analytics, etc.
Systems of Record
Master Data
Raw Data
Integrated Data
…
ETL
Bus
Apache Kafka
record
record
record
33
Engagement Data Model - Metadata
{
_id : ObjectId("556c1122c9c8f48313553be5"),
meta : {
system : "PatientRecords",
lastUpdate : ISODate(...),
version : 2,
lineage : { ... },
...
},
common : { ... },
source : { ... }
}
34
Engagement Data Model - Source
{
_id : ObjectId("556c1122c9c8f48313553be5"),
...
source : {
patientId : "D6E5D510-592D-C613-DB46..."
address1 : "John Smith",
address2 : null,
city : "New York",
state : "NY",
zip : "10007"
},
...
}
35
Engagement Data Model - Common
{
_id : ObjectId("556c1122c9c8f48313553be5"),
...
common : {
patient : "D6E5D510-592D-C613-DB46...",
address : {
addr1 : "John Smith",
city : "New York",
state : "NY",
zip : "10007"
}
},
...
}
36
Systems of Record Data Model
• Address records can be in different systems
• Each system can be notified of the change to the record
37
Update Process
1. User accesses an application to change their address
2. User updates their address in the System of
Engagement
3. The address change is broadcast to any Systems of
Record that have registered
4. An adapter applies the address change to the System
of Record in an application-specific manner
38
Tracking Changes
• Add basic document versioning to track what changed
when
• Prefer the separate "current" and "history" collections
approach
– current contains the last updated version
– history contains all previous versions
• Can query history to see the lineage
(See http://askasya.com/post/revisitversions)
39
Engagement Data Model – Current
{
_id : ObjectId("556c1122c9c8f48313553be5"),
meta : {
system : "PatientRecords",
lastUpdate : ISODate(...),
version : 2,
lineage : {
event : "update",
source : "ProfileApp",
},
...
},
...
}
40
Engagement Data Model - History
{
_id : {
id : ObjectId("556c1122c9c8f48313553be5"), v : 1
},
meta : {
system : "PatientRecords",
lastUpdate : ISODate(...),
version : 1,
lineage : {
event : "update",
source : "PatientRecords",
},
...
},
...
}
41
Result – New Possibilities
• Change address in one place!
• Other value-add processes can be triggered by changes
• Example: Automated outreach
– heath and benefits centers in new location
– help moving
• Extend address change to Veteran’s dependents
Next Steps
43
Keep going
• Keep adding valuable processes to improve or provide
new services
• Phase out legacy if desired
– Part 1 – From Relational to MongoDB
• Improve data governance
– Part 3 – Bulletproof Data Management
• Reduce costs
• Innovate
44
• Systems of Engagement give users new ways to interact
with data
• You can start small and add value quickly
• MongoDB enables Systems of Engagement
– Dynamic schema
– Fast, flexible querying, analysis, & aggregation
– High performance
– Scalable
– Secure
Summary
45
• Systems of Engagement and the Future of Enterprise IT:
A Sea Change in Enterprise IT
http://www.aiim.org/futurehistory
• Systems of Engagement & the Enterprise
http://www-01.ibm.com/software/ebusiness/jstart/systemsofengagement/
• Geoffrey Moore - The Future of Enterprise IT
http://www.slideshare.net/SAPanalytics/geoffrey-moore-the-future-of-
enterprise-it
• Ask Asya
http://askasya.com/post/trackversions
http://askasya.com/post/revisitversions
References
Questions?
james.kerr@mongodb.com

More Related Content

What's hot

Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Andrew Morgan
 
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...MongoDB
 
Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Tobias Trelle
 
Xa transactions over camel routes
Xa transactions over camel routesXa transactions over camel routes
Xa transactions over camel routesJosué Neis
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couchdelagoya
 
Boundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 
Java development with MongoDB
Java development with MongoDBJava development with MongoDB
Java development with MongoDBJames Williams
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know Norberto Leite
 
Webinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDBWebinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDBMongoDB
 
MongoDB + Java + Spring Data
MongoDB + Java + Spring DataMongoDB + Java + Spring Data
MongoDB + Java + Spring DataAnton Sulzhenko
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
 
MongoDB.local Paris Keynote
MongoDB.local Paris KeynoteMongoDB.local Paris Keynote
MongoDB.local Paris KeynoteMongoDB
 
FIWARE Global Summit - Real-time Processing of Historic Context Information u...
FIWARE Global Summit - Real-time Processing of Historic Context Information u...FIWARE Global Summit - Real-time Processing of Historic Context Information u...
FIWARE Global Summit - Real-time Processing of Historic Context Information u...FIWARE
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Spring Data, Jongo & Co.
Spring Data, Jongo & Co.Spring Data, Jongo & Co.
Spring Data, Jongo & Co.Tobias Trelle
 
Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBMongoDB
 

What's hot (20)

Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2
 
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
 
Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Morphia, Spring Data & Co.
Morphia, Spring Data & Co.
 
Xa transactions over camel routes
Xa transactions over camel routesXa transactions over camel routes
Xa transactions over camel routes
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couch
 
Boundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary Front end tech talk: how it works
Boundary Front end tech talk: how it works
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Java development with MongoDB
Java development with MongoDBJava development with MongoDB
Java development with MongoDB
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
Webinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDBWebinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDB
 
MongoDB + Java + Spring Data
MongoDB + Java + Spring DataMongoDB + Java + Spring Data
MongoDB + Java + Spring Data
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
 
MongoDB.local Paris Keynote
MongoDB.local Paris KeynoteMongoDB.local Paris Keynote
MongoDB.local Paris Keynote
 
MongoDB crud
MongoDB crudMongoDB crud
MongoDB crud
 
FIWARE Global Summit - Real-time Processing of Historic Context Information u...
FIWARE Global Summit - Real-time Processing of Historic Context Information u...FIWARE Global Summit - Real-time Processing of Historic Context Information u...
FIWARE Global Summit - Real-time Processing of Historic Context Information u...
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Spring Data, Jongo & Co.
Spring Data, Jongo & Co.Spring Data, Jongo & Co.
Spring Data, Jongo & Co.
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 
Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDB
 

Viewers also liked (8)

Emotional intelligence HRM PPT MBA
Emotional intelligence HRM PPT MBAEmotional intelligence HRM PPT MBA
Emotional intelligence HRM PPT MBA
 
Emotional intelligence
Emotional intelligenceEmotional intelligence
Emotional intelligence
 
Hr Metrics
Hr Metrics  Hr Metrics
Hr Metrics
 
HR Metrics 3 0
HR  Metrics 3 0HR  Metrics 3 0
HR Metrics 3 0
 
HR Metrics
HR MetricsHR Metrics
HR Metrics
 
HR Metrics That Matter
HR Metrics That MatterHR Metrics That Matter
HR Metrics That Matter
 
Emotional intelligence
Emotional intelligenceEmotional intelligence
Emotional intelligence
 
Working with Emotional Intelligence
Working with Emotional IntelligenceWorking with Emotional Intelligence
Working with Emotional Intelligence
 

Similar to Data Management 2: Conquering Data Proliferation

Painting the Future of Big Data with Apache Spark and MongoDB
Painting the Future of Big Data with Apache Spark and MongoDBPainting the Future of Big Data with Apache Spark and MongoDB
Painting the Future of Big Data with Apache Spark and MongoDBMongoDB
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementDragan Kinkela
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.pptBsMath3rdsem
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMark Kromer
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Denodo
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBMongoDB
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Denodo
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
 
How MongoDB is Transforming Healthcare Technology
How MongoDB is Transforming Healthcare TechnologyHow MongoDB is Transforming Healthcare Technology
How MongoDB is Transforming Healthcare TechnologyMongoDB
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff UniversityPaolo Missier
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your PortfolioDenodo
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
 
Data Services and the Modern Data Ecosystem
Data Services and the Modern Data EcosystemData Services and the Modern Data Ecosystem
Data Services and the Modern Data EcosystemDenodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Building a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthBuilding a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthDatabricks
 

Similar to Data Management 2: Conquering Data Proliferation (20)

Painting the Future of Big Data with Apache Spark and MongoDB
Painting the Future of Big Data with Apache Spark and MongoDBPainting the Future of Big Data with Apache Spark and MongoDB
Painting the Future of Big Data with Apache Spark and MongoDB
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDB
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
How MongoDB is Transforming Healthcare Technology
How MongoDB is Transforming Healthcare TechnologyHow MongoDB is Transforming Healthcare Technology
How MongoDB is Transforming Healthcare Technology
 
ReComp, the complete story: an invited talk at Cardiff University
ReComp, the complete story:  an invited talk at Cardiff UniversityReComp, the complete story:  an invited talk at Cardiff University
ReComp, the complete story: an invited talk at Cardiff University
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Data Services and the Modern Data Ecosystem
Data Services and the Modern Data EcosystemData Services and the Modern Data Ecosystem
Data Services and the Modern Data Ecosystem
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Building a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public HealthBuilding a Federated Data Directory Platform for Public Health
Building a Federated Data Directory Platform for Public Health
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Data Management 2: Conquering Data Proliferation

  • 2. 2 Part 2 In The Data Management Series Data integration Capture data changes Engaging with your data From Relational To MongoDB Conquering Data Proliferation Bulletproof Data Management ç Ω Part 1 Part 2 Part 3
  • 3. 3 Agenda • Today's Problem • Systems of Engagement • Single View of… • Changing Data • Summary
  • 6. 6 Result • Data walled off in "silos" • Can't get a complete picture • Have to "swivel chair" system to system • Hard to find new avenues to add value • Frustrated ops • Frustrated customers
  • 7. 7 Example • 20+ million Veterans in the US today • 250,000+ employees at Veterans Affairs • $3.9 billion for IT in 2015 budget • What happens when a Veteran has to change their address with the VA? • How does a doctor see a single view of a Veteran's health record?
  • 9. 9 Big Wave of Change Happening Today's Systems of Record were yesterday's Systems of Engagement! Enterprise IT Transition From • Systems of Record To the Next Stage • Systems of Engagement
  • 10. 10 Definition • Incorporate technologies which encourage peer interactions • More decentralized • More options for infrastructure especially cloud • Enable new / faster interactions
  • 11. 11 Notional Architecture Systems of Engagement DataServices Data Processing Integration, Analytics, etc. Systems of Record Master Data Raw Data Integrated Data …
  • 12. 12 Many Complexities to Tackle • Data Extraction (ETL) • Change Data Capture (CDC) • Data Governance • Data Lineage – Versioning – Merging changes • Security / Entitlements
  • 13. 13 Focus for Today • Data Extraction (ETL) • Change Data Capture (CDC) • Data Governance • Data Lineage – Versioning – Merging changes • Security / Entitlements
  • 15. 15 Don't Boil the Ocean • Information is often spread across multiple systems of record • Start with a read-only view of that information • Target high value/impact data – "moments of engagement"
  • 16. 16 Example – Single View of a Health Record • Veteran's view • Doctor's view • Case worker's view
  • 17. 17 Single View Architecture Systems of Engagement DataServices Data Processing Integration, Analytics, etc. Systems of Record Master Data Raw Data Integrated Data … ETL record record
  • 18. 18 • Dynamic schema • Rich querying • Aggregation framework • High scale/performance • Auto-sharding • Map-reduce capability (Native MR or Hadoop Connector) • Enterprise Security Features Single View – Why MongoDB?
  • 19. 19 Systems of Record Data Model • Continuity of Care (CCR) XML docs • Pulled some examples from http://googlehealthsamples.googlecode.com/svn/trunk/CCR_samples ... <Immunizations> <Immunization> <CCRDataObjectID>BB0022</CCRDataObjectID> <DateTime> <Type> <Text>Start date</Text> </Type> <ExactDateTime>1998-06-13T05:00:00Z</ExactDateTime> </DateTime> <Source> <Actor> <ActorID>Jane Smith</ActorID> <ActorRole> <Text>Ordering clinician</Text> </ActorRole> </Actor> </Source> ...
  • 20. 20 Systems of Record Data Model ... <Medications> <Medication> <CCRDataObjectID>52</CCRDataObjectID> <DateTime> <Type> <Text>Prescription Date</Text> </Type> <ExactDateTime>2007-03-09T12:00:00Z</ExactDateTime> </DateTime> <Type> <Text>Medication</Text> </Type> <Source> <Actor> <ActorID>Rx History Supplier</ActorID> </Actor> </Source> <Product> <ProductName> <Text>TIZANIDINE HCL 4 MG TABLET TEV</Text> <Code> <Value>-1</Value> <CodingSystem>omi-coding</CodingSystem> <Version>2005</Version> ...
  • 21. 21 Engagement Data Model • Leverage dynamic schema / flexible data model • Use an envelope/wrapper pattern Source Data Master Data / Common Data Model Metadata Integrated Data Metadata
  • 22. 22 Data Flow 1. Read most recent CCRs from each source system 2. Create a source document for each CCR in our system of engagement database 1. Transform XML to JSON for the source data 2. Record the system and date in the metadata 3. Pull out the patient's identifying information to the common data 4. Generate an Id for the raw file 3. Store the original CCR XML into GridFS 4. After each source document is created, update the integrated document for the patient
  • 23. 23 Engagement Data Model - Metadata { _id : ObjectId("556b92b83f7e775b8e92b30a"), meta : { system : "EHR1", lastUpdate : ISODate(...) ... }, common : { ... }, source : { ... } raw_id : "..." }
  • 24. 24 Engagement Data Model - Source { _id : ObjectId("556b92b83f7e775b8e92b30a"), ... source : { ... Immunizations : { Immunization : { CCRDataObjectID :"BB0022", DateTime : { Type : { Text :"Start date" }, ExactDateTime :"1998-06-13T05:00:00Z" }, Source : { Actor : { ActorID :"Jane Smith", ActorRole : { Text :"Ordering clinician" } } }, ... }, ... }
  • 25. 25 Engagement Data Model - Common { _id : ObjectId("556b92b83f7e775b8e92b30a"), ... common : { patient : "D6E5D510-592D-C613-DB46..." }, ... }
  • 26. 26 Engagement Data Model - Integrated { _id : ObjectId("556b92b83f7e775b8e92b30d"), ... meta : { lastUpdate : ISODate(...) integrated : [ { _id : ObjectId("...a"), { _id : ObjectId("...b") ] }, common : { ... } ... }
  • 27. 27 Engagement Data Model - Integrated { _id : ObjectId("556b92b83f7e775b8e92b30d"), ... common : { patient : "D6E5D510-592D-C613-DB46...", CCRs : [ { ... Medication : { Product : { ProductName : "TIZANIDINE HCL 4 MG TABLET TEV" } } ... }, { ... Immunizations : { ... }, ... } ] } ... }
  • 29. 29 Single View Enables New Interactions • Deliver faster • Deliver to new applications (mobile, etc.) • Improve services • New analytics
  • 30. 30 Changing Data • Now that data is easy to get to, users will want to make changes • With single view, can change data in the source systems of record • Remember the change of address scenario?
  • 31. 31 Example – Change of Address • Enter in different systems • Call different parts of the organization • What if you have dependents that live with you?
  • 32. 32 Capture Data Changes Systems of Engagement DataServices Data Processing Integration, Analytics, etc. Systems of Record Master Data Raw Data Integrated Data … ETL Bus Apache Kafka record record record
  • 33. 33 Engagement Data Model - Metadata { _id : ObjectId("556c1122c9c8f48313553be5"), meta : { system : "PatientRecords", lastUpdate : ISODate(...), version : 2, lineage : { ... }, ... }, common : { ... }, source : { ... } }
  • 34. 34 Engagement Data Model - Source { _id : ObjectId("556c1122c9c8f48313553be5"), ... source : { patientId : "D6E5D510-592D-C613-DB46..." address1 : "John Smith", address2 : null, city : "New York", state : "NY", zip : "10007" }, ... }
  • 35. 35 Engagement Data Model - Common { _id : ObjectId("556c1122c9c8f48313553be5"), ... common : { patient : "D6E5D510-592D-C613-DB46...", address : { addr1 : "John Smith", city : "New York", state : "NY", zip : "10007" } }, ... }
  • 36. 36 Systems of Record Data Model • Address records can be in different systems • Each system can be notified of the change to the record
  • 37. 37 Update Process 1. User accesses an application to change their address 2. User updates their address in the System of Engagement 3. The address change is broadcast to any Systems of Record that have registered 4. An adapter applies the address change to the System of Record in an application-specific manner
  • 38. 38 Tracking Changes • Add basic document versioning to track what changed when • Prefer the separate "current" and "history" collections approach – current contains the last updated version – history contains all previous versions • Can query history to see the lineage (See http://askasya.com/post/revisitversions)
  • 39. 39 Engagement Data Model – Current { _id : ObjectId("556c1122c9c8f48313553be5"), meta : { system : "PatientRecords", lastUpdate : ISODate(...), version : 2, lineage : { event : "update", source : "ProfileApp", }, ... }, ... }
  • 40. 40 Engagement Data Model - History { _id : { id : ObjectId("556c1122c9c8f48313553be5"), v : 1 }, meta : { system : "PatientRecords", lastUpdate : ISODate(...), version : 1, lineage : { event : "update", source : "PatientRecords", }, ... }, ... }
  • 41. 41 Result – New Possibilities • Change address in one place! • Other value-add processes can be triggered by changes • Example: Automated outreach – heath and benefits centers in new location – help moving • Extend address change to Veteran’s dependents
  • 43. 43 Keep going • Keep adding valuable processes to improve or provide new services • Phase out legacy if desired – Part 1 – From Relational to MongoDB • Improve data governance – Part 3 – Bulletproof Data Management • Reduce costs • Innovate
  • 44. 44 • Systems of Engagement give users new ways to interact with data • You can start small and add value quickly • MongoDB enables Systems of Engagement – Dynamic schema – Fast, flexible querying, analysis, & aggregation – High performance – Scalable – Secure Summary
  • 45. 45 • Systems of Engagement and the Future of Enterprise IT: A Sea Change in Enterprise IT http://www.aiim.org/futurehistory • Systems of Engagement & the Enterprise http://www-01.ibm.com/software/ebusiness/jstart/systemsofengagement/ • Geoffrey Moore - The Future of Enterprise IT http://www.slideshare.net/SAPanalytics/geoffrey-moore-the-future-of- enterprise-it • Ask Asya http://askasya.com/post/trackversions http://askasya.com/post/revisitversions References

Editor's Notes

  1. Hello and welcome to Conquering Data Proliferation, the 2nd talk in our 3 part data management prototype to production series today. My name is James Kerr and I'm a Solutions Architect here at MongoDB. I've been with the company about 2 and a half years now and have been in the NoSQL space building large scale distributed databases for the last 9 years or so. I work primarily with US government agencies building things on MongoDB but I do work with our commercial customers now and again as well.
  2. As I said, this is part 2 in out 3 part data management path to production series. Hopefully you caught Jay's talk on migrating from relational to mongodb. In this talk I'll cover one type of system you could possibly migrate your relational databases to. I'll talk about what's being called systems of engagement (as opposed to systems of record) and how to get your data to and from that system. Be sure to catch the 3rd part of the series where Buzz will talk about some clever ways of tackling some of the data governance and quality issues we face when building these types of systems
  3. The title of this talk is "Conquering Data Proliferation" which is a pretty far reaching topic. But the fact is that this is a problem that most enterprises, big and small, are facing today.
  4. Cannot link data together across Systems of Record Systems of Record not designed to have end users interact with them
  5. Cannot link data together across Systems of Record Systems of Record not designed to have end users interact with them
  6. What happens when a Veteran has to change their address with the VA? Veteran has to update different parts of the organization manually May be propagated to other internal systems (or not) What happens if an address is not up to date? Benefits mailed to the wrong address (delay or maybe worse) in service How does a doctor see a single view of a Veteran's health record? Next to impossible right now VA has efforts underway to address (Political issues outside the scope of this talk) They need a way and are in fact making efforts to simplify basic functions such as this
  7. So how do we tackle this? Systems of Engagement is a concept was introduced by a fellow by the name of Geoffrey Moore a few years back. - NEXT
  8. He said that Systems of Engagement are the next big wave of change in Enterprise IT. And we see this starting to happen across many organizations today. This is essentially the transition from our existing, mostly passive, systems of record to connected systems of engagement that are more active and encourage/support peer interactions. Essentially, our customers and employees want to interact with business systems the way they they do in their personal/social lives. By some accounts, industry spent over $1 trillion on systems of record and though we continue to spend on them, we have reached a point of diminishing returns. Remember though that Today's Systems of Record were yesterday's Systems of Engagement – the engagement model has changed. People need to be able to do things themselves. Can't wait for hours for things to happen.
  9. "systems of engagement refers to the transition from current enterprise systems designed around discrete pieces of information ("records") to systems which are more decentralized, incorporate technologies which encourage peer interactions, and which often leverage cloud technologies to provide the capabilities to enable those interactions." You have probably heard terms such as Data Lake Operational Data Layer Data hub These are all moves towards a system of engagement where data is central
  10. We need to be able to do this transition while still leveraging our investment in Systems of Record Analogy: Retrofitting old building with new “connectivity” and interfaces (maintain existing architecture) This enables a new class of collaborative applications that interact with the data
  11. Today we'll touch on getting data out of source systems of record, pushing changes back to those systems and tracking the lineage of data through the system
  12. So how do we start to approach this? You have heard a lot about the 360 degree view of a customer, product or anything that is core to a business * popular concept that a lot of enterprises are putting solutions in place for This is a good starting point for a System of Engagement * You need a view across your core business before you can start to find new ways of interacting with it
  13. You have heard a lot about the360 degree view or single view of a customer, product or anything that is core to a business This is a popular concept that a lot of enterprises are putting solutions in place for This is a good starting point for a System of Engagement You often need a view across your core business before you can start to find new ways of interacting with it Remember the Retrofitting analogy? We are trying to build on top of our existing investment in systems of record and the Information about our core business objects is typically spread across these systems. Let's start by identifying "moments of engagement" that are of high value to our business and customers and that just providing a view for would make major improvements. These "moments" are the things that customers/users either expect the most from the business or feel the most pain about when they interact with the business.
  14. Let's go back to our examples at the VA. One of the more complex issues they face is providing a single view of a Veteran's health record. Providing this view is critical to a Veteran receiving quality care from doctors as well as other services provided by the VA. Right now health records are spread across systems (and agencies at that) and it is very difficult to see a Veteran's entire history. The VA has efforts under way to improve this and the approach is in line with the systems of engagement concept I have been talking about.
  15. So let's go back to our notional architecture… We have a central database fed and orchestrated by data ingestion and processing capabilities. This is fronted by data services that are consumed by new applications Let's put some technologies in place to try it out: 1) For the central database, we'll use MongoDB. No big surprise here and we'll talk more about why MongoDB is a great fit for this in a second. 2) For the ETL and data processing, we'll use Pentaho. There are other options for this but Pentaho has a good integration with MongoDB and is fairly easy to use (see part 1 in this series for more details on migrating data from relational databases) 3) Lastly, we'll use Node.js to build our RESTful services to sit on top of the database ** Describe the data flow: Records in SOR ELT pulls parts from SOR and updates SOE
  16. This is a picture you have seen many times over the years so what's different? Let's talk about some of the features of MongoDB and how they enable Dynamic schema  can handle vastly different data together and can keep improving and fixing issues over time easily (schema on read) Our example shows two systems but think about the complexities of integrating data across 5, 10 or even more systems Rich querying  supporting ends users directly requires multiple ways of access and key/value is not sufficient Aggregation framework  database-supported roll-ups for analysis High scale/performance  directly impacts customer & user experience so every second counts Auto-sharding  can automatically add processing power as data is added Map-reduce capability (Native MR or Hadoop Connector)  batch analysis looking for patterns and opportunities in the single view Enterprise Security  Provide the security controls necessary to protect the data
  17. So let's jump back into our example… We have a couple of different electronic health record (EHR) systems that we are going to pull Continuity of Care, or CCR, records from They happen to be in XML and have a lot of fields so I just pulled a couple of snippets out CCRs are meant to track many different types of medical interactions and events Here we have some things about immunizations… CLICK
  18. And here we have some data about medications that were perscribed So how are we going to put that together in one place so we can better interact with it and get a single view across the different CCRs for each patient? CLICK
  19. Leveraging the dynamic schema capabilities, we can readily create a document data model to encapsulate our original source data, common fields across systems as well as metadata CLICK We can start with the source data The source data can be a roughly transformed version of the data from the system of record so that we can interact with it as JSON/BSON CLICK The source data can be wrapped in an envelope document CLICK We can track metadata about the source data – source system, date, etc CLICK We can also extract any master data or the data that fits into a common enterprise data model out of the source If the raw data is required, it can be stored or maybe just cached as binary data stored directly in MongoDB and the wrapper document can contain a pointer to it CLICK, CLICK, CLICK Let's add a few more source data documents Now, in the process of creating these documents, we could also be creating an integrated view across them CLICK, CLICK There's a lot of flexibility here Maybe you want to keep the original source data objects as individual documents and then integrate them or maybe you just want to keep the integrated data objects
  20. ** add pentaho screenshot if you get a chance The actual process we go though isn't really that interesting You've all seen or written basic ETL processes before so I'll just cover it at a high level For this example, we'll create source documents for each of the original CCRs as well as an integrated view for each patient The last step can either be done incrementally or once you have completed loading the full batch of source documents for all patients. At which time, you would create an integrated document for each patient that you updated.
  21. Here, the source data is transformed from XML into JSON so we can work with the structure in MongoDB. Otherwise, we have to just store it raw in binary or text form The topic of converting XML to JSON is a whole separate discussion but it can range from simple to complex depending on how general a solution you need. Also keep in mind that this is an optional step and can be done at later stages in the process of rolling out your system. It may be more beneficial to focus on the "common" fields and integrate them initially.
  22. In this case, we only have one common data element and that's the patient Id that ties the records together
  23. Our integrated data model can contain metadata about what source data objects we put together. This helps us understand the lineage
  24. The integrated data model can pull in the patient Id as well as the list of the continuum of care records We can just pull in the fields that we want from each CCR
  25. Now that we have a single view, what do we do next? – NEXT, NEXT
  26. Improve care in the case of our VA example
  27. Having a view is a great first step but now that data is easy to engage with, users will want to be able make changes to it as well Worst case, they can go back to the systems of record directly and change the data there Let's switch gears a bit and go back to the change of address example we talked about
  28. So let's add a component that will propagate changes from the system of engagement back to the systems of record In addition to the previous components we put in place for the single view, we need some sort of message processing component to receive and publish data changes back to the source systems. For this example, we will use Apache Kafka as it is pretty commonly used these days. We'll show changing the integrated data in the system of engagement database and propagating that back to the systems of record
  29. Let's start with the data model in our system of engagement this time… The metadata looks the same as before but lets now add the version and lineage fields to track when changes are made. We'll talk more about this in a second.
  30. The source data contains the patient Id and their address as it came from the Patient Records system of record. Notice the address2 field is null as that is how it came from the source system of record.
  31. We can then pull the address into our common / master data model Notice that we have an "addr1" field but no "addr2" field because in a document model, we can just omit fields rather than set them to null
  32. For the sake of this example, we can have two simple but different address formats in our systems of record. In some cases, the systems will track overlapping sets of addresses In others, they may track completely separate sets. They may be different lines of business for example.
  33. At a high level, conceptually, this process is quite simple 1, 2, 3, 4 As we drill into it though, many complexities arise: How do we track who change what? How do we deal with failure, retries and conflict resolution? Unfortunately, I don't have all day to talk about this so let's just focus on tracking the changes for now
  34. To track the changes that were made to the data and when, lets extend our data model There are a number of possible approaches to this and Asya's "Ask Asya" blog does a great job of summarizing the tradeoffs I prefer the separate "current" and "history" collections approach though. In this approach, your current versions are always in your current collection so it's easy to query the current state and your history collection contains all the prior versions You can easily query your history collection to see the full lineage of changes made to the documents
  35. The version field contains a number indicating what version of the document is current Our lineage field can contain the type of event that changed the document as well as the source of the change.
  36. Our history collection contains all the previous versions of the document. We can add the version number to the _id field so we can easily use the a range query on _id to get all of the versions of a document. We can then examine the lineage and lastUpdate fields to see the list of events and when they occurred to understand the lineage of the document
  37. So what do we end up with? We have re-implemented a "moment of engagement" that use to be complicated for operations and frustrating to end users to now be simple and painless. We can now think about additional processes that we could launch in response to this change: Let's look for dependents in the Veteran's benefits and, if they live with them, update their address too Let's do some automatic outreach to help the Veteran get settled in their new location Combining this with the benefits of creating single views, we can see how truly powerful this can be.
  38. So what's next? - NEXT
  39. Just keep going…
  40. Systems of engagement are a new wave of change happening in enterprise IT. These new systems are transforming businesses and allowing both employees and end users to interact with systems in new ways. MongoDB enables Systems of Engagement Dynamic schema can handle data from numerous different systems all in one place Fast, flexible querying, analysis, & aggregation gets maximum value from the data High performance allows Systems of Engagement to handle load from a new class of users