Transcript of "The Myth of Health Data Integration Complexity"
The Myth of Health Data
There’s nothing special about health IT data that
justifies complex, expensive, or special technology
By Shahid N. Shah, CEO
Who is Shahid?
20+ years of software engineering and multidiscipline complex IT implementations (Gov.,
defense, health, finance, insurance)
12+ years of healthcare IT and medical
devices experience (blog at
15+ years of technology management
experience (government, non-profit,
10+ years as architect, engineer, and
implementation manager on various EMR
and EHR initiatives (commercial and nonprofit)
Author of Chapter 13, “You’re
the CIO of your Own Office”
What’s this talk about?
A deluge of healthcare data is being
created as we digitize biology,
chemistry, and physics.
Data changes the questions we ask
and it can actually democratize and
improve the science of medicine, if we
While cures are the only real miracles
of medicine, data can help solve
intractable problems and lead to more
engineering is going to do more harm
than good (industry-neutral is better).
Applications come and go, data lives
forever. He who owns, integrates,
and uses data wins in the end.
Never leave your data in the hands
of an application/system vendor.
There’s nothing special about
health IT data that justifies
complex, expensive, or special
Spend freely on multiple systems
and integration-friendly solutions.
NEJM believes doctors are trapped
It is a widely accepted myth that medicine requires
complex, highly specialized information-technology (IT)
This myth continues to justify soaring IT costs,
burdensome physician workloads, and stagnation in
innovation — while doctors become increasingly bound
to documentation and communication products that are
functionally decades behind those they use in their
New England Journal of Medicine “Escaping the EHR Trap - The Future of Health IT”, June 2012
Data changes the questions we ask
Simple visual facts
Complex visual facts
Implications for scientific discovery
The old way
The new way
Application focus is biggest mistake
Application-focused IT instead of Data-focused IT is causing business problems.
Silos of information exist across
groups (duplication, little sharing)
Healthcare Provider Systems
Poor data integration across
The Strategy: Modernize Integration
Need to get existing applications to share data through modern integration
Healthcare Provider Systems
Master Data Management, Entity Resolution, and Data Integration
Improved integration by services
that can communicate between applications
Confronting Data Integration Myths
My EHR will handle
everything I need
and push data
aggregated data is
I can’t possibly store
I don’t have to
worry about storing
certain types of data
I only need to store
data for a period of
If I don’t understand
how to synthesize
data now, I’d rather
not store it
Why health IT system integrate poorly
prevents tinkering and “hacking”
We don't support shared identities,
single sign on (SSO), and industryneutral authentication and
We're too focused on "structured
data integration" instead of "practical
app integration" in our early project
We focus more on "pushing" versus
"pulling" data than is warranted
early in projects
We have “Inside out” architecture,
not “Outside in”
We're too focused on heavyweight
industry-specific formats instead of
lightweight or micro formats
Data emitted is not tagged using
semantic markup, so it's not
securable or searchable by default
When health IT systems produce
other common outputs, it's not done
in a security- and integrationfriendly manner
Encourage clinical “tinkering” and “hacking”
• Clinicians usually go
into medicine because
they’re problem solvers
• Today’s permissionsoriented culture now
prevents “playing” with
data and discovering
Promote “Outside-in” architecture
Think about clinical and
hospital operations and
processes as a collection
of business capabilities or
services that can be
Implement industry-neutral ICAM
Implement shared identities, single sign on (SSO), neutral authentication and authorization
Proprietary identity is hurting us
Most health IT systems create their own
custom identity, credentialing, and access
management (ICAM) in an opaque part of
a proprietary database.
We’re waiting for solutions from health IT
vendors but free or commercial industryneutral solutions are much better and
Identity exchange is possible
• Follow National Strategy for Trusted Identities
in Cyberspace (NSTIC)
• Use open identity exchange protocols such as
SAML, OpenID, and Oauth
• Use open roles and permissions-management
protocols, such as XACML
• Consider open source tools such as OpenAM,
Apache Directory, OpenLDAP Shibboleth, or
• Externalize attribute-based access control
(ABAC) and role-based access control (RBAC)
from clinical systems into enterprise systems
like Active Directory or LDAP
App-focused integration is better than nothing
Structured data dogma gets in the way of faster decision support real solutions
Dogma is preventing integration
App-centric sharing is possible
Many think that we shouldn’t integrate
until structured data at detailed machinecomputable levels is available.
The thinking is that because mistakes can
be made with semi-structured or hard to
map data, we should rely on paper, make
users live with missing data, or just make
educated guesses instead.
Instead of waiting for HL7 or other structured
data about patients, we can use simple
techniques like HTML widgets to share
"snippets" of our apps.
• Allow applications immediate access to
portions of data they don't already manage.
• Widgets are portions of apps that can be
embedded or "mashed up" in other apps
without tight coupling.
• Blue Button has demonstrated the power of
app integration versus structured data
integration. It provides immediate benefit to
users while the data geeks figure out what
they need for analytics, computations, etc.
Pushing data is more expensive than pulling it
We focus more on "pushing" versus "pulling" data than is warranted early in projects
Old way to architect:
“What data can you send me?” (push)
Better way to architect:
“What data can I publish safely?” (pull)
The "push" model, where the system that
contains the data is responsible for sending the
data to all those that are interested (or to some
central provider, such as a health information
exchange or HL7 router) shouldn’t be the only
model used for data integration.
• Implement syndicated Atom-like feeds (which
could contain HL7 or other formats).
• Data holders should allow secure
authenticated subscriptions to their data and
not worry about direct coupling with other
• Consider the Open Data Protocol (oData).
• Enable auditing of protected health
information by logging data transfers through
use of syslog and other reliable methods.
• Enable proper access control rules expressed
in standards like XACML.
Industry-specific formats aren’t always necessary
Reliance on heavyweight industry-specific formats instead of lightweight micro formats is bad
HL7 and X.12 aren’t the only formats
Consider industry-neutral protocols
The general assumption is that
formats like HL7, CCD, and X.12 are
the only ways to do data integration
in healthcare but of course that’s
not quite true.
Consider identity exchange
protocols like SAML for integration
of user profile data and even for
exchange of patient demographics
and related profile information.
Consider iCalendar/ICS publishing
and subscribing for schedule data.
Consider microformats like FOAF
and similar formats from
Consider semantic data formats
like RDF, RDFa, and related family.
Tag all app data using semantic markup
When data is not tagged using semantic markup, it's not securable or shareable by default
Legacy systems trap valuable data
Semantic markup and tagging is easy
In many existing contracts, the
vendors of systems that house the
data also ‘own’ the data and it can’t
be easily liberated because the
vendors of the systems actively
prevent it from being shared or are
just too busy to liberate the data.
• One easy way to create semantically
meaningful and easier to share and
secure patient data is to have all
HTML tags be generated with
companion RDFa or HTML5 Data
Attributes using industry-neutral
schemas and microformats similar to
the ones defined at Schema.org.
• Google's recent implementation of
its Knowledge Graph is a great
example of the utility of this
semantic mapping approach.
Produce data in search-friendly manner
Proprietary data formats limit findability
Search engines are great integrators
• Legacy applications only present
through text or windowed
interfaces that can be “scraped”.
• Web-based applications present
other assets but aren’t search
• Most users need access to
information trapped in existing
applications but sometimes they
don’t need must more than access
that a search engine could easily
• Assume that all pages in an
application, especial web
applications, will be “ingested” by
a securable, protectable, search
engine that can act as the first
method of integration.
Rely first on open source, then proprietary
“Free” is not as important as open source, you should pay for software but require openness
Healthcare fears open source
Open source can save health IT
• Only the government spends more per
user on antiquated software than we do
• There is a general fear that open source
means unsupported software or lower
quality solutions or unwanted security
• Other industries save billions by using
• Commercial vendors give better pricing,
service, and support when they know
they are competing with open source.
• Open source is sometimes more secure,
higher quality, and better supported
than commercial equivalents.
• Don’t dismiss open source, consider it
the default choice and select commercial
alternatives when they are known to be