Navigating Healthcare's Seas: Unraveling the Power of Data Mining in Healthcare
hunter Whitney The Journal of Precision Medicine
1. PRECISION TOOLS
A review of the current tools
and technologies that are enabling
a deeper insight into the mechanism
of various diseases.
2. 13
High Resolution:
bringing a sharper focus
to precision medicine
through design
W
ith so much attention focused on new ways to collect, combine, and mine health data,
it can be easy to overlook the critical need to clearly display the output. The interfaces and
data visualizations for precision medicine need to be intelligible, meaningful, and action
able for the people who depend on them. While the many tried-and-true approaches to
displaying quantitative information will continue to play essential roles, they are unlikely to be sufficient to
meet current and future demands. There are several converging forces driving these new needs including the
massively expanding volume and variety of data and the ever-increasing diversity of people expecting to make
use of it all. Technology may be the main driver but, by definition, medicine’s ultimate impact is on human beings
and not machines. From biomedical researchers engaged in drug discovery to physicians considering treatment
plans to patients making difficult but informed personal decisions, designing user interfaces and visualizations
should not just be a side note. If the goal of precision medicine is to provide more effective, targeted treatments
and tailored preventative care, shouldn’t the information support systems for the participants in these efforts
also be just as tailored and effective? That means thinking deeply about design considerations including:
l Moving toward more fluid, dynamic interactions with data rather than needing to rely on static, fragmentary views.
l Tuning the data presentation for the intended users without stripping out the meaning and value.
l Enabling the representations to reveal, at least to some degree, potential problems in the underlying data.
l Making displays that address questions about data such as - What is the context? What is important here? What can I do with this?
The interviewees in this article have vantage points ranging from patient to health care provider and from a biomedical researcher to a
technology executive whose company has deep roots in Formula One racecars. In each area, there are some emerging commonalities as
well as divergent needs that design will need to address.
The Roads To - and From - Wellness
Joel Dudley, Director of Biomedical Informatics at Mount Sinai School of Medicine, believes that some of the same technological capabilities that are
fueling precision medicine are also the source of its greatest challenges. “This is all really being enabled by an explosion of data and it’s concerning
to me that the things that are making this possible are also presenting real problems in terms of information overload for physicians and patients.”
Dudley believes not enough effort is being put into figuring out how to take all this information to make it digestible and actionable. “I think a lot
of the effort and funding around precision medicine has been spent on collecting more and more data and building algorithms to put data together
and analyze it.” However, he doesn’t see funding put toward designing these systems. He adds, “To my mind, it shows a real lack of appreciation
of the role of design in this area.”
by Hunter Whitney
PRECISIONTOOLS
3. 13
High Resolution:
bringing a sharper focus
to precision medicine
through design
W
ith so much attention focused on new ways to collect, combine, and mine health data,
it can be easy to overlook the critical need to clearly display the output. The interfaces and
data visualizations for precision medicine need to be intelligible, meaningful, and action
able for the people who depend on them. While the many tried-and-true approaches to
displaying quantitative information will continue to play essential roles, they are unlikely to be sufficient to
meet current and future demands. There are several converging forces driving these new needs including the
massively expanding volume and variety of data and the ever-increasing diversity of people expecting to make
use of it all. Technology may be the main driver but, by definition, medicine’s ultimate impact is on human beings
and not machines. From biomedical researchers engaged in drug discovery to physicians considering treatment
plans to patients making difficult but informed personal decisions, designing user interfaces and visualizations
should not just be a side note. If the goal of precision medicine is to provide more effective, targeted treatments
and tailored preventative care, shouldn’t the information support systems for the participants in these efforts
also be just as tailored and effective? That means thinking deeply about design considerations including:
l Moving toward more fluid, dynamic interactions with data rather than needing to rely on static, fragmentary views.
l Tuning the data presentation for the intended users without stripping out the meaning and value.
l Enabling the representations to reveal, at least to some degree, potential problems in the underlying data.
l Making displays that address questions about data such as - What is the context? What is important here? What can I do with this?
The interviewees in this article have vantage points ranging from patient to health care provider and from a biomedical researcher to a
technology executive whose company has deep roots in Formula One racecars. In each area, there are some emerging commonalities as
well as divergent needs that design will need to address.
The Roads To - and From - Wellness
Joel Dudley, Director of Biomedical Informatics at Mount Sinai School of Medicine, believes that some of the same technological capabilities that are
fueling precision medicine are also the source of its greatest challenges. “This is all really being enabled by an explosion of data and it’s concerning
to me that the things that are making this possible are also presenting real problems in terms of information overload for physicians and patients.”
Dudley believes not enough effort is being put into figuring out how to take all this information to make it digestible and actionable. “I think a lot
of the effort and funding around precision medicine has been spent on collecting more and more data and building algorithms to put data together
and analyze it.” However, he doesn’t see funding put toward designing these systems. He adds, “To my mind, it shows a real lack of appreciation
of the role of design in this area.”
by Hunter Whitney
PRECISIONTOOLS
4. Dudley believes new generations of tools are
needed to visualize and explore, for example,
the data about patterns of relationships in
patient populations. These tools would enable
users to ask clinical questions about patients
with shared characteristics. “A lot of precision
medicine will involve representing an individual
patient in the context of other patients.”
Dudley continues, “We just completed a project
in which we took the patient population at
Mount Sinai and took all of the lab tests and
created a similarity score with other patients
based on that clinical data.” Dudley says they
had several hundred variables and developed
a patient-patient similarity network and then
highlighted where all the type 2 diabetics were
positioned. It turned out that there were
several different hot spots and they had different
clinical characteristics. For example, even
though all type 2 diabetics had a heart attack
risk increase, one subgroup of diabetics had a
higher risk than the others. Another hot spot,
he adds, was cancer risk. He wonders, “How
do you efficiently traverse the network space
to move closer to an important cluster of
characteristics in a patient population?” On the
care delivery side, he believes the emphasis
shifts from investigating networks to looking
at health trajectories, inflection points, and
outcomes. In other words, what are the potential
health outcomes for a person following one
course of action versus another? For example,
looking at where care paths of similar patients
diverge, such as the introduction of a medication,
and determining what impact that has. These
care pathway tools, Dudley notes, are more
driven by data rather than only relying on
guidelines.
What’s in a Number? Data, Metadata, and Design
What does the number 280 mean? When it
appears on Manny Hernandez’s continuous
glucose monitor (CGM), it signifies that his
blood sugar level is too high and he’ll need to
take corrective action very soon. The number
is given additional context by an accompanying
Smoking
Exercise
Diet
Prostate
cancer
Hypertension Depression
Type 2
diabetes
Obesity
Osteoarthritis
Coronary
artery disease
Asthma
Abdonimal
aortic aneurysm
Stress
Alcohol
Antihypertensives
Air pollution
Sodium
Interferon-a
Pesticides
Cocaine
NSAID
Anticoagulants
Statins
Myocardial
infarction
MAO inhibitors
Parkinson’s
Disease
Allergens
Antipsychotics
Injury
arrow in either upward, level, or down trend,
with double up arrows and double down arrows
showing, respectively, a more pronounced
trend. As a person with diabetes, the conse-
quences of a mistaken assessment of blood
sugar level, and it’s likely trajectory, can be
severe. Even though he may be taking all the
right actions, the number may still in a
problematic range or it may be inaccurate for
a host of reasons that are not readily apparent.
The more good context the system can provide,
the better. A CGM is only one component of
a system that Hernandez uses to keep track
of his blood sugar levels. The mainstay device
is called a glucose meter or glucometer that
measure the level from a single sample of
blood, but does not show trends in the data.
His CGM needs to be calibrated with a
glucometer on a regular basis but, the glucometer
cannot show multiple readings. (The full
process and nuances of the devices that
Hernandez employs monitor his data takes
some effort and won’t be described in full
A map of health risk. Reprinted from The Lancet,
Vol. 375 (9725), 2010. Ashley E.A., Butte A. J., Wheeler
M. T., Chen R., et al. Clinical assessment incorporating
a personal genome, pp. 1525-1535, with permission
from Elsevier.
5. 15
here.) However, viewing data and taking action
in real-time when needed is simply part of his
life and something he does to be healthy. He
shares aggregated views of his blood glucose
data with his doctor to plan longer term
strategies. He thinks about this relationship
with data not only for his own wellbeing, but
also as a community advocate. For him, and his
community, the design and displays of glucose
monitoring systems is an ongoing pressing
consideration. Hernandez says, “When you see
a number on a display, it can feel very real even
though there are many factors that can throw
it off.” For example some monitors may not be
as accurate as others. Even a high quality device
may present an incorrect number. For instance,
for glucometers, a bit of sugar that might happen
to be stuck on his fingertips when he is doing a
blood test can significantly throw off the results..
“The problem is sometimes people see the number
and just don’t question it.” There’s a kind of
seeming validity and weight imparted by the system
display. Clearly have a high degree of distrust
of a monitor is not an optimal strategy either.
So how can design solve some of the problem it
sometimes causes? One approach is to visualize
normal patters of behavior in certain contexts
and then compare that to a number that might
require a decision. The system can draw upon its
own number crunching abilities to provide more
context to a single reading. That way the design
Where did this come from?
What is this showing me?
Do I have other options for
visualizing this data?
What can I do with it?
100
90
80
70
60
50
40
30
20
10
0
Design Plays a Role in Addressing These Key Questions
of the display can draw upon the strengths of an
analytic tool to minimize the form presenting a
key measure with insufficient context. “It’s the
metadata about the data which makes all the
difference in the world about what actions you
take.” Hernandez believes that imbuing systems
with the ability to passively collect and represent
contextual data such as location, time of day, the
last time someone ate at a particular place, would
be very useful. He adds, “If you can model and
project into the future, that adds even more
value.” This same idea applies to design
approaches for many areas of precision medicine.
Jujitsu and Narratives at Scale
Jon Duke, MD, MS, is Chief Innovation Officer
and Director Drug Safety Informatics at the
Regenstrief Institute. He says, “The emphasis
has really been placed around creating powerful
methodologies for working with big data, but
there’s been much less progress made with how
to represent that information to the end users.”
Duke observes, “I’ve seen conferences where
absolutely brilliant data scientists are presenting
tremendous work and you can almost see all
of this ‘data jujitsu’ falling flat for the mostly
clinical audiences.” An analysis including data
from millions of patients and involving
complex data mining and statistical techniques
that show a particular number for a patient’s
risk of developing diabetes, he says, can still
leave many questions in his own mind. For
example, he might wonder about the source
data and what potential influencing factors are
not being conveyed about the results. Duke
believes there are so many moving parts in
the process of precision medicine, that the
challenges of data presentation may get worse
before they get better. “I think we may get to a
place initially where people become more
confused, not less.” He says, “I would argue
that where the biggest need is right now is
for designs that can communicate the outputs
of big data and precision medicine’s analyses.”
Duke believes that “where people are absolutely
struggling is how do we manifest the findings
of, for example, a genomic study so that the
data is graspable.” He adds that designs that
can clarify are “not just for physicians - there
are many decision makers in the health system -
administration, leadership, government
agencies, clinicians, and patients themselves.”
Duke thinks part of the answer to helping
people connect with analysis is by providing
relatable context - “even big data has to have
a story,” he says in what he terms “narratives
at scale.” “The way I think about narratives at
scale is that big data has to connect to a story
that you can tease out. When people start
digging into the findings of a large-scale
analysis, it often ends up with them asking –
can you tell me about a patient that was part
of this?” Duke adds that
PRECISIONTOOLS
6. he’s seen many times, “when you can explain
this at the individual patient level, it all seems
to register. They see what you’ve done and how
you’ve arrived at the results.” He notes that this
kind of context-setting can be aided with tech-
niques such as natural language processing
(NLP) or text mining. He says NLP produces a
great deal of information that has previously
been hard to access about patient symptoms,
the reasoning behind certain decision-making,
and even sentiment analysis. “You can do
that at scale, but it has to be part of a story.”
Although precision medicine involves diving
into aspects of patient health data at a very fine
level of detail, Duke says it’s important to keep
sight of the big picture. “We are going to use all
these sharpened tools to look at the individual
but, the reality is, much of the simple stuff still
makes a big difference in terms of translating
into action.” He says, the designs that develop
around precision medicine shouldn’t let people
drill down so far they get lost in the details and
lose sight of larger factors that have a greater
impact. “We need to have context so factors
such as socioeconomic situations are also on
our radar.”
The Perils of Data Alchemy
Thomas Wilckens MD, is CSO/CEO and Founder
of the precision medicine start-up InnVentis.
He worries that data scientists are being
considered “modern alchemists.” Why is that
a potential problem? Wilckens thinks that in
many instances, some of data that will undergo
transformation is not adequate for how it will
be used or the reasons for certain underlying
patterns of data will not be taken into account.
He says, “The quality of the data set - how it’s
collected, its reproducibility, and comparability
can be real problems.” Wilckens adds, “When
we come closer to the hardcore biochemistry,
small differences can have a big impact.” He
notes that a colleague who runs a metabolomics
center said that for some metabolomics assays,
it makes a difference if the patient stands or
sits for the measurements. Wilckens asks,
“How many people would have considered
Design decisions have a big
impact on what people can
find in the data.
Data Insights: New Ways to Visualize and Make Sense
of Data. Published November 2012 by Morgan
Kaufmann/Elsevier
7. However, as important as user control can
be, it can also introduce more complexity.
Friedman says, “I think controls are really
useful, but sometimes that can go too far as
well. It’s a balancing act.” Friedman adds,
“We put a lot of care into building tools like
the AnalysisPageServer, that support analytic
workflows rather than just making static reports
and plots which, in a way, can be dead-ends.”
The author of this article had the privilege
of working with Friedman on a project for an
analytic tool and saw him firsthand make the
user experience of the application a top
consideration. Among other things, that
meant direct discussions with users during
the development process and incorporation
of their feedback into the interface. It also
included designing the tool to be as easy to
use as possible and make the interactions
that drive the workflow as fluid as possible.
Friedman says, “Every piece of data and every
visualization has another hundred pieces of
metadata attached to it and you have to find a
way to focus on the ones that are important. I
think that’s what the idea of precision is really
getting at. We want to be able to drill down
without being overwhelmed by the data,”
Friedman says, “We’ve done a lot of gene
expression profiling and want to enable our
research collaborators to be able to dive into
that information, so we organized eight
or ten of the most common questions they
want to ask and they can now collect their
data sets through the web interface, select the
other parameters of these analyses and get the
results to see which genes are differentially
expressed and where the outliers are in a really
user-friendly and easy-to-learn way. By putting
the entire workflow within one application,
it makes it a more useful tool for people.
that impact even three or four years ago?” He
says that is just one of the challenges. It’s a
different issue with genomics in which different
systems doing the sequencing will produce
different patterns of errors. “That’s something
the number crunchers can deal with, but what
they can’t deal with is what I call the ‘batch
factor’.” The way blood is drawn, stored, and
analyzed at various times all has an impact on
the resulting data and metadata. “If you don’t
really know the history of the sample, you have
a significant problem.” Some problems with
flawed data, Wilckens notes, can be mitigated
by pairing effective visualizations with trained
observers who can detect anomalous patterns.
“If you only look at spreadsheets you may
never realize that there is something completely
wrong in the data, unless you have a proper
visualization to reveal it.”
Cutting Through the Static: Tools
For Dynamic Analysis
Brad Friedman, PhD, is a computational biologist
at Genentech who, along with being a researcher,
has also developed a web-based framework
for sharing data and interactive visualizations
called AnalysisPageServer. This tool can be used
for various kinds of analytic work including
the exploration of gene expression data – an
important measure for precision medicine. He
says, “It’s nice to have one foot in each domain
as a tool builder and a tool user - by using
the tools in my analytic work, I become more
aware of what I want from them.” On the other
hand, as a tool builder, Friedman says, “There
are certain kinds of analyses that I may not
be thinking about while I am building the
application or report, but that I still want to
ensure will be possible to do by users.” He
says they want to empower users to bring their
own understanding to the data, even things
he would never initially have considered. In
practical design terms, that means embedding
a lot of interactivity into the interface. For
example, he says there are options that enable
users to bring together and explore different
data sets that haven’t been combined before.
8. 19
Hunter Whitney is a consultant, author, and instructor who
brings a user experience (UX) design perspective to data
visualization. He has advised corporations, start-ups,
government agencies, and NGOs to achieve their goals
through a strategic design approach to digital products and
services. Hunter is the author of “Data Insights: New Ways to
Visualize and Make Sense of Data” and contributed a chapter
in “Designing for Emerging Technologies: UX for Genomics,
Robotics, and the Internet of Things.”
will this trend affect my outcome? What’s the
course of action?” He notes that, automotive
engineers might be interested in a somewhat
different set of questions and might want
to look at the past to see where things went
wrong – looking at the same data, but with a
different level of detail. “The screen you see
depends on the function you fulfill, even if
there’s the same data underpinning all the
various outputs,” He says. “We don’t tend to
present a lot of alarms because they can make
the user stressed rather than informed.”
problems, he says, start with multiple variables
and people have several choices to rapidly
judge to determine the best course of action.
“Our models are based on multiple variables
then we start looking at confidence limits and
probabilities around outcomes once changes
are made. McGrath adds, “We still frequently
use Monte Carlo simulations, especially in the
race context.” This includes, appropriately
enough, the Monaco Grand Prix. They will run
thousands of simulated scenarios during the
race in the background, McGrath explains, so
when problems occur, the models suggest a
course of action. “You don’t have to sit around
wondering what to do next” he says and adds,
“That’s our ambition as we go into the medical
space. We are at the early phase of a very long
road, but we’re achieving very promising results.”
Conclusion
Precision medicine is fueled by emerging
forms of collaboration and new feedback loops
between the processing power of machines
and human intellect. The quality of interface
and visualization designs will have a big impact
on the success or failure of these interactions
and outcomes. By thinking deeply about
design, precision medicine can reach the level
of resolution it needs to fulfill its promise.
PROBABILITIES
PREDICTION
OUTCOME
PRECISIONTOOLS
Minds, Machines, and Monte Carlo – New
Drivers for Medicine
Some of the best drivers of innovation arise by
taking perspectives from one field and applying
them to another. That’s been a recurring theme
for Geoff McGrath, a vice president of McLaren
Applied Technologies, a high performance
design and engineering company. The company
has branched out from its roots in the world
of Formula One (F1) racecars. For example,
McLaren has teamed up with pharmaceutical
company GlaxoSmithKline on a range of
R&D initiatives, including data analytics and
visualization. “For me,” McGrath says, “going
into healthcare was not a great leap – we don’t
want to limit ourselves to racing technology
per se and embrace a range of other technologies
from other sources and bring them to
market in a fast track sort of way.” He says
they don’t separate drivers from the product
that they drive and takes a person-centered
design approach. “You have to put the humans
first, then design and engineer around them,
and keep the designers in the process to make
changes based on feedback.” The way we feed
the models is by weaving intelligence [e.g.,
sensors] into the product and then feeding
that intelligence back to the design team
which, in turn, feeds the models that can then
be more prescriptive for the users. This is the
philosophy we’ve managed to translate to a
whole range of product types, including medical
devices. It doesn’t matter if the product is an
inhaler or other drug delivery system,” he
says. “It has intelligence built in to provide
information about how customers are using the
product and how they might get more from it.”
Design Rules of the Road
McGrath and his colleagues use a number of
design heuristics in their work. For example,
in the racing context, he says, participants don’t
have much time to think so the displays need
to “crisply suggest the best course of action or
intervention.” He adds, “The design challenge
is capturing and then presenting quickly the
answers to questions like - Where am I? How
McGrath continues, “That’s why I like the idea
of predictive or even prescriptive analytics -
predictive meaning if you carry on this way,
this outcome is likely.” There’s a program that
they’re working on that addresses this issue
in the context of a neonatal intensive care for
premature infants. “These patients are very
difficult to monitor, so you need to establish a
baseline for the individual child and then
try and spot the anomalies,” he says. Following
from that, anomaly alerts should be designed
so that they optimize the nurse’s time. Most
9. Dudley believes new generations of tools are
needed to visualize and explore, for example,
the data about patterns of relationships in
patient populations. These tools would enable
users to ask clinical questions about patients
with shared characteristics. “A lot of precision
medicine will involve representing an individual
patient in the context of other patients.”
Dudley continues, “We just completed a project
in which we took the patient population at
Mount Sinai and took all of the lab tests and
created a similarity score with other patients
based on that clinical data.” Dudley says they
had several hundred variables and developed
a patient-patient similarity network and then
highlighted where all the type 2 diabetics were
positioned. It turned out that there were
several different hot spots and they had different
clinical characteristics. For example, even
though all type 2 diabetics had a heart attack
risk increase, one subgroup of diabetics had a
higher risk than the others. Another hot spot,
he adds, was cancer risk. He wonders, “How
do you efficiently traverse the network space
to move closer to an important cluster of
characteristics in a patient population?” On the
care delivery side, he believes the emphasis
shifts from investigating networks to looking
at health trajectories, inflection points, and
outcomes. In other words, what are the potential
health outcomes for a person following one
course of action versus another? For example,
looking at where care paths of similar patients
diverge, such as the introduction of a medication,
and determining what impact that has. These
care pathway tools, Dudley notes, are more
driven by data rather than only relying on
guidelines.
What’s in a Number? Data, Metadata, and Design
What does the number 280 mean? When it
appears on Manny Hernandez’s continuous
glucose monitor (CGM), it signifies that his
blood sugar level is too high and he’ll need to
take corrective action very soon. The number
is given additional context by an accompanying
Smoking
Exercise
Diet
Prostate
cancer
Hypertension Depression
Type 2
diabetes
Obesity
Osteoarthritis
Coronary
artery disease
Asthma
Abdonimal
aortic aneurysm
Stress
Alcohol
Antihypertensives
Air pollution
Sodium
Interferon-a
Pesticides
Cocaine
NSAID
Anticoagulants
Statins
Myocardial
infarction
MAO inhibitors
Parkinson’s
Disease
Allergens
Antipsychotics
Injury
arrow in either upward, level, or down trend,
with double up arrows and double down arrows
showing, respectively, a more pronounced
trend. As a person with diabetes, the conse-
quences of a mistaken assessment of blood
sugar level, and it’s likely trajectory, can be
severe. Even though he may be taking all the
right actions, the number may still in a
problematic range or it may be inaccurate for
a host of reasons that are not readily apparent.
The more good context the system can provide,
the better. A CGM is only one component of
a system that Hernandez uses to keep track
of his blood sugar levels. The mainstay device
is called a glucose meter or glucometer that
measure the level from a single sample of
blood, but does not show trends in the data.
His CGM needs to be calibrated with a
glucometer on a regular basis but, the glucometer
cannot show multiple readings. (The full
process and nuances of the devices that
Hernandez employs monitor his data takes
some effort and won’t be described in full
A map of health risk. Reprinted from The Lancet,
Vol. 375 (9725), 2010. Ashley E.A., Butte A. J., Wheeler
M. T., Chen R., et al. Clinical assessment incorporating
a personal genome, pp. 1525-1535, with permission
from Elsevier.
10. 15
here.) However, viewing data and taking action
in real-time when needed is simply part of his
life and something he does to be healthy. He
shares aggregated views of his blood glucose
data with his doctor to plan longer term
strategies. He thinks about this relationship
with data not only for his own wellbeing, but
also as a community advocate. For him, and his
community, the design and displays of glucose
monitoring systems is an ongoing pressing
consideration. Hernandez says, “When you see
a number on a display, it can feel very real even
though there are many factors that can throw
it off.” For example some monitors may not be
as accurate as others. Even a high quality device
may present an incorrect number. For instance,
for glucometers, a bit of sugar that might happen
to be stuck on his fingertips when he is doing a
blood test can significantly throw off the results..
“The problem is sometimes people see the number
and just don’t question it.” There’s a kind of
seeming validity and weight imparted by the system
display. Clearly have a high degree of distrust
of a monitor is not an optimal strategy either.
So how can design solve some of the problem it
sometimes causes? One approach is to visualize
normal patters of behavior in certain contexts
and then compare that to a number that might
require a decision. The system can draw upon its
own number crunching abilities to provide more
context to a single reading. That way the design
Where did this come from?
What is this showing me?
Do I have other options for
visualizing this data?
What can I do with it?
100
90
80
70
60
50
40
30
20
10
0
Design Plays a Role in Addressing These Key Questions
of the display can draw upon the strengths of an
analytic tool to minimize the form presenting a
key measure with insufficient context. “It’s the
metadata about the data which makes all the
difference in the world about what actions you
take.” Hernandez believes that imbuing systems
with the ability to passively collect and represent
contextual data such as location, time of day, the
last time someone ate at a particular place, would
be very useful. He adds, “If you can model and
project into the future, that adds even more
value.” This same idea applies to design
approaches for many areas of precision medicine.
Jujitsu and Narratives at Scale
Jon Duke, MD, MS, is Chief Innovation Officer
and Director Drug Safety Informatics at the
Regenstrief Institute. He says, “The emphasis
has really been placed around creating powerful
methodologies for working with big data, but
there’s been much less progress made with how
to represent that information to the end users.”
Duke observes, “I’ve seen conferences where
absolutely brilliant data scientists are presenting
tremendous work and you can almost see all
of this ‘data jujitsu’ falling flat for the mostly
clinical audiences.” An analysis including data
from millions of patients and involving
complex data mining and statistical techniques
that show a particular number for a patient’s
risk of developing diabetes, he says, can still
leave many questions in his own mind. For
example, he might wonder about the source
data and what potential influencing factors are
not being conveyed about the results. Duke
believes there are so many moving parts in
the process of precision medicine, that the
challenges of data presentation may get worse
before they get better. “I think we may get to a
place initially where people become more
confused, not less.” He says, “I would argue
that where the biggest need is right now is
for designs that can communicate the outputs
of big data and precision medicine’s analyses.”
Duke believes that “where people are absolutely
struggling is how do we manifest the findings
of, for example, a genomic study so that the
data is graspable.” He adds that designs that
can clarify are “not just for physicians - there
are many decision makers in the health system -
administration, leadership, government
agencies, clinicians, and patients themselves.”
Duke thinks part of the answer to helping
people connect with analysis is by providing
relatable context - “even big data has to have
a story,” he says in what he terms “narratives
at scale.” “The way I think about narratives at
scale is that big data has to connect to a story
that you can tease out. When people start
digging into the findings of a large-scale
analysis, it often ends up with them asking –
can you tell me about a patient that was part
of this?” Duke adds that
PRECISIONTOOLS
11. he’s seen many times, “when you can explain
this at the individual patient level, it all seems
to register. They see what you’ve done and how
you’ve arrived at the results.” He notes that this
kind of context-setting can be aided with tech-
niques such as natural language processing
(NLP) or text mining. He says NLP produces a
great deal of information that has previously
been hard to access about patient symptoms,
the reasoning behind certain decision-making,
and even sentiment analysis. “You can do
that at scale, but it has to be part of a story.”
Although precision medicine involves diving
into aspects of patient health data at a very fine
level of detail, Duke says it’s important to keep
sight of the big picture. “We are going to use all
these sharpened tools to look at the individual
but, the reality is, much of the simple stuff still
makes a big difference in terms of translating
into action.” He says, the designs that develop
around precision medicine shouldn’t let people
drill down so far they get lost in the details and
lose sight of larger factors that have a greater
impact. “We need to have context so factors
such as socioeconomic situations are also on
our radar.”
The Perils of Data Alchemy
Thomas Wilckens MD, is CSO/CEO and Founder
of the precision medicine start-up InnVentis.
He worries that data scientists are being
considered “modern alchemists.” Why is that
a potential problem? Wilckens thinks that in
many instances, some of data that will undergo
transformation is not adequate for how it will
be used or the reasons for certain underlying
patterns of data will not be taken into account.
He says, “The quality of the data set - how it’s
collected, its reproducibility, and comparability
can be real problems.” Wilckens adds, “When
we come closer to the hardcore biochemistry,
small differences can have a big impact.” He
notes that a colleague who runs a metabolomics
center said that for some metabolomics assays,
it makes a difference if the patient stands or
sits for the measurements. Wilckens asks,
“How many people would have considered
Design decisions have a big
impact on what people can
find in the data.
Data Insights: New Ways to Visualize and Make Sense
of Data. Published November 2012 by Morgan
Kaufmann/Elsevier
12. However, as important as user control can
be, it can also introduce more complexity.
Friedman says, “I think controls are really
useful, but sometimes that can go too far as
well. It’s a balancing act.” Friedman adds,
“We put a lot of care into building tools like
the AnalysisPageServer, that support analytic
workflows rather than just making static reports
and plots which, in a way, can be dead-ends.”
The author of this article had the privilege
of working with Friedman on a project for an
analytic tool and saw him firsthand make the
user experience of the application a top
consideration. Among other things, that
meant direct discussions with users during
the development process and incorporation
of their feedback into the interface. It also
included designing the tool to be as easy to
use as possible and make the interactions
that drive the workflow as fluid as possible.
Friedman says, “Every piece of data and every
visualization has another hundred pieces of
metadata attached to it and you have to find a
way to focus on the ones that are important. I
think that’s what the idea of precision is really
getting at. We want to be able to drill down
without being overwhelmed by the data,”
Friedman says, “We’ve done a lot of gene
expression profiling and want to enable our
research collaborators to be able to dive into
that information, so we organized eight
or ten of the most common questions they
want to ask and they can now collect their
data sets through the web interface, select the
other parameters of these analyses and get the
results to see which genes are differentially
expressed and where the outliers are in a really
user-friendly and easy-to-learn way. By putting
the entire workflow within one application,
it makes it a more useful tool for people.
that impact even three or four years ago?” He
says that is just one of the challenges. It’s a
different issue with genomics in which different
systems doing the sequencing will produce
different patterns of errors. “That’s something
the number crunchers can deal with, but what
they can’t deal with is what I call the ‘batch
factor’.” The way blood is drawn, stored, and
analyzed at various times all has an impact on
the resulting data and metadata. “If you don’t
really know the history of the sample, you have
a significant problem.” Some problems with
flawed data, Wilckens notes, can be mitigated
by pairing effective visualizations with trained
observers who can detect anomalous patterns.
“If you only look at spreadsheets you may
never realize that there is something completely
wrong in the data, unless you have a proper
visualization to reveal it.”
Cutting Through the Static: Tools
For Dynamic Analysis
Brad Friedman, PhD, is a computational biologist
at Genentech who, along with being a researcher,
has also developed a web-based framework
for sharing data and interactive visualizations
called AnalysisPageServer. This tool can be used
for various kinds of analytic work including
the exploration of gene expression data – an
important measure for precision medicine. He
says, “It’s nice to have one foot in each domain
as a tool builder and a tool user - by using
the tools in my analytic work, I become more
aware of what I want from them.” On the other
hand, as a tool builder, Friedman says, “There
are certain kinds of analyses that I may not
be thinking about while I am building the
application or report, but that I still want to
ensure will be possible to do by users.” He
says they want to empower users to bring their
own understanding to the data, even things
he would never initially have considered. In
practical design terms, that means embedding
a lot of interactivity into the interface. For
example, he says there are options that enable
users to bring together and explore different
data sets that haven’t been combined before.
13. 19
Hunter Whitney is a consultant, author, and instructor who
brings a user experience (UX) design perspective to data
visualization. He has advised corporations, start-ups,
government agencies, and NGOs to achieve their goals
through a strategic design approach to digital products and
services. Hunter is the author of “Data Insights: New Ways to
Visualize and Make Sense of Data” and contributed a chapter
in “Designing for Emerging Technologies: UX for Genomics,
Robotics, and the Internet of Things.”
will this trend affect my outcome? What’s the
course of action?” He notes that, automotive
engineers might be interested in a somewhat
different set of questions and might want
to look at the past to see where things went
wrong – looking at the same data, but with a
different level of detail. “The screen you see
depends on the function you fulfill, even if
there’s the same data underpinning all the
various outputs,” He says. “We don’t tend to
present a lot of alarms because they can make
the user stressed rather than informed.”
problems, he says, start with multiple variables
and people have several choices to rapidly
judge to determine the best course of action.
“Our models are based on multiple variables
then we start looking at confidence limits and
probabilities around outcomes once changes
are made. McGrath adds, “We still frequently
use Monte Carlo simulations, especially in the
race context.” This includes, appropriately
enough, the Monaco Grand Prix. They will run
thousands of simulated scenarios during the
race in the background, McGrath explains, so
when problems occur, the models suggest a
course of action. “You don’t have to sit around
wondering what to do next” he says and adds,
“That’s our ambition as we go into the medical
space. We are at the early phase of a very long
road, but we’re achieving very promising results.”
Conclusion
Precision medicine is fueled by emerging
forms of collaboration and new feedback loops
between the processing power of machines
and human intellect. The quality of interface
and visualization designs will have a big impact
on the success or failure of these interactions
and outcomes. By thinking deeply about
design, precision medicine can reach the level
of resolution it needs to fulfill its promise.
PROBABILITIES
PREDICTION
OUTCOME
PRECISIONTOOLS
Minds, Machines, and Monte Carlo – New
Drivers for Medicine
Some of the best drivers of innovation arise by
taking perspectives from one field and applying
them to another. That’s been a recurring theme
for Geoff McGrath, a vice president of McLaren
Applied Technologies, a high performance
design and engineering company. The company
has branched out from its roots in the world
of Formula One (F1) racecars. For example,
McLaren has teamed up with pharmaceutical
company GlaxoSmithKline on a range of
R&D initiatives, including data analytics and
visualization. “For me,” McGrath says, “going
into healthcare was not a great leap – we don’t
want to limit ourselves to racing technology
per se and embrace a range of other technologies
from other sources and bring them to
market in a fast track sort of way.” He says
they don’t separate drivers from the product
that they drive and takes a person-centered
design approach. “You have to put the humans
first, then design and engineer around them,
and keep the designers in the process to make
changes based on feedback.” The way we feed
the models is by weaving intelligence [e.g.,
sensors] into the product and then feeding
that intelligence back to the design team
which, in turn, feeds the models that can then
be more prescriptive for the users. This is the
philosophy we’ve managed to translate to a
whole range of product types, including medical
devices. It doesn’t matter if the product is an
inhaler or other drug delivery system,” he
says. “It has intelligence built in to provide
information about how customers are using the
product and how they might get more from it.”
Design Rules of the Road
McGrath and his colleagues use a number of
design heuristics in their work. For example,
in the racing context, he says, participants don’t
have much time to think so the displays need
to “crisply suggest the best course of action or
intervention.” He adds, “The design challenge
is capturing and then presenting quickly the
answers to questions like - Where am I? How
McGrath continues, “That’s why I like the idea
of predictive or even prescriptive analytics -
predictive meaning if you carry on this way,
this outcome is likely.” There’s a program that
they’re working on that addresses this issue
in the context of a neonatal intensive care for
premature infants. “These patients are very
difficult to monitor, so you need to establish a
baseline for the individual child and then
try and spot the anomalies,” he says. Following
from that, anomaly alerts should be designed
so that they optimize the nurse’s time. Most