Big healthcare data from electronic medical records can be used to understand drug effectiveness and safety. It is most useful when combined with experimental research throughout the drug lifecycle. Some uses of big data include exploring patterns to identify populations, conducting association studies by linking to genomics data, predicting treatment responses, and assessing causal relationships between drugs and health outcomes. However, healthcare data was primarily collected for administrative purposes, not research, so it has limitations that require careful analysis to draw valid conclusions and produce meaningful evidence on therapeutic safety and effectiveness.
Great article on how to integrate machine learning and optimization technique.
One group of researchers was able to reduce heart failure readmissions by 35% by combining machine learning and decision science technique, see "Data-driven decisions for reducing readmissions for heart failure: general methodology and case study" (Bayati, et. al., 2014).
Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparis...IJECEIAES
Public health care systems routinely collect health-related data from the population. This data can be analyzed using data mining techniques to find novel, interesting patterns, which could help formulate effective public health policies and interventions. The occurrence of chronic illness is rare in the population and the effect of this class imbalance, on the performance of various classifiers was studied. The objective of this work is to identify the best classifiers for class imbalanced health datasets through a cost-based comparison of classifier performance. The popular, open- source data mining tool WEKA, was used to build a variety of core classifiers as well as classifier ensembles, to evaluate the classifiers‟ performance. The unequal misclassification costs were represented in a cost matrix, and cost-benefit analysis was also performed. In another experiment, various sampling methods such as under-sampling, over-sampling, and SMOTE was performed to balance the class distribution in the dataset, and the costs were compared. The Bayesian classifiers performed well with a high recall, low number of false negatives and were not affected by the class imbalance. Results confirm that total cost of Bayesian classifiers can be further reduced using cost-sensitive learning methods. Classifiers built using the random under-sampled dataset showed a dramatic drop in costs and high classification accuracy.
Population Health Management PHM MLCSU huddleMatthew Grek
Andi Orlowski (Director of The Health Economics Unit) give an overview of Population Health Management (PHM) to the Midlands and Lancashire Commissioning Support Unit Huddle, on 25 March 2021
Great article on how to integrate machine learning and optimization technique.
One group of researchers was able to reduce heart failure readmissions by 35% by combining machine learning and decision science technique, see "Data-driven decisions for reducing readmissions for heart failure: general methodology and case study" (Bayati, et. al., 2014).
Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparis...IJECEIAES
Public health care systems routinely collect health-related data from the population. This data can be analyzed using data mining techniques to find novel, interesting patterns, which could help formulate effective public health policies and interventions. The occurrence of chronic illness is rare in the population and the effect of this class imbalance, on the performance of various classifiers was studied. The objective of this work is to identify the best classifiers for class imbalanced health datasets through a cost-based comparison of classifier performance. The popular, open- source data mining tool WEKA, was used to build a variety of core classifiers as well as classifier ensembles, to evaluate the classifiers‟ performance. The unequal misclassification costs were represented in a cost matrix, and cost-benefit analysis was also performed. In another experiment, various sampling methods such as under-sampling, over-sampling, and SMOTE was performed to balance the class distribution in the dataset, and the costs were compared. The Bayesian classifiers performed well with a high recall, low number of false negatives and were not affected by the class imbalance. Results confirm that total cost of Bayesian classifiers can be further reduced using cost-sensitive learning methods. Classifiers built using the random under-sampled dataset showed a dramatic drop in costs and high classification accuracy.
Population Health Management PHM MLCSU huddleMatthew Grek
Andi Orlowski (Director of The Health Economics Unit) give an overview of Population Health Management (PHM) to the Midlands and Lancashire Commissioning Support Unit Huddle, on 25 March 2021
Teaching issues acc and neurotechnology lessons drug preventionJacob Stotler
Teaching Technique: Functional connectivity of the Anterior Cingulate Cortex, error awareness and the effects of inhibition on the ACC from drug use / Nuerofeedback approaches to Bio-technologies and bio-engineering.
Advanced Laboratory Analytics — A Disruptive Solution for Health SystemsViewics
As US healthcare systems grapple with the recent upheavals in care payment and delivery, they are turning to advanced analytics as their “central nervous systems” for driving care and financial performance.
Laboratory information — spanning chemistry, pathology, microbiology and molecular testing, for example — is among the best sources of data for these advanced analytics, including clinician decision support, predictive analytics, population health management, and personalized medicine. When strategically harnessed and integrated to create a patient-centric lab data lake, laboratory information can form an affordable yet competitively powerful advanced analytics solution well suited for many health systems — i.e., a disruptive option.
L. Eleanor J. Herriman, MD, MBA, Chief Medical Informatics Officer of Viewics, explains why laboratory data should be a core strategic component for achieving success in value-based healthcare.
How do you know what to believe when it comes to medical research studies? What sources of information should you trust? What about statistics? Is evidence based medicine the sollution?
Patient recruitment & retention is highlighted as the key factor in ensuring study success, the area of patient retention in clinical trials is often overlooked. Retention of patients throughout the life of a clinical trial is however extremely vital from scientific as well as economic point of view. Poor recruitment & retention negatively impacts on the overall evaluable data for regulatory submissions. Dropped participants must be replaced which incurs further expenditures and time delays. Subject dropout rates are estimated to range from 15-40% of enrolled participants in clinical trials.
Application of Pharma Economic Evaluation Tools for Analysis of Medical Condi...IJREST
Application of Pharma Economic Evaluation Tools for Analysis of Medical
Conditions: A Case Study of an Educational Institution in India
1 Dr. Debasis Patnaik, 2 Ms. Pranathi Mandadi
1Assistant Professor, Department of Economics, BITS-Pilani, K K Birla Goa Campus, Goa, India
2Research Scholar, Department of Economics, BITS-Pilani, K K Birla Goa Campus, Goa, India
ABSTRACT
The basic idea of a QALY is straightforward. The amount of time spent in a health state is weighted by the utility score given to
that health state. It takes one year of perfect health (utility score of 1) to generate one QALY, whereas one year in a health state
valued at 0.5 is regarded as being equivalent to half a QALY. Thus, an intervention that generates four additional years in a health
state valued at 0.75 will generate one more QALY than an intervention that generates four additional years in a health state valued
at 0.5. This paper discusses effect of self-medication on health care taking an educational institution population comprising of
students, teaching and non-teaching staff in 2011.
Keywords: Pharma economics, QALY, measuring clinical and health excellence
Our main involvement with your clinical research recruitment program concludes with processing the responses to your mailer. As our staff members direct the respondents to your site, you can begin conducting final interviews to complete the clinical trial recruitment process.
In the realm of healthcare, data is a critical asset that holds the potential to revolutionise patient care, enhance treatment outcomes, and streamline healthcare operations. One of the most valuable resources in this data-driven landscape is healthcare datasets. These datasets encompass a wide range of information, from patient medical records and clinical trial data to health insurance claims and public health statistics.
Healthcare datasets serve as the foundation for evidence-based medicine, enabling researchers and healthcare professionals to analyse trends, identify patterns, and make informed decisions. By delving into these datasets, medical researchers can uncover new insights into disease progression, treatment efficacy, and patient outcomes. This knowledge is crucial for developing more effective therapies, improving diagnostic accuracy, and tailoring treatment plans to individual patients' needs.
Moreover, healthcare datasets play a pivotal role in public health initiatives. By examining data on disease incidence, vaccination rates, and health behaviours, public health officials can design targeted interventions, allocate resources more efficiently, and monitor the impact of public health policies. This data-driven approach helps in controlling the spread of infectious diseases, promoting healthy lifestyles, and ultimately reducing the burden of illness on society.
The integration of healthcare datasets with advanced analytics and machine learning technologies opens up even more possibilities. Predictive models built on these datasets can forecast disease outbreaks, identify high-risk patient populations, and optimise resource allocation in healthcare facilities. These predictive insights are invaluable for proactive healthcare management and ensuring that patients receive timely and appropriate care.
However, the effective use of healthcare datasets is not without challenges. Issues related to data privacy, security, and interoperability need to be addressed to ensure that sensitive patient information is protected and that data from different sources can be integrated seamlessly. Additionally, the quality and completeness of data are crucial for drawing accurate conclusions, necessitating rigorous data management and validation practices.
In conclusion, healthcare datasets are a vital resource that holds immense potential for advancing medical research, improving patient care, and enhancing public health outcomes. As technology continues to evolve, the ability to harness the power of these datasets will become increasingly important in shaping the future of healthcare.
Navigating Healthcare's Seas: Unraveling the Power of Data Mining in HealthcareThe Lifesciences Magazine
Here are 5 Applications of Data Mining in Healthcare: 1. Clinical Decision Support Systems (CDSS) 2. Disease Surveillance and Outbreak Prediction 3. Fraud Detection and Prevention 4. Personalized Medicine 5. Predictive Analytics for Patient Outcomes
Teaching issues acc and neurotechnology lessons drug preventionJacob Stotler
Teaching Technique: Functional connectivity of the Anterior Cingulate Cortex, error awareness and the effects of inhibition on the ACC from drug use / Nuerofeedback approaches to Bio-technologies and bio-engineering.
Advanced Laboratory Analytics — A Disruptive Solution for Health SystemsViewics
As US healthcare systems grapple with the recent upheavals in care payment and delivery, they are turning to advanced analytics as their “central nervous systems” for driving care and financial performance.
Laboratory information — spanning chemistry, pathology, microbiology and molecular testing, for example — is among the best sources of data for these advanced analytics, including clinician decision support, predictive analytics, population health management, and personalized medicine. When strategically harnessed and integrated to create a patient-centric lab data lake, laboratory information can form an affordable yet competitively powerful advanced analytics solution well suited for many health systems — i.e., a disruptive option.
L. Eleanor J. Herriman, MD, MBA, Chief Medical Informatics Officer of Viewics, explains why laboratory data should be a core strategic component for achieving success in value-based healthcare.
How do you know what to believe when it comes to medical research studies? What sources of information should you trust? What about statistics? Is evidence based medicine the sollution?
Patient recruitment & retention is highlighted as the key factor in ensuring study success, the area of patient retention in clinical trials is often overlooked. Retention of patients throughout the life of a clinical trial is however extremely vital from scientific as well as economic point of view. Poor recruitment & retention negatively impacts on the overall evaluable data for regulatory submissions. Dropped participants must be replaced which incurs further expenditures and time delays. Subject dropout rates are estimated to range from 15-40% of enrolled participants in clinical trials.
Application of Pharma Economic Evaluation Tools for Analysis of Medical Condi...IJREST
Application of Pharma Economic Evaluation Tools for Analysis of Medical
Conditions: A Case Study of an Educational Institution in India
1 Dr. Debasis Patnaik, 2 Ms. Pranathi Mandadi
1Assistant Professor, Department of Economics, BITS-Pilani, K K Birla Goa Campus, Goa, India
2Research Scholar, Department of Economics, BITS-Pilani, K K Birla Goa Campus, Goa, India
ABSTRACT
The basic idea of a QALY is straightforward. The amount of time spent in a health state is weighted by the utility score given to
that health state. It takes one year of perfect health (utility score of 1) to generate one QALY, whereas one year in a health state
valued at 0.5 is regarded as being equivalent to half a QALY. Thus, an intervention that generates four additional years in a health
state valued at 0.75 will generate one more QALY than an intervention that generates four additional years in a health state valued
at 0.5. This paper discusses effect of self-medication on health care taking an educational institution population comprising of
students, teaching and non-teaching staff in 2011.
Keywords: Pharma economics, QALY, measuring clinical and health excellence
Our main involvement with your clinical research recruitment program concludes with processing the responses to your mailer. As our staff members direct the respondents to your site, you can begin conducting final interviews to complete the clinical trial recruitment process.
In the realm of healthcare, data is a critical asset that holds the potential to revolutionise patient care, enhance treatment outcomes, and streamline healthcare operations. One of the most valuable resources in this data-driven landscape is healthcare datasets. These datasets encompass a wide range of information, from patient medical records and clinical trial data to health insurance claims and public health statistics.
Healthcare datasets serve as the foundation for evidence-based medicine, enabling researchers and healthcare professionals to analyse trends, identify patterns, and make informed decisions. By delving into these datasets, medical researchers can uncover new insights into disease progression, treatment efficacy, and patient outcomes. This knowledge is crucial for developing more effective therapies, improving diagnostic accuracy, and tailoring treatment plans to individual patients' needs.
Moreover, healthcare datasets play a pivotal role in public health initiatives. By examining data on disease incidence, vaccination rates, and health behaviours, public health officials can design targeted interventions, allocate resources more efficiently, and monitor the impact of public health policies. This data-driven approach helps in controlling the spread of infectious diseases, promoting healthy lifestyles, and ultimately reducing the burden of illness on society.
The integration of healthcare datasets with advanced analytics and machine learning technologies opens up even more possibilities. Predictive models built on these datasets can forecast disease outbreaks, identify high-risk patient populations, and optimise resource allocation in healthcare facilities. These predictive insights are invaluable for proactive healthcare management and ensuring that patients receive timely and appropriate care.
However, the effective use of healthcare datasets is not without challenges. Issues related to data privacy, security, and interoperability need to be addressed to ensure that sensitive patient information is protected and that data from different sources can be integrated seamlessly. Additionally, the quality and completeness of data are crucial for drawing accurate conclusions, necessitating rigorous data management and validation practices.
In conclusion, healthcare datasets are a vital resource that holds immense potential for advancing medical research, improving patient care, and enhancing public health outcomes. As technology continues to evolve, the ability to harness the power of these datasets will become increasingly important in shaping the future of healthcare.
Navigating Healthcare's Seas: Unraveling the Power of Data Mining in HealthcareThe Lifesciences Magazine
Here are 5 Applications of Data Mining in Healthcare: 1. Clinical Decision Support Systems (CDSS) 2. Disease Surveillance and Outbreak Prediction 3. Fraud Detection and Prevention 4. Personalized Medicine 5. Predictive Analytics for Patient Outcomes
Precision medicine is a rapidly evolving approach to healthcare that uses patient-specific data to tailor medical treatment and therapies to an individual’s unique needs.
The Role of Real-World Data in Clinical DevelopmentCovance
Healthcare is experiencing an avalanche of electronic data with sources that include social media, smart phones, activity trackers, electronic health records (EHRs), insurance claim databases, patient registries, health surveys, and more. **Disclaimer: This article was previously published. Sciformix is now a Covance company.
A REVIEW OF DATA INTELLIGENCE APPLICATIONS WITHIN HEALTHCARE SECTOR IN THE UN...ijsc
Data intelligence technologies have transformed the United States healthcare sector, bringing about transformational advances in patient care, research, and healthcare management. United States is the focus due fact that many academic and research institutions in the country are at the forefront of healthcare data research, making it an attractive location for in-depth studies.This paper explores the diverse realm of Data Intelligence in Healthcare, examining its applications, challenges, ethical considerations, and emerging trends. Data Intelligence Applications encompass a spectrum of technologies designed to collect, process, analyze, and interpret data effectively. These apps enable healthcare practitioners to make more educated decisions, forecast health outcomes, manage population health, customize treatment, optimize workflows, assist research, improve data security, and drive healthcare analytics. However, the use of data intelligence applications raises issues and concerns about data privacy, fairness, transparency, data quality, accountability, fair data access, regulatory compliance, and the balance between automation and human judgment. Emerging themes include AI and machine learning domination, stronger ethical and regulatory frameworks, edge and quantum computing, data democratization, sustainability applications, and developing human-machine collaboration. Data intelligence has an impact that goes beyond healthcare delivery, influencing decision-making, scientific discovery, education, and economic growth. Understanding its potential and ethical responsibilities is paramount as data-driven insights redefine healthcare excellence and extend their influence across sectors.
A REVIEW OF DATA INTELLIGENCE APPLICATIONS WITHIN HEALTHCARE SECTOR IN THE UN...ijsc
Data intelligence technologies have transformed the United States healthcare sector, bringing about
transformational advances in patient care, research, and healthcare management. United States is the
focus due fact that many academic and research institutions in the country are at the forefront of healthcare
data research, making it an attractive location for in-depth studies.This paper explores the diverse realm of
Data Intelligence in Healthcare, examining its applications, challenges, ethical considerations, and
emerging trends. Data Intelligence Applications encompass a spectrum of technologies designed to collect,
process, analyze, and interpret data effectively. These apps enable healthcare practitioners to make more
educated decisions, forecast health outcomes, manage population health, customize treatment, optimize
workflows, assist research, improve data security, and drive healthcare analytics. However, the use of data
intelligence applications raises issues and concerns about data privacy, fairness, transparency, data
quality, accountability, fair data access, regulatory compliance, and the balance between automation and
human judgment. Emerging themes include AI and machine learning domination, stronger ethical and
regulatory frameworks, edge and quantum computing, data democratization, sustainability applications,
and developing human-machine collaboration. Data intelligence has an impact that goes beyond
healthcare delivery, influencing decision-making, scientific discovery, education, and economic growth.
Understanding its potential and ethical responsibilities is paramount as data-driven insights redefine
healthcare excellence and extend their influence across sectors.
A REVIEW OF DATA INTELLIGENCE APPLICATIONS WITHIN HEALTHCARE SECTOR IN THE UN...ijsc
Data intelligence technologies have transformed the United States healthcare sector, bringing about
transformational advances in patient care, research, and healthcare management. United States is the
focus due fact that many academic and research institutions in the country are at the forefront of healthcare
data research, making it an attractive location for in-depth studies.This paper explores the diverse realm of
Data Intelligence in Healthcare, examining its applications, challenges, ethical considerations, and
emerging trends. Data Intelligence Applications encompass a spectrum of technologies designed to collect,
process, analyze, and interpret data effectively. These apps enable healthcare practitioners to make more
educated decisions, forecast health outcomes, manage population health, customize treatment, optimize
workflows, assist research, improve data security, and drive healthcare analytics. However, the use of data
intelligence applications raises issues and concerns about data privacy, fairness, transparency, data
quality, accountability, fair data access, regulatory compliance, and the balance between automation and
human judgment. Emerging themes include AI and machine learning domination, stronger ethical and
regulatory frameworks, edge and quantum computing, data democratization, sustainability applications,
and developing human-machine collaboration. Data intelligence has an impact that goes beyond
healthcare delivery, influencing decision-making, scientific discovery, education, and economic growth.
Understanding its potential and ethical responsibilities is paramount as data-driven insights redefine
healthcare excellence and extend their influence across sectors.
Post marketing studies of drug effects must then generally include at least 10,000 exposed persons in a cohort study, or enroll diseased patients from a population of equivalent size for a case–control study. A study of this size would be 95% certain of observing at least one case of any adverse effect that occurs with an incidence of 3 per 10 000 or greater (see Chapter 3). However, studies this large are expensive and difficult to perform. Yet, these studies often need to be conducted quickly, to address acute and serious regulatory, commercial, and/or public health crises. For all of these reasons, the past two decades have seen a growing use of computerized databases containing medical care data, so called “automated databases,” as potential data sources for pharmacoepidemiology studies.
This white paper offers a detailed perspective on how big data is impacting the healthcare industry and its underlying implication on the industry as a whole. It outlines the role of big data in healthcare, its benefits, core components and challenges faced by the healthcare sector towards full-fledged adoption & implementation.
The Increasing Importance of Patient Reported Outcomes and the Patient Voice ...Covance
Over the past few years there has been a paradigm shift in the overall approach to pharmacovigilance from that of pure safety analysis to overall benefit-risk evaluation of products. **Disclaimer: This article was previously published. Sciformix is now a Covance company.
An AI-based Decision Platform built using unified data model, incorporating systems biology topics for unit analysis using semi-supervised learning models
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evi.docxchristinemaritza
Chapter 4 Knowledge Discovery, Data Mining, and Practice-Based Evidence
Mollie R. Cummins
Ginette A. Pepper
Susan D. Horn
The next step to comparative effectiveness research is to conduct more prospective large-scale observational cohort studies with the rigor described here for knowledge discovery and data mining (KDDM) and practice-based evidence (PBE) studies.
Objectives
At the completion of this chapter the reader will be prepared to:
1.Define the goals and processes employed in knowledge discovery and data mining (KDDM) and practice-based evidence (PBE) designs
2.Analyze the strengths and weaknesses of observational designs in general and of KDDM and PBE specifically
3.Identify the roles and activities of the informatics specialist in KDDM and PBE in healthcare environments
Key Terms
Comparative effectiveness research, 69
Confusion matrix, 62
Data mining, 61
Knowledge discovery and data mining (KDDM), 56
Machine learning, 56
Natural language processing (NLP), 58
Practice-based evidence (PBE), 56
Preprocessing, 56
Abstract
The advent of the electronic health record (EHR) and other large electronic datasets has revolutionized efficient access to comprehensive data across large numbers of patients and the concomitant capacity to detect subtle patterns in these data even with missing or less than optimal data quality. This chapter introduces two approaches to knowledge building from clinical data: (1) knowledge discovery and data mining (KDDM) and (2) practice-based evidence (PBE). The use of machine learning methods in retrospective analysis of routinely collected clinical data characterizes KDDM. KDDM enables us to efficiently and effectively analyze large amounts of data and develop clinical knowledge models for decision support. PBE integrates health information technology (health IT) products with cohort identification, prospective data collection, and extensive front-line clinician and patient input for comparative effectiveness research. PBE can uncover best practices and combinations of treatments for specific types of patients while achieving many of the presumed advantages of randomized controlled trials (RCTs).
Introduction
Leaders need to foster a shared learning culture for improving healthcare. This extends beyond the local department or institution to a value for creating generalizable knowledge to improve care worldwide. Sound, rigorous methods are needed by researchers and health professionals to create this knowledge and address practical questions about risks, benefits, and costs of interventions as they occur in actual clinical practice. Typical questions are as follows:
•Are treatments used in daily practice associated with intended outcomes?
•Can we predict adverse events in time to prevent or ameliorate them?
•What treatments work best for which patients?
•With limited financial resources, what are the best interventions to use for specific types of patients?
•What types of indi ...
Why Is There A Need For Healthcare Data Aggregation.pptxPersivia Inc
Healthcare Data Aggregation is crucial in streamlining information, improving patient care, and enhancing overall healthcare outcomes. Aggregating healthcare data allows for the creation of comprehensive patient profiles by pulling information from various sources such as electronic health records (EHRs), wearable devices, and diagnostic tools. This holistic view enables healthcare professionals to make more informed decisions about patient care.
1. COMMENTARIES
Improving Therapeutic
Effectiveness and Safety
Through Big Healthcare Data
S Schneeweiss1
Big healthcare data—electronically recorded longitudinal data
generated during the provision and administration of healthcare for
millions of patients—have become essential for understanding the
effectiveness and safety of therapeutics. They are most effectively
used in concert with experimental and laboratory research
throughout the life cycle of a drug. Applications range from
providing phenotype and health outcomes information in genome-
wide association studies to postmarketing studies that assure pre-
scribers of the safety of approved drugs (Figure 1).
USES OF BIG HEALTHCARE DATA
Big healthcare data are characterized by a
large number of patients covered, a reflec-
tion of the variations in routine care prac-
tices, a lack of researcher-designed data
capture (yielding inaccurate or missing
information), and a lack of uniform data
standards.1
These characteristics have differ-
ent implications depending on the analytic
goals of the study and how data are utilized.
Use for population description
and pattern exploration
Large insurance claims and electronic medi-
cal record databases are very useful in under-
standing disease burden and medical
need, as well as the underuse and guideline-
recommended use of therapeutics, because
they reflect care outside tightly controlled
research environments. Conclusions drawn
from electronic medical record-based re-
search may trigger care interventions to
optimize the use of drugs (e.g., improve
adherence to chronic-use medications), and
may be used to monitor the success of
such interventions. Hypothesis-free pattern
exploration with powerful visualizations
may identify populations with particular
utilization and outcome patterns, stimulat-
ing new lines of inquiry. Data generated
from outside the professional healthcare sys-
tem (e.g., through blogs, smartphone apps,
or patient support groups), provide a differ-
ent type of health information but are less
straightforward to interpret for population-
level insights, as they lack meaningful
denominators and are subject to selected par-
ticipation. Linkage between these novel data
sources and structured healthcare informa-
tion would create a very useful data asset.
Use for association studies
Big genomics data are increasingly linked to
big healthcare data. The latter includes phe-
notypic data and temporality-preserving
drug use and health outcomes data, allowing
large-scale genome-wide association studies
and genome-drug interaction studies. Even
if they do not imply causal relationships,
such association studies using big healthcare
data can be useful when interpreted
cautiously.
Use for prediction
Particularly in integrated healthcare systems,
it is now possible to program prediction
algorithms for treatment effectiveness vs. fail-
ure and feed the suggestion back to the pro-
vider. Because these are individual-level
probabilistic predictions without any impli-
cation of causality, such prediction algo-
rithms inform the provider at the point of
care; however, they will not culminate in
automated prescribing unless their perform-
ance improves substantially.2
Improvements
are more likely to come from richer data
than from new algorithms. For example, pre-
dicting the lack of adherence to a medication
regimen is an area in which dynamic analyses
of big data, including claims, electronic med-
ical record, in addition to consumer applica-
tions and electronic devices measuring
behavioral factors, hold the promise of
meaningful improvements in health care.3
Use for understanding
causal relationships
Ultimately, providers and drug developers
need to understand causal relationships
between drugs and health outcomes.
1
Division of Pharmacoepidemiology, Department of Medicine, Brigham & Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA.
Correspondence: S Schneeweiss (schneeweiss@post.harvard.edu)
doi:10.1002/cpt.316
262 VOLUME 99 NUMBER 3 | MARCH 2016 | www.wileyonlinelibrary/cpt
PERSPECTIVES
2. Understanding causality is arguably even
more critical in medicine than in other
data-rich fields, because healthcare profes-
sionals and regulators are responsible for
making decisions about the well-being of
patients. Big healthcare data have proven
useful for assessing the safety of medica-
tions, drug-drug interactions (including the
risk of unintended clinical events), and
increasingly the comparative effectiveness of
different drugs on health outcomes.4
Stud-
ies that conduct baseline randomization
and follow subjects using secondary health-
care data are of particular interest.5
How-
ever, in order to be useful for patient care,
evidence needs to cross a quality threshold
that allows interpreting associations as
causal relationships. When analyses lead to
causal interpretations of the effectiveness of
therapeutics they become subject to more
scientific scrutiny, in terms of transparency,
auditability, reproducibility, and replicabil-
ity (Figure 2).
WHAT MAKES HEALTHCARE
DATA DIFFERENT?
Despite the potential of big healthcare
data, there is concern about the analysis
and interpretation of findings. Most issues
arise from concerns over a fundamental
misunderstanding: that secondary health-
care data will be interpreted as research-
grade medical information. However, big
healthcare data are usually filtered through
the sociology of health care systems and
recording practices in light of economic
interests and system constraints.1
When
analyzing such data, one often works with
surrogates for true medical constructs. For
example, an elderly patient who received a
discharge code for hypertension, as one of
five allowed diagnosis fields from a tertiary
care hospital, is less sick because the five
fields could not be used up with more
severe diagnoses that would have increased
revenues. Conversely, the use of an oxygen
canister, which is well measured in second-
ary data, is a good proxy for advanced dis-
ease approaching the end of life.6
When we rely too closely on secondary
data, we are frequently faced with a mass of
numbers that defy epidemiological interpre-
tation because a population denominator is
not clearly defined; we see a mixing of inci-
dence and prevalence values; and we see
reversed temporality between patient base-
line characteristics and future health out-
comes. These numbers are often cloaked in
colorful visualizations that mean little in
terms of true insights. Attempts to quantify
causal relationships are unfortunately often
associated with adjustment for causal inter-
mediates, reverse causation, immortal time
bias, residual confounding, or neglecting
informative censoring, among other biases.7
The issues above are well described in the
literature, and we know how to minimize
and avoid them. However, the application of
statistics to context-free numbers leads us in
the wrong direction. We need investigators
who seek to understand data sources with
the help of software that implements prin-
cipled analyses at scale and in rapid cycles.
SUCCESS FACTORS FOR BIG
HEALTHCARE DATA ANALYTICS
Several factors make evidence produced
by big healthcare data analyses more
likely to be successful in terms of under-
standing therapeutic effectiveness and
safety that will ultimately influence
healthcare decision-making.
Meaningful evidence
In order for any big healthcare data analysis
to be meaningful, the appropriate informa-
tion needs to be available. This may include
information on drug exposure, outcomes
that matter to patients and providers, mea-
surement of important confounders or prox-
ies thereof, and (increasingly) biomarker
information to identify the right patients for
highly targeted therapeutics. Because
Figure 1 The uses of big healthcare data and their analysis throughout the life cycle of prescription drugs.
PERSPECTIVES
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 3 | MARCH 2016 263
3. investigators cannot change information
already collected through the routine health-
care system, this requires searching for the
most appropriate data source, sometimes
worldwide, and often necessitates linking
several sources. This type of flexibility means
working with data that vary in terms of
quality, content, and coding.
For example, if the goal is a full charac-
terization of medications, their effective-
ness should be established in increasingly
finely stratified populations. Net benefits
are established by ascribing preference or
quality weights to absolute treatment effect
measures (i.e., difference rather than ratio
measures) of intended and unintended
effects.
Valid evidence
In order to reach (and surpass) the evidence
threshold (Figure 2), it is of course critical
to produce valid findings. This may require
a variety of different methodological ap-
proaches to the same question (e.g., combin-
ing randomized studies with secondary data
and observational analyses). Historical-
controlled studies and time-trend analyses
will become even more important in evalu-
ating highly targeted therapies that have
well-characterized molecular mechanisms
and will be quickly adopted by the provider
community. Observational studies based on
big healthcare data will generally benefit
from data-driven approaches to adjustment
for confounding in order to minimize bias.8
Expedited evidence
Particularly for newly marketed medications
(but also for most other applications of big
data analyses) it is important that evidence
on effectiveness and safety be generated in
rapid cycles.9
Even if the precision of safety
estimates is somewhat limited for newly
marketed medications, providing further
feedback as early as possible will either serve
as reassurance or put regulators on notice.
Regularly updating trends in treatment
effect estimates through frequently refreshed
big data will make treatment recommenda-
tions or regulatory decisions less binary and
more iterative, as knowledge regarding the
effectiveness and safety of a medication
evolves throughout its life cycle.
Transparent and reproducible evidence
Ultimately, big data analyses should help
inform decision makers, who usually are
not the ones generating the evidence.
Because of the lack of standardization in
secondary data analytics, complete trans-
parency is critically important in the
reporting of analytic approaches and all
coding details. This will allow reproduction
of analyses, replication of findings using
different data sources, and ultimately
greater confidence in such analyses, possibly
approaching the trust we place in highly
controlled clinical trials.
What does big data in healthcare mean,
other than a lot of data? The three Vs that
are often used to characterize big data also
apply to healthcare data: volume of data;
variety of data types; and velocity of data
access.10
The analysis of such data requires
three more Vs in order to be impactful: valid-
ity of analytic approach; visibility of methods
and results; and the ability to vouch for
patient privacy/data security. Computational
bottlenecks have largely disappeared because
of the dramatically decreased cost of comput-
ing capacity and its on-demand availability
through cloud computing.
Overall, big healthcare data analytics to
improve therapeutic effectiveness and safety
of medications continues to both broaden its
scope of use and gain strength in its impact
on population-based evidence generation
and population management. The field has
matured to develop a clearer understanding
of the challenges it faces, ways to improve the
Figure 2 Most decisions in healthcare require insights that are above a certain evidence quality threshold in order to support causal interpretations of
associations between drug use and health outcomes.
PERSPECTIVES
264 VOLUME 99 NUMBER 3 | MARCH 2016 | www.wileyonlinelibrary/cpt
4. meaningfulness of inferences made from
massive amounts of data, and approaches to
interpreting results for decision-making.
However, there is still a strong need for data
scientists with thorough training in the con-
duct of principled analyses that minimize
bias. Although a new generation of software
products will support this effort, a deep
understanding of the source data and how it
was generated will remain critical to the suc-
cess of big healthcare data analytics.
CONFLICT OF INTEREST
Dr. Schneeweiss is consultant to WHISCON,
LLC, and to Aetion, a software company in which
he also owns equity. He is principal investigator
of investigator-initiated grants to the Brigham
and Women’s Hospital from Novartis, Genen-
tech, Boehringer Ingelheim, and Genentech
unrelated to the topic of this study.
VC 2015 ASCPT
1. Schneeweiss, S. & Avorn, J. A review of
uses of health care utilization databases
for epidemiologic research on therapeutics.
J. Clin. Epidemiol. 58, 323–337 (2005).
2. Pencina, M.J. & D’Agostino, R.B. Sr.
Evaluating discrimination of risk prediction
models: the C statistic. JAMA 314, 1063–
1064 (2015).
3. Shrank, W.H. A case for why health systems
should partner with pharmacies. Harvard
Business Review, 14 October 2015.
4. Schneeweiss, S. Developments in post-
marketing comparative effectiveness
research. Clin. Pharmacol. Ther. 82, 143–
156 (2007).
5. Tunis, S.R., Stryer, D.B. & Clancy, C.M.
Practical clinical trials: increasing the value
of clinical research for decision making in
clinical and health policy. JAMA 290,
1624–1632 (2003).
6. Schneeweiss, S., Rassen, J.A., Glynn, R.J.,
Avorn, J., Mogun, H. & Brookhart, M.A.
High-dimensional propensity score
adjustment in studies of treatment effects
using health care claims data.
Epidemiology 20, 512–522 (2009).
7. Suissa, S. Immortal time bias in pharmaco-
epidemiology. Am. J. Epidemiol. 167, 492–
499 (2008).
8. Laan, M.J. & Rose, S. Targeted Learning:
Causal Inference for Observational and
Experimental Data (Springer, New York,
NY, 2011).
9. Psaty, B.M. & Breckenridge, A.M. Mini-
sentinel and regulatory science—big data
rendered fit and functional. N. Engl. J. Med.
370, 2165–2167 (2014).
10. Douglas, I. The Importance of “Big Data”:
a Definition (Gartner, Stamford, CT,
2012).
The FDA’s Sentinel Initiative—A
Comprehensive Approach to
Medical Product Surveillance
R Ball1
, M Robb1
, SA Anderson2
and G Dal Pan1
In May 2008, the Department of Health and Human Services
announced the launch of the Sentinel Initiative by the US Food
and Drug Administration (FDA) to create the Sentinel System, a
national electronic system for medical product safety
surveillance.1,2
This system complements existing FDA
surveillance capabilities that track adverse events reported after
the use of FDA regulated products by allowing the FDA to
proactively assess the safety of these products.
The Sentinel System includes the Active
Postmarket Risk Identification and Analy-
sis (ARIA) system mandated by Congress
in the US Food and Drug Administration
(FDA) Amendments Act (FDAAA) of
2007. In addition, the Sentinel Initiative
created focused surveillance efforts around
vaccine safety using the Postmarket Rapid
Immunization Safety Monitoring (PRISM)
system,3
and supports regulatory review of
blood and blood products with its Blood
Surveillance Continuous Active Network
(BloodSCAN).
One of the first stages of the development
of the Sentinel System included Mini-
Sentinel, a pilot program launched in 2009
to test the feasibility of and develop the sci-
entific approaches needed for creating such
a national system.2
In 2014, the FDA began
transitioning from the Mini-Sentinel pilot
to the fully operational Sentinel System.
The Sentinel System will build upon the
successes of the Mini-Sentinel pilot4
and
leverage the Sentinel Infrastructure, a dis-
tributed database with a Common Data
Model to enable the creation of analytical
programs to be run remotely in participat-
ing data partner’s secure data environment
for analysis. The FDA is also seeking to
develop the use of the Sentinel Infrastruc-
ture for questions outside of safety surveil-
lance, but of importance to the FDA in
the protection and promotion of public
health. All these elements are defined in
Table 1.
Assessment of the Sentinel System’s
current capabilities
The Sentinel Program Interim Assessment
mandated by the Prescription Drug User
Fee Act (PDUFA) V concluded that “In
the implementation and execution of Mini-
Sentinel, FDA has met or exceeded the
requirements of FDAAA and ...PDUFA.”5
The report highlights several additional
accomplishments: (1) the establishment of
the Mini-Sentinel Operations Center;
(2) creation of a common data model and
distributed-data approach; (3) successful
development of processes for turning safety
concerns into queries of the Mini-Sentinel
data; and (4) making good progress toward
building a mature data analytics system.5
Other major accomplishments included
exceeding the FDAAA 2007 milestones
1
Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA; 2
Center for Biologics Evaluation and Research, Food
and Drug Administration, Silver Spring, Maryland, USA. Correspondence: R Ball (Robert.Ball@fda.hhs.gov)
doi:10.1002/cpt.320
PERSPECTIVES
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 3 | MARCH 2016 265