2168479015596020.full

Original Research
Extended Risk-Based Monitoring Model,
On-Demand Query-Driven Source Data
Verification, and Their Economic Impact
on Clinical Trial Operations
Vadim Tantsyura, MS, MA, DrPH1
, Imogene McCanless Dunn, PhD2
,
Joel Waters, MSCR, MBA3
, Kaye Fendt, MSBS4
, Yong Joong Kim, MS1
,
Deborah Viola, PhD5
, and Jules Mitchel, MBA, PhD1
Abstract
Background: Computer-aided data validation enhanced by centralized monitoring algorithms is a more powerful tool for data
cleaning compared to manual source document verification (SDV). This fact led to the growing popularity of risk-based moni-
toring (RBM) coupled with reduced SDV and centralized statistical surveillance. Since RBM models are new and immature, there is
a lack of consensus on practical implementation. Existing RBM models’ weaknesses include (1) mixing data monitoring and site
process monitoring (ie, micro vs macro level), making it more complex, obscure, and less practical; and (2) artificial separation of
RBM from data cleaning leading to resource overutilization. The authors view SDV as an essential part (and extension) of the data-
validation process. Methods: This report offers an efficient and scientifically grounded model for SDV. The innovative component
of this model is in making SDV ultimately a part of the query management process. Cost savings from reduced SDV are estimated
using a proprietary budget simulation tool with percent cost reductions presented for four study sizes in four therapeutic areas.
Results: It has been shown that an ‘‘on-demand’’ (query-driven) SDV model implemented in clinical trial monitoring could result in
cost savings from 3% to 14% for smaller studies to 25% to 35% or more for large studies. Conclusions: (1) High-risk sites (identified
via analytics) do not necessarily require a higher percent SDV. While high-risk sites require additional resources to assess and
mitigate risks, in many cases these resources are likely to be allocated to non-SDV activities such as GCP, training, etc. (2) It is not
necessary to combine SDV with the GCP compliance monitoring. Data validation and query management must be at the heart of
SDV as it makes the RBM system more effective and efficient. Thus, focusing SDV effort on queries is a promising strategy.
(3) Study size effect must be considered in designing the monitoring plan since the law of diminishing returns dictates focusing SDV
on ‘‘high-value’’ data points. Relatively lower impact of individual errors on the study results leads to realization that larger studies
require less data cleaning, and most data (including most critical data points) do not require SDV. Subsequently, the most sig-
nificant economy is expected in larger studies.
Keywords
risk-based monitoring, RBM, source document verification, SDV, data quality, site monitoring, clinical trials
Background: Current RBM Process
and Its Main Flaws
According to TransCelerate, the current RBM (Figure 1)
‘‘approach includes early and recurrent risk assessment, identi-
fication of Critical Data to be monitored for risk mitigation,
Off-site and Central Monitoring as the foundation, and target-
ing of On-site Monitoring visits.’’1
Suspicious (‘‘high risk’’) subjects or sites are determined by
statistical algorithms as a part of the central monitoring
1
Target Health Inc, New York, NY, USA
2
vTv Therapeutics LLC, High Point, NC, USA
3
PAREXEL International, Durham, NC, USA
4
Duke Clinical Research Institute, Durham, NC, USA
5
Center for Regional Healthcare Innovation, Westchester Medical Center,
Hawthorne, NY, USA
Submitted 11-May-2015; accepted 22-Jun-2015
Corresponding Author:
Vadim Tantsyura, MS, MA, DrPH, Target Health Inc, 261 Madison Avenue, NY,
USA.
Email: vtantsyura@targethealth.com
Therapeutic Innovation
& Regulatory Science
1-9
ª The Author(s) 2015
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/2168479015596020
tirs.sagepub.com

process, based on the FDA-recommended approach to focus on
‘‘sites with data anomalies or a higher frequency of errors, pro-
tocol violations, or dropouts relative to other sites.’’2
Grimes,3
Burgess,4
Landray,5
and Dudley6
provided examples of data
visualization tools facilitating the search for high-risk sites and
subjects. Lindblad et al7
also provided a list of criteria for high-
risk sites. Ning Li8
reported details of statistical data processing
in a typical case that involves central monitoring.
Methods underlying current RBM discussion require addi-
tional implementation details to support consistency and opti-
mization of the SDV processes. The weaknesses of the
current methodology are as follows:
1. Separation of RBM from data cleaning. In actuality, the
monitoring process is part of data-cleaning efforts and
should be examined as such.
2. Mixing data monitoring and site process monitoring (ie,
micro vs macro level) makes it more complex, obscure,
and less practical. ‘‘For example, if the Monitor identi-
fies a potential issue with lack of Investigator involve-
ment, there is no need to escalate the amount of SDV
since it is not a transcription issue’’ (TransCelerate1
).
3. Too many risk and quality factors make it difficult to
implement. Furthermore, analytical tools for site-specific
risk identification generate ‘‘lots of false positives,’’
which is an additional source of inefficiency (Torche9
).
Perfection and complexity are enemies of good and
practicality in this case.
4. ‘‘Risk scores’’ are often calculated as continuous vari-
ables when dichotomous or categorical assessment is
more practical. Continuous variables create a focus on
outliers and cause subjective categorization into ‘‘high
concern,’’ ‘‘moderate concern,’’ ‘‘mild concern,’’ and
‘‘no concern’’ data, and these probabilistic ‘‘risk’’ mod-
els are not cross-validated (Lindblad et al7
).
5. Current probabilistic models focus on site-level risks.
However, to eliminate waste, a more granular-level dis-
cussion around data point–level risks is necessary.
6. The majority of current central monitoring algorithms
are heavily dependent on sample size, thus making
them insensitive for smaller trials to be practical,i
espe-
cially at the beginning of a trial, when the data quality
(DQ) assessment is expected to occur (even for studies
with hundreds of subjects) (Torche9
). (Even statistics has
its limits, when the amount of data are limited!) This
inherent limitation of risk identification algorithms will
limit their usefulness and impede their ‘‘market penetra-
tion.’’ These algorithms are used primarily because (1)
no better alternative is available, (2) clinical research
operations are not trained to detect these fundamental
deficiencies (Alsumidaie10
), and (3) clinical operations’
resistance to rely on statisticians (Eric11
).
7. Prospective knowledge of ‘‘what is /what is not subject
of SDV’’ leads to reduced effort by the study site per-
sonnel, and lowers data quality.
Extended RBM Model
With the advent of the RBM paradigm, it was no longer neces-
sary to combine SDV with the GCP compliance monitoring.
The TransCelerate white paper was an important step in this
direction and welcomed separation of ‘‘critical data’’ and ‘‘pro-
cesses’’—the main building blocks of site monitoring.1
We
agree with TransCelerate’s argument that such division
‘‘enables companies to prioritize the high-value task of compli-
ance checks and de-prioritize the low-value task of checking
for transcription errors.’’ However, this division on data and
process level is not trivial and adds complexity.
In a similar fashion, the Extended RBM model (Figure 2)
differentiates between data point (micro level) and site/process
level (macro level) of monitoring. At the data point level (illu-
strated by the feedback loop at the top of Figure 2), data mon-
itoring is driven by queries and frequently results in data point
level changes. On the other hand, at the site level (illustrated by
the feedback loop at the bottom of Figure 2), monitoring is sub-
stantially different. It results in (1) identifying ‘‘high-risk’’ sites
and (2) mitigating such risks via additional site personnel train-
ing or process modification.
As Figure 2 illustrates, the process flow includes three major
steps and three sub-steps, the first three constitute central mon-
itoring. (For a detailed discussion on the relevant terminology,
see Appendix A.)
1. Central monitoring (team effort), including:
a. data validation/edit checks (by data managers
[DMs]),
b. statistical data surveillance (by statistics and DMs),
c. medical review (by qualified medical personnel)
2. Query management (by DMs, CRAs, site staff), and
3. On-site monitoring (by CRAs).
Planning/risk factor
idenƟﬁcaƟon
Centralized
monitoring
Target sites/
data points
On-site
monitoring
Figure 1. Traditional risk-based monitoring (RBM) model.
2 Therapeutic Innovation & Regulatory Science

This model demonstrates the role and power of data vali-
dation, statistical data surveillance, clinical/medical review,
query resolution, and on-site monitoring leading to optimal
resource allocation for data error correction. By incorporating
data validation into the data quality assessment at the earliest
stages of review and SDV, this model helps to uncover unne-
cessary redundancies and provides justification for scaling
down the SDV efforts toward optimal level.12
The three cen-
tral monitoring levels depicted in Figure 2 aim not only to
reduce data errors but also to identify protocol deviations,
scientific misconduct, and GCP noncompliance and ensure
that the protocol is being followed and the collected data are
in accordance with protocol objectives. The distinction
between these central monitoring levels is not in their objec-
tives, but in utilization of different tools and skill sets to
accomplish goals.
Finally, the Extended RBM model reflects on the more
complex and intelligent nature of RBM. It demonstrates the
increasing role of those who are trained in interpreting
‘‘errors that matter’’ (data experts) in planning and executing
monitoring activities. Thus, to take full advantage of RBM,
some job roles need to be redefined and training required.
Furthermore, this model prompts the change in quality
metrics utilization. Query rate will likely lose its appeal as
less informative relative to such metrics as query effective-
ness rates and the rates of data modifications (for multiple
categories: [1] overall data modification rate, [2] SDV-induced
data modifications, and [3] changes in key variable of analysis).
Most importantly, the Extended RBM model also demonstrates
the importance of ‘‘query’’ in the data cleaning and monitoring
process and suggests limiting SDV to the data points that are sub-
ject to query.
The subsequent discussion focuses primarily on the SDV
and other data point–level error identification/correction
components of the monitoring activities illustrated by
Extended RBM model while leaving other (process/site-level)
components out of scope.
New/Simplified Risk-Based SDV Method:
Laser Precision/Minimum Invasion
Our earlier paper (Tantsyura et al12
) and the evidence above
suggest that data validation and query management must be
at the heart of SDV, as it makes the RBM system effective
and efficient. Computer-aided data validation (enhanced by
centralized monitoring algorithms) is an inherently and
appreciably more powerful tool for data cleaning relative to
manual SDV (Tantsyura et al,12
Scheetz et al13
). Further-
more, since data point–level issue identification by the CRA
is not critical (TransCelerate,1
FDA,2
Mitchel 2011,14
Bako-
baki,15
Mitchel 201416
), we advocate for a model in which
SDV serves as the QC step for the ‘‘highly suspicious’’ data
points that are identified during previous (centralized moni-
toring) steps, such as data validation, statistical data review,
and surveillance and medical review. Finally, since queries
Study data
Central Monitoring
Level 1. Data Validation
Level 2. Statistical Data
Surveillancea
(not limited
to DQ indicators)
Level 3. Medical Review,
including AE dedicated
reviewb
Data Changes
Queries
Target (“High-
Risk”) Sites
Site Training &
Process Changes
Site Monitoring
SDV
Non-SDV components of
On-site monitoring,
including GCP and SDR
Trend
Expected Data Change;
Usually Single Discrepancy
-or-
Trend; Expected Site / Process
Change.
Single Discrepancy
Figure 2. Data cleaning and monitoring flow: Extended risk-based monitoring (RBM) model.
Tantsyura et al 3

typically involve 7% to 8% of data points (TransCelerate1
),
focusing SDV efforts on queries has the potential to drop
SDV effort by 92% to 93% without a noticeable risk
increase.
The proposed process and implementation approach is pre-
sented in Tables 1 and 2 and Appendix B. The first step in
the proposed approach is ‘‘planning’’ that includes document-
ing the key data points. The next step is examination of these
data points from the potential data discrepancy perspective
and dividing them into 3 categories: (a) those that can and
will be ‘‘cleaned’’ exclusively via edit checks and other sta-
tistical computer-aided methods, (b) those that will require
manual review and manual queries, and (c) those that cannot
be cleaned using methods a and b but are important enough
and will require SDV to identify the potential discrepancies
(some study eligibility criteria, for example). In cases where
errors are easily detectable by computer algorithms (edit
checks, method a), there is no need for SDV other than of
queried data points. Subsequently, if the ‘‘medical review’’
(method b) is perceived as being the most effective in identi-
fying data discrepancies for particular data points, then these data
points should be crossed off the SDV list and performed only for
queried data. In case error detection is noncomputerizable or can-
not be identified via remote medical review (various types of pro-
tocol violations, for example; method c), the study team should
develop (prospectively) a list of data points that require thorough
investigation (including manual SDV). Finally, the impact of the
study size must be assessed.
Contrary to the historical monitoring approach, when SDV
preceded data validation, in the Extended RBM model, SDV
serves primarily as a logical extension or a sub-step of the
query management process. At the same time, this proposal
Table 1. Categories of data by risk of nonidentifying discrepancies without SDV.
Method Critical Data Primary Method of Data Cleaning SDV Approach Examine Study Size Effect on DQ
A Yes Edit checks and statistical data surveillance SDV only queries Consider further reduction of SDV
if study size is large enoughB Yes Medical review SDV only queries
C Yes Noncomputerizable but important enough
(some inclusion criteria, some protocol violations)
100% SDV
No Only edit checks 0% SDV or queries Not applicable
Abbreviations: DQ, data quality; SDV, source document verification.
Table 2. Proposed SDV approach.
Study Size, N (Patients Enrolled)a
Recommended % SDVb
SDV Targets
Ultra-small
(0-30)
100 100% SDV all data
Small
(31-100)
Typically 10-20 All queries
100% SDV of Screening Baseline visitsc
AEs/SAEs (TBD)
Medium
(101-1000)
Typically 5-7 All queries (queries leading to data changes could be considered)
ICF, Incl/Excl, TBD
SAEs (TBDd
)
Large
(1000þ)
Typically 0-1 TBD (‘‘SDV of key queries’’e
is recommended; ‘‘Remote SDV’’f
and ‘‘No SDV’’ are viable alternatives too)
Abbreviations: AEs, adverse events; ICF, inform consent form; SAEs, serious adverse events; SDV, source document verification; TBD, to be decided on a case-by-
case basis.
a
Ranges are illustrations only.
b
With the exception of ultra-small studies, the % SDV can be estimated as %1000/N.
c
Monitoring screening and baseline visits for small and medium-sized studies is driven by these factors: (a) ICF, eligibility criteria data that is captured at screening;
(b) losing baseline means losing patient for analysis; and (c) early error detection allows early interventions, such as additional training or adjustment of the process.
d
The perceived value of this step expected to diminish over time.
e
Key queries must be determined by study teams on a case-by-case basis (prospectively, if possible). For example, the team might decide to SDV only the queries
issued on the primary and secondary analysis variables.
f
Remote SDV is a less expensive monitoring technique when images of (certain) source documents are reviewed remotely; cost savings are primarily due to
reduction of the travel and ‘‘on-site’’ monitoring cost and time. This SDV approach, where study site personnel upload ICF for remote access by CRA, is rarely
used but is gaining popularity (Dillon and Zhou18
).
4 Therapeutic Innovation Regulatory Science

does not contradict GCDMP recommendation that ‘‘source
data verification (SDV) may be used to identify errors that are
difficult to catch with programmatic checks’’ (GCDMP17
). It
just reduces the importance of this step according to its real
value. Table 2 provides the implementation details for the pro-
posed on-demand query-driven approach to SDV.
First, this model allows reducing SDV (dictated by law of
diminishing returns driving the process) from 100% for ultra-
small studies to virtually 0% for large studies allowing for
the ability to intelligently eliminate waste. (For a detailed dis-
cussion on study size effect, see Tantsyura et al.12
) Second,
this model overcomes monitors’ concern that reduced SDV
models (especially when the SDV model is prospectively
known to the site staff) might lead to reduced data quality.
Queries (which drive the SDV process) are fairly random,
so site staff will not know which data points will be moni-
tored, and this model will not lower their data collection
quality. A limitation inherent in any system is the element
of human error, and SDV reduction leads to less reliance
on human review, lower variability, and higher data quality.
Third, reduced SDV creates an opportunity for ‘‘remote data
review’’ for medium to large studies, leading to reduced
travel time and cost. Fourth, this model provides the flexibil-
ity to adopt recommendations by TranCelerate’s position
paper: ‘‘use of Risk Indicators and Thresholds—identification
of key performance indicators in a process control system
(PCS) environment to track site and study performance; and
adjustment of monitoring activities based on the issues and
risks identified throughout the study—adaptive, real-time
modification of SDV and other monitoring tools.’’1
Finally, a tailored prospective monitoring plan, which con-
stitutes a significant change relative to the existing processes
and organizational habits, is the most crucial component of the
successful RBM implementation. Cross-functional collabora-
tion, education, and formal change management are essential
in order to overcome the organizational resistance and acceler-
ate the RBM adoption.
Economic Impact
Figure 3 lays out the overall monitoring effort reduction asso-
ciated with the proposed SDV method relative to 100% SDV.
Monitoring effort is displayed as a combination of (1) GCP
compliance/process monitoring, (2) source document review
(SDR), (3) SDV, and (4) nominal increase in central monitor-
ing planning efforts and training, including assurance that the
site personnel are trained and following the protocol. The SDV
category reduction is the most dramatic one. In addition to
travel expense, the savings include CRA travel time and CRA
time on-site that is saved because fewer visits are required to
SDV the reduced data percent (less than 8% on average).
Table 3 presents project cost savings estimates that were
modeled using a leading contract research organization’s
(CRO’s) proprietary price estimation tool. Cost savings relative
to 100% SDV were modeled for the lower and upper limits of
3 study subject sample size ranges in 4 therapeutic areas
(oncology, cardiovascular, neurology, and endocrine). Study
variables for the 48 scenarios included the following: screening
factor, enrollment rate, CRF pages per subject, SDV time per
CRF page, study subjects per site, study timeline periods (eg,
treatment period), etc. Only percentage reductions are reported.
For more details, please contact the authors.
The cost simulations presented in Table 3 (and also using data
presented by DiMasi,19
Adams,20
and Katin21
) allow estimating
the total industry savings in excess of 18% of total US pharma-
ceutical clinical research spending (US$9 billion per year).
Contact the lead author for calculation details.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
TradiƟonal (100% SDV)
Effort
SDV Model
Effort Change by Category
AddiƟonal effort
(planning/central monitoring)
SDR
SDV
GCP Compliance / Process
Monitoring (no effort
reducƟon)
New (query-driven SDV)
Figure 3. Monitoring effort reduction (large hypothetical study).
Tantsyura et al 5

Conclusion
The proposed model differs from traditional RBM in that it
merges data validation and centralized monitoring as a 3-level
process of data validation, statistical data surveillance, and clin-
ical/medical review to ensure high data quality, identify proto-
col deviations and signs of scientific misconduct or Good
Clinical Practice (GCP) noncompliance, and ensure the data are
in accordance with protocol objectives. These three levels uti-
lize different tools and skill-sets to accomplish these goals.
It is important to realize that ‘‘high-risk’’ sites (identified via
analytics) do not necessarily require higher percent SDV. High-
risk sites will require additional resources to assess and mitigate
risks;however, inmany cases these resources are likelytobe allo-
cated to non-SDV activities (such as GCP, SDR, training, etc).
Utilizing a ‘‘hierarchy of errors’’ as well as an ‘‘absence of
errors that matter’’ data quality definition, data points identi-
fied as potentially discrepant (ie, subject to queries) carry the
highest (data point level) value. Focusing SDV effort on
queries is a promising strategy, and further optimization is pos-
sible via reduction of the number of ‘‘noncritical’’ queries
when DMs and clinical operations are sufficiently trained and
understand the query source and content.
The prevailing belief that all critical data require SDV is
unfounded. Study size effect must be considered in designing
a monitoring plan since the law of diminishing returns dictates
focusing SDV on ‘‘high-value’’ data points.
Similar to the variability in SDV percentage, most signifi-
cant economy is expected in large studies. Expected savings
from the proposed method is up to 43% to 63% of monitoring
cost (22%-35% of total study budget). For the small studies
(100 subjects), the expected savings are smaller, 16% to
33% (or 3%-14% of total study budget).
There is plenty of important work left for monitors. The
new paradigm offers less travel and more focus on science and
the site while keeping CRA accountability for the site’s over-
all quality and productivity. In addition to queries, focusing
monitoring effort on training and protocol adherence, identifi-
cation of protocol violations, identification of missing data
and un-reported events, and other data not easy to review via
computer (eg, ICF and some eligibility criteria) is a better use
of a CRA’s time. This proposal is consistent with the FDA
RBM guidance2
and will ultimately lead to overall higher data
quality.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to
the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, author-
ship, and/or publication of this article.
Note
i. For example, one of the popular tools on the market today requires
3 subjects per site, with 15 to 20 centers as a minimum (with excep-
tion of oncology).
References
1. TransCelerate. Position paper: Risk-based monitoring methodology.
2013. http://www.transceleratebiopharmainc.com/wp-content/
uploads/2013/10/TransCelerateRBM-Position-Paper-FINAL-
30MAY2013.pdf. Accessed April 16, 2015.
2. Food and Drug Administration. Guidance for industry: oversight
of clinical investigations—a risk-based approach to monitoring.
August 2013. http://www.fda.gov/downloads/Drugs/ . . . /Gui-
dances/UCM269919.pdf. Accessed April 16, 2015.
Table 3. Estimated cost savings.
Simulated Cost Savings Relative to 100% SDV, % c,d,e
(Monitoring Cost Reduction / Total Trial Cost Reduction)
Study Size, Na
Recommended
% SDVb
Hypothetical Typical
Oncology Study
Hypothetical
Typical CV Study
Hypothetical
Typical CNS Study
Hypothetical Typical
Endocrine / Metabolic Study
Ultra-small (0-30) 100 0 0 0 0
Small (31-100) 15 26-29 / 7-14 24-33 / 5-12 21-30 / 4-11 16-23 / 3-8
Medium (100-1000) 5-7 49-52 / 22-31 46-53 / 14-26 40-44 / 13-21 38-42 / 12-23
Large (1000þ) 0-1 62-63 / 34-35 58-59 / 29-30 50-51 / 26-27 43-44 / 22-23
Abbreviations: CNS, central nervous system; CV, cardiovascular; SDV, source document verification.
a
Ranges are illustration only.
b
Midpointswereusedtocalculatecostsavings.Exclusively‘‘papersource’’areassumedforthecalculations.WhenePRO,DDE,EMR,orothertypesofe-Sourceareused,
SDV is considered to be eliminated for them.
c
Travel expenses are the primary cost driver for site monitoring.
d
Low/high range limits (n ¼ 2) Â sample size ranges (n ¼ 3) Â therapeutic areas (n ¼ 4) Â SDV levels (n ¼ 2) ¼ 48 scenarios.
e
Scenario variables were developed as simple averages from a sample of studies available to the research team. Note: the total study cost excluded investigator
grants, central laboratory, and study product preparation and distribution.

3. Grimes I. Leveraging statistical tools when designing risk-based
monitoring plans. Presented at: CBI’s Risk-Based Approaches
to Clinical Investigations; April 11, 2012.
4. Burgess M.Lessismore: risk-basedmonitoring of siteperformance.
ICON Insight, Vol 13, May 2013. http://www.iconplc.com/icon-
files/insight-newsletter/June13/lessismore.html. Accessed April
16, 2015.
5. Landray M. Clinical trials: rethinking how we ensure quality. Pre-
sented at: DIA/FDA webinar; July 22, 2013.
6. Dudley B. Risk-based monitoring: operational application. Pre-
sented at: DIA webinar; May 15, 2014.
7. Lindblad AS, Manukyan Z, Purohit-Sheth T, et al. Central site
monitoring: results from a test of accuracy in identifying trials and
sites failing Food and Drug Administration inspection. Clin
Trials. 2014;11:205-217.
8. Ning L. Clinical trial monitoring, auditing and inspection work-
shop—FDA, SFDA and industry perspective. Presented at: 2nd
DIA China Annual Meeting; May 16-19, 2010. http://www.
diahome.org/productfiles/22993/ws/3/w3%2004_ning%20li.pdf.
Accessed April 16, 2015.
9. Francois T. CluePoint. Presented at: Annual DIA Conference
(session 109); June 16, 2014.
10. Alsumidaie M. The emergence of the centralized monitor.
Appl Clin Trials. 2013 Nov. http://www.appliedclinicaltrial-
sonline.com/emergence-centralized-monitor. Accessed April
16, 2015.
11. Eric A. Presented at: Annual DIA Conference (session 109); June
16, 2014.
12. Tantsyura V, McCanless Dunn I, Fendt K, Kim YJ, Waters J,
Mitchel J. Risk-based monitoring: a closer look at source docu-
ment verification (SDV), queries, study size effects and data qual-
ity. Therapeutic Innovation Regulatory Science. DOI:10.1177/
2168479015586001. Published online May 25, 2015.
13. Sheetz N, Wilson B, Benedict J, et al. Evaluating source data ver-
ification as a quality control measure in clinical trials. Therapeu-
tic Innovation Regulatory Science. 2014;48:671-80.
14. Mitchel JT, Kim YJ, Choi J, et al. Evaluation of data entry errors
and data changes to an electronic data capture clinical trial data-
base. Drug Inf J. 2011;45:421-430.
15. Bakobaki JM, Rauchenberger M, Joffe N, McCormack S, Stenning
S, Meredith S. The potential for central monitoring techniques
to replace on-site monitoring: findings from an international
multi-centre clinical trial. Clin Trials. 2012;9:257-264.
16. Mitchel JT, Kim JY, Hamrell MR, et al. Time to change the clin-
ical trial monitoring paradigm: results from a multicenter clinical
trial using a quality by design methodology, risk-based monitor-
ing and real-time direct data entry. Appl Clin Trials. 2014.
http://www.appliedclinicaltrialsonline.com/time-change-clinical-
trial-monitoring-paradigm. Accessed April 16, 2015.
17. Society for Clinical Data Management Good Clinical Data Manage-
ment Practices (GCDMP), Measuring Data Quality chapter, 2008.
18. Dillon C, Zhao W. A comparison of the effectiveness of on-site
and central monitoring activities across six phase III multi-
center clinical trials. Presented at: SCT conference; May 20, 2014.
19. DiMasi JA, Hansen RW, Grabowski HG. The price of innovation:
new estimates of drug development costs. J Health Econ. 2003;
22:151-185.
20. Adams CP, Brantner W. Estimating the cost of new drug develop-
ment: is it really 802 million dollars? Health Aff (Millwood).
2006;25:420-428.
21. Kaitin KI. Deconstructing the drug development process: the new
face of innovation. Clin Pharmacol Ther. 2010;87:356-361.
22. ICH Harmonised Tripartite Guideline: Statistical Principles for
Clinical Trials E9. February 1998. http://www.ich.org/filead
min/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E9/
Step4/E9_Guideline.pdf. Accessed November 09, 2014.
Appendix A. Terminology
Data Validation utilizes real-time on-line edit checks (approx-
imately 90%) and off-line post hoc edit checks (approximately
10%) programmed in SAS or other reporting or visualization
tools. The process is owned by DM from specifications through
execution with some help from other functions. It is the most
traditional and standard part of the process.
Statistical Data Surveillance is scientifically based and can
be viewed as data validation on steroids. Complex data mining
algorithms are identified and programmed under leadership of
statisticians. Interpretation of results might require statistical
and clinical expertise prior to issuing a query (by the DM) or
otherwise recommended actions.
Medical Review is used to address the specific areas where
even complex algorithms fall short. Those are areas of highly
specialized medical knowledge, where modern programming
capabilities are insufficient and the task requires medical
review of data by an expert. The most common example of
such review would be review of Suspected Adverse Drug Reac-
tion Reports (also called CIOM forms) or review of listings of
adverse events (AEs) versus Concomitant Medications versus
Medical History in order to identify underreported AEs.
Query management is defined (in CDISC Clinical Research
Glossary, version 6) as ‘‘ongoing process of data review, discre-
pancy generation, and resolving errors and inconsistencies that
arise in the entry and transcription of clinical trial data.’’ Query
itself is defined by CDISC Clinical Research Glossary (version
6) as ‘‘a request for clarification on a data item collected for a
clinical trial; specifically a request from a sponsor or sponsor’s
representative to an investigator to resolve an error or inconsis-
tency discovered during data review.’’ For the purpose of this
discussion, it is important to realize that not all queries are cre-
ated equal—some of them lead to change from ‘‘erroneous’’ to a
‘‘true’’ value and some lead to no change. Some of them involve
Tantsyura et al 7

critical data points that might impact the study results and some
have no impact on the study results. Regardless this distinction,
here are four reasons for query being the most critical instrument
of the modern data cleaning process. First, query is a focal point
of the detective work by DMs and CRAs. If a data point is ‘‘lead-
ing to query,’’ it is 30 to 100 times (our estimate) more likely to
be erroneous (aka ‘‘risky’’) than non–‘‘leading-to-query’’ data
points. Second, query is a very effective data correction tool.
If implemented properly, it leads to data corrections in 40%-
80% of cases. Third, it is an efficient mechanism, and the cost
of query in EDC is low (relative to other clinical trial operational
costs). Also, on average, only a tiny portion of key data (Trans-
Celerate,1
Mitchel 201416
) are queried. Finally, documentation
of data changes (what was changed, when, who and why, as well
as the preservation of the original entry and sign-off
by investigator) is required by regulations (21 CFR 11.10(e)).
EDC-enabled query management systems provide efficient
means for such documentation. All 4 reasons above make such
a small part of the clinical trial process as query, a crown jewel of
the data cleaning and monitoring process. In this context, one
may borrow a phrase from Sherlock Holmes: ‘‘the little things
are infinitely the most important!’’
On-site monitoring is the last step of the process, which at a
minimum includes ‘‘targeted on-site visits to higher risk clinical
investigators (eg, where centralized monitoring suggests prob-
lems at a site)’’ (FDA2
). The following are typically tracked dur-
ing on-site monitoring:
Compliance with GCP
Compliance with Protocol requirements and identify rea-
sons for protocol violations (including proper equipment)
Reasons for high or low drop outs
Training and quality of staff, staff turnover
Systematic deficiencies and provide solutions to resolve
them
Fraud
Data quality
The first 5 items cannot be performed by computer and thus
will stay largely unchanged over the near future. On the other
hand, the last 2 (italicized) items, fraud identification and
checking for data quality, if facilitated by computers leveraging
power of statistical algorithms, produce appreciably better
results for a tiny fraction of cost.
On-site monitoring could be viewed as a combination of
3 discrete activities: Source Data Verification (SDV), Source
Data Review (SDR), and GCP Compliance/(Site) Process Mon-
itoring. The TransCelerate position paper1
helps to show the
distinction between SDV and SDR. ‘‘SDV is the process by
which data within the CRF or other data collection systems are
compared to the original source of information (and vice versa)
to confirm that the data were transcribed accurately (ie, data
from source matches data in the CRF or other system and vice
versa). SDR involves review of source documentation to check
quality of source, review protocol compliance, ensure the crit-
ical processes and source documentation (eg, accurate, legible,
complete, timely, dated) are adequate, ascertain investigator
involvement and appropriate delegation, and assess compli-
ance to other areas (eg, SOPs, ICH GCPs). SDR is not a com-
parison of source data against CRF data. SDR is necessary to
evaluate areas that do not have an associated data field in the
CRF or system available for more timely remote review’’
(TransCelerate1
).
Finally, one might reasonably ask: what is the role of the
‘‘blind review’’22
and the ‘‘centralized monitoring’’ (FDA1
)
in this model. Here is our response.
‘‘Centralized monitoring is a remote evaluation carried out
by sponsor personnel or representatives (eg, clinical monitors,
data management personnel, or statisticians) at a location
other than the sites at which the clinical investigation is being
conducted. Centralized monitoring processes can provide many
of the capabilities of on-site monitoring as well as additional
capabilities’’ (FDA2
). In all the proposed RBM methods, a sta-
tistical/aggregate look at the inconsistencies is the most critical
step of the process (very much as long advocated by ICH E9
[1998]22
‘‘blind review’’). Thus, in our ‘‘extended RBM
model,’’ centralized monitoring is a combination of Level 2
‘‘Statistical Data Surveillance’’ and Level 3 ‘‘Clinical
Review.’’
Blind review is defined in ICH E9 as ‘‘The checking and
assessment of data during the period of time between trial com-
pletion (the last observation on the last subject) and the break-
ing of the blind, for the purpose of finalizing the planned
analysis.’’22
Based on this definition, centralized monitoring
could be viewed as ongoing ‘‘blind review’’ process that starts
long before trial completion.

Appendix B. SDV Target Identification
Figure A1. SDV data point selection decision tree.
Tantsyura et al 9

2168479015596020.full

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to 2168479015596020.full

Similar to 2168479015596020.full (20)

2168479015596020.full