2. 2 Future Med. Chem. (2016) 8(1) future science group
Editorial Lushington & Chaguturu
negligence or other more ignorant/innocent flaws, the
problems ultimately arise from a misguided system of
incentives. Proposal funding and manuscript publi-
cation rely upon peer review according to a series of
key questions: What is the prospective impact of the
proposal/report? How novel is the study? What is the
reputation of the authors? How feasible/believable is
the study?
Although seemingly appropriate, these adjudication
metrics are clearly no longer adequate for safeguard-
ing the integrity of our scientific knowledge base. In
recent years, peer-evaluated scientific research has been
scrutinized relative to the all important arena of real-
world outcomes, and has been found to harbor nearly
endemic flaws. The precise quantification of irrepro-
ducibility rates remains a challenge, ungoverned by uni-
versal evaluation metrics. However, a variety of recent
surveys have emerged, which suggest that between 50
and 90% of key findings in basic science studies have
questionable reproducibility [1,9], and similar statistics
apply to clinical observational assessments [10].
Even discounting the economic cost, there are seri-
ous societal implications to this failure rate. For exam-
ple, in clinical settings, irreproducibility could mean
the difference between a drug that is safe and effec-
tive, versus one with questionable efficacy or safety.
In the preclinical arena, the long-term consequences
may actually be even worse in that a tainted scientific
study that is exposed may reduce public confidence
(and potentially also public funding) in science, but
one that remains undetected may have effects even
more deleterious, by degrading the knowledge base
upon which countless future studies may unwittingly
attempt to build. Until systemic trust can be re-estab-
lished, we are bound to miss many tantalizing oppor-
tunities to advance the human condition because our
jaded society simply does not know what to believe.
Distinguishing sense from nonsense thus becomes an
exercise in futility.
The recent surge of public commentary decrying
methodological failures in funded, published science
has prompted a spate of recommendations aimed
at slowing and correcting this trend. A sample of
commonsense recommendations include:
• Ensure global accessibility to detailed protocol
information, raw data, metadata and numerical
manipulations [4]
• Increase the amount of funding available for
reproducibility studies [11]
• Implement more exacting standards for statistical
evaluation [10]
• Update and disseminate standard best practices
for the design and validation of experiments [12]
[ Vaux D, pers. comm. ]
• Digitally scrutinize gel blots, microscopic images,
etc., for tampering and misuse [4]
• Evaluate and curate technical protocols
independently of manuscripts and proposals [4]
• Provide effective safeguards to protect those who
report suspected fraud [4,5]
• Make principle investigators more accountable for
what they publish [4]
• Empower law enforcement to investigate suspected
fraud and prosecute fittingly [4]
While such corrective measures make sense, apply-
ing each measure individually is like pasting many
small bandages over a gaping wound. Why not directly
integrate a more rigorous mindset into the mechanisms
for funding and publishing research results? To illus-
trate this, let us scrutinize some key examples of the
kinds of considerations that often play substantial roles
in manuscript or proposal review, with mind to find-
ing opportunities to refine the peer review mentality in
ways that better foster reproducibility.
Scientific impact
Funding panels assess proposals for relevance to key
topics and applications identified by the agency, but
most final funding decisions hinge on the question
of how interesting and important the project sounds,
and what chance there is for producing big break-
throughs. Similarly, manuscripts with sensational
prospective implications tend to rise above those that
are merely relevant, realistic and interesting.
The impact measure is familiar to most research-
ers and probably sounds superficially reasonable, but
it is not optimally aligned to the fundamental sci-
entific objective of advancing our core knowledge.
Proposal and publication approval slants excessively
toward studies that confirm original hypotheses,
overlooking the fact that honest reporting of negative
research results is invaluable in guiding future gen-
erations of scientists away from unproductive lines of
investigation and instead toward alternative paths of
study. As well, the focus on well-articulated hypoth-
eses itself may be damaging our scientific prospects.
While many NIH (US) program officers once deni-
grated the ‘look and see’ type of proposal (e.g., a
broader based screening platform capable of produc-
ing relevant insight without preconceived notions of
eventual findings), stated hypotheses often become
3. www.future-science.com 3future science group
Biomedical research: a house of cards? Editorial
self-fulfilling, although not necessarily in a replicable
manner.
Aspects of the above impact concept remain use-
ful, but evaluation should be broadened to assess the
chance that a proposed study will shape the landscape
of scientific understanding, regardless of whether
the study is based on explicit hypotheses, and even
if hypotheses are ultimately disproven. Similarly, in
an ideal world, manuscripts reporting negative find-
ings would be commended and advanced to publica-
tion as long as they are likely to provide beneficial
guidance to subsequent research.
Novelty
Innovation is justifiably fetishized in the arts, but
does an obsession with funding and publishing
transformative science actually reflect sound strat-
egy? On a very basic level, if publication in high-
est echelon journals is conditional upon reporting
unprecedented conclusions, might this not slant the
unwitting scientist toward data interpretations that
are more unusual, even if more mundane principles
could have been inferred? As well, if a proposed new
study employs techniques or applications that differ
radically from prior studies, does that make it any
more likely to produce practical new knowledge
than would incremental progress based more directly
on prior understanding? If a manuscript reports a
truly transformative discovery or the application of
a wholly new technique, how many suitable review-
ers will actually be able to produce reliable, insight-
ful peer assessments of the technical fidelity of the
work? If a high degree of apparent novelty introduces
greater risks in achieving reasonable data interpreta-
tion and rigorous validation, then how does this not
exacerbate the irreproducibility problem?
Novelty stirs the human imagination and can
engender great enthusiasm, but one need to look
no further than the stock market to understand the
perils of unchecked enthusiasm. Science sometimes
does advance in quantum leaps that vault past exist-
ing building blocks, and we should encourage such
prospects, just as we may hold speculative penny
stocks in our retirement portfolio. However, if we
aspire to live comfortably into our old age, we should
look for healthy balance both in our personal finan-
cial portfolio and in our global biomedical science
investments.
Investigator reputation
Despite a strong commitment to objectivity, human
nature frequently plays a subtle role in favoring par-
ties with whom we have familiarity. If a reviewer has
read papers by scientist X, but is unfamiliar with sci-
entist Y, it may be human nature to unconsciously
assume X has produced higher quality research than
Y. By this measure, a scientist who averages 20 publica-
tions per year is more likely, by virtue of sheer name
recognition, to inspire more karmic confidence than
one who publishes only twice per year. But, is it logi-
cal to expect that the hyperpublishing scientist will
truly dedicate to every publication the intensive self-
scrutiny necessary to foster consistently reproducible
research? Is the hypopublisher much more careful, or
just unproductive?
The intelligent response is, who knows? Degree of
exposure, number of publications and funding his-
tory are no guarantees of reproducible research. The
best metric for projecting future reproducibility is
a carefully articulated validation plan. If one really
truly wants to encode a track record metric, then
our discipline should begin objectively tracking how
reproducible a scientist’s past work has proven to be!
Feasibility & believability
The metrics of feasibility and believability encoded
into current proposal and manuscript reviewing do
have strong relationships with reproducibility. Unfor-
tunately these criteria rarely carry the weight of the
first three factors, yet we ignore them at our own peril.
If a graduate student across the hall reported that he
has just created new pluripotent stem cells by dosing
adult mouse cells with an acidic solution, who would
believe him? So how could a prestigious journal, such
as Nature, have accepted an article by a prominent
research laboratory making precisely that claim [13,14]?
Might this arise because too many reviewers value
impact, novelty and reputation above believability?
In addition to being afforded more weight, feasibil-
ity and believability metrics should be more rigorously
specified. Clear and correct reporting of statistics,
universally available raw data and rigorous, well-artic-
ulated protocols for external validation are all essen-
tial. Such rigorous documentation is a profound dem-
onstration of good faith, and can salvage value from
studies in which errors do crop up, by providing the
means for others to readily detect or correct problems
rather than leaving them to fester in our knowledge
base.
The above revisions to evaluation criteria amount
to recognition of a problem, and to a preliminary
attempt to address it. The well-intended reward sys-
tem for fostering biomedical research has been too
slow in adjusting to the psychological, statistical and
analytical complexities of 21st century science and is
beginning to fail. The aforementioned recommenda-
tions for modifying evaluations criteria are intended
to spur dialog to re-introduce a healthy balance into
4. 4 Future Med. Chem. (2016) 8(1) future science group
Editorial Lushington & Chaguturu
the prioritization process. On a more basic level, this
balance includes the following exhortations:
Publish meaningful negative results!
The obsession with reporting transformative break-
throughs has been denying us the benefits of the
foundational ‘process of elimination.’ Discourse on
disproven hypotheses and the quirky means by which
experiments may fail are shunned by journals and incur
deep stigma in grant progress reports. Ultimately, if the
culture shift required to promote the valuable disclo-
sure of negative findings proves to be too monumental
for our current publishing enterprise to foster, perhaps
a vigorous promotion of alternative dissemination
modes such as centrally supported, self-publication
portals may prove adequate.
Solid fundamentals
In our era of hyperspecialized subdisciplines, the cult
of scientific transformation is producing research tech-
niques that are so esoteric that we risk losing any real
expectation of rigorous peer review. Junior scientists
spend more and more training on sophisticated equip-
ment, at the expense of simple, core lab protocol and fun-
damental grounding in principles of experiment design,
data interpretation and validation. Is it any wonder
that faulty materials and reagents are the single greatest
cause of research irreproducibility [1]? Ultimately, greater
sophistication in research methodology will be critical
to future achievements, but technological advance only
serves us as long as those who apply new technology are
fully able to grasp not only its strengths, but also the
limitations and intricate ways that it may fail or produce
misleading results.
Show your work
The ever-shrinking research proposal/application length
has degraded the ability of reviewers to reliably assess
fundamental procedural and numerical detail, so who
truly knows whether the proposed methodology is via-
ble? Competition for print space in many high-impact
journals is squeezing publications into a capsular format
that glosses over critical technical information, thus defy-
ing external validation. Many scientific validation stud-
ies fail not because of mistakes in the original study, but
rather from incomplete protocol specifications available
for those trying to reproduce the study [1,13]. Journals that
are addressing this issue should be lauded, while those
who yet lag in this rigor should be gently encouraged to
enhance the informational content of their publications.
Furthermore, journal editors and program officers
would do our community a great informational service
by publishing original critiques and author responses
for future reference [15,16].
The next time a retraction catches our attention, cer-
tainly we can look at the study to see what the authors
did wrong. But we should also ask ourselves, what was
it that they were not given adequate encouragement to
do it right?
To illustrate the precarious situation we find our-
selves in, imagine the collapse of a high-rise housing
complex and the associated risk to human life. Whom
do you have to blame: the building contractor, faulty
materials, poor workmanship, the handyman, the
building inspector or the imprecise building codes?
Applying this scenario to the current biomedical
research ‘house of cards’ should send tremors down
the spines of all those involved. The house has not
yet completely collapsed, but there is no time like the
present to set it right.
Financial & competing interests disclosure
The authors have no relevant affiliations or financial involve-
ment with any organization or entity with a financial inter-
est in or financial conflict with the subject matter or mate-
rials discussed in the manuscript. This includes employment,
consultancies, honoraria, stock ownership or options, expert
testimony, grants or patents received or pending, or royalties.
No writing assistance was utilized in the production of this
manuscript.
References
1 Freedman LP, Cockburn IM, Simcoe TS. The economics
of reproducibility in preclinical research. PLoS Biol. 13(6),
e1002165 (2015).
2 Report for selected country groups and subjects. World
Economic Outlook. International Monetary Funds.
http://probeinternational.org
3 Lushington GH, Chaguturu R. A systemic malady: the
pervasive problem of misconduct in the biomedical sciences.
Part I: issues and causes. Drug Discovery World 16 , 79–90
(2015).
4 Lushington GH, Chaguturu R. A systemic malady: the
pervasive problem of misconduct in the biomedical sciences.
Part II: detection and prevention. Drug Discovery World 15,
70–82 (2015).
5 Gunn W. Reproducibility: fraud is not the big problem.
Nature 505, 483 (2014).
6 Aschwanden C. Science isn’t broken. It’s just a hell of a lot
harder than we give it credit for. Five Thirty Eight.
http://fivethirtyeight.com/features/science-isnt-broken
7 von Bubnoff A. Special Report. Biomedical research: are all
the results correct? Burroughs Wellcome Fund.
www.bwfund.org
8 Fang FC, Steen RG, Casadevall A. Misconduct accounts for
the majority of retracted scientific publications. Proc. Natl
Acad. Sci. USA 109(42), 17028–17033 (2012).
5. www.future-science.com 5future science group
Biomedical research: a house of cards? Editorial
9 Pritsker M. Studies show only 10% of published science
articles are reproducible. What is happening?
www.jove.com
10 Young S, Karr A. Deming, data and observational studies.
A process out of control and needing fixing. Significance 8,
116–120 ( 2011).
11 Iorns E.
http://blog.scienceexchange.com
12 Begley CG, Ellis LM. Nature 483(7391), 531–533 (2012).
13 von Bubnoff A.
www.bwfund.org
14 Obokata H, Wakayama T, Sasai Y et al. Nature 505, 641
& 676 (2014).
15 Chaguturu R. Scientific misconduct, editorial. Comb. Chem.
High Throughput Screen. 17(1), 1 (2014).
16 Chaguturu R. Collaborative Innovation in Drug Discovery:
Strategies for Public and Private Partnerships. Wiley and Sons,
NY, USA (2014).