Reproducible research: First steps.

Richard Layton
May 6, 2015
First steps towards reproducible research

Credibility turns on the success or failure
of attempts to reproduce findings.
Kenneth Rogoff &
Carmen Reinhart
In economic models
• coding errors
• selective exclusion of available data
• unconventional weighting of
summary statistics
Thomas Herdon, Michael Ash, & Robert Pollin (2013). Does high public debt
consistently stifle economic growth? A critique of Reinhart and Rogoff. Working
paper series 322. Political Economy Research Institute, U Mass Amherst.

Jason deBruyn (Jan 23, 2015) Trial involving disgraced scientist and
bunk Duke research to begin Monday. Triangle Business Journal.
In cancer therapy models
• data falsification
• retracted journal articles
• terminated clinical trials
• civil suit by patients
Anil Potti

1000 years of temperature variation: the
”hockey stick” graph by Michael Mann
In climate science models
• flawed research methods
• evasion of FOIA requests
• leaked emails
• media hype
Freed Pearce (2010-02-09) Climate change debate overheated
after sceptic grasped 'hockey stick‘. The Guardian.

“Computational science today faces a credibility crisis.”
Victoria Stodden, UIUC
Without access to the code and
data that underlie scientific
discoveries, published findings
are all but impossible to verify.

What can reproducible research do for you?
Your closest collaborator
is you six months ago,
but you don't reply to emails.
Paul Wilson
Engineering Physics
UW–Madison

This work flow is probably familiar.

Karl Broman
Biostatistics & Medical Informatics
UW–Madison
If you do anything “by hand”
once, you’ll do it 100 times.

Some narrative.
<<>>=
hist(co2)
@
Discuss result.
Principle 1.
Blend computing, results, and narrative.
Open a script.
Embed the code that
creates output.
More narrative.
Write content.

Principle 1.
Blend computing, results, and narrative.
<<>>=
hist(co2)
@
Render the text and
code outputs.
Report title
Introduction.
Some narrative.
Discuss result.
More narrative.

Report title
Introduction.
Some narrative.
Discuss result.
More narrative.
Some narrative.
<<>>=
hist(co2)
@
Discuss result.
Changes in the script? Render a new report.

The same report in Markdown.
.Rmd

The same report in Markdown.
render
.Rmd

.Rmd
Edit the output option.
No change to the rest of the file.
render
Same report with a different output format.

Principle 2. Organize for reproducibility
from the beginning.
1. Everything is a script
2. Every script is connected
3. File management is planned

# wrangle data
write(csv)
# gather data
read(xlsx)
script
Data

# create graph
write(PDF)
write(PNG)
# analysis
read(csv)
script
Design

source(design)
```{r}
source(gather)
Narrative.
script
Narrative
include(graph)
.Rmd
Report
.Rmd
render

.Rmd
reproducible
report
non-reproducible
documents
Your future self thanks you.

Summary: two principles.
Organize for reproducibility
from the beginning.
Explicitly link computing,
results, and narrative.

To learn more,
Victoria Stodden, Friedrich
Leisch, & Roger D. Peng (2014)
Chrtistopher Gandrud (2015)Yihui Xie (2013)

One Script to rule them all,
One Script to find them,
One Script to bring them all
And in the Markdown bind them.

Image credits
1. Image of Reinhart and Rogoff, reprinted under Creative Commons license, courtesy of The Commentator,
http://www.thecommentator.com/privacy_policy.
2. Image of Anil Potti, from WPDE.com, http://www.carolinalive.com/ © 2015 Sinclair Communications, LLC.
3. “Hockey stick” graph from Mann, Bradley, & Hughes, Nature, 1998. Reprinted from The Guardian, © 2015 Guardian
News and Media Limited, http://www.theguardian.com/environment/2010/feb/02/hockey-stick-graph-climate-
change.
4. Image of Victoria Stodden, from YouTube, speaking on "Reproducible Research: A Digital Curation Agenda" at the 7th
International Digital Curation Conference, University of Bath, Bristol, UK, Dec 6, 2011. Creative Commons attribution
license.
5. Bing images for the MATLAB logo, Microsoft Word, Excel, & PowerPoint, and for Adobe PDF are reprinted under
Creative Commons license.
6. Other unattributed clipart courtesy of https://openclipart.org/, used with permission.

Reproducible research: First steps.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Reproducible research: First steps.

Similar to Reproducible research: First steps. (20)

Recently uploaded

Recently uploaded (20)

Reproducible research: First steps.