SlideShare a Scribd company logo
1 of 35
Download to read offline
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Challenges for Reproducibility in the Field Sciences
or
It’s déjà vu all over again
Aaron M. Ellison
Harvard University, Harvard Forest
Founding Editor, Ecological Archives
Editor-in-Chief, Ecological Monographs
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
©LonSchleining
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
“It will be observed, then, that our efforts are not merely to
accumulate as great a mass of animal remains as possible. On the
contrary, we are expending even more time than would be required
for the collection of specimens alone, in rendering what we do obtain
as permanently valuable as we know how, to the ecologist as well as
the systematist. . . . I wish to emphasize what I believe will ultimately
prove to be the greatest value of our museum. This value will not,
however, be realized until the lapse of many years, possibly a
century. . . the student of the future will have access to the original
record of faunal conditions. . .”
Joseph Grinnell (1910) The methods and uses of a research museum. Popular Science Monthly 75: 163-169.
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
“…It is only upon the distribution of a database that its far-
reaching research, educational, and other socioeconomic values
are recognized. … The contribution of any of these products to
scientific and technical knowledge might well assume a value far
greater than the costs of database production and dissemination.”
NRC (1999) A question of balance: private rights and the public interest in scientific and technical databases
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Reproducibility: Key challenges
• Cultural
– US
– International
• Technological
– Data storage
– Descriptive metadata
– Process metadata (Provenance)
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
“Ecology and evolutionary biology stand virtually alone among the
environmental and environment-related sciences in the lack of
some agency- or community-mandated data archiving and data
sharing policy.”
Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In:
Environmental Information Management and Analysis: Ecosystem to Global Scales
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
“Ecology and evolutionary biology stand virtually alone among the
environmental and environment-related sciences in the lack of
some agency- or community-mandated data archiving and data
sharing policy.”
Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In:
Environmental Information Management and Analysis: Ecosystem to Global Scales
1907 Annie Alexander and Joseph Grinnell found the MVZ at Berkeley
1988-1991 ESA Sustainable Biosphere Initiative (Jane Lubchenco)
1994-1995 ESA Committee: Future of Long-term Ecological Data (Kay Gross)
1995-1996 ESA Special Committee: Communications in the Electronic Age (Rob Colwell)
1995-1996 ESA Special Committee: Data Sharing and Archiving (Steward Pickett/Aaron Ellison)
1998- Ecological Archives (Aaron Ellison/William Michener)
2001 Ecological Metadata Language, version 1.0
2005 LTER Data Policy approved: requires data archiving within two years of collection
2008 ILTER Data Policy approved: encourages data archiving commensurate with publication
2009 NEON Data Policy developed: open access to all NEON data, on request
2011 NSF Data Management Requirement implemented for proposals
2015 Ecological Metadata Language, version 2.1
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Data Code
Ecology (est. 1920) Available to SME on request Required (2014)
Ecological Applications (est. 1991) Required (2014) Required (2014)
Ecological Monographs (est. 1930) Required (2011) Required (2014)
Ecosphere (est. 2010) Available to SME on request Available to SME on request
Ecosystem Health & Sustainability (est. 2015) NO NO
Journal of Ecology (est. 1913) Required (2014) For computer models (2014)
Journal of Animal Ecology (est. 1932) Required (2014) For computer models (2014)
Journal of Applied Ecology (est. 1964) Required (2014) For computer models (2014)
Functional Ecology (est. 1987) Required (2014) For computer models (2014)
Methods in Ecology & Evolution (est. 2010) Required (2014) Encouraged
Oikos (est. 1949) Strongly encouraged (2015) Strongly encouraged (2015)
Ecography (Holarctic Ecology) (est. 1978) Strongly encouraged (2015) Strongly encouraged (2015)
Oecologia (est. 1968) Available to SME on request Available to SME on request
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
http://www.forestgeo.si.edu/
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Reproducibility: Key challenges
• Cultural
– US
– International
• Technological
– Data storage
– Descriptive metadata
– Process metadata (Provenance)
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
changes to the master.txt
Assigned 15 months to cellulose 1bx,3bx,6ad,6bd,7ac, and 3 months to
cellulose 7bd with missing month value. This means each subplot is
sampled in every time period.
Changed all "dc" subplots to "bc", since those were missing from all
plot 5.2008-2010
2014
2012
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
filled.contour3 <-
function (x = seq(0, 1, length.out = nrow(z)),
y = seq(0, 1, length.out = ncol(z)), z, xlim = range(x, finite = TRUE),
ylim = range(y, finite = TRUE), zlim = range(z, finite = TRUE),
levels = pretty(zlim, nlevels), nlevels = 20, color.palette = cm.colors,
col = color.palette(length(levels) - 1), plot.title, plot.axes,
key.title, key.axes, asp = NA, xaxs = "i", yaxs = "i", las = 1,
axes = TRUE, frame.plot = axes,mar, ...)
{
# modification by Ian Taylor of the filled.contour function
# to remove the key and facilitate overplotting with contour()
# further modified by Carey McGilliard and Bridget Ferris
# to allow multiple plots on one page
if (missing(z)) {
if (!missing(x)) {
if (is.list(x)) {
z <- x$z
y <- x$y
x <- x$x
}
else {
z <- x
x <- seq.int(0, 1, length.out = nrow(z))
}
}
else stop("no 'z' matrix specified")
}
else if (is.list(x)) {
y <- x$y
x <- x$x
}
if (any(diff(x) <= 0) || any(diff(y) <= 0))
stop("increasing 'x' and 'y' values expected")
# mar.orig <- (par.orig <- par(c("mar", "las", "mfrow")))$mar
# on.exit(par(par.orig))
# w <- (3 + mar.orig[2]) * par("csi") * 2.54
# par(las = las)
# mar <- mar.orig
plot.new()
# par(mar=mar)
plot.window(xlim, ylim, "", xaxs = xaxs, yaxs = yaxs, asp = asp)
if (!is.matrix(z) || nrow(z) <= 1 || ncol(z) <= 1)
stop("no proper 'z' matrix specified")
if (!is.double(z))
storage.mode(z) <- "double"
.Internal(filledcontour(as.double(x), as.double(y), z, as.double(levels),
col = col))
#AME 1/15/2014: in R 3.0, should be
# .filled.contour(as.double(x), as.double(y), z, as.double(levels),
# col = col)
if (missing(plot.axes)) {
if (axes) {
title(main = "", xlab = "", ylab = "")
Axis(x, side = 1)
Axis(y, side = 2)
}
}
else plot.axes
if (frame.plot)
box()
if (missing(plot.title))
title(...)
else plot.title
invisible()
}
This happens if you use a non-standard API.
You are allowed to do that, but cannot expect
that it is maintained.
The C code underlying base graphics has been
migrated to the graphics package (and hence
no longer uses .Internal() calls).
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
From R Scripts to Provenance Graphs
DDG
Explorer
Textual
DDG
R
Script
Instrumented
R Script
RData
Tracker
R
Interpreter
DDG
Database
Visual
DDG
Instrumented
by scientist
Legend
R Scripts
R Environment
Provenance
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
“Ecology and evolutionary biology stand virtually alone among the
environmental and environment-related sciences in the lack of
some agency- or community-mandated data archiving and data
sharing policy.”
Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In:
Environmental Information Management and Analysis: Ecosystem to Global Scales
1907 Annie Alexander and Joseph Grinnell found the MVZ at Berkeley
1988-1991 ESA Sustainable Biosphere Initiative (Jane Lubchenco)
1994-1995 ESA Committee: Future of Long-term Ecological Data (Kay Gross)
1995-1996 ESA Special Committee: Communications in the Electronic Age (Rob Colwell)
1995-1996 ESA Special Committee: Data Sharing and Archiving (Steward Pickett/Aaron Ellison)
1998- Ecological Archives (Aaron Ellison/William Michener)
2001 Ecological Metadata Language, version 1.0
2005 LTER Data Policy approved: requires data archiving within two years of collection
2009 NEON Data Policy developed: open access to all NEON data, on request
2011 NSF Data Management Requirement implemented for proposals
2015 Ecological Metadata Language, version 2.1
… ???
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Goal:
Create reproducible science by documenting data
provenance: processes used to create, modify, visualize,
analyze, and synthesize data
Challenges:
Standard tools (e.g., R) do not collect provenance
Specialized tools (e.g., Kepler) have steep learning curve
Computer scientists are interested in control flow, data flow,
abstraction; ecologists are interested in other things
Lack of community standards
How much information to collect, manage, store, and use
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
Reproducible
Traceable
Usable
Comparable
11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison

More Related Content

Recently uploaded

原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
siskavia95
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
great91
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Ellison keynote - aaas workshop 2015

  • 1. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Challenges for Reproducibility in the Field Sciences or It’s déjà vu all over again Aaron M. Ellison Harvard University, Harvard Forest Founding Editor, Ecological Archives Editor-in-Chief, Ecological Monographs
  • 2. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 3. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison ©LonSchleining
  • 4. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 5. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 6. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 7. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 8. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison “It will be observed, then, that our efforts are not merely to accumulate as great a mass of animal remains as possible. On the contrary, we are expending even more time than would be required for the collection of specimens alone, in rendering what we do obtain as permanently valuable as we know how, to the ecologist as well as the systematist. . . . I wish to emphasize what I believe will ultimately prove to be the greatest value of our museum. This value will not, however, be realized until the lapse of many years, possibly a century. . . the student of the future will have access to the original record of faunal conditions. . .” Joseph Grinnell (1910) The methods and uses of a research museum. Popular Science Monthly 75: 163-169.
  • 9. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison “…It is only upon the distribution of a database that its far- reaching research, educational, and other socioeconomic values are recognized. … The contribution of any of these products to scientific and technical knowledge might well assume a value far greater than the costs of database production and dissemination.” NRC (1999) A question of balance: private rights and the public interest in scientific and technical databases
  • 10. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 11. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Reproducibility: Key challenges • Cultural – US – International • Technological – Data storage – Descriptive metadata – Process metadata (Provenance)
  • 12. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison “Ecology and evolutionary biology stand virtually alone among the environmental and environment-related sciences in the lack of some agency- or community-mandated data archiving and data sharing policy.” Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In: Environmental Information Management and Analysis: Ecosystem to Global Scales
  • 13. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison “Ecology and evolutionary biology stand virtually alone among the environmental and environment-related sciences in the lack of some agency- or community-mandated data archiving and data sharing policy.” Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In: Environmental Information Management and Analysis: Ecosystem to Global Scales 1907 Annie Alexander and Joseph Grinnell found the MVZ at Berkeley 1988-1991 ESA Sustainable Biosphere Initiative (Jane Lubchenco) 1994-1995 ESA Committee: Future of Long-term Ecological Data (Kay Gross) 1995-1996 ESA Special Committee: Communications in the Electronic Age (Rob Colwell) 1995-1996 ESA Special Committee: Data Sharing and Archiving (Steward Pickett/Aaron Ellison) 1998- Ecological Archives (Aaron Ellison/William Michener) 2001 Ecological Metadata Language, version 1.0 2005 LTER Data Policy approved: requires data archiving within two years of collection 2008 ILTER Data Policy approved: encourages data archiving commensurate with publication 2009 NEON Data Policy developed: open access to all NEON data, on request 2011 NSF Data Management Requirement implemented for proposals 2015 Ecological Metadata Language, version 2.1
  • 14. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Data Code Ecology (est. 1920) Available to SME on request Required (2014) Ecological Applications (est. 1991) Required (2014) Required (2014) Ecological Monographs (est. 1930) Required (2011) Required (2014) Ecosphere (est. 2010) Available to SME on request Available to SME on request Ecosystem Health & Sustainability (est. 2015) NO NO Journal of Ecology (est. 1913) Required (2014) For computer models (2014) Journal of Animal Ecology (est. 1932) Required (2014) For computer models (2014) Journal of Applied Ecology (est. 1964) Required (2014) For computer models (2014) Functional Ecology (est. 1987) Required (2014) For computer models (2014) Methods in Ecology & Evolution (est. 2010) Required (2014) Encouraged Oikos (est. 1949) Strongly encouraged (2015) Strongly encouraged (2015) Ecography (Holarctic Ecology) (est. 1978) Strongly encouraged (2015) Strongly encouraged (2015) Oecologia (est. 1968) Available to SME on request Available to SME on request
  • 15. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 16. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 17. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison http://www.forestgeo.si.edu/
  • 18. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Reproducibility: Key challenges • Cultural – US – International • Technological – Data storage – Descriptive metadata – Process metadata (Provenance)
  • 19. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 20. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison changes to the master.txt Assigned 15 months to cellulose 1bx,3bx,6ad,6bd,7ac, and 3 months to cellulose 7bd with missing month value. This means each subplot is sampled in every time period. Changed all "dc" subplots to "bc", since those were missing from all plot 5.2008-2010 2014 2012
  • 21. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 22. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 23. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 24. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 25. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison filled.contour3 <- function (x = seq(0, 1, length.out = nrow(z)), y = seq(0, 1, length.out = ncol(z)), z, xlim = range(x, finite = TRUE), ylim = range(y, finite = TRUE), zlim = range(z, finite = TRUE), levels = pretty(zlim, nlevels), nlevels = 20, color.palette = cm.colors, col = color.palette(length(levels) - 1), plot.title, plot.axes, key.title, key.axes, asp = NA, xaxs = "i", yaxs = "i", las = 1, axes = TRUE, frame.plot = axes,mar, ...) { # modification by Ian Taylor of the filled.contour function # to remove the key and facilitate overplotting with contour() # further modified by Carey McGilliard and Bridget Ferris # to allow multiple plots on one page if (missing(z)) { if (!missing(x)) { if (is.list(x)) { z <- x$z y <- x$y x <- x$x } else { z <- x x <- seq.int(0, 1, length.out = nrow(z)) } } else stop("no 'z' matrix specified") } else if (is.list(x)) { y <- x$y x <- x$x } if (any(diff(x) <= 0) || any(diff(y) <= 0)) stop("increasing 'x' and 'y' values expected") # mar.orig <- (par.orig <- par(c("mar", "las", "mfrow")))$mar # on.exit(par(par.orig)) # w <- (3 + mar.orig[2]) * par("csi") * 2.54 # par(las = las) # mar <- mar.orig plot.new() # par(mar=mar) plot.window(xlim, ylim, "", xaxs = xaxs, yaxs = yaxs, asp = asp) if (!is.matrix(z) || nrow(z) <= 1 || ncol(z) <= 1) stop("no proper 'z' matrix specified") if (!is.double(z)) storage.mode(z) <- "double" .Internal(filledcontour(as.double(x), as.double(y), z, as.double(levels), col = col)) #AME 1/15/2014: in R 3.0, should be # .filled.contour(as.double(x), as.double(y), z, as.double(levels), # col = col) if (missing(plot.axes)) { if (axes) { title(main = "", xlab = "", ylab = "") Axis(x, side = 1) Axis(y, side = 2) } } else plot.axes if (frame.plot) box() if (missing(plot.title)) title(...) else plot.title invisible() } This happens if you use a non-standard API. You are allowed to do that, but cannot expect that it is maintained. The C code underlying base graphics has been migrated to the graphics package (and hence no longer uses .Internal() calls).
  • 26. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 27. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 28. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 29. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 30. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison From R Scripts to Provenance Graphs DDG Explorer Textual DDG R Script Instrumented R Script RData Tracker R Interpreter DDG Database Visual DDG Instrumented by scientist Legend R Scripts R Environment Provenance
  • 31. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison
  • 32. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison “Ecology and evolutionary biology stand virtually alone among the environmental and environment-related sciences in the lack of some agency- or community-mandated data archiving and data sharing policy.” Porter & Callahan (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In: Environmental Information Management and Analysis: Ecosystem to Global Scales 1907 Annie Alexander and Joseph Grinnell found the MVZ at Berkeley 1988-1991 ESA Sustainable Biosphere Initiative (Jane Lubchenco) 1994-1995 ESA Committee: Future of Long-term Ecological Data (Kay Gross) 1995-1996 ESA Special Committee: Communications in the Electronic Age (Rob Colwell) 1995-1996 ESA Special Committee: Data Sharing and Archiving (Steward Pickett/Aaron Ellison) 1998- Ecological Archives (Aaron Ellison/William Michener) 2001 Ecological Metadata Language, version 1.0 2005 LTER Data Policy approved: requires data archiving within two years of collection 2009 NEON Data Policy developed: open access to all NEON data, on request 2011 NSF Data Management Requirement implemented for proposals 2015 Ecological Metadata Language, version 2.1 … ???
  • 33. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Goal: Create reproducible science by documenting data provenance: processes used to create, modify, visualize, analyze, and synthesize data Challenges: Standard tools (e.g., R) do not collect provenance Specialized tools (e.g., Kepler) have steep learning curve Computer scientists are interested in control flow, data flow, abstraction; ecologists are interested in other things Lack of community standards How much information to collect, manage, store, and use
  • 34. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison Reproducible Traceable Usable Comparable
  • 35. 11-12 May 2015 AAAS Workshop on Reproducibility in the Field Sciences © 2015 Aaron M. Ellison