1) The document describes a workshop on research synthesis and reproducibility.
2) It discusses challenges with reproducibility in science and proposes provenance and conceptual tools like PRIMAD to help address these challenges.
3) The document presents a case study where an intern was able to reproduce results from a 2006 ecological niche modeling paper using the Whole Tale environment and MaxEnt software, demonstrating computational reproducibility.
2. All-in-One (Teaser)• Reproducibility Crisis in Science
• A conceptual tool: Provenance
• Transparency? Explanation? Provenance !
• … why-, how-, where-, why-not-, data-, workflow- ... provenance ...
• Terminological Chaos Reigns
– … replicability … reproducibility … repeatability …
• A modest proposal and (evolving) conceptual tool: PRIMAD
– What’s fixed? What varies? (X à X’, Y à Y’ , … )
– What is the information gain when succeeding, failing to reproduce?
• Tool Tools (cf. audio-book, e-book, book-book)
– Computational Reproducibility? Whole-Tale (vms++) !
– Modeling (Dataflow) Dependencies? YesWorkflow !
– Terminological Confusion? EulerX ! (“Semantics”)
• A Case Study
– Whole Tale Summer Internship (Santiago Núñez-Corrales):
– Reproducibility in Ecological Niche Models: the case of Phillips et al (2006)
Ludäscher & Núñez-Corrales
Whole Tale
6. Computational Provenance …
• Origin, processing history of artifacts
– data products, figures, ...
– also: underlying workflow
è understand methods, dataflow, and dependencies
è role of computational provenance in HoH !?
Ludäscher & Núñez-Corrales
Climate Change Impacts
in the United States
U.S. National Climate Assessment
U.S. Global Change Research Program
9. : Provenance in DataONE
A DataONE search (here: “grass”) yields different packages with Data Provenance
(not covered: Semantic Search)
Ludäscher & Núñez-Corrales
10. Exploring Provenance in DataONE
• Let’s go there è Mark Carls. 2017. Analysis of hydrocarbons following
the Exxon Valdez oil spill, Gulf of Alaska, 1989 - 2014. Gulf of Alaska
Data Portal. urn:uuid:3249ada0-afe3-4dd6-875e-0f7928a4c171.
Ludäscher & Núñez-Corrales
13. Adding YesWorkflow to DataONE
Yaxing’s script with
inputs & output
products
Christopher’s
YesWorkflow
model
Christopher using
Yaxing’s outputs as
inputs for his script
Christopher’s results
can be traced back all
the way to Yaxing’s
input
Ludäscher & Núñez-Corrales
17. To succeed or to fail? What do we gain?
• Successful reproducibility study:
– increases trust in prior study J
– … but no surprises L
• Failed reproducibility study :
– decreases trust (or falsifies) prior study L
– … but surprising failure yields new info/knowledge J
• Learning from failures!
– not really a totally new idea..
– What does a positive vs negative result mean anyways?
– When developing s/w, tools: fail early, fail often ...
Ludäscher & Núñez-Corrales
21. • SKOPE: system and tools to discover, access,
analyze, visualize paleoenvironmental data
– unprecedented ability to explore provenance
(detailed, comprehensible record of computational
derivation of results)
– for researchers, tinkerers, and modelers
• Whole Tale:
– leverage & contribute to existing CI to support the
whole tale (“living paper”), from workflow run to
scholarly publication
– integrate tools & CI (DataONE, Globus, iRODS,
NDS, ...) to simplify use and promote best
practices.
– driven by science WGs (Archaeology/SKOPE,
materials science, astro, bio ..)
But first: Some Tools (“Cyberinfrastructure”)
Ludäscher: Provenance Back & Forth 21
25. Project Goals (… Reproducibility in Ecological Niche Models … )
● Try to reproduce one set of results reported in the literature
using maximum entropy methods (MaxEnt) within The Whole
Tale environment
○ Phillips, S. J., Anderson, R. P., & Schapire, R. E. (2006). Maximum
entropy modeling of species geographic distributions. Ecological
modelling, 190(3-4), 231-259.
● Determine whether existing software tools focus more on the
scientific modeling problem instead of on software usage
while covering reproducibility concerns
○ Not with existing tools, either incomplete or desktop-based, not
comparable
● Build scientific software for ecological niche modeling that
helps users diversify and trace their stories
○ Introspection-based model
26. intros-MaxEnt: view in PRIMAD++
Actions Parameter Raw data Platform /
Stack
Implem. Method Research
Objective
Actor Gain
Re-code (x) x Run MaxEnt models in the Whole Tale
Validate (x) (x) (x) (x) x Determine MaxEnt robustness factors
Re-use x Increase the user base for MaxEnt methods
Independent x x Collectively verify MaxEnt experiments
Introspect (x) (x) x Explore and adjust model contents
Diff (x) (x) (x) x Test hypotheses dependent on state-change
Trace (log) (x) (x) (x) (x) x Capture time-dep decision modeling pathways
Package (x) (x) (x) (x) x Provide a zero cold-start entry for experiments
Freire, J., Fuhr, N., & Rauber, A. (2016). Reproducibility of data-oriented experiments in e-Science (Dagstuhl Seminar 16041). In Dagstuhl Reports (Vol. 6, No. 1). Schloss Dagstuhl-Leibniz-Zentrum fuer
Informatik.
27. Ecological niche models ..
1. Positive observations (i.e. presence-only data) suffice to
compute a distribution of a species
2. The likelihood of the presence of an individual depends on
biologically relevant environmental factors
3. Interactions between species can be abstracted as
environmental factors, hence not modeled explicitly
4. The distribution is stated in terms of the probability of finding
a member of the species at the locations of interest
5. An exact fit is not a good fit, but rather an overfit
38. Summary of Outcomes
1. Able to execute a version of MaxEnt with original data from
Phillips et al (2006) within The Whole Tale
a. Stated in terms of a regularized support vector machine
(complex code!)
b. Discovered problems with reproducibility and how to
evaluate it
2. A tool for batch georeferencing DarwinCore based on minimal
location data was implemented
a. Helpful to assign geolocation data after taxonomy
alignment
b. Discovered data is much less clean than expected
3. A new “introspective” software version of MaxEnt
a. Available in PyPI
b. Based on a state machine
39. ... now what?
• PRIMAD ++
– PRIMAD is built on the idea of
– … keeping some things the same
– ... and “wiggling” some things
– We can start from the “execution stack”:
• Hardware … Operating System … Libraries ... PLs ... IDEs ..
– Then going into the domain:
• … varying datasets, parameters, assumptions ...
– Experimental Design ++ !
• PRIMAD ++ HoH (v2?)
• Tools to support
– “higher order” {data, parameter, method, …} sweeps
– Automate these (workflow tools!)
Ludäscher & Núñez-Corrales
43. Taxonomic concept alignment, Andropogon glomeratus-virginicus
complex, spanning across 11 classifications authored 1889-2015
• 36 unique taxonomic names
• 88 taxonomic concept labels
Þ name sec. author strings
• Alignment by A.S. Weakley
Þ row position = congruence
• 1/36 names with unique 1 : 1
name : meaning cardinality
across all classifications
• Andropogon virginicus
• Source: Franz et al. 20161
1 Franz et al. 2016. Names are not good enough: reasoning over taxonomic change in the Andropogon complex.
Semantic Web Journal (IOS). doi:10.3233/SW-160220
Ludäscher & Núñez-Corrales
45. Half-Smokes in DC: Typical for the Northeast?
… or the South !? (A tale of two taxonomies: NDC vs CEN)
“…in the face of incompatible information or data structures among users or among those specifying
the system, attempts to create unitary knowledge categories are futile. Rather, parallel or multiple
representational forms are required” [Bowker & Star, 2000, p.159]
West
Southwest Southeast
Midwest North-
east
West
South
Midwest North-
east
National Diversity Council map (NDC) US Census Buero map (CEN)
Source: Yi-Yun (Jessica) Cheng (PhD student, iSchool @ Illinois)
Ludäscher & Núñez-Corrales