'Open' as a strategy for durability,
reproducibility, and scalability
Jonathan A. Rees
National Evolutionary Synthesis Cen...
Outline
● The challenge
● Open Tree of Life
● Threats
● 'Open' as defense
● General remarks
● The challenge
(c) NingWang et al CC-BY
Problems
● Information about tree of life is hard to fnd
● Technical incompatibilities make
integration hard (e.g. names /...
●
● Open Tree of Life
Stephen Smith
● Name gathering (taxonomy)
● Study upload
● Curation interface (OTU matching etc.)
● Deposit to study repository
● Synthe...
But...
● Trees are hard to obtain (Drew et al. 2013)
– ~4% relatively easily available (Treebase, Dryad)
– Another ~12% av...
●
●
● Threats
Threats (1)
● Under- and non-funding
● Software and/or data could be lost or
captured
– volunteers may not want to invest ...
Threats (2)
● Scientifc decision making could be opaque
– volunteers could turn away if system doesn't
make sense or if it...
●
●
●
● 'Open' as defense
Legal 'open'
● Hybrid CC0 + 'facts are free' (open data)
– aids persistence (link to Dryad)
– makes corpus more valuable
●...
Technical 'open'
● Data as JSON
– 'open' to larger part of ecosystem
● Trees as NeXML (in JSON syntax)
● Data on github ('...
Process 'open'
● Open process ('open science')
– public issue trackers
– public discussion groups
– routing user feedback ...
●
●
●
●
● General remarks
life = free software ?
"Three cell growth types". Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedi...
● I hereby acknowledge technical debt
● tree.opentreeofife.org
● opentreeofife.org
● github.com/opentreeofife
● twitter.com/opentreeofife
● freenode #opentreeof...
Software:
Jim Allman
Joseph Brown
Karen Cranston
Cody Hinchlif
Mark Holder
Jonathan Leto
Emily McTavish
Peter Midford
Rick...
© 2014 Jonathan A Rees /CC-BY 4.0
'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)
'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)
'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)
'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)
'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)
Upcoming SlideShare
Loading in …5
×

'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)

315 views
266 views

Published on

About the 'Open' in the Open Tree of Life project.

Published in: Science, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
315
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

'Open' as a strategy for durability, reproducibility, and scalability (BOSC 2014)

  1. 1. 'Open' as a strategy for durability, reproducibility, and scalability Jonathan A. Rees National Evolutionary Synthesis Center BOSC 12 July 2014
  2. 2. Outline ● The challenge ● Open Tree of Life ● Threats ● 'Open' as defense ● General remarks
  3. 3. ● The challenge
  4. 4. (c) NingWang et al CC-BY
  5. 5. Problems ● Information about tree of life is hard to fnd ● Technical incompatibilities make integration hard (e.g. names / identifers) ● Scientifc diferences are rampant but hard to see ● No single view on what is known
  6. 6. ● ● Open Tree of Life
  7. 7. Stephen Smith
  8. 8. ● Name gathering (taxonomy) ● Study upload ● Curation interface (OTU matching etc.) ● Deposit to study repository ● Synthesis ● Access (browser & API)
  9. 9. But... ● Trees are hard to obtain (Drew et al. 2013) – ~4% relatively easily available (Treebase, Dryad) – Another ~12% available in response to email – Others available only as JPEGs (cf. Mounce) ● Lots of manual labor (curation) – Obtaining trees – Ingroup and outgroup – OTU matching
  10. 10. ● ● ● Threats
  11. 11. Threats (1) ● Under- and non-funding ● Software and/or data could be lost or captured – volunteers may not want to invest if they think work might be lost or captured ● Software could become brittle and unimprovable – volunteers may not want to invest if they think the data is 'captured' by the software
  12. 12. Threats (2) ● Scientifc decision making could be opaque – volunteers could turn away if system doesn't make sense or if its claims are hard to confrm
  13. 13. ● ● ● ● 'Open' as defense
  14. 14. Legal 'open' ● Hybrid CC0 + 'facts are free' (open data) – aids persistence (link to Dryad) – makes corpus more valuable ● Free software (including marked Javascript) – aids persistence – encourages experiments ● Open access publications (CC-BY)
  15. 15. Technical 'open' ● Data as JSON – 'open' to larger part of ecosystem ● Trees as NeXML (in JSON syntax) ● Data on github ('phylesystem' repo + index) – what you see is everything – not locked up in a database – helps assure community that data won't be lost ● Scripting
  16. 16. Process 'open' ● Open process ('open science') – public issue trackers – public discussion groups – routing user feedback to github issue tracker ● Reproducibility is another kind of 'open' – hard to reproduce → not so open – links to source material – scripting
  17. 17. ● ● ● ● ● General remarks
  18. 18. life = free software ? "Three cell growth types". Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Three_cell_growth_types.png#mediaviewer/File:Three_cell_growth_types.png. copy interpret disseminate compete change merge
  19. 19. ● I hereby acknowledge technical debt
  20. 20. ● tree.opentreeofife.org ● opentreeofife.org ● github.com/opentreeofife ● twitter.com/opentreeofife ● freenode #opentreeofife ● opentreeofife google group ● opentreeofife-software google group
  21. 21. Software: Jim Allman Joseph Brown Karen Cranston Cody Hinchlif Mark Holder Jonathan Leto Emily McTavish Peter Midford Rick Ree Jonathan Rees Stephen Smith PIs: Gordon Burleigh Keith Crandall Karen Cranston Karl Gude David Hibbett Mark Holder Laura Katz Rick Ree Stephen Smith Doug Soltis Tifani Williams Funding: NSF Curators: Joseph Brown Bryan Drew Romina Gazis Jessica Grant Cody Hinchlif Dail Laughinghouse Chris Owen
  22. 22. © 2014 Jonathan A Rees /CC-BY 4.0

×