SCIENCE IN THE
OPEN,
WHAT DOES IT
TAKE?
MELISSA HAENDEL DATA JAMBOREE
MARCH 3RD, 2017
@ontowonka
THERE ARE OVER 1500 PUBLIC
DATABASES IN NUCLEIC ACIDS
RESEARCH DATABASE COLLECTION
https://doi.org/10.1093/nar/gkw1188
HOW MANY OF THESE ARE
TRULY OPEN?
OPENNESS IS AN NAR
REQUIREMENT, BUT …
OPEN DATA IS FAIR DATA
http://www.nature.com/articles/sdata201618
Findable Accessible Interoperable Reusable
ANATOMY OF FAIRNESS
Metadata
Identifiers
Registration
Preservation
Standards
McMurry et al Identifiers for the 21st century bit.ly/identifiers-revision-2017
WHAT MAKES DATA FAIR:
FINDABLE
F1. (meta)data are assigned a globally unique and
persistent identifier
F2. data are described with rich metadata (defined
by R1 below)
F3. metadata clearly and explicitly include the
identifier of the data it describes
F4. (meta)data are registered or indexed in a
searchable resource
WHAT MAKES DATA FAIR:
ACCESSIBLE
A1. (meta) data are retrievable by their identifier
using a standardized communications protocol
A1.1 the protocol is open, free, and universally
implementable
A1.2 the protocol allows for an authentication
and authorization procedure, where necessary
A2. metadata are accessible, even when the
data are no longer available
WHAT MAKES DATA FAIR:
INTEROPERABLE
I1. (meta)data use a formal, accessible, shared,
and broadly applicable language for knowledge
representation.
I2. (meta)data use vocabularies that follow FAIR
principles
I3. (meta)data include qualified references to
other (meta)data
WHAT MAKES DATA FAIR:
REUSABLE
R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
R1.1. (meta)data are released with a clear and
accessible data usage license
R1.2. (meta)data are associated with detailed
provenance
R1.3. (meta)data meet domain-relevant community
standards
LETS CALL OUT A FEW MORE
THINGS.
https://zenodo.org/record/203295
Findable Accessible Interoperable Reusable
FAIR-TLC
Traceable Licensed Connected
FAIR-TLC:
TRACEABILITY
T1: Provenance
The data’s provenance is well documented and attributed (the data within
the resource)
T2: Attribution
The contributions to the content (data, tools, algorithms, sources, etc.) are
clearly declared.
Documentation on how to cite a record from a source or the whole
resource
FAIR-TLC: LICENSURE
http://peterdesmet.com/posts/analyzing-gbif-data-licenses.html
Not all data resources are free to use, derive,
and redistribute, even if they are publicly funded
and seemingly publicly available.
FAIR-TLC: CONNECTED
BECAUSE AGGREGATED != INTEGRATED
FAIR-TLC AS AN EVAL RUBRIC
Room for
improvement
bit.ly/open-science-prize
Open imaging
DISCUSSION:
HOW DO WE DO BETTER?
Make the right thing the easy thing:
- Carrots:
- Tenure & promotion cycles
- Dedicated funding for increasing FAIR-
TLC
- Sticks:
- Publication requirements
- Funding requirements
- Tools:
- Tracking tools
- Documentation tools
FAIR-TLC = OPEN SCIENCE
Findable Accessible Interoperable Reusable
Traceable Licensed Connected
Coming soon reusabledata.org

Science in the open, what does it take?

  • 1.
    SCIENCE IN THE OPEN, WHATDOES IT TAKE? MELISSA HAENDEL DATA JAMBOREE MARCH 3RD, 2017 @ontowonka
  • 2.
    THERE ARE OVER1500 PUBLIC DATABASES IN NUCLEIC ACIDS RESEARCH DATABASE COLLECTION https://doi.org/10.1093/nar/gkw1188
  • 3.
    HOW MANY OFTHESE ARE TRULY OPEN? OPENNESS IS AN NAR REQUIREMENT, BUT …
  • 4.
    OPEN DATA ISFAIR DATA http://www.nature.com/articles/sdata201618 Findable Accessible Interoperable Reusable
  • 5.
    ANATOMY OF FAIRNESS Metadata Identifiers Registration Preservation Standards McMurryet al Identifiers for the 21st century bit.ly/identifiers-revision-2017
  • 6.
    WHAT MAKES DATAFAIR: FINDABLE F1. (meta)data are assigned a globally unique and persistent identifier F2. data are described with rich metadata (defined by R1 below) F3. metadata clearly and explicitly include the identifier of the data it describes F4. (meta)data are registered or indexed in a searchable resource
  • 7.
    WHAT MAKES DATAFAIR: ACCESSIBLE A1. (meta) data are retrievable by their identifier using a standardized communications protocol A1.1 the protocol is open, free, and universally implementable A1.2 the protocol allows for an authentication and authorization procedure, where necessary A2. metadata are accessible, even when the data are no longer available
  • 8.
    WHAT MAKES DATAFAIR: INTEROPERABLE I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles I3. (meta)data include qualified references to other (meta)data
  • 9.
    WHAT MAKES DATAFAIR: REUSABLE R1. meta(data) are richly described with a plurality of accurate and relevant attributes R1.1. (meta)data are released with a clear and accessible data usage license R1.2. (meta)data are associated with detailed provenance R1.3. (meta)data meet domain-relevant community standards
  • 10.
    LETS CALL OUTA FEW MORE THINGS. https://zenodo.org/record/203295
  • 11.
    Findable Accessible InteroperableReusable FAIR-TLC Traceable Licensed Connected
  • 12.
    FAIR-TLC: TRACEABILITY T1: Provenance The data’sprovenance is well documented and attributed (the data within the resource) T2: Attribution The contributions to the content (data, tools, algorithms, sources, etc.) are clearly declared. Documentation on how to cite a record from a source or the whole resource
  • 13.
    FAIR-TLC: LICENSURE http://peterdesmet.com/posts/analyzing-gbif-data-licenses.html Not alldata resources are free to use, derive, and redistribute, even if they are publicly funded and seemingly publicly available.
  • 14.
  • 15.
    FAIR-TLC AS ANEVAL RUBRIC Room for improvement bit.ly/open-science-prize Open imaging
  • 16.
    DISCUSSION: HOW DO WEDO BETTER? Make the right thing the easy thing: - Carrots: - Tenure & promotion cycles - Dedicated funding for increasing FAIR- TLC - Sticks: - Publication requirements - Funding requirements - Tools: - Tracking tools - Documentation tools
  • 17.
    FAIR-TLC = OPENSCIENCE Findable Accessible Interoperable Reusable Traceable Licensed Connected Coming soon reusabledata.org

Editor's Notes