| 1
Anita de Waard, VP Research Data Collaborations
Elsevier RDM Services
a.dewaard@elsevier.com
December 1, 2016
Elsevier‘s RDM Program:
Ten Habits of Highly Effective Data
| 2
https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data
10.Integrateupstreamanddownstream
–makemetadatatoserveuse.
Save
Share
Use
9. Re-usable (allow tools to run on it)
8. Reproducible
7. Trusted (e.g. reviewed)
6. Comprehensible (description / method is available)
5. Citable
4. Discoverable (data is indexed or data is linked from article)
3. Accessible
1. Stored (existing in some form)
2. Preserved (long-term & format-independent)
A Maslow Hierarchy for Research Data:
| 3
Store, Preserve: Data Rescue Award
| 4
Store: Hivebench
www.hivebench.com
| 5
https://data.mendeley.com/
Linked to published
papers – or not
Linked to Github
– or not
Versioning and
provenance tracking
Store, Access: Mendeley Data
Different Licenses:
GNU-PL, CC-BY CC0,
etc
| 6
Access, Cite: Data Linking
• Integrated in paper submission process
• Supplementary data is never behind a firewall
• Closely integrated with > 150 databases:
| 7
Access, Discover: Scholix/DLIs
• ICSU-WDS/RDA Publishing Data Service Working group,
merged with National Data Service pilot
• Cross-stakeholder – with input from CrossRef, DataCite, OpenAIRE, Europe
PubMed Central, ANDS, PANGAEA, Thomson Reuters, Elsevier, and others
• Proposed long-term architecture and interoperability framework: www.scholix.org
• Operational prototype at http://dliservice.research-infrastructures.eu/#/api
(including 1.4 Million links from various sources)
| 8
Cite: Force11
https://www.elsevier.com/connect/data-citation-is-becoming-real-with-force11-and-elsevier
| 9
Discover: Datasearch
https://datasearch.elsevier.com
| 10
Data
articles
Software
articles
Method
articles
Protocols
Video
articles
Hardware
articles
Lab
resources
Full Research
paper
• Brief article types designed to
communicate a specific element of
the research cycle
• Complementary to full research
papers
• Easy to prepare and submit
• Peer-reviewed and indexed
• Receive a DOI and fully citable
• Allow citable post-publication
updates
• Primarily Open Access (CC-BY)
• Published in Multidisciplinary and
domain-specific journals
https://www.elsevier.com/books-and-journals/research-elements
Review: Research Elements
| 11
Reuse: Cortex Registered Reports
11
• Two-step submission process:
• Method and proposed analysis are submitted for pre-registration
• Paper is conditionally accepted
• Research is executed
• Full paper submitted, accepted provided that protocol is followed
• All experimental data made available Open Access
• Featured in the Guardian:
| 12
Research
article
published
Initial inquiry
Share,
publish and
link data
Monitor
progress and
provide
guidance
Generate
reports
111110 00011
1101110 0000
001
10011
1
011100
101
What?
• Service for Research Institutes (esp. librarians) to
engage with researchers throughout the research
data life cycle.
How?
Offer service for Librarians to interact with researchers
regarding the RDM Process to:
• Offer solutions to store, share, link and publish data
• Monitor progress report on posting, citation,
downloads of dataset
• Provide monthly reportingDATA
LIGHTHOUSE
Metrics for Institutions: Data Lighthouse
| 13
10.Integrateupstreamanddownstream
–makemetadatatoserveuse.
Save
Share
Use
9. Re-usable
8. Reproducible
7. Trusted
6. Comprehensible
5. Citable
4. Discoverable
3. Accessible
1. Stored
2. Preserved
https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data
A Maslow Hierarchy for Research Data:
Data at Risk
Reproducibility Papers
Data
Lighthouse
| 14
Links:
• Original Materials:
- The original research paper: Kinnings et al, 2010
- The paper describing the earlier reproducibility effort: Garijo et al., 2013
- A wiki with the reproduction attempt: Gil/Darijo, 2012
- Background materials on the reproduction efforts: Garijo, 2012
- SMAP Tool: Xie, 2010
• Our rebuild:
- Protocol in Hivebench: https://www.hivebench.com/protocols/16483
- Experiment in Hivebench:
https://www.hivebench.com/notebooks/8524/experiments/20562
- Data in Mendeley Data:
https://data.mendeley.com/datasets/r69mvkckmn/draft?preview=1
- MethodsX Paper, with links to protocols and data:
http://www.articleofthefuture.com/methodsx.html

Elsevier‘s RDM Program: Ten Habits of Highly Effective Data

  • 1.
    | 1 Anita deWaard, VP Research Data Collaborations Elsevier RDM Services a.dewaard@elsevier.com December 1, 2016 Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
  • 2.
    | 2 https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data 10.Integrateupstreamanddownstream –makemetadatatoserveuse. Save Share Use 9. Re-usable(allow tools to run on it) 8. Reproducible 7. Trusted (e.g. reviewed) 6. Comprehensible (description / method is available) 5. Citable 4. Discoverable (data is indexed or data is linked from article) 3. Accessible 1. Stored (existing in some form) 2. Preserved (long-term & format-independent) A Maslow Hierarchy for Research Data:
  • 3.
    | 3 Store, Preserve:Data Rescue Award
  • 4.
  • 5.
    | 5 https://data.mendeley.com/ Linked topublished papers – or not Linked to Github – or not Versioning and provenance tracking Store, Access: Mendeley Data Different Licenses: GNU-PL, CC-BY CC0, etc
  • 6.
    | 6 Access, Cite:Data Linking • Integrated in paper submission process • Supplementary data is never behind a firewall • Closely integrated with > 150 databases:
  • 7.
    | 7 Access, Discover:Scholix/DLIs • ICSU-WDS/RDA Publishing Data Service Working group, merged with National Data Service pilot • Cross-stakeholder – with input from CrossRef, DataCite, OpenAIRE, Europe PubMed Central, ANDS, PANGAEA, Thomson Reuters, Elsevier, and others • Proposed long-term architecture and interoperability framework: www.scholix.org • Operational prototype at http://dliservice.research-infrastructures.eu/#/api (including 1.4 Million links from various sources)
  • 8.
  • 9.
  • 10.
    | 10 Data articles Software articles Method articles Protocols Video articles Hardware articles Lab resources Full Research paper •Brief article types designed to communicate a specific element of the research cycle • Complementary to full research papers • Easy to prepare and submit • Peer-reviewed and indexed • Receive a DOI and fully citable • Allow citable post-publication updates • Primarily Open Access (CC-BY) • Published in Multidisciplinary and domain-specific journals https://www.elsevier.com/books-and-journals/research-elements Review: Research Elements
  • 11.
    | 11 Reuse: CortexRegistered Reports 11 • Two-step submission process: • Method and proposed analysis are submitted for pre-registration • Paper is conditionally accepted • Research is executed • Full paper submitted, accepted provided that protocol is followed • All experimental data made available Open Access • Featured in the Guardian:
  • 12.
    | 12 Research article published Initial inquiry Share, publishand link data Monitor progress and provide guidance Generate reports 111110 00011 1101110 0000 001 10011 1 011100 101 What? • Service for Research Institutes (esp. librarians) to engage with researchers throughout the research data life cycle. How? Offer service for Librarians to interact with researchers regarding the RDM Process to: • Offer solutions to store, share, link and publish data • Monitor progress report on posting, citation, downloads of dataset • Provide monthly reportingDATA LIGHTHOUSE Metrics for Institutions: Data Lighthouse
  • 13.
    | 13 10.Integrateupstreamanddownstream –makemetadatatoserveuse. Save Share Use 9. Re-usable 8.Reproducible 7. Trusted 6. Comprehensible 5. Citable 4. Discoverable 3. Accessible 1. Stored 2. Preserved https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data A Maslow Hierarchy for Research Data: Data at Risk Reproducibility Papers Data Lighthouse
  • 14.
    | 14 Links: • OriginalMaterials: - The original research paper: Kinnings et al, 2010 - The paper describing the earlier reproducibility effort: Garijo et al., 2013 - A wiki with the reproduction attempt: Gil/Darijo, 2012 - Background materials on the reproduction efforts: Garijo, 2012 - SMAP Tool: Xie, 2010 • Our rebuild: - Protocol in Hivebench: https://www.hivebench.com/protocols/16483 - Experiment in Hivebench: https://www.hivebench.com/notebooks/8524/experiments/20562 - Data in Mendeley Data: https://data.mendeley.com/datasets/r69mvkckmn/draft?preview=1 - MethodsX Paper, with links to protocols and data: http://www.articleofthefuture.com/methodsx.html

Editor's Notes

  • #7 IUPAC has recommendations for what word you should use to describe a given property, but the vocabulary itself isn’t very accessible or usable itself, thus is not universally implemented. Each site decides how it wants to label a given property, which hinders indexing and reuse of the data across silos. Structured capture of information using an ELN such as Hivebench enables the researcher to report data using a consistent vocabulary without extra effort.
  • #8 IUPAC has recommendations for what word you should use to describe a given property, but the vocabulary itself isn’t very accessible or usable itself, thus is not universally implemented. Each site decides how it wants to label a given property, which hinders indexing and reuse of the data across silos. Structured capture of information using an ELN such as Hivebench enables the researcher to report data using a consistent vocabulary without extra effort.
  • #10 Chemistry data are retrievable from NIST, but only by going to their page in a browser and using their search tools. What about accessible within other applications, or accessible in assistive devices for those with vision impairment? What guarantee do we have the data will remain accessible in case of government funding problems?