SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
1.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Brian Hole, Founder and CEO
ISI CODATA Workshop, Bangalore, 9th March 2015
Publishing (open) data
2.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Overview
Why publish open
data?
How to publish
When open isn’t
possible
3.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
To return control of publishing to researchers, providing them
with the infrastructure and support to advance publishing in
ways that legacy publishers are not willing to do.
About Ubiquity Press
Background
Mission
Spun out of University College London in 2012
Researcher-led
Extensive publishing background as well
(BioMed Central, PLoS, Elsevier, IOP)
Based in London
Comprehensive approach: journals,
books, data, software, hardware, wetware….
4.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
The Social Contract
of Science
• Validation
• Dissemination
• Further development
Scientific Malpractice
• Data
• Results
• Software
• Hardware, wetware…
#@%$#@
% #@%$#
Source: http://www.smbc-comics.com/index.php?db=comics&id=2015
5.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
why publish open
data?
12.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Openness incentivises rigour in research
Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the
Quality of Reporting of Statistical Results. PLoS ONE 6(11): e26828. doi:10.1371/journal.pone.0026828
13.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
how to publish
14.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Searchable: published metadata allows Google search for data files ✗
Confirmable: author can confirm descriptive metadata terms used ✗
Citable: unique identifiers (DOIs) permit citation of data files ?
Increased exposure of source journal articles through data citation ?
Permanent: data files securely archived in perpetuity ?
Linked: datasets linked to article based on them ✔
Metadata will be available as RDF: part of the “web of linked data” ?
Curated: quality verified, stable formats used, content virus-checked ?
Ease of deposit: authors can upload multiple or zipped files ?
Updatable: new versions of data files can be added, with provenance ✗
Embargo: can delay release of data up to one year after publication ✗
Open access: no restrictions for users, no subscription required ?
Scalable: many journals and societies can leverage economies of scale ✗
Supplementary data
15.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Data repositories
• Use of a data repository alone works very well in some disciplines
• Genes & gene sequences:
GenBank
• Drosophila: FlyBase
• Clinical trials: clinicaltrials.gov
16.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
• But using just a repository isn’t always best for other disciplines,
and especially the long tail
17.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Data repositories that work closely with journals
• Provide a home for long-tail data
• Dryad: datadryad.org
• Zenodo: zenodo.org
• Figshare: figshare.com
• Integrate with publisher systems
• Dataverse: dataverse.org
18.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Data journals have begun to appear over the past two years:
The data publishing landscape
20.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
The basics of the model
Data papers are short
1) Low barrier data publication
Peer review is quick and objective
2) Online authoring
Low APC: £100 / ₹1,000
Lower cost (straight to XML)
Encourages shorter form
3) Open access only (CC-BY)
4) The publisher is not the repository
No-questions-asked waivers
22.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
1. The paper contents
a. The methods section of the paper must provide
sufficient detail that a reader can understand how
the resource was created.
b. The resource must be correctly described.
c. The reuse section must provide concrete and useful
suggestions for reuse of the reuse.
2. The deposited resource
a. The repository must be suitable for resource
and have a sustainability model.
b. Open license permits unrestricted access (e.g. CC0),
or access guaranteed if criteria met (must qualify)
c. A version in an open, non-proprietary format.
d. Labeled in such a way that a 3rd party can make
sense of it.
e. Must be actionable.
Peer review
32.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Integrating data publication within universities
33.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
when open isn’t
possible
34.
brian.hole@ubiquitypress.com www.ubiquitypress.com / @ubiquitypress
Frequently cited reasons not to make data open
• Consent issues
• The data cannot be sufficiently anonymised
• The data is commercially valuable
• More work to be done and published from the data
• Reuse of data could be dangerous
• It would take too much work to make it usable by others