This talk, is part of the MIT Program on Information Science brown bag series (http://informatics.mit.edu)
This talk discusses findings from an analysis of data sharing and citation policies in Open Access journals and describes a set of novel tools for open data publication in open access journal workflows. Bring your lunch and enjoy a discussion fit for scholars, Open Access fans, and students alike.
Dr Micah Altman is Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, at the Massachusetts Institute of Technology.
2. Integrating Open Data into Open
Access Journals
Micah Altman
Director of Research
MIT Libraries
Prepared for
Program on Information Science Brown Bag Series
MIT
October 2015
3. Roadmap
Integrating Open Data into Open Access Journals
Motivation
• Reproducibility
Intervention
• Integrating journal and
data publication workflow
Future
• Changing policies & uses
5. DISCLAIMER
These opinions are my own, they are not the
opinions of MIT, Brookings, any of the project
funders, nor (with the exception of co-authored
previously published work) my collaborators
Secondary disclaimer:
“It’s tough to make predictions, especially about
the future!”
-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston
Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert
Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan
Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel,
Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.
Integrating Open Data into Open Access Journals
6. Collaborators & Co-Conspirators
Collaborators and Co-conspirators
IQSS, Harvard: Eleni Castro, Mercè Crosas, Phil Durbin
PKP Project: Alex Garnett, Jenn Whitney
Research Support
Supported by the Sloan Foundation
Integrating Open Data into Open Access Journals
7. Related Work
Project Website
projects.iq.harvard.edu/ojs-dvn
Related publications:
(Reprints available from: informatics.mit.edu )
Altman M, Castro E, Crosas M, Durbin P, Garnett A, Whitney J. Open
Journal Systems and Dataverse Integration-- Helping Journals to
Upgrade Data Publication for Reusable Research. Code4Lib
Journal. Forthcoming.
Altman M, Avery M. 2015 Information wants someone else to pay for
it: laws of information economics and scholarly publishing.
Information Services and Use . 2015;35(1-2):57-70.
Altman M, Borgman C, Crosas M, Martone M. An Introduction to the
Joint Principles for Data Citation. Bulletin of the Association for
Information Science and Technology [Internet]. 2015;41(3):43-44.
Brand A, Allen L, Altman M, Hlava M, Scott J. Beyond authorship:
attribution, contribution, collaboration, and credit. Learned
Publishing. 2015;28(2):151-155.
Integrating Open Data into Open Access Journals
9. New Initiatives to Improve Scientific Reliability
Retraction monitoring
Data citation
Clinical trial
preregistration
Registered replication
Open data
Badges
Integrating Open Data into Open Access Journals
12. Irreproducible Results
Integrating Open Data into Open Access Journals
Many journals have
no replication policy
Even in journals with
clear policy, success
rate is low
13. The File Drawer Problem
Integrating Open Data into Open Access Journals
Daniel
Schectman’s
Lab Notebook
Providing
Initial
Evidence of
Quasi Crystals
• Null results are less likely to be published
published results as a whole are biased toward positive findings
• Outliers are routinely discarded
unexpected patterns of evidence across studies remain hidden
14. Potential Interventions
Integrating Open Data into Open Access Journals
Trustworthy
Science
Access to
Scholarly
Record
Reproducible
processes
Attribution
&
Provenance
Management
and
Governance of
the Evidence
Base
Measurement and
Evaluation
15. Integrating Open Data into Open Access Journals
The Project
Citation
to Data
Citation
to Article
• Technical Integration
• Socio-Technical Intervention
16. Integrating Open Journals and Data Publishing
Integrating Open Data into Open Access Journals
Who? Address the needs of journals
publishers and editors
What? Enable journals to seamlessly manage
the submission, review, citation, and publication
of data associate with published articles.
How? Integrate existing technologies and
workflows, promote adoption through outreach
and involvement.
Why? Increase replicability of science,
facilitate peer-review of data, promote long-term
access to the scientific evidence base.
17. Introduction to Dataverse
Provides incentives for researchers to
share:
• Recognition & credit via data citations
• Control over data & branding
• Fulfill journal data availability and
funder requirements.
Software framework for publishing, citing and preserving research
data
(open source on github for others to install)
1 2 9 0
Dataverse
s
Harvard Dataverse (open to all; repository instance at Harvard) has:
59,346 Datasets
248,35
1
Files
> 1 Million
Downloads
17
18. Open Journal System (OJS)
Integrating Open Data into Open Access Journals
Open source journal management and publishing system
created by the Public Knowledge Project (PKP) to expand & improve access
to research.
19. About OJS
Integrating Open Data into Open Access Journals
OJS Software hosts almost 10000
active journals
Used in all continents
Particularly popular in developing countries
OJS Model
Open Access Publication
Open Software
Services
Journal hosting
Crossref intergration
PLOS Article Level Metrics
LOCKSS Integration
OJS is part of a suite of products
for automating workflow:
Journal Publishing
Monograph Publishing
Conference Hosting
21. Actor Roles
Integrating Open Data into Open Access Journals
The Author submits their article and research data to the journal's
OJS article submission system. (Note that the article and data do not
have to be submitted at the same time. Authors can also submit data
at a later time, or they can just provide a persistent link with a data
citation pointing to the repository that their data is currently in.)
Editors and/or Peer Reviewers review the article and data.
If the article and corresponding research data are approved for
publication, the Authors' research data and its corresponding
metadata is automatically deposited from OJS into the Dataverse
through the API. No redundant information need be entered. A
permanent identifier (DOI) will be automatically included that allows
the data to be cited and tracked. There will be a data citation
included in the journal article page in OJS (and ideally within the
Reference section of the article) enabling readers of the article to
quickly access the data.
The Dataverse stores the dataset metadata and files (including raw
data, documentation, code, etc). There will also be a permanent
publication citation link within the Dataverse for researchers to
access the article in OJS that corresponds to this research data.
22. Developing a Data Submission & Review Workflow
Integrating Open Data into Open Access Journals
23. OJS Plugin Architecture
Integrating Open Data into Open Access Journals
Plugin’s Extend Back-End Functionality and User Interface
Data Publication now part of OJS distribution
Can target any Dataverse repository
24. Author Submission
Integrating Open Data into Open Access Journals
Extends supplementary
file submission
Can provide extended
metadata
Can provide data
citation
25. Data Publication
Integrating Open Data into Open Access Journals
Data published through dataverse
Provides on-line exploration, reformatting, etc.
Linked through citations, DOI’s and author ID’s
Data updates managed in repository
26. Integration Through Sword
Integrating Open Data into Open Access Journals
Full SWORD 2 deposit interface
The core supported functions:
Retrieve SWORD service document
Create a dataset with an Atom
Dublin Core Terms (DC Terms) Qualified
Mapping
- Dataverse DB Element Crosswalk
List datasets in a dataverse
Add files to a dataset with a zip file
Display a dataset atom entry
Display a dataset statement
Delete a file by database id
Replacing metadata for a dataset
Delete a dataset
Determine if a dataverse has been published
Publish a dataverse
Publish a dataset
Complementary Dataverse API’s
Search
Data Access
Data Analysis
Native Harvesting
27. REST-ful-ness
Integrating Open Data into Open Access Journals
List datasets in a dataverse
curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-
deposit/v1.1/swordv2/collection/dataverse/$DATAVERSE_ALIAS
Add files to a dataset with a zip file
curl -u $API_TOKEN: --data-binary @path/to/example.zip -H
"Content-Disposition: filename=example.zip" -H "Content-
Type: application/zip" -H "Packaging:
http://purl.org/net/sword/package/SimpleZip"
https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/edit-
media/study/doi:TEST/12345
Display a dataset atom entry
curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-
deposit/v1.1/swordv2/edit/study/doi:TEST/12345
28. Results
Integrating Open Data into Open Access Journals
First complete journal + data publishing workflow
Successful integration of two major OSS systems
Leverages a standard, open protocol; plugin architecture
Released and supported by existing development
communities
30. Policy Changes Since we Started
Integrating Open Data into Open Access Journals
Preregistration
Requirements
Data Sharing Requirements
Open Access Requirements
Data Citation Requirements
31. Evaluating Current Open Journal Policies
Integrating Open Data into Open Access Journals
Self selected sample of OJS Publishers
Random Samples of
OJS Journals
DOAJ Journals
Coding of data sharing policy by strength
32. Substantial Interest
(In a self-selected sample)
Integrating Open Data into Open Access Journals
>200 OJS Journals
95% -- Data citation is important
75% -- Data sharing is important
72% -- Replicability is important
33. Limited Adoption of Data Policies in Open Access Journals
Integrating Open Data into Open Access Journals
35. Future Integrations
Integrating Open Data into Open Access Journals
PKP
push into Archivematic
(experimental)
Deposit into other SWORD endpoints
Dataverse
Accept deposits from OSF
Accept deposits from other SWORD suppliers
36. Additional References
● Crosas M. "A Data Sharing Story." Journal of
eScience Librarianship. 1(3), 173-179. 2013.
● Crosas, M. "The Dataverse Network™: An Open-
Source Application for Sharing, Discovering and
Preserving Data," D-lib Magazine 17(1/2). 2011.
● Willinsky, J. "Open Journal Systems: An example of
open source software for journal management and
publishing." Library Hi-Tech 23 (4), 504-519. 2005.
Integrating Open Data into Open Access Journals
38. Creative Commons License
This work. Managing Confidential
information in research, by Micah Altman
(http://redistricting.info) is licensed under
the Creative Commons Attribution-Share
Alike 3.0 United States License. To view a
copy of this license, visit
http://creativecommons.org/licenses/by-
sa/3.0/us/ or send a letter to Creative
Commons, 171 Second Street, Suite 300,
San Francisco, California, 94105, USA.
Integrating Open Data into Open Access Journals