SlideShare a Scribd company logo
1 of 38
Integrating Open Data into Open Access Journals
Integrating Open Data into Open
Access Journals
Micah Altman
Director of Research
MIT Libraries
Prepared for
Program on Information Science Brown Bag Series
MIT
October 2015
Roadmap
Integrating Open Data into Open Access Journals
Motivation
• Reproducibility
Intervention
• Integrating journal and
data publication workflow
Future
• Changing policies & uses
Integrating Open Data into Open Access Journals
Credits
&
Disclaimers
DISCLAIMER
These opinions are my own, they are not the
opinions of MIT, Brookings, any of the project
funders, nor (with the exception of co-authored
previously published work) my collaborators
Secondary disclaimer:
“It’s tough to make predictions, especially about
the future!”
-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston
Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert
Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan
Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel,
Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.
Integrating Open Data into Open Access Journals
Collaborators & Co-Conspirators
 Collaborators and Co-conspirators
 IQSS, Harvard: Eleni Castro, Mercè Crosas, Phil Durbin
 PKP Project: Alex Garnett, Jenn Whitney
 Research Support
 Supported by the Sloan Foundation
Integrating Open Data into Open Access Journals
Related Work
Project Website
projects.iq.harvard.edu/ojs-dvn
Related publications:
(Reprints available from: informatics.mit.edu )
 Altman M, Castro E, Crosas M, Durbin P, Garnett A, Whitney J. Open
Journal Systems and Dataverse Integration-- Helping Journals to
Upgrade Data Publication for Reusable Research. Code4Lib
Journal. Forthcoming.
 Altman M, Avery M. 2015 Information wants someone else to pay for
it: laws of information economics and scholarly publishing.
Information Services and Use . 2015;35(1-2):57-70.
 Altman M, Borgman C, Crosas M, Martone M. An Introduction to the
Joint Principles for Data Citation. Bulletin of the Association for
Information Science and Technology [Internet]. 2015;41(3):43-44.
 Brand A, Allen L, Altman M, Hlava M, Scott J. Beyond authorship:
attribution, contribution, collaboration, and credit. Learned
Publishing. 2015;28(2):151-155.
Integrating Open Data into Open Access Journals
Integrating Open Data into Open Access Journals
Concerns for
Reliable Science
New Initiatives to Improve Scientific Reliability
 Retraction monitoring
 Data citation
 Clinical trial
preregistration
 Registered replication
 Open data
 Badges
Integrating Open Data into Open Access Journals
Integrating Open Data into Open Access Journals
What can go
wrong?
Misconduct & Lies
Integrating Open Data into Open Access Journals
Irreproducible Results
Integrating Open Data into Open Access Journals
 Many journals have
no replication policy
 Even in journals with
clear policy, success
rate is low
The File Drawer Problem
Integrating Open Data into Open Access Journals
Daniel
Schectman’s
Lab Notebook
Providing
Initial
Evidence of
Quasi Crystals
• Null results are less likely to be published 
published results as a whole are biased toward positive findings
• Outliers are routinely discarded 
unexpected patterns of evidence across studies remain hidden
Potential Interventions
Integrating Open Data into Open Access Journals
Trustworthy
Science
Access to
Scholarly
Record
Reproducible
processes
Attribution
&
Provenance
Management
and
Governance of
the Evidence
Base
Measurement and
Evaluation
Integrating Open Data into Open Access Journals
The Project
Citation
to Data
Citation
to Article
• Technical Integration
• Socio-Technical Intervention
Integrating Open Journals and Data Publishing
Integrating Open Data into Open Access Journals
 Who? Address the needs of journals
publishers and editors
 What? Enable journals to seamlessly manage
the submission, review, citation, and publication
of data associate with published articles.
 How? Integrate existing technologies and
workflows, promote adoption through outreach
and involvement.
 Why? Increase replicability of science,
facilitate peer-review of data, promote long-term
access to the scientific evidence base.
Introduction to Dataverse
Provides incentives for researchers to
share:
• Recognition & credit via data citations
• Control over data & branding
• Fulfill journal data availability and
funder requirements.
Software framework for publishing, citing and preserving research
data
(open source on github for others to install)
1 2 9 0
Dataverse
s
Harvard Dataverse (open to all; repository instance at Harvard) has:
59,346 Datasets
248,35
1
Files
> 1 Million
Downloads
17
Open Journal System (OJS)
Integrating Open Data into Open Access Journals
Open source journal management and publishing system
created by the Public Knowledge Project (PKP) to expand & improve access
to research.
About OJS
Integrating Open Data into Open Access Journals
 OJS Software hosts almost 10000
active journals
 Used in all continents
 Particularly popular in developing countries
 OJS Model
 Open Access Publication
 Open Software
 Services
 Journal hosting
 Crossref intergration
 PLOS Article Level Metrics
 LOCKSS Integration
 OJS is part of a suite of products
for automating workflow:
 Journal Publishing
 Monograph Publishing
 Conference Hosting
Integrating Workflow Across the Lifecycle
Integrating Open Data into Open Access Journals
Actor Roles
Integrating Open Data into Open Access Journals
 The Author submits their article and research data to the journal's
OJS article submission system. (Note that the article and data do not
have to be submitted at the same time. Authors can also submit data
at a later time, or they can just provide a persistent link with a data
citation pointing to the repository that their data is currently in.)
 Editors and/or Peer Reviewers review the article and data.
 If the article and corresponding research data are approved for
publication, the Authors' research data and its corresponding
metadata is automatically deposited from OJS into the Dataverse
through the API. No redundant information need be entered. A
permanent identifier (DOI) will be automatically included that allows
the data to be cited and tracked. There will be a data citation
included in the journal article page in OJS (and ideally within the
Reference section of the article) enabling readers of the article to
quickly access the data.
 The Dataverse stores the dataset metadata and files (including raw
data, documentation, code, etc). There will also be a permanent
publication citation link within the Dataverse for researchers to
access the article in OJS that corresponds to this research data.
Developing a Data Submission & Review Workflow
Integrating Open Data into Open Access Journals
OJS Plugin Architecture
Integrating Open Data into Open Access Journals
 Plugin’s Extend Back-End Functionality and User Interface
 Data Publication now part of OJS distribution
 Can target any Dataverse repository
Author Submission
Integrating Open Data into Open Access Journals
 Extends supplementary
file submission
 Can provide extended
metadata
 Can provide data
citation
Data Publication
Integrating Open Data into Open Access Journals
 Data published through dataverse
 Provides on-line exploration, reformatting, etc.
 Linked through citations, DOI’s and author ID’s
 Data updates managed in repository
Integration Through Sword
Integrating Open Data into Open Access Journals
 Full SWORD 2 deposit interface
The core supported functions:
 Retrieve SWORD service document
 Create a dataset with an Atom
 Dublin Core Terms (DC Terms) Qualified
Mapping
- Dataverse DB Element Crosswalk
 List datasets in a dataverse
 Add files to a dataset with a zip file
 Display a dataset atom entry
 Display a dataset statement
 Delete a file by database id
 Replacing metadata for a dataset
 Delete a dataset
 Determine if a dataverse has been published
 Publish a dataverse
 Publish a dataset
 Complementary Dataverse API’s
 Search
 Data Access
 Data Analysis
 Native Harvesting
REST-ful-ness
Integrating Open Data into Open Access Journals
List datasets in a dataverse
curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-
deposit/v1.1/swordv2/collection/dataverse/$DATAVERSE_ALIAS
Add files to a dataset with a zip file
curl -u $API_TOKEN: --data-binary @path/to/example.zip -H
"Content-Disposition: filename=example.zip" -H "Content-
Type: application/zip" -H "Packaging:
http://purl.org/net/sword/package/SimpleZip"
https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/edit-
media/study/doi:TEST/12345
Display a dataset atom entry
curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-
deposit/v1.1/swordv2/edit/study/doi:TEST/12345
Results
Integrating Open Data into Open Access Journals
 First complete journal + data publishing workflow
 Successful integration of two major OSS systems
 Leverages a standard, open protocol; plugin architecture
 Released and supported by existing development
communities
Integrating Open Data into Open Access Journals
The Future
Policy Changes Since we Started
Integrating Open Data into Open Access Journals
 Preregistration
Requirements
 Data Sharing Requirements
 Open Access Requirements
 Data Citation Requirements
Evaluating Current Open Journal Policies
Integrating Open Data into Open Access Journals
 Self selected sample of OJS Publishers
 Random Samples of
 OJS Journals
 DOAJ Journals
 Coding of data sharing policy by strength
Substantial Interest
(In a self-selected sample)
Integrating Open Data into Open Access Journals
 >200 OJS Journals
 95% -- Data citation is important
 75% -- Data sharing is important
 72% -- Replicability is important
Limited Adoption of Data Policies in Open Access Journals
Integrating Open Data into Open Access Journals
Comparison to Other Fields
Integrating Open Data into Open Access Journals
Future Integrations
Integrating Open Data into Open Access Journals
 PKP
 push into Archivematic
(experimental)
 Deposit into other SWORD endpoints
 Dataverse
 Accept deposits from OSF
 Accept deposits from other SWORD suppliers
Additional References
● Crosas M. "A Data Sharing Story." Journal of
eScience Librarianship. 1(3), 173-179. 2013.
● Crosas, M. "The Dataverse Network™: An Open-
Source Application for Sharing, Discovering and
Preserving Data," D-lib Magazine 17(1/2). 2011.
● Willinsky, J. "Open Journal Systems: An example of
open source software for journal management and
publishing." Library Hi-Tech 23 (4), 504-519. 2005.
Integrating Open Data into Open Access Journals
Questions?
E-mail: escience@mit.edu
Web: informatics.mit.edu
Integrating Open Data into Open Access Journals
Creative Commons License
This work. Managing Confidential
information in research, by Micah Altman
(http://redistricting.info) is licensed under
the Creative Commons Attribution-Share
Alike 3.0 United States License. To view a
copy of this license, visit
http://creativecommons.org/licenses/by-
sa/3.0/us/ or send a letter to Creative
Commons, 171 Second Street, Suite 300,
San Francisco, California, 94105, USA.
Integrating Open Data into Open Access Journals

More Related Content

What's hot

Big Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTPBig Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTP
Micah Altman
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
University of Washington
 

What's hot (20)

Executing the Research Paper
Executing the Research PaperExecuting the Research Paper
Executing the Research Paper
 
June2014 brownbag privacy
June2014 brownbag privacyJune2014 brownbag privacy
June2014 brownbag privacy
 
State of the Art Informatics for Research Reproducibility, Reliability, and...
 State of the Art  Informatics for Research Reproducibility, Reliability, and... State of the Art  Informatics for Research Reproducibility, Reliability, and...
State of the Art Informatics for Research Reproducibility, Reliability, and...
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
Managing confidential data
Managing confidential dataManaging confidential data
Managing confidential data
 
Big Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTPBig Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTP
 
Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data Privacy
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy Issues
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Privacy tool osha comments
Privacy tool osha commentsPrivacy tool osha comments
Privacy tool osha comments
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do research
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research Requirements
 
ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration Tools
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsCase Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
 

Similar to BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS

Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble
 

Similar to BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS (20)

Data Publishing Workflows with Dataverse
Data Publishing Workflows with DataverseData Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Open Science: Research Data Management
Open Science: Research Data ManagementOpen Science: Research Data Management
Open Science: Research Data Management
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11
 
How can we ensure research data is re-usable? The role of Publishers in Resea...
How can we ensure research data is re-usable? The role of Publishers in Resea...How can we ensure research data is re-usable? The role of Publishers in Resea...
How can we ensure research data is re-usable? The role of Publishers in Resea...
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13Wilson-npg-scientific data-nfdp13
Wilson-npg-scientific data-nfdp13
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing Elsevier
 
David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published record
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
 
Data publishing at the UQ Library
Data publishing at the UQ LibraryData publishing at the UQ Library
Data publishing at the UQ Library
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 

More from Micah Altman

SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
Micah Altman
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Micah Altman
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
Micah Altman
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Micah Altman
 

More from Micah Altman (20)

Selecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategiesSelecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategies
 
Well-Being - A Sunset Conversation
Well-Being - A Sunset ConversationWell-Being - A Sunset Conversation
Well-Being - A Sunset Conversation
 
Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...
 
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
 
Well-being A Sunset Conversation
Well-being A Sunset ConversationWell-being A Sunset Conversation
Well-being A Sunset Conversation
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer Review
 
Academy Owned Peer Review
Academy Owned Peer ReviewAcademy Owned Peer Review
Academy Owned Peer Review
 
Redistricting in the US -- An Overview
Redistricting in the US -- An OverviewRedistricting in the US -- An Overview
Redistricting in the US -- An Overview
 
A Future for Electoral Districting
A Future for Electoral DistrictingA Future for Electoral Districting
A Future for Electoral Districting
 
A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk  A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
 
Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
 
Ndsa 2016 opening plenary
Ndsa 2016 opening plenaryNdsa 2016 opening plenary
Ndsa 2016 opening plenary
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental Scan
 
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
 
Gary Price, MIT Program on Information Science
Gary Price, MIT Program on Information ScienceGary Price, MIT Program on Information Science
Gary Price, MIT Program on Information Science
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS

  • 1. Integrating Open Data into Open Access Journals
  • 2. Integrating Open Data into Open Access Journals Micah Altman Director of Research MIT Libraries Prepared for Program on Information Science Brown Bag Series MIT October 2015
  • 3. Roadmap Integrating Open Data into Open Access Journals Motivation • Reproducibility Intervention • Integrating journal and data publication workflow Future • Changing policies & uses
  • 4. Integrating Open Data into Open Access Journals Credits & Disclaimers
  • 5. DISCLAIMER These opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators Secondary disclaimer: “It’s tough to make predictions, especially about the future!” -- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc. Integrating Open Data into Open Access Journals
  • 6. Collaborators & Co-Conspirators  Collaborators and Co-conspirators  IQSS, Harvard: Eleni Castro, Mercè Crosas, Phil Durbin  PKP Project: Alex Garnett, Jenn Whitney  Research Support  Supported by the Sloan Foundation Integrating Open Data into Open Access Journals
  • 7. Related Work Project Website projects.iq.harvard.edu/ojs-dvn Related publications: (Reprints available from: informatics.mit.edu )  Altman M, Castro E, Crosas M, Durbin P, Garnett A, Whitney J. Open Journal Systems and Dataverse Integration-- Helping Journals to Upgrade Data Publication for Reusable Research. Code4Lib Journal. Forthcoming.  Altman M, Avery M. 2015 Information wants someone else to pay for it: laws of information economics and scholarly publishing. Information Services and Use . 2015;35(1-2):57-70.  Altman M, Borgman C, Crosas M, Martone M. An Introduction to the Joint Principles for Data Citation. Bulletin of the Association for Information Science and Technology [Internet]. 2015;41(3):43-44.  Brand A, Allen L, Altman M, Hlava M, Scott J. Beyond authorship: attribution, contribution, collaboration, and credit. Learned Publishing. 2015;28(2):151-155. Integrating Open Data into Open Access Journals
  • 8. Integrating Open Data into Open Access Journals Concerns for Reliable Science
  • 9. New Initiatives to Improve Scientific Reliability  Retraction monitoring  Data citation  Clinical trial preregistration  Registered replication  Open data  Badges Integrating Open Data into Open Access Journals
  • 10. Integrating Open Data into Open Access Journals What can go wrong?
  • 11. Misconduct & Lies Integrating Open Data into Open Access Journals
  • 12. Irreproducible Results Integrating Open Data into Open Access Journals  Many journals have no replication policy  Even in journals with clear policy, success rate is low
  • 13. The File Drawer Problem Integrating Open Data into Open Access Journals Daniel Schectman’s Lab Notebook Providing Initial Evidence of Quasi Crystals • Null results are less likely to be published  published results as a whole are biased toward positive findings • Outliers are routinely discarded  unexpected patterns of evidence across studies remain hidden
  • 14. Potential Interventions Integrating Open Data into Open Access Journals Trustworthy Science Access to Scholarly Record Reproducible processes Attribution & Provenance Management and Governance of the Evidence Base Measurement and Evaluation
  • 15. Integrating Open Data into Open Access Journals The Project Citation to Data Citation to Article • Technical Integration • Socio-Technical Intervention
  • 16. Integrating Open Journals and Data Publishing Integrating Open Data into Open Access Journals  Who? Address the needs of journals publishers and editors  What? Enable journals to seamlessly manage the submission, review, citation, and publication of data associate with published articles.  How? Integrate existing technologies and workflows, promote adoption through outreach and involvement.  Why? Increase replicability of science, facilitate peer-review of data, promote long-term access to the scientific evidence base.
  • 17. Introduction to Dataverse Provides incentives for researchers to share: • Recognition & credit via data citations • Control over data & branding • Fulfill journal data availability and funder requirements. Software framework for publishing, citing and preserving research data (open source on github for others to install) 1 2 9 0 Dataverse s Harvard Dataverse (open to all; repository instance at Harvard) has: 59,346 Datasets 248,35 1 Files > 1 Million Downloads 17
  • 18. Open Journal System (OJS) Integrating Open Data into Open Access Journals Open source journal management and publishing system created by the Public Knowledge Project (PKP) to expand & improve access to research.
  • 19. About OJS Integrating Open Data into Open Access Journals  OJS Software hosts almost 10000 active journals  Used in all continents  Particularly popular in developing countries  OJS Model  Open Access Publication  Open Software  Services  Journal hosting  Crossref intergration  PLOS Article Level Metrics  LOCKSS Integration  OJS is part of a suite of products for automating workflow:  Journal Publishing  Monograph Publishing  Conference Hosting
  • 20. Integrating Workflow Across the Lifecycle Integrating Open Data into Open Access Journals
  • 21. Actor Roles Integrating Open Data into Open Access Journals  The Author submits their article and research data to the journal's OJS article submission system. (Note that the article and data do not have to be submitted at the same time. Authors can also submit data at a later time, or they can just provide a persistent link with a data citation pointing to the repository that their data is currently in.)  Editors and/or Peer Reviewers review the article and data.  If the article and corresponding research data are approved for publication, the Authors' research data and its corresponding metadata is automatically deposited from OJS into the Dataverse through the API. No redundant information need be entered. A permanent identifier (DOI) will be automatically included that allows the data to be cited and tracked. There will be a data citation included in the journal article page in OJS (and ideally within the Reference section of the article) enabling readers of the article to quickly access the data.  The Dataverse stores the dataset metadata and files (including raw data, documentation, code, etc). There will also be a permanent publication citation link within the Dataverse for researchers to access the article in OJS that corresponds to this research data.
  • 22. Developing a Data Submission & Review Workflow Integrating Open Data into Open Access Journals
  • 23. OJS Plugin Architecture Integrating Open Data into Open Access Journals  Plugin’s Extend Back-End Functionality and User Interface  Data Publication now part of OJS distribution  Can target any Dataverse repository
  • 24. Author Submission Integrating Open Data into Open Access Journals  Extends supplementary file submission  Can provide extended metadata  Can provide data citation
  • 25. Data Publication Integrating Open Data into Open Access Journals  Data published through dataverse  Provides on-line exploration, reformatting, etc.  Linked through citations, DOI’s and author ID’s  Data updates managed in repository
  • 26. Integration Through Sword Integrating Open Data into Open Access Journals  Full SWORD 2 deposit interface The core supported functions:  Retrieve SWORD service document  Create a dataset with an Atom  Dublin Core Terms (DC Terms) Qualified Mapping - Dataverse DB Element Crosswalk  List datasets in a dataverse  Add files to a dataset with a zip file  Display a dataset atom entry  Display a dataset statement  Delete a file by database id  Replacing metadata for a dataset  Delete a dataset  Determine if a dataverse has been published  Publish a dataverse  Publish a dataset  Complementary Dataverse API’s  Search  Data Access  Data Analysis  Native Harvesting
  • 27. REST-ful-ness Integrating Open Data into Open Access Journals List datasets in a dataverse curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data- deposit/v1.1/swordv2/collection/dataverse/$DATAVERSE_ALIAS Add files to a dataset with a zip file curl -u $API_TOKEN: --data-binary @path/to/example.zip -H "Content-Disposition: filename=example.zip" -H "Content- Type: application/zip" -H "Packaging: http://purl.org/net/sword/package/SimpleZip" https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/edit- media/study/doi:TEST/12345 Display a dataset atom entry curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data- deposit/v1.1/swordv2/edit/study/doi:TEST/12345
  • 28. Results Integrating Open Data into Open Access Journals  First complete journal + data publishing workflow  Successful integration of two major OSS systems  Leverages a standard, open protocol; plugin architecture  Released and supported by existing development communities
  • 29. Integrating Open Data into Open Access Journals The Future
  • 30. Policy Changes Since we Started Integrating Open Data into Open Access Journals  Preregistration Requirements  Data Sharing Requirements  Open Access Requirements  Data Citation Requirements
  • 31. Evaluating Current Open Journal Policies Integrating Open Data into Open Access Journals  Self selected sample of OJS Publishers  Random Samples of  OJS Journals  DOAJ Journals  Coding of data sharing policy by strength
  • 32. Substantial Interest (In a self-selected sample) Integrating Open Data into Open Access Journals  >200 OJS Journals  95% -- Data citation is important  75% -- Data sharing is important  72% -- Replicability is important
  • 33. Limited Adoption of Data Policies in Open Access Journals Integrating Open Data into Open Access Journals
  • 34. Comparison to Other Fields Integrating Open Data into Open Access Journals
  • 35. Future Integrations Integrating Open Data into Open Access Journals  PKP  push into Archivematic (experimental)  Deposit into other SWORD endpoints  Dataverse  Accept deposits from OSF  Accept deposits from other SWORD suppliers
  • 36. Additional References ● Crosas M. "A Data Sharing Story." Journal of eScience Librarianship. 1(3), 173-179. 2013. ● Crosas, M. "The Dataverse Network™: An Open- Source Application for Sharing, Discovering and Preserving Data," D-lib Magazine 17(1/2). 2011. ● Willinsky, J. "Open Journal Systems: An example of open source software for journal management and publishing." Library Hi-Tech 23 (4), 504-519. 2005. Integrating Open Data into Open Access Journals
  • 38. Creative Commons License This work. Managing Confidential information in research, by Micah Altman (http://redistricting.info) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by- sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Integrating Open Data into Open Access Journals