Open Science is a movement to make scientific research, its data and dissemination accessible to all levels of society. This movement considers aspects such as Open Access, Open Data, Reproducible Research and Open Software.
Each of these aspects presents discreteness that need to be evaluated and discussed by the scientific community so that guidelines are established that facilitate the dissemination of scientific information.
The great challenge is to establish effective and efficient practices that allow journals to add these demands in their editorial processes, so as not only to allow data, software and methods to be accessible, but also to encourage the community to do so.
Considering these questions, this panel has as a proposal to discuss important aspects about the advancement of research communication. Some of these aspects are placed in the SciELO indexing criteria, as is the case of referencing research materials in favor of transparency and reproducibility.
FAIR criteria, concepts and implementation; challenges for the publication of data and methods; institutional policies for open data; adoption of TOP guidelines (Transparency and Openness Promotion); software repositories; thematic areas data repositories.
Jonathan David Crabtree - The Dataverse Community: Supporting Open Science and Reproducibility
SciELO International Conference 2018
Director of Cyberinfrastructure
Founded in 1924, the Odum Institute provides core research infrastructure for the social
sciences to support the research, teaching, and service mission of UNC. We define social
science broadly to include the health sciences, and we serve faculty and students from every
corner of UNC’s campus.
Home of the Lou Harris Data Center and the UNC Dataverse
An ongoing 12 year collaboration around repository solutions and tools
Partnering on projects to promote data sharing and publication
Leading efforts to promote Open and Reproducible Science
An open-source platform to share and archive data
Developed at Harvard’s Institute for Quantitative Social Science since 2006
Gives credit and control to data authors and producers
Builds a community to define standards and best practices and foster new
research in data sharing and research reproducibility
Has brought data publishing into the hands of data authors
๏ Data Citation with global persistent IDs:
๏ generate DOI automatically
attribution to data authors and repository
registration to DataCite
๏ Rich Metadata:
๏ citation metadata
domain-specific descriptive metadata
variable and file metadata (extracted automatically)
๏ Access and usage controls:
๏ open data as default, with CC0 waiver
data can be restricted, but citation & metadata always publicly accessible
๏ APIs and standards:
๏ SWORD, OAI-PMH, native API to search and get data and metadata Dublin Core and DDI
PROV ontology standard to capture provenance of a dataset (coming soon)
Has grown to 33 installations around the world
Thousands of scientific studies archived
Across many disciplines
Supporting many metadata standards
The Global Dataverse Community Consortium (GDCC) is dedicated to
providing international organization to existing Dataverse community
efforts, and will provide a collaborative venue for institutions to leverage
economies of scale in support of Dataverse repositories around the world.
But, are these shared data reusable?
For this, we need well-documented, well-organized data and code as well as
tools to facilitate the replication and reuse
More than 50% of the top
50 journals in
political sciences have
data policies that either
encourage or require to
share the data
associated with the
Crosas, Gautier, Karcher, Kirilova, Otalora, Schwartz, 2018. Data Policies of highly-ranked social science journals
With funding from the Sloan Foundation, our organizations plan to address data reuse and reproducibility by:
– Improving curation through educational materials, friendly user-interface, and services
– Integrating replication tools with Dataverse repositories:
• Encapsulator to pack your data and code in a self-contained, documented capsule (IQSS Harvard)
• Code Ocean to easily run scientific code online (IQSS Harvard)
• CoRe2 to connect systems in order to streamline the verification workflow (ODUM Institute)
Linking Tools to Promote
Support for this research was provided by the
Alfred P. Sloan Foundation (2018-11121). The
views expressed here do not necessarily reflect
the views of the Foundation.
Manuscript Publication & Data Curation + Verification
Given current constraints and the need for iterative review, data curation and
successful verification of a replication package for a single manuscript requires
six hours of labor on average.
COMPUTATION COORDINATION ADMINISTRATION
Promote and support computational reproducibility by
integrating and streamlining manuscript publication and
data curation + verification workflows
● Facilitate access to and adoption of tools and platforms to support
● Coordinate manuscript submission and data curation + verification
workflow processes across key stakeholders
● Promote the adoption of standards and best practices for data access and
transparency as part of normative research practice.