The document discusses recommendations from a workshop on peer review of research data. It focuses on three key areas:
1. Connecting data review with data management planning by requiring data sharing plans, ensuring adequate funding for data management, and refusing publication without clear data access.
2. Connecting scientific and technical review with data curation by linking articles and data with versioning, avoiding duplicate review efforts, and addressing issues found in data.
3. Connecting data review with article review by requiring methods/software information, providing review checklists, ensuring data access for reviewers, and permanent dataset identifiers from repositories.
This slideshow was used in a Preparing Your Research Data for the Future course taught in the Medical Sciences Division, University of Oxford, on 2015-06-08. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
This slideshow was used at a lunchtime session delivered at the Humanities Division, University of Oxford, on 2014-05-12. It provides a general overview of some key data management topics, plus some pointers on where to find further information.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
This slideshow was used in a Preparing Your Research Data for the Future course taught in the Medical Sciences Division, University of Oxford, on 2015-06-08. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
This slideshow was used at a lunchtime session delivered at the Humanities Division, University of Oxford, on 2014-05-12. It provides a general overview of some key data management topics, plus some pointers on where to find further information.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
Data Publishing Models by Sünje Dallmeier-Tiessendatascienceiqss
Data Publishing is becoming an integral part of scholarly communication today. Thus, it is indispensable to understand how data publishing works across disciplines. Are there best practices others can learn from or even data publishing standards? How do they impact interoperability in the Open Science landscape? The presentation will look at a range of examples, and the main building blocks of data publishing today. The work has been conducted as part of the RDA Data Publishing Workflows group.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2015-02-09. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...LEARN Project
Enabling Precise Identification and Citability of Dynamic Data: Recommendations of the RDA Working Group, by Andreas Rauber – 2nd LEARN Workshop, Vienna, 6th April 2016
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
This presentation was provided by Melissa Levine of the University of Michigan during a NISO Virtual Conference on the topic of data curation, held on Wednesday, August 31, 2016
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
Slides providing an overview of the research methods used in the author's thesis, "Managing Ireland's Research Data: Recognising Roles for Recordkeepers". The methods discussed are online surveys, comparative case studies, and autoethnography.
Licensed as CC-BY.
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
Our regular Introduction to Data Management (DM) workshop (90-minutes). Covers very basic DM topics and concepts. Audience is graduate students from all disciplines. Most of the content is in the NOTES FIELD.
Data Publishing Models by Sünje Dallmeier-Tiessendatascienceiqss
Data Publishing is becoming an integral part of scholarly communication today. Thus, it is indispensable to understand how data publishing works across disciplines. Are there best practices others can learn from or even data publishing standards? How do they impact interoperability in the Open Science landscape? The presentation will look at a range of examples, and the main building blocks of data publishing today. The work has been conducted as part of the RDA Data Publishing Workflows group.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2015-02-09. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...LEARN Project
Enabling Precise Identification and Citability of Dynamic Data: Recommendations of the RDA Working Group, by Andreas Rauber – 2nd LEARN Workshop, Vienna, 6th April 2016
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
This presentation was provided by Melissa Levine of the University of Michigan during a NISO Virtual Conference on the topic of data curation, held on Wednesday, August 31, 2016
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
Slides providing an overview of the research methods used in the author's thesis, "Managing Ireland's Research Data: Recognising Roles for Recordkeepers". The methods discussed are online surveys, comparative case studies, and autoethnography.
Licensed as CC-BY.
Preparing your data for sharing and publishingVarsha Khodiyar
Talk given as part of the MRC Cognition and Brain Sciences Unit Open Science Day on 20th November 2018 , University of Cambridge (https://www.eventbrite.co.uk/e/open-science-day-at-the-mrc-cbu-tickets-50363553745)
Creating a sustainable business model for a digital repository: the Dryad exp...ASIS&T
Creating a sustainable business model for a digital repository: the Dryad experience
Peggy Schaeffer
Datadryad.org
Presentation at Research Data Access & Preservation Summit
22 March 2012
Presentation on data sharing that outlines five layers that must be addressed to enable data to be located, obtained, access, understood and use, and cited.
Transparency and reproducibility in researchLouise Corti
Talk given at the ESS Summer School: An introduction to using big data in the social sciences, 20-24 July 2020, University of Essex, Colchester, UK.
In the morning we look at publishing and sharing data and the importance of research replication, code sharing, examining what methodological issues peer reviewers might look for in a published paper using big data. An increasing number of journals in the sciences and social sciences expect a high degree of transparency and knowing how best to publish high quality raw (or processed data), methodology and code is a useful skill. We show how ‘data papers’ help to elucidate how datasets were constructed, compiled and processed, and help to showcase the value of data beyond the original research.
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
Presentation given at the M25 Consortium of Academic Libraries, CPD25 Event on 'The Role of the Library in Supporting Research'. Provides an introduction to data, software and PIDs and a brief look at how libraries can enable researchers to gain impact and credit for their research data and software.
Presentation slides on Open Science and research reproducibility. Presented by Gareth Knight (LSHTM Research Data Manager) on 18th September 2018, as part of an Open Science event for LSHTM Week 2018.
Paper was presented at European Survey Research Association 2013, in the session Research Data Management for Re-use: Bringing Researchers and Archivists closer.
Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET
For all proposals submitted on/after January 25 2023, NIH requires data sharing from all NIH-funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and available resources that could help.
In our upcoming session on March 3, 2023, we are pleased to invite Dr. Jeffrey Grethe, dkNET co-PI and expert on Data Management and Sharing, Dr. Rebecca Rodriguez, Repository Program Director at NIDDK, Ms. Reaya Reuss, Chief of Staff to the Deputy Director at NIDDK, and the support team members from the NIDDK Central Repository. They will be available to answer any questions you may have.
*Previous Office Hours Slides and Recording: https://dknet.org/about/blog/2535
Upcoming Webinars Schedule: https://dknet.org/about/webinar
Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarised under the term Research Data Repositories (RDR). The project re3data.org – Registry of Research Data Repositories – began to index research data repositories in 2012 and offers researchers, funding organisations, libraries and publishers an overview of the heterogeneous research data repository landscape. In December 2014 re3data.org listed more than 1,030 research data repositories, which are described in detail using the re3data.org schema (http://dx.doi.org/10.2312/re3.003). Information icons help researchers to identify easily an adequate repository for the storage and reuse of their data. This talk describes the heterogeneous RDR landscape and presents a typology of institutional, disciplinary, multidisciplinary and project-specific RDR. Further, it outlines the features of re3data. org and it shows current developments for integration into data management planning tools and other services.
By the end of 2015 re3data.org and Databib (Purdue University, USA) will merge their services, which will then be managed under the auspices of DataCite. The aim of this merger is to reduce duplication of effort and to serve the research community better with a single, sustainable registry of research data repositories. The talk will present this organisational development as a best practice example for the development of international research information services.
Similar to Perspectives on the Role of Trustworthy Repository Standards in Data Journal Publication (20)
Results from Digital Curation Centre's 2015 survey of UK universities and Higher Education Institutions on development of RDM (research data management) support services
Outline of some of the opportunities and risks driving universities in the UK to develop services aiming to help researchers manage their data. Gives an overview of funder policies and service components, and suggests where these are heading in the short term.
Presentation from "Institutional Repositories Dealing with Data" OR2013 Workshop, 8th July 2013, Prince Edward Island. Outlines UK programmes to help Higher Education Institutions develop Research Data Management Services. Gives background on the Digital Curation Centre, and the
DCC role in developing services. Outlines emerging RDM services based on this experience. projects in the JISC Managing Research Data programmes, and two ecent surveys on library plans & priorities. Then outlnes
examples in ‘new’ universities of how repository managers are enabling new roles for subject librarians to take shape in their institutions.
Presented to "Managing the Material: Tackling Visual Arts as Research Data" workshop, organised by Visual Arts Data Service (VADS) in conjunction with the Digital Curation Centre (DCC), through the JISC-funded KAPTUR project. London, 14 September 2012
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Connector Corner: Automate dynamic content and events by pushing a button
Perspectives on the Role of Trustworthy Repository Standards in Data Journal Publication
1. Perspectives on the Role of
Trustworthy Repository Standards in
Data Journal Publication
IASSIST Cologne, 31 May 2013
Angus Whyte, Sarah Callaghan,Jonathan
Tedds, Matthew S. Mayernik, and the
PREPARDE project team
#preparde
a.whyte@ed.ac.uk
2. Aims
1. Introduce the PREPARDE project
Data journal and repository links
Data peer-review
Repository trust accreditation *
2. Repository certification background
Why relevant to data journals
Standards developed
Issues being discussed
1. PREPARDE Guidelines
Input to them from IDCC workshop
Hopefully also your comments…
Q. What should repositories,
depositors and journals
expect from one another?
Q. What are use cases for
data journals in social
sciences?
Q. What support should
institutions offer?
PREPARDE Guidelines
3. PREPARDE:Peer REview for Publication & Accreditation
of Research Data in the Earth sciences
Lead Institution: University of Leicester
Partners
– British Atmospheric Data Centre (BADC)
– US National Centre for Atmospheric Research (NCAR)
– California Digital Library (CDL)
– Digital Curation Centre (DCC)
– University of Reading
– Wiley-Blackwell
– Faculty of 1000 Ltd
Project Lead: Dr Jonathan Tedds (University of Leicester, jat26@le.ac.uk)
Project Manager: Dr Sarah Callaghan (BADC, sarah.callaghan@stfc.ac.uk )
Length of Project: 12 months
Project Start Date: 1st July 2012
Project End Date: 31st June 2013
4. 3 main areas of interest (in orange)
1. Workflows and cross-linking
between journal and repository
2. Repository accreditation
3. Scientific peer-review of data
Main aim: to put in place the
policies and procedures needed
for data publication in the
Geoscience Data Journal and to
generalise those policies for
application outside the Earth
Sciences.
PREPARDE topics
5. Why: Reasons for citing and publishing data
•Pressure from (UK) government to make data from
publicly funded research available for free.
• Scientists want attribution and credit for their work
• Public want to know what the scientists are doing
• Research funders want reassurance that they’re getting
value for money
• Relies on peer-review of science publications (well
established) and data (not done yet!)
• Allows the wider research community to find and use
datasets, and understand the quality of the data
• Extra incentive for scientists to submit their data to data
centres in appropriate formats and with full metadata
http://www.evidencebased-
management.com/blog/2011/11/04/new-
evidence-on-big-bonuses/
6. • Partnership to develop a mechanism for the
formal publication of data in the Open Access
Geoscience Data Journal
• GDJ publishes short data articles cross-linked
to and citing datasets that have been deposited
in approved data centres and awarded DOIs or
other permanent identifier.
• A data article describes a dataset collection,
processing, software, file formats, etc., without
the requirement of novel analyses or ground
breaking conclusions.
• the when, how and why data was collected
and what the data-product is.
How: Geoscience Data Journal, Wiley-Blackwell
and the Royal Meteorological Society
7. Dataset submission
“authors must complete the following
two-tiered process:
The dataset, along with supporting
metadata, must be formally archived in
a Geoscience Data Journal approved
repository or data centre (and
preferably have been assigned a digital
object identifier (DOI))
…An approved repository is one that is
commonly used by the scientific
community it supports, has a formal
data management policy in place, and
can mint a DOI or provide a stable URL
and unique identifier for the dataset. “
Author Guidelines
Current approved repositories are:
3TU.Datacentrum
British Atmospheric Data Centre (BADC)
British Oceanographic Data Centre (BODC)
CSIRO Data Access Portal
Environmental Information Data Centre (EIDC)
Figshare
National Geoscience Data Centre (NGDC)
NERC Earth Observation Data Centre (NEODC)
PANGAEA
Polar Data Centre (PDC)
8. Dataset submission
… Subject to satisfactory reviews of both dataset
and paper, Geoscience Data Journal will publish
the data description paper, along with a link to
the underlying dataset (usually by means of the
dataset's DOI).
Author Guidelines
9. BADC
Data Data
BODC
DataData
A Journal
(Any online
journal system)
PDF PDF PDF PDF PDF
Word processing software
with journal template
Data Journal
(Geoscience Data Journal)
html html html html
1) Author prepares the
paper using word
processing software.
3) Reviewer reviews the
PDF file against the
journal’s acceptance
criteria.
2) Author submits
the paper as a
PDF/Word file.
Word processing software
with journal template
1) Author prepares the
data paper using word
processing software and
the dataset using
appropriate tools.
2a) Author submits
the data paper to
the journal.
3) Reviewer reviews
the data paper and
the dataset it points
to against the
journals acceptance
criteria.
The traditional online journal model
Overlay journal model for publishing data
2b) Author submits
the dataset to a
repository.
Data
How we publish data
10. GDJ Reviewers consider three sets of questions
Review I – Data description document
1. Is the method used to create the data of a high
scientific standard?
2. Is enough information provided (in metadata
also) to enable the data to be re-used or the
experiment to be repeated?
3. Does the document provide a comprehensive
description of all the data that is there?
4. Does the data make an important and unique
contribution to the meteorological sciences?
5. What range of applications to meteorological
sciences does it have?
6. Are all contributors and existing work
acknowledged?
Peer Review
11. GDJ Reviewers consider three sets of questions
Review II – Metadata
7. Does the metadata establish the ownership of
the data fairly?
8. Is enough information provided (in data
description document also) to enable the data to
be re-used or the experiment to be repeated?
9. Are the data present as described, and
accessible from a registered repository using the
software provided?
Peer Review
Overlaps with repository
appraisal, curation
processes…and trust
certification?
12. GDJ Reviewers consider three sets of questions
Review III – Data themselves
10. Are the data easily readable, e.g. across
different platforms such as Linux Mac and
Windows?
11. Are the data of high quality e.g. are error
limits and quality statements adequate to assess
fitness for purpose, is spatial or temporal
coverage good enough to make the data useable?
12. Are the data values physically possible and
plausible?
13. Are there missing data that might
compromise its usefulness?
Peer Review
Overlaps with repository
appraisal, curation
processes…and trust
certification?
13. Repository accreditation schemes
European Framework for Audit and Certification of Digital
Repositories.
Three levels, in increasing trustworthiness:
1. Basic Certification is granted to repositories which obtain
DSA (Data Seal of Approval) certification;
2. Extended Certification is granted to Basic Certification
repositories which in addition perform a
structured, externally reviewed and publicly available self-
audit based on ISO 16363 or DIN 31644;
3. Formal Certification is granted to repositories which in
addition to Basic Certification obtain full external audit and
certification based on ISO 16363 or equivalent DIN 31644."
14. Data
Centre
Repository accreditation – IDCC workshop
Link between data paper and
dataset is crucial!
• How can data journal
editors know a repository
is trustworthy
• How can repositories
prove they’re trustworthy
• What does “trustworthy”
mean for data journal
peer review?
What guidelines can journals use?
• General, cross-disciplinary
and concrete
• How far do certification
standards help
15. IDCC Workshop background
PREPARDE Workshop, Amsterdam 17 Jan. 2013
• Research Data Alliance - Working Group on
Repository Accreditation
http://forum.rd-alliance.org/viewtopic.php?f=3&t=31
• Previous work on integrating data and
publications e.g. DRIVER project and
Opportunities for Data Exchange report
• Innovation in data integration
E.g. PANGAEA – Elsevier since 2010
• New data journals e.g. Journal of Open
Psychology Data (Ubiquity Press, DANS)
16. Workshop perspectives
36 Participants – range of roles
Data Centres - UKDA, PANGAEA,
BADC
Learned Society - Royal Society
Chemistry
Publisher - Elsevier
Institutions - UK, US, De, Aus,
NL, Ch.
National Libraries & Orgs -
STM Assoc. DANS (NL), NRF
(SA), BL, DCC (UK)
Common Ground
• Data journals offer reuse
and citation
But a passing phase?
• Data journals offer credit
to data managers
• Certification yes, it offers
journals some assurances
• Collaboration key as roles
& infrastructure evolve
17. For data publication “trust” means…
What certification standards say it
is…
Collections policy
Active curation & mgmt
Long-term preservation plans
Persistent Links
Landing pages
Continuity plan
Support for multi- stage review
Repository – QA, appraisal
Peer – open or closed
User – e.g. DANS study
Journals can plan how to integrate
more data into article
Don’t have to look at process detail
for each dataset reviewed
Data centrescan support policy
compliance – track outputs against
grants (e.g. IDEA Data Compliance
Reporting Tool) or data sharing
statements
18. Cloudier issues
How do repository accreditation and
data quality relate to each other?
What about quality of service to
depositors, users?
Researchers’ and other stakeholder
roles …
e.g. advocacy, tool support to gather
provenance info for publication
earlier?
Repository directories – informing
decisions on trust?
Indicators of repository value…not covered
in certification?
•Funding
•Community acceptance
•Alt-metrics – access and reuse metrics
Service level agreements, memorandums of
understanding may better meet some
needs than certification
19. Draft guidelines for journal editors
For data publication, repositories must:
1. Ensure persistence and stability of published datasets
2. Have a clear and public indication to preserve the data or have
responsibility for providing access to the data over the long term
3. Assign globally unique persistent IDs to the published datasets and
maintain all URLs associated with those IDs
4. Provide persistent, actionable links to enable citations to data
5. Ensure that data will be accessible (open data, or info on license terms)
6. Actively manage and curate the data in their archive
7. Appropriate, formal succession plan, contingency plans/ escrow in case
cease
8. Provide info on numbers of deposits and frequency of user access
20. Draft guidelines for journal editors
Repositories can ‘prove’ capabilities to provide persistent access by…
1. Certification on any of 3 levels in TrustedDigitalRepository.eu
2. Regular or network membership of ICSU World Data System
3. Data Centre accreditation via MEDIN
4. Contractual arrangement with DataCite managing agent to mint DOIs
5. Operate using the OAIS reference model
6. Clear intent in mission statement, institutional data mgmt policy, data
preservation plan, collections policy
7. Evidence of community take-up e.g. user numbers, service level
agreements, partnership agreements with well established journals, a
learned society or equivalent body.
Use directory e.g. Re3data for reference on some of above.
21. Landing page requirements
Permanent IDs for the dataset must resolve to a
publicly accessible landing page which must:
• be open and human readable (can also be
provided in a format which is machine readable)
• describe the data object and include metadata and
permanent identifier
• be maintained, even if the data is no longer
available.
Metadata:
• Must be human readable, where possible machine
readable (e.g. DataCitemetadata schema)
• Freely available for discovery purposes
• Repo must develop and implement suitable quality
control measures to ensure the metadata is correct
22. Social Science Use Cases?
Data centres increase reuse
Funders, data centres,
researchers, learned societies
improve transparency
Data centres, researchers,
learned societies, institutions
Improve visibility
Data managers Publication route, get credit
Researchers Provide snapshot of rich
content, sensitive data
Reusers Support meta-analysis
Mine structured description
Visualisation
23. Thank you
And please! Tell us what you think
data-publication@jiscmail.ac.uk
sarah.callaghan@stfc.ac.uk
Project website: http://proj.badc.rl.ac.uk/preparde/wiki
Project blog: http://proj.badc.rl.ac.uk/preparde/blog
24. Peer-review of data
Summary Recommendations from
Workshop at the British Library, 11 March
2013
Workshop attendees included
funders, publishers, repository managers
and other interested parties.
Draft recommendations put up for
discussion and feedback from audience
captured.
Feedback from the community still
welcome!
http://libguides.luc.edu/content.php?pid=5464&sid=164619
25. Connecting data review with data management
planning
1. All research funders should at least require a “data sharing plan” as part of all
funding proposals, and if a submitted data sharing plan is inadequate, appropriate
amendments should be proposed.
2. Research organisations should manage research data according to recognised
standards, providing relevant assurance to funders so that additional technical
requirements do not need to be assessed as part of the funding application peer
review. (Additional note: Research organisations need to provide adequate
technical capacity to support the management of the data that the researchers
generate.)
3. Research organisations and funders should ensure that adequate funding is
available within an award to encourage good data management practice.
4. Data sharing plans should indicate how the data can and will be shared and
publishers should refuse to publish papers which do not clearly indicate how
underlying data can be accessed, where appropriate.
26. 1. Articles and their underlying data or metadata (by the same or other authors)
should be multi-directionally linked, with appropriate management for data
versioning.
2. Journal editors should check data repository ingest policies to avoid duplication of
effort , but provide further technical review of important aspects of the data
where needed. (Additional note: A map of ingest/curation policies of the different
repositories should be generated.)
3. If there is a practical/technical issue with data access (e.g. files don’t open or
exist), then the journal should inform the repository of the issue. If there is a
scientific issue with the data, then the journal should inform the author in the first
instance; if the author does not respond adequately to serious issues, then the
journal should inform the institution who should take the appropriate action.
Repositories should have a clear policy in place to deal with any feedback.
Connecting scientific, technical review and curation
27. 1. For all articles where the underlying data is being submitted, authors need to
provide adequate methods and software/infrastructure information as part of
their article. Publishers of these articles should have a clear data peer review
process for authors and referees.
2. Publishers should provide simple and, where appropriate, discipline-specific data
review (technical and scientific) checklists as basic guidance for reviewers.
3. Authors should clearly state the location of the underlying data. Publishers should
provide a list of known trusted repositories or, if necessary, provide advice to
authors and reviewers of alternative suitable repositories for the storage of their
data.
4. For data peer review, the authors (and journal) should ensure that the data
underpinning the publication, and any tools required to view it, should be fully
accessible to the referee. The referees and the journal need to then ensure
appropriate access is in place following publication.
5. Repositories need to provide clear terms and conditions for access, and ensure
that datasets have permanent and unique identifiers.
Connecting data review with article review
Editor's Notes
Note that Wiley-Blackwell and Faculty of 1000 are commercial partners