FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
The document discusses metadata and its uses in information retrieval. It describes how metadata is important for understanding information in data warehouses and is used in content management and web applications. It also discusses different types of metadata like descriptive, structural and administrative metadata and how metadata functions like allowing resource discovery, organizing e-resources, facilitating interoperability, digital identification, archiving and preservation. Finally, it discusses metadata elements, tools, organizations and concludes stating that metadata is essential in the digital era to improve access to information.
This document summarizes a session from the Force 11 Scholarly Communications Institute Summer School on data discovery. The session covered metadata, including what it is, types of metadata, and standards. It discussed how people search for and find data through various sources. The session also explored the FAIR data principles of findable, accessible, interoperable and reusable data and had breakout groups discuss applying these principles in practice.
Overview of the Research on Open Educational Resources for Development (ROER4D) Open Data initiative, highlighting data management principles, the five pillars of the ROER4D data publication approach and the project de-identification approach.
workshop session delivered alongside 'Making your thesis legal' workshop in July and September 2013 to PhD, MPhil, DrPh students who are completing their thesis. Discusses standards for sharing data, issues that need addressing, formats, data protection, usability, licenses
This document provides an overview of FAIR data principles and the FAIR data ecosystem. It discusses what FAIR data is, including that FAIR data aims to support communities in publishing and utilizing scientific data and knowledge in a findable, accessible, interoperable, and reusable manner. It then describes the different levels of the FAIR data ecosystem, including normative principles, standards in the FAIR data protocol, FAIR data resources that comply with these standards, and systems/tools that use FAIR data. It provides examples of converting raw data into FAIR data resources and the potential applications of a FAIR data ecosystem.
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
The document discusses metadata and its uses in information retrieval. It describes how metadata is important for understanding information in data warehouses and is used in content management and web applications. It also discusses different types of metadata like descriptive, structural and administrative metadata and how metadata functions like allowing resource discovery, organizing e-resources, facilitating interoperability, digital identification, archiving and preservation. Finally, it discusses metadata elements, tools, organizations and concludes stating that metadata is essential in the digital era to improve access to information.
This document summarizes a session from the Force 11 Scholarly Communications Institute Summer School on data discovery. The session covered metadata, including what it is, types of metadata, and standards. It discussed how people search for and find data through various sources. The session also explored the FAIR data principles of findable, accessible, interoperable and reusable data and had breakout groups discuss applying these principles in practice.
Overview of the Research on Open Educational Resources for Development (ROER4D) Open Data initiative, highlighting data management principles, the five pillars of the ROER4D data publication approach and the project de-identification approach.
workshop session delivered alongside 'Making your thesis legal' workshop in July and September 2013 to PhD, MPhil, DrPh students who are completing their thesis. Discusses standards for sharing data, issues that need addressing, formats, data protection, usability, licenses
This document provides an overview of FAIR data principles and the FAIR data ecosystem. It discusses what FAIR data is, including that FAIR data aims to support communities in publishing and utilizing scientific data and knowledge in a findable, accessible, interoperable, and reusable manner. It then describes the different levels of the FAIR data ecosystem, including normative principles, standards in the FAIR data protocol, FAIR data resources that comply with these standards, and systems/tools that use FAIR data. It provides examples of converting raw data into FAIR data resources and the potential applications of a FAIR data ecosystem.
Essentials 4 Data Support: a fine course in FAIR Data SupportEllen Verbakel
The document summarizes the Essentials 4 Data Support (E4DS) course, which teaches people how to support researchers in storing, managing, archiving, and sharing research data according to FAIR principles. The course covers topics like data documentation, identifiers, formats, metadata, and licensing. It is offered online or in a blended format over 6 weeks. The goal is to educate data supporters so that researchers can find, access, interoperate with, and reuse each other's data in a fair manner.
Increasing the Reputation of your Published Data on the WebEric Stephan
The document discusses how following FAIR principles can increase the reputation of published research data on the web. It provides an overview of FAIR principles, including making data findable, accessible, interoperable, and reusable. It also describes a FAIR data maturity model that uses indicators to evaluate how well digital objects, datasets, and repositories follow each of the FAIR principles. The document concludes that FAIR principles and maturity indicators can be used to assess data reputation by researchers, repositories, domains, and institutions working together to share responsibilities in managing research data.
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
The FAIR Data Principles are a hot topic in research data managment. Their adoption within the H2020 funding programme means researchers now have to pay much more attention to how their share, publish and archive their data.
In this light, how can libraries help their research communities implement the FAIR principles? And write better data management plans?
This questions were addressed in a LIBER webinar containing some guidance and reflections on the principles themselves. Presented by Alastair Dunning, Head Research Data Services at the TU Delft (hosts of the 4TU.Centre for Research Data), it is based on a study of 37 data repositories (from subject specific repositories, to generic data archives, to national infrastructures), seeing how far they comply with each of the individual facets of the Data principles.
Presentation on data sharing that outlines five layers that must be addressed to enable data to be located, obtained, access, understood and use, and cited.
The FAIR principles have been introduced as a guideline for good scientific data stewardship. They have gained momentum at a management level and are now for example part of the project template for EU Horizon 2020 projects. This raises the question what research groups and projects can do to implement them. Hugo Besemer will introduce the ideas behind the FAIR principles.
FAIR Data Management and FAIR Data SharingMerce Crosas
Presentation at the Critical Perspective on the Practice of Digiral Archeology symposium: http://archaeology.harvard.edu/critical-perspectives-practice-digital-archaeology
The document discusses the FAIR principles for findable, accessible, interoperable, and reusable scientific data. It provides a timeline for the development of the FAIR principles from 2014 to the present. It describes each of the FAIR components and proposes indicators for evaluating compliance. For each principle, it discusses challenges in implementing them at Wageningen University in the Netherlands. Overall, the document aims to help researchers and institutions understand and apply the FAIR principles to improve data management and sharing.
Providing support and services for researchers in good data governanceRobin Rice
The University of Edinburgh provides support and services to help researchers with good data governance. This includes a research data policy, research data service with various tools across the data lifecycle, and a data safe haven for sensitive data. The research data service offers centralized storage, version control, collaboration tools, and repositories for sharing data openly or long-term retention. Training and outreach aim to educate researchers on topics like data management plans, sensitive data, and GDPR compliance.
The document discusses principles for FAIR (Findable, Accessible, Interoperable, Reusable) data and metadata. It outlines the FAIR data principles for metadata, data, and supporting infrastructure. The principles state that metadata and data should be assigned persistent and unique identifiers, be described richly, be registered and searchable, be retrievable through open standards, use formal knowledge representation and vocabularies, include references to other metadata, and be released with licenses and provenance. The document also discusses extending metadata layers with additional vocabularies like DCAT, PROV, and DATS to better structure and connect metadata.
Horizon 2020 open access and open data mandatesMartin Donnelly
This document summarizes the key requirements for open access and open data under the Horizon 2020 framework. It outlines the mandate for open access to publications, requiring deposit in a repository and granting open access rights. It also describes the open data pilot, defining research data and the FAIR principles of findable, accessible, interoperable and reusable data. Projects must submit a data management plan addressing data collection, sharing and preservation. Compliance involves depositing data in a repository and applying an open license.
This presentation gives an overview of the key things that we need to consider before deciding to set up a data repository. It briefly talks about data repository, the software behind data repository and their limitations and merits. Additionally, the presenters shared IFPRI's experiences with Harvard Dataverse.
This presentation gives an overview of the key things that we need to consider before publishing data from the repository. It briefly discusses research data management, research data lifecycle, FAIR principles of research data management and then move on to key elements that should be considered while preparing datasets for publishing through repository.
Urm concept for sharing information inside of communitiesKarel Charvat
The document describes the Uniform Resource Management (URM) concept for sharing information within communities. URM provides a framework for standardized description of information using metadata schemes and controlled vocabularies to improve discovery. It is implemented through various portals and tools that allow users to manage and discover knowledge according to context. Initial implementations included portals for nature, sustainability and rural information in the Czech Republic and Latvia. URM supports collaborative knowledge sharing through interoperable systems based on open standards.
Presentation by Luiz Olavo Bonino, Dutch Techcentre & Vrije University Amsterdam.
As one of the organisations present at the Lorentz workshop in January 2014 where the concept of FAIR Data has been created, the Dutch Techcentre for Life Sciences has, since then, worked on a number of solutions to support the adoption and dissemination of the FAIR Data Principles. This presentation presents the ecosystem on how to support FAIR data.
This presentation introduces the basics of the Dataverse including preparing the submission to the Dataverse, creating an account and logging in, adding datasets to the Dataverse account, and metadata.
PARTHENOS Common Policies and Implementation StrategiesParthenos
Presentation by Hella Hollander for the PARTHENOS workshop "Introducing PARTHENOS - Integrating the Digital Humanities" on 14 December 2016 in Prato, Italy.
Data Management Planning for researchersSarah Jones
This document provides information about creating a data management plan (DMP) for researchers. It begins with defining what a DMP is - a short plan that outlines what data will be created, how it will be managed and stored, and plans for sharing and preservation. It then discusses the common components of a DMP, including describing the data, standards and methodologies, ethics and intellectual property, data sharing plans, and preservation strategies. The document provides examples of DMP requirements and recommendations from funders. It offers tips for creating a good DMP, including thinking about the needs of future data re-users, consulting stakeholders, grounding plans in reality, and planning for sharing from the outset. Finally, it discusses tools and resources
OSFair2017 Training | FAIR metrics - Starring your data setsOpen Science Fair
Peter Doorn, Marjan Grootveld & Elly Dijk talk about FAIR data principles and present the assessment tool that DANS is developing for data repositories | OSFair2017 Workshop
Workshop title: FAIR metrics - Starring your data sets
Workshop overview:
Do you want to join our effort to put the FAIR data principles into practice? Come and explore the assessment tool that DANS, Data Archiving and Networked Services in the Netherlands, is developing for data repositories.
The aim of our work is to implement the FAIR principles into a data assessment tool so that every dataset which is deposited or reused from any digital repository can be assessed in terms of a score on the principles Findable, Accessible, Interoperable, and Reusable, using a ‘FAIRness’ scale from 1 to 5 stars. In this interactive session participants can explore the pilot version of FAIRdat: the FAIR data assessment tool. The organisers would like to inform you about the project, and look forward to all feedback to improve the tool, or to improve the metrics that are used.
DAY 3 - PARALLEL SESSION 7
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
Essentials 4 Data Support: a fine course in FAIR Data SupportEllen Verbakel
The document summarizes the Essentials 4 Data Support (E4DS) course, which teaches people how to support researchers in storing, managing, archiving, and sharing research data according to FAIR principles. The course covers topics like data documentation, identifiers, formats, metadata, and licensing. It is offered online or in a blended format over 6 weeks. The goal is to educate data supporters so that researchers can find, access, interoperate with, and reuse each other's data in a fair manner.
Increasing the Reputation of your Published Data on the WebEric Stephan
The document discusses how following FAIR principles can increase the reputation of published research data on the web. It provides an overview of FAIR principles, including making data findable, accessible, interoperable, and reusable. It also describes a FAIR data maturity model that uses indicators to evaluate how well digital objects, datasets, and repositories follow each of the FAIR principles. The document concludes that FAIR principles and maturity indicators can be used to assess data reputation by researchers, repositories, domains, and institutions working together to share responsibilities in managing research data.
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
The FAIR Data Principles are a hot topic in research data managment. Their adoption within the H2020 funding programme means researchers now have to pay much more attention to how their share, publish and archive their data.
In this light, how can libraries help their research communities implement the FAIR principles? And write better data management plans?
This questions were addressed in a LIBER webinar containing some guidance and reflections on the principles themselves. Presented by Alastair Dunning, Head Research Data Services at the TU Delft (hosts of the 4TU.Centre for Research Data), it is based on a study of 37 data repositories (from subject specific repositories, to generic data archives, to national infrastructures), seeing how far they comply with each of the individual facets of the Data principles.
Presentation on data sharing that outlines five layers that must be addressed to enable data to be located, obtained, access, understood and use, and cited.
The FAIR principles have been introduced as a guideline for good scientific data stewardship. They have gained momentum at a management level and are now for example part of the project template for EU Horizon 2020 projects. This raises the question what research groups and projects can do to implement them. Hugo Besemer will introduce the ideas behind the FAIR principles.
FAIR Data Management and FAIR Data SharingMerce Crosas
Presentation at the Critical Perspective on the Practice of Digiral Archeology symposium: http://archaeology.harvard.edu/critical-perspectives-practice-digital-archaeology
The document discusses the FAIR principles for findable, accessible, interoperable, and reusable scientific data. It provides a timeline for the development of the FAIR principles from 2014 to the present. It describes each of the FAIR components and proposes indicators for evaluating compliance. For each principle, it discusses challenges in implementing them at Wageningen University in the Netherlands. Overall, the document aims to help researchers and institutions understand and apply the FAIR principles to improve data management and sharing.
Providing support and services for researchers in good data governanceRobin Rice
The University of Edinburgh provides support and services to help researchers with good data governance. This includes a research data policy, research data service with various tools across the data lifecycle, and a data safe haven for sensitive data. The research data service offers centralized storage, version control, collaboration tools, and repositories for sharing data openly or long-term retention. Training and outreach aim to educate researchers on topics like data management plans, sensitive data, and GDPR compliance.
The document discusses principles for FAIR (Findable, Accessible, Interoperable, Reusable) data and metadata. It outlines the FAIR data principles for metadata, data, and supporting infrastructure. The principles state that metadata and data should be assigned persistent and unique identifiers, be described richly, be registered and searchable, be retrievable through open standards, use formal knowledge representation and vocabularies, include references to other metadata, and be released with licenses and provenance. The document also discusses extending metadata layers with additional vocabularies like DCAT, PROV, and DATS to better structure and connect metadata.
Horizon 2020 open access and open data mandatesMartin Donnelly
This document summarizes the key requirements for open access and open data under the Horizon 2020 framework. It outlines the mandate for open access to publications, requiring deposit in a repository and granting open access rights. It also describes the open data pilot, defining research data and the FAIR principles of findable, accessible, interoperable and reusable data. Projects must submit a data management plan addressing data collection, sharing and preservation. Compliance involves depositing data in a repository and applying an open license.
This presentation gives an overview of the key things that we need to consider before deciding to set up a data repository. It briefly talks about data repository, the software behind data repository and their limitations and merits. Additionally, the presenters shared IFPRI's experiences with Harvard Dataverse.
This presentation gives an overview of the key things that we need to consider before publishing data from the repository. It briefly discusses research data management, research data lifecycle, FAIR principles of research data management and then move on to key elements that should be considered while preparing datasets for publishing through repository.
Urm concept for sharing information inside of communitiesKarel Charvat
The document describes the Uniform Resource Management (URM) concept for sharing information within communities. URM provides a framework for standardized description of information using metadata schemes and controlled vocabularies to improve discovery. It is implemented through various portals and tools that allow users to manage and discover knowledge according to context. Initial implementations included portals for nature, sustainability and rural information in the Czech Republic and Latvia. URM supports collaborative knowledge sharing through interoperable systems based on open standards.
Presentation by Luiz Olavo Bonino, Dutch Techcentre & Vrije University Amsterdam.
As one of the organisations present at the Lorentz workshop in January 2014 where the concept of FAIR Data has been created, the Dutch Techcentre for Life Sciences has, since then, worked on a number of solutions to support the adoption and dissemination of the FAIR Data Principles. This presentation presents the ecosystem on how to support FAIR data.
This presentation introduces the basics of the Dataverse including preparing the submission to the Dataverse, creating an account and logging in, adding datasets to the Dataverse account, and metadata.
PARTHENOS Common Policies and Implementation StrategiesParthenos
Presentation by Hella Hollander for the PARTHENOS workshop "Introducing PARTHENOS - Integrating the Digital Humanities" on 14 December 2016 in Prato, Italy.
Data Management Planning for researchersSarah Jones
This document provides information about creating a data management plan (DMP) for researchers. It begins with defining what a DMP is - a short plan that outlines what data will be created, how it will be managed and stored, and plans for sharing and preservation. It then discusses the common components of a DMP, including describing the data, standards and methodologies, ethics and intellectual property, data sharing plans, and preservation strategies. The document provides examples of DMP requirements and recommendations from funders. It offers tips for creating a good DMP, including thinking about the needs of future data re-users, consulting stakeholders, grounding plans in reality, and planning for sharing from the outset. Finally, it discusses tools and resources
OSFair2017 Training | FAIR metrics - Starring your data setsOpen Science Fair
Peter Doorn, Marjan Grootveld & Elly Dijk talk about FAIR data principles and present the assessment tool that DANS is developing for data repositories | OSFair2017 Workshop
Workshop title: FAIR metrics - Starring your data sets
Workshop overview:
Do you want to join our effort to put the FAIR data principles into practice? Come and explore the assessment tool that DANS, Data Archiving and Networked Services in the Netherlands, is developing for data repositories.
The aim of our work is to implement the FAIR principles into a data assessment tool so that every dataset which is deposited or reused from any digital repository can be assessed in terms of a score on the principles Findable, Accessible, Interoperable, and Reusable, using a ‘FAIRness’ scale from 1 to 5 stars. In this interactive session participants can explore the pilot version of FAIRdat: the FAIR data assessment tool. The organisers would like to inform you about the project, and look forward to all feedback to improve the tool, or to improve the metrics that are used.
DAY 3 - PARALLEL SESSION 7
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
OSFair2017 workshop | Monitoring the FAIRness of data sets - Introducing the ...Open Science Fair
Elly Dijk & Peter Doorn present the DANS approach to FAIR metrics
Workshop title: Open Science Monitor
Workshop overview:
Which are the measurable components of Open Science? How do we build a trustworthy, global open science monitor? This workshop will discuss a potential framework to measure Open Science, including the path from the publishing of an open policy (registries of policies and how these are represented or machine read), to the use of open methodologies, and the opening up of research results, their recording and measurement.
DAY 2 - PARALLEL SESSION 5
Access to biomedical data is increasingly important to enable data driven science in the research community.
The Linked Open Data (LOD) principles (by Tim Berner-Lee) have been suggested to judge the quality of data by its accessibility (open data access), by its format and structures, and by its interoperability with other data sources.
The objective is to use interoperable data sources across the Web with ease.
The FAIR (findable, accessible, interoperable, reusable) data principles have been introduced for similar reasons with a stronger emphasis on achieving reusability.
In this presentation we assess the FAIR principles against the LOD principles to determine, to which degree, the FAIR principles reuse LOD principles, and to which degree they extend the LOD principles.
This assessment helps to clarify the relationship between both schemes and gives a better understanding, what extension FAIR represents in comparison to LOD.
We conclude, that LOD gives a clear mandate to the openness of data, whereas FAIR asks for a stated license for access and thus includes the concept of reusability under consideration of the license agreement.
Furthermore, FAIR makes strong reference to the contextual information required to improve reuse of the data, e.g., provenance information.
According to the LOD principles, such meta-data would be considered interoperable data as well, however, the requirement of extending of data with meta-data does indicate that FAIR is an extension of the LOD (in contrast to the inverse).
AFAIR in Astronomy Research - Slides. In this webinar ARDC is partnering with the ADACS project to explore the FAIR data principles in the context of Astronomy research and the ASVO and IVOA as a community exemplars of the implementation of the FAIR data principles.
These slides from: Keith Russell (ARDC): Looking at FAIR
In this talk Keith will provide an overview of the FAIR principles and how it was used in astronomy before it became official. He will conclude the talk by discussing what other disciplines can learn from their approach.
This document discusses guidelines for making archaeological data findable, accessible, interoperable, and reusable (FAIR). It is funded by the European Commission and aims to support the creation and sharing of FAIR archaeological data. The document introduces the FAIR data principles, which are designed to make data more easily discoverable and usable. It also discusses the role of metadata, identifiers, standards, and repositories in making archaeological data FAIR. Guidelines are provided around each letter of FAIR to help researchers and institutions implement good research data management practices.
Towards metrics to assess and encourage FAIRnessMichel Dumontier
With an increased interest in the FAIR metrics, there is need to develop tools and appraoches that can assess the FAIRness of a digital resource. This talk begins to explore some ideas in this space, and invites people to participate in a working group focused on the development, application, and evaluation of FAIR metric efforts.
The document outlines plans for the VODAN Africa FAIR data project. It discusses the FAIR principles of findability, accessibility, interoperability, and reusability and how they will guide the project. The architecture will include tools like CEDAR for machine-readable data production and a triple store for exposing metadata. An initial minimal viable product will integrate clinical data from DHIS2 to validate the approach before full deployment.
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
The document provides information about facilitating open science training for European research. It discusses the Digital Curation Centre (DCC), which provides guidance and services on research data management and open science. The FOSTER project aims to spread open science practices through training resources, events, and online courses. The presentation then discusses research data management (RDM), including the benefits of managing data according to FAIR principles to make it findable, accessible, interoperable, and reusable. It also covers the importance of developing data management plans (DMPs) to document how research data will be handled and preserved over its lifecycle.
FAIR data: what it means, how we achieve it, and the role of RDASarah Jones
Presentation on FAIR data, the FAIR Data Action Plan developed by the European Commission Expert Group and the role of the Research Data Alliance on implementing FAIR. The presentation was given at the RDAFinland workshop held on 6th June - https://www.csc.fi/web/training/-/rda_and_fair_supporting_finnish_researchers
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Tom Plasterer
What to do About FAIR…
In the experience of most pharma professionals, FAIR remains fairly abstract, bordering on inconclusive. This session will outline specific case studies – real problems with real data, and address opportunities and real concerns.
·
Why making data Findable, Actionable, Interoperable and Reusable is important.
Talk presented at the Data Driven Drug Development (D4) conference on March 20th, 2019.
This presentation introduced participants to the DC 101 course and was given at the Digital Curation and Preservation Outreach and Capacity Building Workshop in Belfast on September 14-15 2009.
http://www.dcc.ac.uk/events/workshops/digital-curation-and-preservation-outreach-and-capacity-building-workshop
A presentation on FAIR, FAIRsharing and the FAIR ecosystem for the ENVRI-FAIR community on the 13th December 2019. This presentation covers the basics of what FAIR is, how FAIRsharing can help 'FAIRify' standards, repositories, knowledgebases and data policies, and then the connections FAIRsharing has with other initiatives, such as the FAIR Evaluator, Data Stewardship Wizard, our RDA WG, GO-FAIR and EOSC-Life.
The agenda outlines an introductory meeting to discuss FAIR technology and tools. It includes:
- Welcome and goal setting from 13:00-13:10
- Short introductions from 13:10-13:45
- A presentation on FAIR technology and tools from 13:45-14:30
- A question and answer session from 14:30-15:00
- A wrap up from 15:00
The meeting aims to introduce various organizations to FAIR principles and related technologies through a presentation and discussion.
Garret McMahon - Research Data Preservationdri_ireland
Presentation given by Garret McMahon, DRI Research Data Specialist at DRI Community Forum June 2018 on 'Planning for the long term preservation of humanities and social science research data'.
This document discusses making biobank data and samples FAIR (Findable, Accessible, Interoperable, and Reusable).
It explains the four FAIR principles and provides examples of how to apply each one. To make resources findable, they need unique and persistent identifiers, rich metadata, and to be discoverable through other systems. To make them accessible, they need to be retrievable using open standards. To make them interoperable, standards for knowledge representation like ontologies should be used. And to make them reusable, they need to be richly described and released with clear usage terms and provenance.
The document recommends three steps to make samples and data FAIR: include sufficient metadata using
Fair sample and data access -David Van enckevortData Science NIH
This document discusses making biobank data and samples FAIR (Findable, Accessible, Interoperable, and Reusable).
It explains the four FAIR principles and provides examples of how to apply each one. To make resources findable, they need unique and persistent identifiers, rich metadata, and to be findable through other systems. To make them accessible, they need to be retrievable using open standards. To make them interoperable, standardized languages and vocabularies should be used. And to make them reusable, they need to be richly described and released with clear usage terms and provenance.
The document recommends three steps to make samples and data FAIR: include sufficient metadata using common
Similar to CARARE: Can I use this data? FAIR into practice (20)
This document discusses challenges of sharing 3D cultural heritage content through Europeana. It notes that while file sizes were previously too large, HTML5 and WebGL have enabled real-time streaming of 3D models in browsers. However, the 50+ file formats used present standardization challenges. Europeana relies on adoption of standard media players to provide a good user experience. The document advocates for high-quality metadata and dissemination strategies to make 3D cultural heritage models more findable, accessible, and reusable.
3D reconstructions for story telling and understandingCARARE
This slidedeck was prepared for a webinar exploring some of the ways that 3D reconstructions are being used for story telling and to aid understanding. Following an introduction to the webinar Daniel Pletinckx of Visual Dimension bvma gave a presentation on 'Interactive storytelling in virtual worlds' which is followed by a presentation by Catherine Cassidy of the Open Virtual Worlds group at the University of St Andrews on 'Dissemination Methods for 3D Historical Virtual Environments'.
Speaking one language: how vocabularies can help organise informationCARARE
This document discusses how vocabularies, taxonomies, thesauri, and ontologies can help organize information by providing standardized, controlled terms. It defines each type of terminology structure and provides examples. Vocabularies are lists of agreed-upon terms, while taxonomies add hierarchical relationships. Thesauri further add cross-references and relationship information. Ontologies specify additional precise relationships between different types of entities. Together, these structured vocabularies can help disambiguate terms, provide access points, and clarify relationships to make sense of varied language and terminology.
Exploiting vocabularies and Linked Data: in practiceCARARE
Presentation by Kate Fernie about how controlled vocabularies and linked data can be used in systems and services, with demonstrations of the Share3D metadata capture tool tool, the Europeana Archaeology Vocabulary service and how the data looks in Europeana's EDM format and on the Europeana Collections portal.
Sharing 3D Cultural Heritage: Standards and metadataCARARE
This document discusses standards and metadata for sharing 3D cultural heritage content. It notes that 3D cultural heritage assets vary in size and scale, and digitization uses vary from conservation to tourism. There is a lack of international standards for 3D cultural heritage, though there are industry standards and best practices. High quality metadata is important for understanding 3D content, discovery, and preservation/reuse, yet metadata for 3D is often minimal. The document advocates for 3D cultural heritage content to be findable, accessible, interoperable, and reusable (FAIR) by using common file formats, appropriate platforms, clear rights information, and adequate metadata.
CARARE is a non-profit organization that aims to advance the use of digital cultural heritage. Some of its members create 3D models of cultural artifacts and share them with Europeana. While 3D technology has advanced, standards for sharing 3D content need improvement to ensure the findability, accessibility, and reusability of 3D cultural heritage models. The document discusses challenges in sharing 3D online and provides examples of how 3D is used in research applications. It emphasizes the need for comprehensive metadata and use of open formats to maximize discovery and reuse of 3D cultural heritage content.
European databases in cultural heritage: making connectionsCARARE
This document summarizes information about several European databases and initiatives for sharing cultural heritage data online. It introduces CARARE, which helps institutions share digital content with Europeana. It then discusses Europeana, a platform for over 50 million digital cultural heritage items, including 1.5 million archaeology items. The document outlines challenges of aggregating data from different sources and standards into Europeana, and how CARARE and other aggregators work to map metadata into a common format. It also introduces the ARIADNE Plus research infrastructure, which aims to support archaeology researchers through an online catalogue of datasets and related services and tools.
3D content in Europeana: the challenges of providing accessCARARE
1) Europeana is a digital platform containing over 50 million cultural heritage items from European institutions. It includes some 3D content.
2) Providing access to 3D content online has been challenging due to large file sizes and a lack of standard formats and viewers. However, technologies like Sketchfab now allow users to interact with 3D models within Europeana.
3) For 3D content to be truly accessible and reusable, standards for formats, metadata, and interoperability need to be improved so users have a consistent experience across platforms.
CARARE is a non-profit association whose main objective is advancing professional practice and fostering appreciation of the digital archaeological and architectural heritage.
Archaeology in Europeana’s publishing frameworkCARARE
This document discusses Europeana's publishing framework for archaeology content. It outlines four tiers of participation, with increasing levels of content contribution and reuse potential. Tier 2 allows inclusion in showcases with attribution licensing. Tier 3 permits non-commercial reuse with licenses like CC BY-NC. Tier 4 enables commercial reuse by requiring open licenses like CC BY. The framework aims to improve discovery through high-quality thematic collections and galleries showcasing archaeology works. It encourages contributions to expand the "Archaeology" collection and engage partners.
Archaeology in Europeana quality assurance, enrichment and publishingCARARE
The document discusses quality assurance for archaeological content being published in Europeana. It describes Europeana's publishing framework which specifies requirements for metadata, content format and rights labelling. The process of monitoring quality involves validating metadata at various stages, from mapping records to schemas through transformation and enrichment. Common issues found are missing mandatory metadata fields, and unnecessary or duplicate data. Quality can be improved through reviewing errors and making updates.
Europeana Archaeology is a project running from February 2019 to July 2020 with the objectives of increasing the quality and quantity of archaeology collections available through Europeana. It involves 16 partners from 14 European countries who will add new open access collections containing objects, sites, archives and multimedia resources from across prehistory to the post-medieval period. The project will develop vocabularies and enrichment services to improve discovery of the collections and make them more multilingual. All content will be added to Europeana's Archaeology thematic collection and showcased through exhibitions to promote reuse of the cultural heritage resources.
CARARE is a non-profit association established in 2016 to advance digital archaeological and architectural heritage by providing tools and support for members to publish datasets to Europeana, participate in workshops and events, join an expertise register, and shape the activities and use of income of the association. Membership is open to institutions at different fee levels from 50 to 200 euros depending on size, as well as individual members at 75 euros, and provides access to software, workshops, and ability to vote at general meetings.
The document discusses the development of the CARARE 2.0 metadata schema. The schema was updated based on lessons learned from supplying data to Europeana and requirements for documenting 3D content from projects like 3D-ICONS. The main changes in CARARE 2.0 include broadening the "Heritage Asset" scope, simplifying references and provenance, and adding elements to document activities, provenance, and paradata needed for quality 3D models. The schema distinguishes heritage assets, digital resources, and activities, and allows them to be related to fully document objects and their digital surrogates.
Achieving interoperability between the CARARE schema for monuments and sites ...CARARE
Presentation by:
Valentine Charles, kate Fernie, Antoine Isaac, Dimitris Gavrillis, Stavros Angelis and Costis Dallas
EuropeansTech Conference
February 2015
How and why people today engage with the archaeological heritage and scholarl...CARARE
Presentation given by Rimvydas Laužikas, Costis Dallas, Suzie Thomas, Ingrida Kelpšienė, Isto Huvila, Pedro Luengo, Helena Nobre, Marina Toumpouri, Vykintas Vaitkevičius
at:
Archaeology and Architecture in Europeana
28 June 2019, Amersfoort, Netherlands
An introduction to the PARTHENOS guidelines to FAIRify data management and ma...CARARE
The document introduces guidelines created by the PARTHENOS project to help make research data more FAIR (Findable, Accessible, Interoperable, Reusable). The PARTHENOS project aims to strengthen cooperation in various humanities fields. The guidelines were developed based on interviews, surveys, and a review of over 100 data management policies. They contain 20 recommendations structured around the FAIR principles to help researchers and repositories better manage research data. The guidelines are available online and will be translated and used to develop workshops.
The everyday reality behind the iron curtainCARARE
The document examines over 2,000 images from Lithuanian archaeological surveys between 1949-1967 to understand everyday life. It describes the difficult conditions faced by archaeologists during the Soviet occupation, including a lack of tools, equipment and funding. The images show how archaeologists improvised and collaborated with local communities, facing challenges like damaged sites but also finding benefits like fresh vegetables grown on excavation plots. The conclusions are that the images provide valuable insights while more remains to be uncovered about everyday realities during this period.
Presentation by Henk Alkemade of the Dutch national heritage agency describing how past adaptations to the natural challenges of water management in the Netherlands, and historic records are inspiring solutions to present problems. Climate change is leading to both summer droughts, which can reveal archaeological sites through cropmarks, and seasonal downpours causing flooding. Changes in sea-level are causing soil subsidence and at the same time the land is becoming wetter; images of historical sites show how the land surface has dropped and the water level has risen. Past adaptations to the natural environment included settlement of high points still visible in the landscape today and the creation of a series of dikes, canals and windmills to move water from the land. Series of historical maps show the creation of water systems in cities in the Netherlands - and the infilling of canals in more recent times. Contemporary responses to rising sea levels include re-instating historic canals, drainage systems and water cellars/cisterns to hold back flood water. Measures to increase the biodiversity and to adapt the landuse to the new water levels are all playing a part in management of the historical landscapes of the Netherlands.
Archaeology in the europeana publishing frameworkCARARE
This presentation describes Europeana's publishing framework, the tiers for content and metadata, and what this means for organisations providing digital content for archaeology and architecture.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
CARARE: Can I use this data? FAIR into practice
1. Can I access and use this data?
FAIR into practice.
Hella Hollander, Head Data Archive DANS
2. • Quality (trustworthiness) of data repositories
• Quality (fitness for use) of datasets
• FAIR into practice
• Europeana and re-use of Cultural Heritage Data
What I will present?
Fit for Purpose?
3. Data Archiving and Networked Services
• Established in 2005
Predecessors dating back to 1964 (Steinmetz Foundation)
• Institute of the Royal Netherlands Academy of
Arts and Sciences (KNAW)
• Co-founded by the Netherlands Organization
for Scientific Research (NWO)
• Objective: permanent preservation of, and
enabling access to scientific research data
4. Institute of
Dutch Academy
and Research
Funding
Organisation
(KNAW & NWO)
since 2005
First predecessor
dates back to
1964 (Steinmetz
Foundation),
Historical Data
Archive 1989
Mission:
promote and
provide
permanent
access to digital
research
resources
DANS is about keeping data FAIR
5. DataverseNL
to support data
storage during
research until
10 years after
NARCIS
Portal
aggregating
research
information and
institutional
repositories
EASY
Certified
Long-term
Archive
DANS key services
https://dans.knaw.nl
6. DANS and DSA
• 2005: DANS to promote and provide permanent access to
digital research resources
• Formulate quality guidelines for digital repositories including
DANS
• 2006: 5 basic principles as basis for 16 DSA guidelines
• 2009: international DSA Board
• Almost 70 seals acquired around the globe, but with a focus
on Europe
7. The Certification Pyramid
ISO 16363:2012 - Audit and certification
of trustworthy digital repositories
http://www.iso16363.org/
DIN 31644 standard “Criteria for trustworthy
digital archives”
http://www.langzeitarchivierung.de
http://www.datasealofapproval.org/
https://www.icsu-wds.org/
8. DSA and WDS: look-a-likes
Communalities:
• Lightweight, self assessment, community review
Complementarity:
• Geographical spread
• Disciplinary spread
9. Partnership
Goals:
• Realizing efficiencies
• Simplifying assessment options
• Stimulating more certifications
• Increasing impact on the community
Outcomes:
• Common catalogue of requirements for core repository
assessment
• Common procedures for assessment
• Shared testbed for assessment
10. New common requirements: CoreTrustSeal
18 requirements:
• Context (1)
• Organizational infrastructure (6)
• Digital object management (8)
• Technology (2)
• Additional information and
applicant feedback (1)
11. Requirements (indirectly) dealing with data quality
R2. The repository maintains all applicable licenses covering data access and use and
monitors compliance.
R3. The repository has a continuity plan to ensure ongoing access to and preservation
of its holdings.
R4. The repository ensures, to the extent possible, that data are created, curated,
accessed, and used in compliance with disciplinary and ethical norms.
R7. The repository guarantees the integrity and authenticity of the data.
12. Requirements (indirectly) dealing with data quality
R8. The repository accepts data and metadata based on defined criteria to
ensure relevance and understandability for data users.
R10. The repository assumes responsibility for long-term preservation and
manages this function in a planned and documented way.
R11. The repository has appropriate expertise to address technical data and
metadata quality and ensures that sufficient information is available for end
users to make quality-related evaluations.
R13. The repository enables users to discover the data and refer to them in a
persistent way through proper citation.
R14. The repository enables reuse of the data over time, ensuring that
appropriate metadata are available to support the understanding and use of the
data.
13. Resemblance DSA – FAIR principles
DSA Principles (for data repositories) FAIR Principles (for data sets)
data can be found on the internet Findable
data are accessible Accessible
data are in a usable format Interoperable
data are reliable Reusable
data can be referred to (citable)
The resemblance is not perfect:
• usable format (DSA) is an aspect of interoperability (FAIR)
• FAIR explicitly addresses machine readability
• etc.
A certified TDR already offers a baseline data quality level
14. Implementing FAIR Principles
See: http://datafairport.org/fair-principles-living-document-menu and
https://www.force11.org/group/fairgroup/fairprinciples
15. FAIR Data Principles
In the FAIR Data approach, data should be:
Findable – Easy to find by both humans and computer systems and based on
mandatory description of the metadata that allow the discovery of interesting
datasets;
Accessible – Stored for long term such that they can be easily accessed and/or
downloaded with well-defined license and access conditions (Open Access when
possible), whether at the level of metadata, or at the level of the actual data
content;
Interoperable – Ready to be combined with other datasets by humans as well as
computer systems;
Reusable – Ready to be used for future research and to be processed further
using computational methods.
16. Accessible: Implementing FAIR
Examples:
• (Meta)data should be open as possible and closed as necessary
• Protected data and personal data must be available through a controlled and
documented procedure. Information that needs to be protected, for example for
privacy reasons, should not be part of the publicly accessible (meta)data but should
be recorded as part of the documentation of the resource in restricted contexts.
• In order to be fully accessible, research data should be fully accessible via (free)
exchange protocols.
• Maintain the integrity and quality of data. This is a general principle, that emerged in
particular from the interviews with historians. It refers to the necessity to maintain
the richness and the context of the data created and collected during time
16
17. Combine and operationalize: DSA & FAIR
• Growing demand for quality criteria for
research datasets and ways to assess their
fitness for use
• Combine the principles of core repository
certification and FAIR
• Use the principles as quality criteria:
• Core certification – digital repositories
• FAIR principles – research data (sets)
• Operationalize the principles as an
instrument to assess FAIRness of existing
datasets in certified TDRs
18. Different implementations of FAIR
Requirements for new data
creation
Establishing the profile for existing data
Transformation tools to make
data FAIR (Go-FAIR initiative)
19. FAIR badge scheme
• Proxy for data “quality” or “fitness
for (re-)use”
• Prevent interactions among
dimensions to ease scoring
• Consider Reusability as the
resultant of the other three:
– the average FAIRness as an indicator
of data quality
– (F+A+I)/3=R
• Manual and automatic scoring
F A I R
2 User Reviews
1 Archivist Assessment
24 Downloads
20. Some unresolved FAIR complications:
1. Dependencies among dimensions, difficulty to measure the criteria,
no rank order from “low” to “high” FAIRness, grouping of criteria
under dimensions is disputable
2. Do we need or want additional dimensions, principles or criteria?
• Is “openness” a separate dimension, not included in FAIR?
• Is it desirable/possible to say something about “substantive” data quality, such as
the accuracy/precision or correctness of the data?
• What about the long-term access? For how long does data remain FAIR?
• Should data security be included?
3. Several FAIR criteria can be solved at the level of the repository
4. Do we need separate FAIR criteria for different disciplines?
• e.g. machine actionable data are more important in some fields than in other;
note that data accessibility by machines is partly defined by technical specs (A1),
partly by licenses (R1.1)
21. First we attempted to operationalise R –
Reusable as well… but we changed our mind
Reusable – is it a separate dimension? Partly subjective: it
depends on what you want to use the data for!
Idea for operationalization Solution
R1. plurality of accurate and relevant attributes ≈ F2: “data are described
with rich metadata” F
R1.1. clear and accessible data usage license A
R1.2. provenance (for replication and reuse) F
R1.3. meet domain-relevant community standards I
Data is in a TDR – unsustained data will not remain usable Aspect of Repository Data
Seal of Approval
Explication on how data was or can be used is available F
Data is automatically usable by machines I
22. Findable (defined by metadata (PID included) and documentation)
1. No PID nor metadata/documentation
2. PID without or with insufficient metadata
3. Sufficient/limited metadata without PID
4. PID with sufficient metadata
5. Extensive metadata and rich additional documentation available
Accessible (defined by presence of user license)
1. Metadata nor data are accessible
2. Metadata are accessible but data is not accessible (no clear terms of reuse in
license)
3. User restrictions apply (i.e. privacy, commercial interests, embargo period)
4. Public access (after registration)
5. Open access unrestricted
Interoperable (defined by data format)
1. Proprietary (privately owned), non-open format data
2. Proprietary format, accepted by Certified Trustworthy Data Repository
3. Non-proprietary, open format = ‘preferred format’
4. As well as in the preferred format, data is standardised using a standard
vocabulary format (for the research field to which the data pertain)
5. Data additionally linked to other data to provide context
23. Creating a FAIR data assessment tool
Using an online questionnaire system
Prototype:
https://www.surveymonkey.com/r/fairdat
24. Website FAIRDAT
• To contain FAIR data
assessments from any
repository or website,
linking to the location of
the data set via
(persistent) identifier
• The repository can show
the resultant badge,
linking back to the
FAIRDAT website
F A I R
2 User Reviews
1 Archivist
Assessment
24 Downloads
Neutral, Independent
Analogous to DSA website
25. Display FAIR badges in any repository (Zenodo,
Dataverse, Mendeley Data, figshare, B2SAFE, …)
26. Can FAIR Data Assessment be automatic?
Criterion Automatic?
Y/N/Semi
Subjective?
Y/N/Semi
Comments
F1 No PID / No Metadata Y N Solved by Repository
F2 PID / Insuff. Metadata S S Insufficient metadata is subjective
F3 No PID / Suff. Metadata S S Sufficient metadata is subjective
F4 PID / Sufficient Metadata S S Sufficient metadata is subjective
F5 PID / Rich Metadata S S Rich metadata is subjective
A1 No License / No Access Y N Solved by Repository
A2 Metadata Accessible Y N Solved by Repository
A3 User Restrictions Y N Solved by Repository
A4 Public Access Y N Solved by Repository
A5 Open Access Y N Solved by Repsoitory
I1 Proprietary Format S N Depends on list of proprietary formats
I2 Accepted Format S S Depends on list of accepted formats
I3 Archival Format S S Depends on list of archival formats
I4 + Harmonized N S Depends on domain vocabularies
I5 + Linked S N Depends on semantic methods used
Optional: qualitative assessment / data review
27. Open and FAIR Data in Trusted Data Repositories
Data does not only need to be Open
Data must also be FAIR
– Findable, Accessible, Interoperable, Reusable
– And must remains so, and therefore should be preserved in a DSA
Certified Trusted Digital Repository
28. Perfect Couple
FAIR principles for data quality
DSA criteria for quality of TDR
minimal set of community agreed guiding principles to make data more easily findable,
accessible, appropriately integrated and re-usable, and adequately citable.
• A perfect couple for quality assessment of research data and trustworthy data
repositories
• Ideally: a DSA certified archive will contain FAIR data
28
34. Comparison Europeana and DANS
Europeana DANS
Public domain CC0
CC licenses CC-BY is very close to “Open
Access Registered Users”
In copyright All categories except CC0
Orphan Work Formally not existing
Unknown Exists
No Copyright NonCommercial Use only Not applicable
35. Europeana
In Copyright:
This work is protected by copyright and/or related rights.
Access/re-use: You are free to use this work in any way that is
permitted by the copyright and related rights legislation that
applies to your use. For other use you need to obtain
permission form the rights holder(s)
36. DANS
Open Access for registered users:
The objects/data are, without further restrictions, only made
available to all registered EASY users. Any existing copyrights
and/or database rights are respected.
Acces/re-use: You are free to use this work in any way that is
allowed by the copyright-and related rights legislation that
applies to your use, but only after user registration. A
registered user is permitted to cite the data in a limited way in
publications. For other use you need to obtain permission form
the rights holder(s).
37. Can I access and Use it?
Clarification DANS licence agreement:
It is allowed to:
copy a dataset for your own use
cite from the dataset in limited degree in publication with a
bibliographic reference to the dataset.
Not allowed to:
Distribute the dataset = (re)-publish the dataset as a whole
40. Thank you for listening!
Hella.hollander@dans.knaw.nl
www.dans.knaw.nl
http://www.dtls.nl/go-fair/
https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-
webinar