This document discusses the FAIR data principles and increasing adoption of FAIR. It begins by explaining the 15 FAIR principles for findable, accessible, interoperable and reusable data. It then discusses how adoption is increasing through funder requirements, the role of FAIR within EOSC, and related projects. However, it notes that most data is still not managed or shared according to FAIR principles due to barriers like time and effort required as well as lack of incentives and rewards. The document argues that both cultural and technical aspects must be addressed to fully implement FAIR.
An overview on FAIR Data and FAIR Data stewardship, and the roadmap for FAIR Data solutions coordinated by the Dutch Techcentre for Life Sciences. This presentation was given at the Netherlands eScience Center's "Essential skills in data-intensive research" course week.
University of Liverpool Researcher KnowHow session presented by Judith Carr.
At the end of this session you will know what the FAIR data principles are, what is required and be in a position to think how these would relate to your research practice.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
FAIR data: what it means, how we achieve it, and the role of RDASarah Jones
Presentation on FAIR data, the FAIR Data Action Plan developed by the European Commission Expert Group and the role of the Research Data Alliance on implementing FAIR. The presentation was given at the RDAFinland workshop held on 6th June - https://www.csc.fi/web/training/-/rda_and_fair_supporting_finnish_researchers
An overview on FAIR Data and FAIR Data stewardship, and the roadmap for FAIR Data solutions coordinated by the Dutch Techcentre for Life Sciences. This presentation was given at the Netherlands eScience Center's "Essential skills in data-intensive research" course week.
University of Liverpool Researcher KnowHow session presented by Judith Carr.
At the end of this session you will know what the FAIR data principles are, what is required and be in a position to think how these would relate to your research practice.
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
FAIR data: what it means, how we achieve it, and the role of RDASarah Jones
Presentation on FAIR data, the FAIR Data Action Plan developed by the European Commission Expert Group and the role of the Research Data Alliance on implementing FAIR. The presentation was given at the RDAFinland workshop held on 6th June - https://www.csc.fi/web/training/-/rda_and_fair_supporting_finnish_researchers
Presentation given at Macquarie University in support of the ARDC 'institutional role in the data commons' project on "Implementing FAIR: Standards in Research Data Management" https://ardc.edu.au/news/data-and-services-discovery-activities-successful-applicants/
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
| www.eudat.eu | 2nd Session: July 14, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
Data Catalog for Better Data Discovery and GovernanceDenodo
Watch full webinar here: https://buff.ly/2Vq9FR0
Data catalogs are en vogue answering critical data governance questions like “Where all does my data reside?” “What other entities are associated with my data?” “What are the definitions of the data fields?” and “Who accesses the data?” Data catalogs maintain the necessary business metadata to answer these questions and many more. But that’s not enough. For it to be useful, data catalogs need to deliver these answers to the business users right within the applications they use.
In this session, you will learn:
*How data catalogs enable enterprise-wide data governance regimes
*What key capability requirements should you expect in data catalogs
*How data virtualization combines dynamic data catalogs with delivery
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3rwWhyv
The Data Mesh architectural design was first proposed in 2019 by Zhamak Dehghani, principal technology consultant at Thoughtworks, a technology company that is closely associated with the development of distributed agile methodology. A data mesh is a distributed, de-centralized data infrastructure in which multiple autonomous domains manage and expose their own data, called “data products,” to the rest of the organization.
Organizations leverage data mesh architecture when they experience shortcomings in highly centralized architectures, such as the lack domain-specific expertise in data teams, the inflexibility of centralized data repositories in meeting the specific needs of different departments within large organizations, and the slow nature of centralized data infrastructures in provisioning data and responding to changes.
In this session, Pablo Alvarez, Global Director of Product Management at Denodo, explains how data virtualization is your best bet for implementing an effective data mesh architecture.
You will learn:
- How data mesh architecture not only enables better performance and agility, but also self-service data access
- The requirements for “data products” in the data mesh world, and how data virtualization supports them
- How data virtualization enables domains in a data mesh to be truly autonomous
- Why a data lake is not automatically a data mesh
- How to implement a simple, functional data mesh architecture using data virtualization
How to Make a Data Governance Program that LastsDATAVERSITY
Traditional data governance initiatives fail by focusing too heavily on policies, compliance, and enforcement, which quickly lose business interest and support. This leaves data management and governance leaders having to continually make the case for data governance to secure business adoption. Join Cameron, VP, Product Management, Precisely, as he shares a lean, business-first data governance approach that connects key initiatives to governance capabilities and quickly delivers business value for the long-term. He will give examples of organizations worldwide who have successfully implemented a data governance program by engaging with key stakeholders using innovative techniques such as gamification and data catalog scavenger hunts.
Datos de investigación: conceptos y tipologías.
Relevancia de la gestión de datos de investigación para el
investigador.
Cómo responder a los organismos financiadores en materia de
gestión de datos: Los planes de gestión de datos.
Cómo buscar datos de investigación.
Aspectos legales y éticos en materia de datos de investigación.
Organización y documentación de los datos.
Almacenamiento y seguridad.
Cómo compartir datos de investigación y su relevancia para la
carrera del investigador.
Los Planes de Gestión de Datos de Investigación
Data Catalogues - Architecting for Collaboration & Self-ServiceDATAVERSITY
The interest in Data Catalogs is growing as more business & technical users are looking to gain insight from data using a self-service approach. Architectural techniques for Data Provisioning and Metadata Cataloging have evolved to cater to these new audiences and ways of working. This webinar provides concrete methods of architecting your Self-service BI & Analytics environment to foster collaboration while at the same time maintaining Data Quality and reducing risk.
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Phar Data Platform: From the Lakehouse Paradigm to the RealityDatabricks
Despite the increased availability of ready-to-use generic tools, more and more enterprises are deciding to build in-house data platforms. This practice, common for some time in research labs and digital native companies, is now making its waves across large enterprises that traditionally used proprietary solutions and outsourced most of their IT. The availability of large volumes of data, coupled with more and more complex analytical use cases driven by innovations in data science have yielded these traditional and on premise architectures to become obsolete in favor of cloud architectures powered by open source technologies.
The idea of building an in-house platform at a larger enterprise comes with many challenges of its own: Build an Architecture that combines the best elements of data lakes and data warehouses to accommodate all kinds from BI to ML use cases. The need to interoperate with all the company’s data and technology, including legacy systems. Cultural transformation, including a commitment to adopt agile processes and data driven approaches.
This presentation describes a success story on building a Lakehouse in an enterprise such as LIDL, a successful chain of grocery stores operating in 32 countries worldwide. We will dive into the cloud-based architecture for batch and streaming workloads based on many different source systems of the enterprise and how we applied security on architecture and data. We will detail the creation of a curated Data Lake comprising several layers from a raw ingesting layer up to a layer that presents cleansed and enriched data to the business units as a kind of Data Marketplace.
A lot of focus and effort went into building a semantic Data Lake as a sustainable and easy to use basis for the Lakehouse as opposed to just dumping source data into it. The first use case being applied to the Lakehouse is the Lidl Plus Loyalty Program. It is already deployed to production in 26 countries with more than 30 millions of customers’ data being analyzed on a daily basis. In parallel to productionizing the Lakehouse, a cultural and organizational change process was undertaken to get all involved units to buy into the new data driven approach.
OSFair2017 workshop | Monitoring the FAIRness of data sets - Introducing the ...Open Science Fair
Elly Dijk & Peter Doorn present the DANS approach to FAIR metrics
Workshop title: Open Science Monitor
Workshop overview:
Which are the measurable components of Open Science? How do we build a trustworthy, global open science monitor? This workshop will discuss a potential framework to measure Open Science, including the path from the publishing of an open policy (registries of policies and how these are represented or machine read), to the use of open methodologies, and the opening up of research results, their recording and measurement.
DAY 2 - PARALLEL SESSION 5
Presentation given at Macquarie University in support of the ARDC 'institutional role in the data commons' project on "Implementing FAIR: Standards in Research Data Management" https://ardc.edu.au/news/data-and-services-discovery-activities-successful-applicants/
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
| www.eudat.eu | 2nd Session: July 14, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
Data Catalog for Better Data Discovery and GovernanceDenodo
Watch full webinar here: https://buff.ly/2Vq9FR0
Data catalogs are en vogue answering critical data governance questions like “Where all does my data reside?” “What other entities are associated with my data?” “What are the definitions of the data fields?” and “Who accesses the data?” Data catalogs maintain the necessary business metadata to answer these questions and many more. But that’s not enough. For it to be useful, data catalogs need to deliver these answers to the business users right within the applications they use.
In this session, you will learn:
*How data catalogs enable enterprise-wide data governance regimes
*What key capability requirements should you expect in data catalogs
*How data virtualization combines dynamic data catalogs with delivery
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3rwWhyv
The Data Mesh architectural design was first proposed in 2019 by Zhamak Dehghani, principal technology consultant at Thoughtworks, a technology company that is closely associated with the development of distributed agile methodology. A data mesh is a distributed, de-centralized data infrastructure in which multiple autonomous domains manage and expose their own data, called “data products,” to the rest of the organization.
Organizations leverage data mesh architecture when they experience shortcomings in highly centralized architectures, such as the lack domain-specific expertise in data teams, the inflexibility of centralized data repositories in meeting the specific needs of different departments within large organizations, and the slow nature of centralized data infrastructures in provisioning data and responding to changes.
In this session, Pablo Alvarez, Global Director of Product Management at Denodo, explains how data virtualization is your best bet for implementing an effective data mesh architecture.
You will learn:
- How data mesh architecture not only enables better performance and agility, but also self-service data access
- The requirements for “data products” in the data mesh world, and how data virtualization supports them
- How data virtualization enables domains in a data mesh to be truly autonomous
- Why a data lake is not automatically a data mesh
- How to implement a simple, functional data mesh architecture using data virtualization
How to Make a Data Governance Program that LastsDATAVERSITY
Traditional data governance initiatives fail by focusing too heavily on policies, compliance, and enforcement, which quickly lose business interest and support. This leaves data management and governance leaders having to continually make the case for data governance to secure business adoption. Join Cameron, VP, Product Management, Precisely, as he shares a lean, business-first data governance approach that connects key initiatives to governance capabilities and quickly delivers business value for the long-term. He will give examples of organizations worldwide who have successfully implemented a data governance program by engaging with key stakeholders using innovative techniques such as gamification and data catalog scavenger hunts.
Datos de investigación: conceptos y tipologías.
Relevancia de la gestión de datos de investigación para el
investigador.
Cómo responder a los organismos financiadores en materia de
gestión de datos: Los planes de gestión de datos.
Cómo buscar datos de investigación.
Aspectos legales y éticos en materia de datos de investigación.
Organización y documentación de los datos.
Almacenamiento y seguridad.
Cómo compartir datos de investigación y su relevancia para la
carrera del investigador.
Los Planes de Gestión de Datos de Investigación
Data Catalogues - Architecting for Collaboration & Self-ServiceDATAVERSITY
The interest in Data Catalogs is growing as more business & technical users are looking to gain insight from data using a self-service approach. Architectural techniques for Data Provisioning and Metadata Cataloging have evolved to cater to these new audiences and ways of working. This webinar provides concrete methods of architecting your Self-service BI & Analytics environment to foster collaboration while at the same time maintaining Data Quality and reducing risk.
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Phar Data Platform: From the Lakehouse Paradigm to the RealityDatabricks
Despite the increased availability of ready-to-use generic tools, more and more enterprises are deciding to build in-house data platforms. This practice, common for some time in research labs and digital native companies, is now making its waves across large enterprises that traditionally used proprietary solutions and outsourced most of their IT. The availability of large volumes of data, coupled with more and more complex analytical use cases driven by innovations in data science have yielded these traditional and on premise architectures to become obsolete in favor of cloud architectures powered by open source technologies.
The idea of building an in-house platform at a larger enterprise comes with many challenges of its own: Build an Architecture that combines the best elements of data lakes and data warehouses to accommodate all kinds from BI to ML use cases. The need to interoperate with all the company’s data and technology, including legacy systems. Cultural transformation, including a commitment to adopt agile processes and data driven approaches.
This presentation describes a success story on building a Lakehouse in an enterprise such as LIDL, a successful chain of grocery stores operating in 32 countries worldwide. We will dive into the cloud-based architecture for batch and streaming workloads based on many different source systems of the enterprise and how we applied security on architecture and data. We will detail the creation of a curated Data Lake comprising several layers from a raw ingesting layer up to a layer that presents cleansed and enriched data to the business units as a kind of Data Marketplace.
A lot of focus and effort went into building a semantic Data Lake as a sustainable and easy to use basis for the Lakehouse as opposed to just dumping source data into it. The first use case being applied to the Lakehouse is the Lidl Plus Loyalty Program. It is already deployed to production in 26 countries with more than 30 millions of customers’ data being analyzed on a daily basis. In parallel to productionizing the Lakehouse, a cultural and organizational change process was undertaken to get all involved units to buy into the new data driven approach.
OSFair2017 workshop | Monitoring the FAIRness of data sets - Introducing the ...Open Science Fair
Elly Dijk & Peter Doorn present the DANS approach to FAIR metrics
Workshop title: Open Science Monitor
Workshop overview:
Which are the measurable components of Open Science? How do we build a trustworthy, global open science monitor? This workshop will discuss a potential framework to measure Open Science, including the path from the publishing of an open policy (registries of policies and how these are represented or machine read), to the use of open methodologies, and the opening up of research results, their recording and measurement.
DAY 2 - PARALLEL SESSION 5
Presentation investigating the state of FAIR practice and what is needed to turn FAIR data into reality given at the Danish FAIR conference in Copenhagen on 20th November 2018. https://vidensportal.deic.dk/en/Programme/FAIR_Toolbox_Nov2018 The presentation reflect on recent FAIR studies and international initiatives and outlines the recommendations emerging from the European Commission's FAIR Data Expert Group report - http://tinyurl.com/FAIR-EG
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
Plenary Lecture Presented at INCF Neuroinformatics 2019 https://www.neuroinformatics2019.org
Title: FAIRy stories: tales from building the FAIR Research Commons
Findable Accessable Interoperable Reusable. The “FAIR Principles” for research data, software, computational workflows, scripts, or any kind of Research Object is a mantra; a method; a meme; a myth; a mystery. For the past 15 years I have been working on FAIR in a range of projects and initiatives in the Life Sciences as we try to build the FAIR Research Commons. Some are top-down like the European Research Infrastructures ELIXIR, ISBE and IBISBA, and the NIH Data Commons. Some are bottom-up, supporting FAIR for investigator-led projects (FAIRDOM), biodiversity analytics (BioVel), and FAIR drug discovery (Open PHACTS, FAIRplus). Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. There are villains and heroes. Some have happy endings; all have morals.
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Tom Plasterer
What to do About FAIR…
In the experience of most pharma professionals, FAIR remains fairly abstract, bordering on inconclusive. This session will outline specific case studies – real problems with real data, and address opportunities and real concerns.
·
Why making data Findable, Actionable, Interoperable and Reusable is important.
Talk presented at the Data Driven Drug Development (D4) conference on March 20th, 2019.
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
Keynote presented at the workshop FAIRe Data Infrastructures, 15 October 2020
https://www.gmds.de/aktivitaeten/medizinische-informatik/projektgruppenseiten/faire-dateninfrastrukturen-fuer-die-biomedizinische-informatik/workshop-2020/
Remarkably it was only in 2016 that the ‘FAIR Guiding Principles for scientific data management and stewardship’ appeared in Scientific Data. The paper was intended to launch a dialogue within the research and policy communities: to start a journey to wider accessibility and reusability of data and prepare for automation-readiness by supporting findability, accessibility, interoperability and reusability for machines. Many of the authors (including myself) came from biomedical and associated communities. The paper succeeded in its aim, at least at the policy, enterprise and professional data infrastructure level. Whether FAIR has impacted the researcher at the bench or bedside is open to doubt. It certainly inspired a great deal of activity, many projects, a lot of positioning of interests and raised awareness. COVID has injected impetus and urgency to the FAIR cause (good) and also highlighted its politicisation (not so good).
In this talk I’ll make some personal reflections on how we are faring with FAIR: as one of the original principles authors; as a participant in many current FAIR initiatives (particularly in the biomedical sector and for research objects other than data) and as a veteran of FAIR before we had the principles.
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
The FAIR Data Principles are a hot topic in research data managment. Their adoption within the H2020 funding programme means researchers now have to pay much more attention to how their share, publish and archive their data.
In this light, how can libraries help their research communities implement the FAIR principles? And write better data management plans?
This questions were addressed in a LIBER webinar containing some guidance and reflections on the principles themselves. Presented by Alastair Dunning, Head Research Data Services at the TU Delft (hosts of the 4TU.Centre for Research Data), it is based on a study of 37 data repositories (from subject specific repositories, to generic data archives, to national infrastructures), seeing how far they comply with each of the individual facets of the Data principles.
Turning FAIR into Reality: Final outcomes from the European Commission FAIR D...Sarah Jones
A multi-speaker presentation given by the European Commission FAIR Data Expert Group at ScieDataCon as part of International Data Week in Botswana in November 2018.
Simon Hodson, Chair of the Group explained the remit and background. Natalie Harrower outlined key concepts. Francoise Genova spoke on the recommendations related to research data culture. Daniel Mietchen addressed the infrastructure needed and our proposals for a FAIR ecosystem, and Sarah Jones spoke to the cultural aspects needed to drive change and outlined the FAIR Action Plan.
The report has been revised in light of the 500+ comments received as part of the open consultation and will be formally released on 23rd November as part of the Austrian Presidency events.
A presentation on FAIR, FAIRsharing and the FAIR ecosystem for the ENVRI-FAIR community on the 13th December 2019. This presentation covers the basics of what FAIR is, how FAIRsharing can help 'FAIRify' standards, repositories, knowledgebases and data policies, and then the connections FAIRsharing has with other initiatives, such as the FAIR Evaluator, Data Stewardship Wizard, our RDA WG, GO-FAIR and EOSC-Life.
Results from the FAIR Expert Group Stakeholder Consultation on the FAIR Data ...EOSCpilot .eu
Turning FAIR into Reality report and action plan by Simon Hodson, Executive Director of CODATA, delivered during the FAIR Data Session at the EOSC Stakeholders Forum 2018
AFAIR in Astronomy Research - Slides. In this webinar ARDC is partnering with the ADACS project to explore the FAIR data principles in the context of Astronomy research and the ASVO and IVOA as a community exemplars of the implementation of the FAIR data principles.
These slides from: Keith Russell (ARDC): Looking at FAIR
In this talk Keith will provide an overview of the FAIR principles and how it was used in astronomy before it became official. He will conclude the talk by discussing what other disciplines can learn from their approach.
The FAIR principles have been introduced as a guideline for good scientific data stewardship. They have gained momentum at a management level and are now for example part of the project template for EU Horizon 2020 projects. This raises the question what research groups and projects can do to implement them. Hugo Besemer will introduce the ideas behind the FAIR principles.
OSFair2017 Training | FAIR metrics - Starring your data setsOpen Science Fair
Peter Doorn, Marjan Grootveld & Elly Dijk talk about FAIR data principles and present the assessment tool that DANS is developing for data repositories | OSFair2017 Workshop
Workshop title: FAIR metrics - Starring your data sets
Workshop overview:
Do you want to join our effort to put the FAIR data principles into practice? Come and explore the assessment tool that DANS, Data Archiving and Networked Services in the Netherlands, is developing for data repositories.
The aim of our work is to implement the FAIR principles into a data assessment tool so that every dataset which is deposited or reused from any digital repository can be assessed in terms of a score on the principles Findable, Accessible, Interoperable, and Reusable, using a ‘FAIRness’ scale from 1 to 5 stars. In this interactive session participants can explore the pilot version of FAIRdat: the FAIR data assessment tool. The organisers would like to inform you about the project, and look forward to all feedback to improve the tool, or to improve the metrics that are used.
DAY 3 - PARALLEL SESSION 7
Towards metrics to assess and encourage FAIRnessMichel Dumontier
With an increased interest in the FAIR metrics, there is need to develop tools and appraoches that can assess the FAIRness of a digital resource. This talk begins to explore some ideas in this space, and invites people to participate in a working group focused on the development, application, and evaluation of FAIR metric efforts.
Keynote presentation given at the Data Fellows 2023 workshop in Berlin on 22-23 June. Presentation gives examples of good communication to explain data management concepts and how to use games and other forms of interactivity in training events
Presentation given at the DMPonline 10 year anniversary week, reflecting on lessons learned developing the business model. See https://www.dcc.ac.uk/events/dmponline-10th-year-anniversary-celebration-week and #10yearsDMPonline
Keynote presentation given at the 10th anniversary of the 4TU.researchdata repository https://data.4tu.nl/info/en/news-events/training-events/news-item/4turesearchdatas-role-in-fostering-open-science-10th-anniversary-celebration-29-sep-2020-1530-1730-c/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
The future of FAIR
1. www.geant.org
1 |
Click to edit Master title style
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
01/07/2021 1
The Future of FAIR
www.geant.org
Sarah Jones
EOSC Engagement Manager
sarah.jones@geant.org
Twitter: @sarahroams
N8 library managers workshop
Wednesday 30th June 2021
3. www.geant.org
A set of principles to ensure that data are
shared in a way that enables & enhances
reuse, by humans and machines
What is FAIR?
Image CC-BY-SA by SangyaPundir
4. www.geant.org
What FAIR means: 15 principles
Findable
F1. (meta)data are assigned a globally unique and
eternally persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a searchable
resource.
F4. metadata specify the data identifier.
Interoperable
I1. (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
I2. (meta)data use vocabularies that follow FAIR
principles.
I3. (meta)data include qualified references to other
(meta)data.
Accessible
A1 (meta)data are retrievable by their identifier using a
standardized communications protocol.
A1.1 the protocol is open, free, and universally
implementable.
A1.2 the protocol allows for an authentication and
authorization procedure, where necessary.
A2 metadata are accessible, even when the data are no
longer available.
Reusable
R1. meta(data) have a plurality of accurate and relevant
attributes.
R1.1. (meta)data are released with a clear and
accessible data usage license.
R1.2. (meta)data are associated with their provenance.
R1.3. (meta)data meet domain-relevant community
standards.
Slide CC-BY by Erik Schultes, Leiden UMC
doi: 10.1038/sdata.2016.18
5. The FAIR data principles explained
• Clarifications from the Dutch
Techcentre for Life Sciences
• Each principle is a link to further
clarification, examples and
context
https://www.dtls.nl/fair-data/fair-
principles-explained
R1. Meta(data) are richly described with a plurality of accurate and relevant attributes
• By giving data many ‘labels’, it will be much easier to find and reuse the data
• Provide not just metadata that allows discovery, but also metadata that richly describes
the context under which that data was generated
• “plurality” indicates that metadata should be as generous as possible, even to the point of
providing information that may seem irrelevant
6. FAIR is nothing new
• Various research communities have been sharing their
data in a ‘FAIR’ way long before the term emerged
• Meaningful and memorable articulation of concepts
• Natural desire to want to be ‘fair’
• FAIR is gaining significant international traction
7. Open data and FAIR data
FAIR
data
Open
data
FAIR and Open are
not synonymous.
Data can be both,
one or neither.
Degrees of Open and FAIR
And both are on a scale
8. www.geant.org
How do Open, FAIR & RDM intersect?
Open
FAIR data
Managed data
Internal
Self-interest
External
Community
benefit
9. Open, FAIR and RDM
• Paper explores overlaps between
concepts of Open, FAIR and RDM.
• Proposes using Open and FAIR as
ways to engage researchers in
managing data well, as this is a
prerequisite for both.
• Recommends making data FAIR
and Open wherever possible
Higman, R., Bangert, D. and Jones, S., 2019. Three camps, one destination: the
intersections of research data management, FAIR and Open. Insights, 32(1), p.18.
DOI: http://doi.org/10.1629/uksg.468
12. www.geant.org
FAIR is a central part of EOSC
• FAIR is the glue which
enables us to federate
data and services
• Principles have been
used as the basis for the
EOSC Interoperability
Framework
12
13. Turning FAIR into Reality: Report and Action Plan - https://doi.org/10.2777/1524
Report and Action Plan: Take a holistic approach to lay out
what needs to be done to make FAIR a reality, in general
and for EOSC
Addresses the following key areas:
1. Concepts for FAIR
2. Creating a FAIR culture
3. Creating a technical ecosystem for FAIR
4. Skills and capacity building
5. Incentives and metrics
6. Investment and sustainability
Recommendations and Actions: 27 clear recommendations,
structured by these topics, are supported by precise actions
for stakeholders.
Report is out!
FAIR Expert Group
FAIR Data Expert Group (2016-2018)
14. www.geant.org
FAIR WG of EOSC Exec Board (2018-2020)
• Six recommendations for
implementation of FAIR
practice
• EOSC Interoperability
Framework
• Persistent identifier
policy for EOSC
• FAIR metrics for EOSC
• Recommendations on
certifying services to
enable FAIR
https://www.eoscsecretariat.eu/working-
groups/fair-working-group
15. www.geant.org
EOSC Association Task Forces… (2021 on)
• FAIR metrics & data quality
• Semantic interoperability
• Interoperability of data & services
• Research careers, recognition & credit
• Data stewardship curricula and career paths
• And more….
https://www.eosc.eu/news/call-members-eosc-
association-task-forces
15 |
Call for members
closes on Friday
30 July 2021 at
18.00 CEST
16. Many FAIR-related projects
All funded by the European Commission
clusters
National initiatives
• EOSC-Nordic
• EOSC-Pillar
• EOSC-synergy
• ExPaNDS
• NI4OS-Europe
17. It’s all going swimmingly!
17 |
Image CC-BY by Brian Matangelo https://unsplash.com/photos/gRof2_Ftu7A
19. www.geant.org
For his most recent paper:
1. Double checking the main dataset and
reformatting to submit to Dryad: 5 hours
2. Creating complementary file and preparing
metadata: 3 hours
3. Submission of these two files and the metadata
to Dryad: 45 minutes
4. Preparing a map of the locations: 1 hour
5. Submission of map to Figshare: 15 minutes
6. Cleaning up and documenting the code,
uploading it to GitHub: 25 hours
7. Cost of archiving in Dryad: US$90
8. Page Charges: $600
It takes a lot of time and effort
19 |
By Emilio Bruna
http://brunalab.org/blog/2014/09/04/the-
opportunity-cost-of-my-openscience-was-35-
hours-690
20. www.geant.org
Recognition and rewards are not there yet
20 |
It’s hard to overcome
your personal
investment.. It’s like
giving away your baby
Quote from a researcher at
Glasgow University as part
of the Incremental project
21. Communities are at different stages of maturity
• Some like astronomy and physics are well-organised
• Others still need to develop standards
• Tension between domain approaches and
interoperability cross-domain
https://doi.org/10.5281/
zenodo.1246815
https://doi.org/
10.2777/986252
22. How do we implement FAIR?
Image Israel Palacio https://unsplash.com/photos/P6FgiDNe6W4
23. • Findable
- Persistent ID
- Metadata online
• Accessible
- Data online
- Restrictions where needed
• Interoperable
- Use standards, controlled vocabs
- Common (open) formats
• Reusable
- Rich documentation
- Clear usage licence
FAIR data checklist
https://doi.org/10.5281/zenodo.1065991
24. Adopt a model for FAIR Digital Objects
• Digital objects can include data, software, and other
research resources
• Universal use of PIDs
• Use of common formats
• Data accompanied by code
• Rich metadata
• Clear licensing
25. www.geant.org
Address culture AND technology
Incentives
Metrics
Skills
Investment
Cultural and
social aspects
that drive the
ecosystem and
enact change
Cloud
of
registries
Two sides of one whole
26. Support interoperability
● Support research communities to develop and maintain their
interoperability frameworks for FAIR sharing
● Engage in international collaboration fora to do this
● Exchange of good practices, define case studies and success stories
● Common standards to support disciplinary frameworks and promote
interoperability and reuse across disciplines
27. www.geant.org
Principle Researcher role Service role
F1. Assign a PID Choose a relevant service Assign PIDs
F2. Rich metadata Create appropriate metadata Link data and metadata
F3. Indexed, searchable resource Choose a relevant service Ensure metadata search
F4. Metadata specify PID Choose a relevant service Link metadata and PID
A1. Standard protocol for retrieval Choose a relevant service Use standard protocols
A1.1 Open, free protocol Choose a relevant service Use open, free protocols
A1.2 Authenticated access if needed Choose a relevant service Provide authenticated access
A2. Metadata remain accessible Choose a relevant service Provide tombstone records
I1. Use of formal language (standards) Adopt standards Support appropriate standards
I2. Metadata vocabularies are FAIR Advocate for FAIR metadata Support FAIR metadata
I3. Qualified references (linked data) Cross-reference resources Cross-reference resources
R1. Rich metadata (plurality of attributes) Enrich metadata/documentation Advocate for good metadata
R1.1 Clear data usage licence Choose appropriate licence Require licences
R1.2 Metadata covers provenance Say where data came from Require provenance
R1.3 Community standards Adopt community standards Support community standards
FAIR is a joint responsibility…
Equal, if not more, responsibility on data services
28. 1. Adopt relevant standards as you create data
Researcher role
2. Create rich metadata and documentation which
• conforms to community standards
• explains provenance
• assigns a clear usage licence
• cross-links data, metadata, code and other resources
3. Choose appropriate data services which
• assign Persistent Identifiers
• enhance discoverability via indexes / catalogues
• use standard protocols for (authenticated) access
4. Advocate for / contribute to community
standards
29. www.geant.org
Institutional role
1. Raise awareness of community standards
2. Help researchers select appropriate data services
3. If running a repository:
• assign Persistent Identifiers
• ensure metadata specifies the PID
• expose metadata via indexes / catalogues / harvesting…
• use standard protocols for (authenticated) access
• cross-reference resources
• keep metadata accessible, even when data aren’t
4. Set requirements / advocate for good practice
30. Inherent link: data and services
In order for data to be FAIR,
you need services that enable FAIR
31. And a community responsibility….
31 |
As the global community adopts &
becomes dependent on FAIR, the
principles themselves need to be
community-owned and governed