SlideShare a Scribd company logo
FAIR Workflows and
Research Objects get a Workout
Carole Goble
The University of Manchester, UK
carole.goble@manchester.ac.uk
DataVerse Community Conference 2021, 15th June 2021
EOSC-Life pan-national data & method thematic commons for
bioscience data and methods
Using and sharing data, tools and workflows in the cloud
Infrastructure Zoo
Flows around a Federated & Diverse System
1466 data repositories / archives
916 data format and metadata
standards*
Not including the institutional or
national repositories like
DataVerse
https://fairsharing.org/ accessed May 2021
From compounds to clinical trials
Primary data - Secondary use
Infrastructure Zoo
Flows around a Federated & Diverse System
https://fairsharing.org/ accessed May 2021
Community domain enclaves
fragmented resources
flow across platforms & sovereignties
Workflows as an entry point and
integration mechanism
Legacy
• data repositories & data platforms
• processing and workflow
platforms
CryoEM Image Analysis Metagenomic Pipelines Drug Discovery
Quality control
Replication
Scrutiny
Shared know-how
Repetition
SARS-CoV-2 pre-processing, monitoring, analysis
https://elixir-europe.org/news/covid-19-variants-galaxy
Beyond Data:ComputationalWorkflows as method objects
to be shared, ported and reused & repurposed
Multi-step
Leverage third party codes
Scalable processing of data
Transparent research
Computational Workflows
Specification
description
Software
Execution
A special kind of software
Separation of the workflow specification from its execution
Precise description of a procedure: multi-
step process coordinated by input/output
data relationships (data types).
Execution of computational
processes (run a code, invoke a
service…).
Data is consumed and produced by
each step.
Beyond Data:ComputationalWorkflows as method objects
to be shared, ported and reused & repurposed
Multi-step
Leverage third party codes
Scalable processing of data
Transparent research
Computational Workflows
<my scripts>
A Zoo of Workflow Systems and “systems”*
Native repositories
*https://s.apache.org/existing-workflow-systems
EMBL-EBI MGnify
Metagenomics
pipelines
Command line tools
Sub-Workflows
Containers
Beyond Data: Multi-part Research Objects
dependencies and associates scattered across repositories and within repositories
made at different times by different people
Workflow itself Workflow associated Objects
Specification
descriptions
Parameters
Input
Datasets
Output
Datasets
Runtime details & Provenance
Documentation
Bind to Dependencies
- Containers
- Codes
- Sub-workflows
Bind to particular test engines
Publications
Image
Other workflows
Sub workflows
Software
Execution
Inputs and outputs
Author
Beyond Data:ComputationalWorkflows as multi-part method objects
to be shared, ported and reused & repurposed
Services for FAIRWorkflows
• Describe workflows with PIDs and metadata
• Flow: Move workflows between services and
platforms
• Parts: Package (scattered) objects linked
together by context (metadata files with their objects)
Honouring
• the legacy and diverse ecosystem
• buy-in from platforms
Be KISSy
• practical and developer friendly standards,
and webby mechanisms
• extensible openendedness – unknown
unknowns & diversity….
Workflow
Registry
Workflow
Systems
Repos Containers Deploys
Testing
Monitoring
Open Registry forWorkflows
Perpetual Development in the open by an open community
https://workflowhub.eu
Towards FAIR workflows and FAIR registry
• Find and AccessWorkflows
– Workflows may remain in their native repositories in
their native form. Or can deposit.
– Register (push) / Harvest (pull)
• Workflows interoperability and reusability
– Using metadata standards framework
Makers are the custodians
• people organisation: spaces, teams, organisations …
• workflow organisation: collections, tagging, facets ...
• credit: for submitters and authors
Open to any platform,
any subject, any person
WorkflowHub Club
TRS -Tool Registry Service API
Access:
FAIRWorkflow are FAIR Software
living and with dependencies…workflow history/provenance
Indicators of Status
Workflow
monitoring
Register versions
(Support Github actions)
Incremental metadata and
supplementary materials
(Tracking & Lifting
out subworkflows)
Which Workflow Objects are FAIR?
• workflow specification with test or
exemplar data?
• implementation of that design in a
particularWfMS?
• instantiation of that implementation
ready to run with input data, parameters
set, computational services spun up?
• run result with intermediate/final data
products and provenance logs?
• In practice this is a bit blurry.
A metadata
framework
extensible
enough to cope
FAIRWorkflows are FAIR Digital Objects
Descriptive, machine actionable metadata framework from the community
practical and developer friendly standards, extensible openendedness
Standardised
metadata about the
workflows
for registration,
discovery
Schema.org profile and types
ComputationalWorkflow
FormalParameter
ComputationalTool
Canonical workflow
description of the
workflow itself
Executable and
Abstract form
Type the input and
output data formats
of the steps
Ontology of types of data
and data identifiers, data
formats, operations in life
sciences
Upload and Download the parts?
Exchange between services & platforms?
Sharing & archiving the components of science
Lets step back!
Beyond Data: Multi-part Research = Multi-part ROs
Each object has its own
metadata and repositories
Integrated view & context over
fragmented resources using
their PIDs and metadata
Need a way of packaging up,
describing the package and
parts, citing, shipping around,
storing, archiving, sharing.
Reference real things. Like
people, mice and equipment.
Beyond Data: Multi-part Research Objects
Describing a Dataset as a
Digital Object
A way of packaging up,
describing the package and
parts, citing, shipping around,
storing, archiving, sharing.
Even reference real things. Like
people, mice and equipment.
Image Courtesy of Peter Sefton: https://arkisto-platform.github.io/standards/ro-crate/
The dataset may contain any kind of
data resource, about anything, in any
format as a file or URL. They can be
scattered across repositories.
Each resource can have a machine
readable description in JSON-LD
format
A human-readable description and
preview can be in an HTML file
that lives alongside the metadata
Provenance and workflow information
can be included - to assist in data and
research-process re-use
RO-Crate DigitalObjects may be
packaged for distribution eg via Zip,
Bagit and OCFL Objects
Courtesy Peter Sefton, https://arkisto-platform.github.io/standards/ro-crate/
A data
repository
perspective
Not just for workflows!
For any kind of object
data, publications, SOPs, software …
and data repositories!
especially data repositories!
Aggregate files, any URI-addressable content, another
RO-Crate, along with contextual information, into a citable
RO-Crate which has its own metadata.
Can use as a bag of references:
large/sensitive datasets
citation aggregator
FAIR
here
FAIR
here
Unbounded Research Objects
Anything referenceable that may be in scattered
across different repositories and/or different
datasets in the same repository.
Self describing integrated view spanning over
fragmented resources using PIDs and metadata
Metadata held alongside heterogeneous data
Infrastructure independent
• Exchange between repositories, registries and
services.
• Avoid vendor lock-in
Practical, lightweight approach Machine
and human readable, search engine friendly
and developer familiar, blah blah
FAIR Object middleware/underware
Standard Web Native PIDs + JSON-LD +
Schema.org, off the shelf archiving formats
Self-describing, Typed by profiles + add
more schema.org and domain ontologies
Extensible, descriptive and content
openendedness, honouring legacy, diversity,
and known and unknown unknowns - one size
does not fit all, blah blah
A Graph inside the RO-Crate
PIDs connect the Graph to the
outside world
http://www.researchobject.org/ro-crate/
RO-Crate variants: Profiles are extensible typing
RO-Crates collect metadata
Workflow-RO-Crate Workflow-Testing-RO-Crate
Workflow-Run-RO-Crate
*https://repository.publisso.de/resource/frl:6423291 https://www.researchobject.org/ro-crate/profiles.html
BioComputeObject-
RO-Crate
Galaxy-Workflow-RO-Crate
maDMP
RO-Crate*
DataRepo-RO-Crate
DataRepo-
DataCube-
RO-Crate
Aggregated
DataCitation
RO-Crate
Secure Bags of
PIDs to sensitive
/ large data
A step towards FAIR Digital Objects*
“To be FAIR each digital object
type has its own metadata
requirements,
and may have its own repositories
and registries”
FAIR DigitalObjects for Science: From Data Pieces toActionable
Knowledge Units: https://doi.org/10.3390/publications8020021
https://fairdo.org
FAIR Digital Objects
Actionable knowledge unit
Digital butterfly – digital twins
Bags of references
courtesy Dimitris Koureas
Coordinator DiSSCo EU
Research Infrastructure
Specimen object image
courtesy of Alex Hardisty
Specimen Data Refinery
Workflows to Digitise Natural History Specimens
FAIR DigitalObjects -> Packaged + Actionable
+
FAIR Digital Object
Framework
Open Digital Specimen
Workflow Infrastructure
courtesy of Alex Hardisty and Laurence Livermore
Real Use Cases Considered Essential!
• Building out in the open accelerated progress
RO-Crate is metadata middleware
• smart use of wheels already invented
• it takes a village: get tools, services on board
• developer friendly, firm best practice
A little bit of semantics goes a long way…
• Schema.org + JSON-LD
…prepare for more
Known and Unknown unknowns, One size does not fit all
• descriptive openendedness , multi-interpretation
Metadata sucks
• auto-curation is the way forward folks!
What about
the workout?
What about
FAIR?
FAIR at multiple levels & granularities
• Workflows & RO-Crates are composite and
nested, with dependencies
• FAIR all the way down
• Not always compatible – e.g. licenses
FAIR+
• Reusable and Usable workflows- testing &
parameter validation. Documentation.
FAIR software paradigm is pervasive
• Applies to RO-Crate Research Objects
FAIR takes a village, of course
C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes,
D.Garijo,Y. Gil, M.R. Crusoe, K. Peters & D.
Schober. FAIR computational workflows. Data
Intelligence 2(2020), 108–121.
doi: 10.1162/dint_a_00033
What about DataVerse?
Workflows have data and software
characteristics
RO-Crate preserves metadata and the objects
– workflow, data, datasets whatever…
• Archive/republish independent of
WorkflowHub
• Move content from one repository to
another, one service to another
• Point to content and don’t move it
• Sharing reproducible results & methods
Set data and
workflows and their
metadata free!
RO-Crate RepositoryCollection, RepositoryObject
represents records in a repository to describe an export from a repository or
digital library
https://www.researchobject.org/ro-crate/community
https://about.workflowhub.eu/community/

More Related Content

What's hot

"Cool" metadata for FAIR data
"Cool" metadata for FAIR data"Cool" metadata for FAIR data
"Cool" metadata for FAIR data
Research Data Alliance
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
Richard Cyganiak
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
dgarijo
 
“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN
Chengjen Lee
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
Jose Emilio Labra Gayo
 
FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust MarketplaceFIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
Leigh Dodds
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
Sarah Jones
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
kbajda
 
Orion Context Broker 1.15.0
Orion Context Broker 1.15.0Orion Context Broker 1.15.0
Orion Context Broker 1.15.0
Fermin Galan
 
DSpace-CRIS technical level introduction
DSpace-CRIS technical level introductionDSpace-CRIS technical level introduction
DSpace-CRIS technical level introduction
4Science
 
Introduction to Open Science and EOSC
Introduction to Open Science and EOSCIntroduction to Open Science and EOSC
Introduction to Open Science and EOSC
Sarah Jones
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
Jose Emilio Labra Gayo
 
Spark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted MalaskaSpark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted Malaska
Spark Summit
 
RDF data model
RDF data modelRDF data model
RDF data model
Jose Emilio Labra Gayo
 
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkEnd-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkDatabricks
 
SHACL Overview
SHACL OverviewSHACL Overview
SHACL Overview
Irene Polikoff
 

What's hot (20)

"Cool" metadata for FAIR data
"Cool" metadata for FAIR data"Cool" metadata for FAIR data
"Cool" metadata for FAIR data
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
 
FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust MarketplaceFIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust Marketplace
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Orion Context Broker 1.15.0
Orion Context Broker 1.15.0Orion Context Broker 1.15.0
Orion Context Broker 1.15.0
 
DSpace-CRIS technical level introduction
DSpace-CRIS technical level introductionDSpace-CRIS technical level introduction
DSpace-CRIS technical level introduction
 
Introduction to Open Science and EOSC
Introduction to Open Science and EOSCIntroduction to Open Science and EOSC
Introduction to Open Science and EOSC
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Spark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted MalaskaSpark Summit EU talk by Ted Malaska
Spark Summit EU talk by Ted Malaska
 
RDF data model
RDF data modelRDF data model
RDF data model
 
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkEnd-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache Spark
 
SHACL Overview
SHACL OverviewSHACL Overview
SHACL Overview
 

Similar to FAIR Workflows and Research Objects get a Workout

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Carole Goble
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
Carole Goble
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Anita de Waard
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
Carole Goble
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
Raul Palma
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
Carole Goble
 
Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)
Globus
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
Julie Allinson
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
Sheila MacNeill
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)
Globus
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
Vivek Krishnakumar
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Carole Goble
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
Raul Palma
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
Andrea Bollini
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
Vassilis Protonotarios
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Blue BRIDGE
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
Herbert Van de Sompel
 

Similar to FAIR Workflows and Research Objects get a Workout (20)

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 

More from Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
Carole Goble
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Carole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
Carole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Carole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
Carole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
Carole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
Carole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
Carole Goble
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
Carole Goble
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
Carole Goble
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
Carole Goble
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
Carole Goble
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
Carole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
Carole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
Carole Goble
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
Carole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
Carole Goble
 

More from Carole Goble (20)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 

Recently uploaded

Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
yusufzako14
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
Cherry
 

Recently uploaded (20)

Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
 

FAIR Workflows and Research Objects get a Workout

  • 1. FAIR Workflows and Research Objects get a Workout Carole Goble The University of Manchester, UK carole.goble@manchester.ac.uk DataVerse Community Conference 2021, 15th June 2021
  • 2. EOSC-Life pan-national data & method thematic commons for bioscience data and methods Using and sharing data, tools and workflows in the cloud
  • 3. Infrastructure Zoo Flows around a Federated & Diverse System 1466 data repositories / archives 916 data format and metadata standards* Not including the institutional or national repositories like DataVerse https://fairsharing.org/ accessed May 2021 From compounds to clinical trials Primary data - Secondary use
  • 4. Infrastructure Zoo Flows around a Federated & Diverse System https://fairsharing.org/ accessed May 2021 Community domain enclaves fragmented resources flow across platforms & sovereignties Workflows as an entry point and integration mechanism Legacy • data repositories & data platforms • processing and workflow platforms
  • 5. CryoEM Image Analysis Metagenomic Pipelines Drug Discovery Quality control Replication Scrutiny Shared know-how Repetition
  • 6. SARS-CoV-2 pre-processing, monitoring, analysis https://elixir-europe.org/news/covid-19-variants-galaxy
  • 7. Beyond Data:ComputationalWorkflows as method objects to be shared, ported and reused & repurposed Multi-step Leverage third party codes Scalable processing of data Transparent research Computational Workflows Specification description Software Execution A special kind of software Separation of the workflow specification from its execution Precise description of a procedure: multi- step process coordinated by input/output data relationships (data types). Execution of computational processes (run a code, invoke a service…). Data is consumed and produced by each step.
  • 8. Beyond Data:ComputationalWorkflows as method objects to be shared, ported and reused & repurposed Multi-step Leverage third party codes Scalable processing of data Transparent research Computational Workflows <my scripts> A Zoo of Workflow Systems and “systems”* Native repositories *https://s.apache.org/existing-workflow-systems
  • 10. Beyond Data: Multi-part Research Objects dependencies and associates scattered across repositories and within repositories made at different times by different people Workflow itself Workflow associated Objects Specification descriptions Parameters Input Datasets Output Datasets Runtime details & Provenance Documentation Bind to Dependencies - Containers - Codes - Sub-workflows Bind to particular test engines Publications Image Other workflows Sub workflows Software Execution Inputs and outputs Author
  • 11. Beyond Data:ComputationalWorkflows as multi-part method objects to be shared, ported and reused & repurposed Services for FAIRWorkflows • Describe workflows with PIDs and metadata • Flow: Move workflows between services and platforms • Parts: Package (scattered) objects linked together by context (metadata files with their objects) Honouring • the legacy and diverse ecosystem • buy-in from platforms Be KISSy • practical and developer friendly standards, and webby mechanisms • extensible openendedness – unknown unknowns & diversity…. Workflow Registry Workflow Systems Repos Containers Deploys Testing Monitoring
  • 12. Open Registry forWorkflows Perpetual Development in the open by an open community https://workflowhub.eu Towards FAIR workflows and FAIR registry • Find and AccessWorkflows – Workflows may remain in their native repositories in their native form. Or can deposit. – Register (push) / Harvest (pull) • Workflows interoperability and reusability – Using metadata standards framework Makers are the custodians • people organisation: spaces, teams, organisations … • workflow organisation: collections, tagging, facets ... • credit: for submitters and authors Open to any platform, any subject, any person WorkflowHub Club
  • 13. TRS -Tool Registry Service API Access:
  • 14. FAIRWorkflow are FAIR Software living and with dependencies…workflow history/provenance Indicators of Status Workflow monitoring Register versions (Support Github actions) Incremental metadata and supplementary materials (Tracking & Lifting out subworkflows)
  • 15. Which Workflow Objects are FAIR? • workflow specification with test or exemplar data? • implementation of that design in a particularWfMS? • instantiation of that implementation ready to run with input data, parameters set, computational services spun up? • run result with intermediate/final data products and provenance logs? • In practice this is a bit blurry. A metadata framework extensible enough to cope
  • 16. FAIRWorkflows are FAIR Digital Objects Descriptive, machine actionable metadata framework from the community practical and developer friendly standards, extensible openendedness Standardised metadata about the workflows for registration, discovery Schema.org profile and types ComputationalWorkflow FormalParameter ComputationalTool Canonical workflow description of the workflow itself Executable and Abstract form Type the input and output data formats of the steps Ontology of types of data and data identifiers, data formats, operations in life sciences Upload and Download the parts? Exchange between services & platforms? Sharing & archiving the components of science
  • 17. Lets step back! Beyond Data: Multi-part Research = Multi-part ROs Each object has its own metadata and repositories Integrated view & context over fragmented resources using their PIDs and metadata Need a way of packaging up, describing the package and parts, citing, shipping around, storing, archiving, sharing. Reference real things. Like people, mice and equipment.
  • 18. Beyond Data: Multi-part Research Objects Describing a Dataset as a Digital Object A way of packaging up, describing the package and parts, citing, shipping around, storing, archiving, sharing. Even reference real things. Like people, mice and equipment. Image Courtesy of Peter Sefton: https://arkisto-platform.github.io/standards/ro-crate/
  • 19. The dataset may contain any kind of data resource, about anything, in any format as a file or URL. They can be scattered across repositories. Each resource can have a machine readable description in JSON-LD format A human-readable description and preview can be in an HTML file that lives alongside the metadata Provenance and workflow information can be included - to assist in data and research-process re-use RO-Crate DigitalObjects may be packaged for distribution eg via Zip, Bagit and OCFL Objects Courtesy Peter Sefton, https://arkisto-platform.github.io/standards/ro-crate/ A data repository perspective
  • 20. Not just for workflows! For any kind of object data, publications, SOPs, software … and data repositories! especially data repositories! Aggregate files, any URI-addressable content, another RO-Crate, along with contextual information, into a citable RO-Crate which has its own metadata. Can use as a bag of references: large/sensitive datasets citation aggregator FAIR here FAIR here
  • 21. Unbounded Research Objects Anything referenceable that may be in scattered across different repositories and/or different datasets in the same repository. Self describing integrated view spanning over fragmented resources using PIDs and metadata Metadata held alongside heterogeneous data Infrastructure independent • Exchange between repositories, registries and services. • Avoid vendor lock-in
  • 22. Practical, lightweight approach Machine and human readable, search engine friendly and developer familiar, blah blah FAIR Object middleware/underware Standard Web Native PIDs + JSON-LD + Schema.org, off the shelf archiving formats Self-describing, Typed by profiles + add more schema.org and domain ontologies Extensible, descriptive and content openendedness, honouring legacy, diversity, and known and unknown unknowns - one size does not fit all, blah blah A Graph inside the RO-Crate PIDs connect the Graph to the outside world http://www.researchobject.org/ro-crate/
  • 23. RO-Crate variants: Profiles are extensible typing RO-Crates collect metadata Workflow-RO-Crate Workflow-Testing-RO-Crate Workflow-Run-RO-Crate *https://repository.publisso.de/resource/frl:6423291 https://www.researchobject.org/ro-crate/profiles.html BioComputeObject- RO-Crate Galaxy-Workflow-RO-Crate maDMP RO-Crate* DataRepo-RO-Crate DataRepo- DataCube- RO-Crate Aggregated DataCitation RO-Crate Secure Bags of PIDs to sensitive / large data
  • 24. A step towards FAIR Digital Objects* “To be FAIR each digital object type has its own metadata requirements, and may have its own repositories and registries” FAIR DigitalObjects for Science: From Data Pieces toActionable Knowledge Units: https://doi.org/10.3390/publications8020021 https://fairdo.org
  • 25. FAIR Digital Objects Actionable knowledge unit Digital butterfly – digital twins Bags of references courtesy Dimitris Koureas Coordinator DiSSCo EU Research Infrastructure Specimen object image courtesy of Alex Hardisty
  • 26. Specimen Data Refinery Workflows to Digitise Natural History Specimens FAIR DigitalObjects -> Packaged + Actionable + FAIR Digital Object Framework Open Digital Specimen Workflow Infrastructure courtesy of Alex Hardisty and Laurence Livermore
  • 27. Real Use Cases Considered Essential! • Building out in the open accelerated progress RO-Crate is metadata middleware • smart use of wheels already invented • it takes a village: get tools, services on board • developer friendly, firm best practice A little bit of semantics goes a long way… • Schema.org + JSON-LD …prepare for more Known and Unknown unknowns, One size does not fit all • descriptive openendedness , multi-interpretation Metadata sucks • auto-curation is the way forward folks! What about the workout?
  • 28. What about FAIR? FAIR at multiple levels & granularities • Workflows & RO-Crates are composite and nested, with dependencies • FAIR all the way down • Not always compatible – e.g. licenses FAIR+ • Reusable and Usable workflows- testing & parameter validation. Documentation. FAIR software paradigm is pervasive • Applies to RO-Crate Research Objects FAIR takes a village, of course C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D.Garijo,Y. Gil, M.R. Crusoe, K. Peters & D. Schober. FAIR computational workflows. Data Intelligence 2(2020), 108–121. doi: 10.1162/dint_a_00033
  • 29. What about DataVerse? Workflows have data and software characteristics RO-Crate preserves metadata and the objects – workflow, data, datasets whatever… • Archive/republish independent of WorkflowHub • Move content from one repository to another, one service to another • Point to content and don’t move it • Sharing reproducible results & methods Set data and workflows and their metadata free! RO-Crate RepositoryCollection, RepositoryObject represents records in a repository to describe an export from a repository or digital library