SlideShare a Scribd company logo
1 of 22
EOSC-Life
Workflow
Collaboratory
IEEE e-Science Conference 2021, 22nd Sept 2021
Carole Goble
The University of Manchester
Joint Head of Node ELIXIR-UK
EOSC-Life Cluster
carole.goble@manchester.ac.uk
The European Open Science Cloud
• seamless access to data, tools,
compute and services
• FAIR management
• reliable reuse of all research
digital objects produced along the
research life cycle
• Web of FAIR Data and services for
science & value-added services.
A federated and open multi-
disciplinary environment where
they can publish, find and re-use
data, tools and services for
research, innovation and
educational purposes.
The European Open Science Cloud
Figure: EPOS
13 Research Infrastructures
350+ institutions
45+ partners in project
https://lifescience-ri.eu/
https://www.eosc-life.eu/
An open collaborative
space for digital
biology in Europe
Since 2019
13 Research Infrastructures
350+ institutions
45+ partners in project
An open collaborative
space for digital
biology in Europe
https://lifescience-ri.eu/
https://www.eosc-life.eu/
Since 2019
Computational Workflows for Data intensive Bioscience
prepare, analyze, and share increasing volumes of complex data
CryoEM Image Analysis
Metagenomic Pipelines
Protein Ligand
Simulation
[Adam Hospital]
[Rob Finn]
[Carlos Oscar Sorzano Sanchez]
Nature 573, 149-150 (2019)
https://doi.org/10.1038/d41586-019-02619-z
Computational Workflows
Multi-step processes to
coordinate and execute
multiple codes and handle
data and processing
dependencies
https://covid19.galaxyproject.org
https://covid19beacon.crg.eu
https://bit.ly/cog-uk-monitoring
SARS-CoV-2 pre-processing, monitoring, analysis
Automated monitoring of structured
data from the EU COVID-19 Data Portal
Managed central service and deployable
infrastructure
Improved data quality, uniformly
analysed data, submission to public
archives
Basis for new National French COVID-
19 surveillance platform
Accelerating knowledge exchange
through workflow and data product
exchange
Take EOSC to the users’ tools
Workflows are an entry point to
the tools and datasets
functions for production quality
FAIR data processing
access to secure data processing
democratising resources
Figure Credit: Romain Dallet
A data and method commons / collaboratory
A portable environment of interoperable tools
RIs publish data, methods & services for management,
storage and reuse
WORKFLOW
APPLICATION USER
Collaboratory stakeholders
TOOL
DEVELOPER
WORKFLOW
USER
SYS ADMIN WORKFLOW
DEVELOPER
& CUSTODIAN
COMPUTATIONAL
USER
Workflow System as a Platform Workflow System as a Service
Labour
Reach
need
infrastructure
& services
need tools to be
wrapped &
maintained
need workflows to be
developed, tested,
run & maintained
need to find and understand
workflows, with explanations to
use properly and safely.
Principles of the Collaboratory: Honour legacy & diversity
Workflow management system agnostic
• WfMS
• Jupyter Notebooks
• Scripts
• Common Workflow Language
Different degrees of support
Buy-in & On-boarding of WfMS:
• popular WfMS: Galaxy, nextflow,
snakemake, CWL
• Specialised WfMS: SCIPION, NMRPipeline
Workflow lifecycle support
Workflows as FAIR Digital Objects
Principles of the Collaboratory: Honour legacy & diversity
https://fairdo.org
https://fairdo.org/wg/fdo-cwfr/
EOSC interoperability framework 2021
Workflows as FAIR Digital Objects
Encourage workflow communities to
make workflows data-FAIR
Support what we already have and
communities actually use
Towards adoption and sustainability
Open federated ecosystem of services
Open ended standards and metadata
exchange for glue
Open communities
The EOSC-Life Workflow Collaboratory Infra Roadmap
People, workflows, services and standards for FAIR Workflows.
CONTAINERS
WF REPOS
REGISTRIES
<my script>
Dedicated Workflow
Testing and Monitoring
services
Workflow Registry
Existing EOSC, community,
commercial computational
infrastructure
Existing Workflow Mgt & Execution Systems
Community Wf Repos
FAIR Workflow Metadata & Standards Framework
Describe self-describe workflows with PIDs and metadata.
Flow: move workflows between services and platforms. Conduits, not silos.
Parts: package (scattered) objects linked together by context (metadata files + their objects)
RO-Crate https://www.researchobject.org/ro-crate/
Bioschemas https://bioschemas.org/
Common Workflow Language https://www.commonwl.org
GA4GH TRS https://ga4gh.github.io/tool-registry-service-schemas/
Practical, lightweight approach Machine
and human readable, search engine friendly
and developer familiar
FAIR Object Underware
Standard Web Native PIDs + JSON-LD +
Schema.org, off the shelf archiving formats
Self-describing, duck-typed by profiles +
add more schema.org and domain
ontologies
Extensible, descriptive and content
openendedness, honouring legacy, diversity,
and known and unknown unknowns - one size
does not fit all
A Graph inside the RO-Crate
PIDs connect the Graph to the
outside world
http://www.researchobject.org/ro-crate/
https://workflowhub.eu
https://workflowhub.org
Workflows
• May remain in their host
repositories
• Organised by teams, collections
& properties
• Linked with data, docs …
WorkflowHub
• GA4GH TRS API
• RO-Crate for import/export
• Bioschemas for metadata
• CWL for canonical workflow
description
• Full GitHub integration Fall 2021
Mixed depth of support for WfMS
• Lifting metadata from systems
• RO-Crate / TRS support
• Coupling to execution platforms
1
Linking up providers and users
Building visibility & reputation
Reciprocity to close the
“Find – Get– Use – Credit” loop
Canonical workflows, workflow
blocks and libraries
DOIs, Citation
Companion objects
Versioning
Knowledge Graphs linking out to
OpenAIRE, DataCite etc
Deposit workflows in Zenodo
Workflow Collaboratory
and Collections
Workflow Services: Testing and monitoring
Uses RO-Crates to exchange. Enriches RO-Crates & their metadata
Integrated with WorkflowHub
Central aggregation point for workflow test statuses and outputs from various testing
services (e.g., Travis CI, GitHub Actions, Jenkins, etc.).
Facilitate the periodic automated execution of workflow tests.
Benchmarking and Technical monitoring of bioinformatics tools.
Check workflow performance, provenance on containers, memory
usage …
https://openebench.bsc.es/dashboard
https://lifemonitor.eu/
Enable new services
https://github.com/inab/WfExS-backend
Workflow execution service for handling
sensitive human data & analysis
Consumes and creates RO-Crates
UI to start computational tasks based
on containerised software
[Jose Maria Fernandez, Laura Rodrigues-Navas, Salvador Capella, BSC]
Beyond Biology, Beyond Our Infrastructures
Specimen Data Refinery
Natural History Collection
Digitalisation Pipelines
EOSC-Life Workflow Collaboratory
Bringing EOSC to users through workflows
Making a picture out of a jigsaw through
metadata and APIs
Essential to get WfMS on board and adopt
active community efforts
Mainly built through virtual hackathons
and open development
Being rolled out into big EU projects on
infectious diseases and cancer
Consultancy and
training essential in
infrastructures and
habitually under or miss
resourced
Workflow best practice
Delegate to WfMS
communities
PEOPLE
ARE
INFRASTRUCTURE
WorkflowHub Club : a open community effort
Join us on
https://about.workflowhub.eu/community/
EOSC-Life https://www.eosc-life.eu/
RO-Crate https://www.researchobject.org/ro-crate/
WorkflowHub https://workflowhub.eu/
Galaxy Europe https://galaxyproject.eu/
Bioschemas https://bioschemas.org/
Common Workflow Language https://www.commonwl.org/

More Related Content

What's hot

RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryCarole Goble
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsSimon Cockell
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Carole Goble
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community UpdateCarole Goble
 
Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Paolo Romano
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific dataBruno Vieira
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...librarianrafia
 
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementBlue BRIDGE
 

What's hot (20)

RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
 

Similar to EOSC-Life Workflow Collaboratory

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)OpenAIRE
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Mobility and federation of Cloud computing
Mobility and federation of Cloud computingMobility and federation of Cloud computing
Mobility and federation of Cloud computingDavid Wallom
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software SuitemyGrid team
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challengesvty
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubEOSC-hub project
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksRaul Palma
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
 
eROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC ArchitectureeROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC Architecturee-ROSA
 
Linked services for the Web of Data
Linked services for the Web of DataLinked services for the Web of Data
Linked services for the Web of DataJohn Domingue
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
 

Similar to EOSC-Life Workflow Collaboratory (20)

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Mobility and federation of Cloud computing
Mobility and federation of Cloud computingMobility and federation of Cloud computing
Mobility and federation of Cloud computing
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software Suite
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challenges
 
ELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hubELIXIR Competence Centre in EOSC-hub
ELIXIR Competence Centre in EOSC-hub
 
Today's forecast for your campus: BLUEcloud
 Today's forecast for your campus: BLUEcloud Today's forecast for your campus: BLUEcloud
Today's forecast for your campus: BLUEcloud
 
RoHub
RoHubRoHub
RoHub
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
eROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC ArchitectureeROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC Architecture
 
Linked services for the Web of Data
Linked services for the Web of DataLinked services for the Web of Data
Linked services for the Web of Data
 
COPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob DaveyCOPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob Davey
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 

More from Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Carole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the FutureCarole Goble
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 

More from Carole Goble (13)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Recently uploaded

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Recently uploaded (20)

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

EOSC-Life Workflow Collaboratory

  • 1. EOSC-Life Workflow Collaboratory IEEE e-Science Conference 2021, 22nd Sept 2021 Carole Goble The University of Manchester Joint Head of Node ELIXIR-UK EOSC-Life Cluster carole.goble@manchester.ac.uk
  • 2. The European Open Science Cloud • seamless access to data, tools, compute and services • FAIR management • reliable reuse of all research digital objects produced along the research life cycle • Web of FAIR Data and services for science & value-added services. A federated and open multi- disciplinary environment where they can publish, find and re-use data, tools and services for research, innovation and educational purposes.
  • 3. The European Open Science Cloud Figure: EPOS
  • 4. 13 Research Infrastructures 350+ institutions 45+ partners in project https://lifescience-ri.eu/ https://www.eosc-life.eu/ An open collaborative space for digital biology in Europe Since 2019
  • 5. 13 Research Infrastructures 350+ institutions 45+ partners in project An open collaborative space for digital biology in Europe https://lifescience-ri.eu/ https://www.eosc-life.eu/ Since 2019
  • 6. Computational Workflows for Data intensive Bioscience prepare, analyze, and share increasing volumes of complex data CryoEM Image Analysis Metagenomic Pipelines Protein Ligand Simulation [Adam Hospital] [Rob Finn] [Carlos Oscar Sorzano Sanchez] Nature 573, 149-150 (2019) https://doi.org/10.1038/d41586-019-02619-z Computational Workflows Multi-step processes to coordinate and execute multiple codes and handle data and processing dependencies
  • 7. https://covid19.galaxyproject.org https://covid19beacon.crg.eu https://bit.ly/cog-uk-monitoring SARS-CoV-2 pre-processing, monitoring, analysis Automated monitoring of structured data from the EU COVID-19 Data Portal Managed central service and deployable infrastructure Improved data quality, uniformly analysed data, submission to public archives Basis for new National French COVID- 19 surveillance platform Accelerating knowledge exchange through workflow and data product exchange
  • 8. Take EOSC to the users’ tools Workflows are an entry point to the tools and datasets functions for production quality FAIR data processing access to secure data processing democratising resources Figure Credit: Romain Dallet A data and method commons / collaboratory A portable environment of interoperable tools RIs publish data, methods & services for management, storage and reuse
  • 9. WORKFLOW APPLICATION USER Collaboratory stakeholders TOOL DEVELOPER WORKFLOW USER SYS ADMIN WORKFLOW DEVELOPER & CUSTODIAN COMPUTATIONAL USER Workflow System as a Platform Workflow System as a Service Labour Reach need infrastructure & services need tools to be wrapped & maintained need workflows to be developed, tested, run & maintained need to find and understand workflows, with explanations to use properly and safely.
  • 10. Principles of the Collaboratory: Honour legacy & diversity Workflow management system agnostic • WfMS • Jupyter Notebooks • Scripts • Common Workflow Language Different degrees of support Buy-in & On-boarding of WfMS: • popular WfMS: Galaxy, nextflow, snakemake, CWL • Specialised WfMS: SCIPION, NMRPipeline Workflow lifecycle support Workflows as FAIR Digital Objects
  • 11. Principles of the Collaboratory: Honour legacy & diversity https://fairdo.org https://fairdo.org/wg/fdo-cwfr/ EOSC interoperability framework 2021 Workflows as FAIR Digital Objects Encourage workflow communities to make workflows data-FAIR Support what we already have and communities actually use Towards adoption and sustainability Open federated ecosystem of services Open ended standards and metadata exchange for glue Open communities
  • 12. The EOSC-Life Workflow Collaboratory Infra Roadmap People, workflows, services and standards for FAIR Workflows.
  • 13. CONTAINERS WF REPOS REGISTRIES <my script> Dedicated Workflow Testing and Monitoring services Workflow Registry Existing EOSC, community, commercial computational infrastructure Existing Workflow Mgt & Execution Systems Community Wf Repos
  • 14. FAIR Workflow Metadata & Standards Framework Describe self-describe workflows with PIDs and metadata. Flow: move workflows between services and platforms. Conduits, not silos. Parts: package (scattered) objects linked together by context (metadata files + their objects) RO-Crate https://www.researchobject.org/ro-crate/ Bioschemas https://bioschemas.org/ Common Workflow Language https://www.commonwl.org GA4GH TRS https://ga4gh.github.io/tool-registry-service-schemas/
  • 15. Practical, lightweight approach Machine and human readable, search engine friendly and developer familiar FAIR Object Underware Standard Web Native PIDs + JSON-LD + Schema.org, off the shelf archiving formats Self-describing, duck-typed by profiles + add more schema.org and domain ontologies Extensible, descriptive and content openendedness, honouring legacy, diversity, and known and unknown unknowns - one size does not fit all A Graph inside the RO-Crate PIDs connect the Graph to the outside world http://www.researchobject.org/ro-crate/
  • 16. https://workflowhub.eu https://workflowhub.org Workflows • May remain in their host repositories • Organised by teams, collections & properties • Linked with data, docs … WorkflowHub • GA4GH TRS API • RO-Crate for import/export • Bioschemas for metadata • CWL for canonical workflow description • Full GitHub integration Fall 2021 Mixed depth of support for WfMS • Lifting metadata from systems • RO-Crate / TRS support • Coupling to execution platforms
  • 17. 1 Linking up providers and users Building visibility & reputation Reciprocity to close the “Find – Get– Use – Credit” loop Canonical workflows, workflow blocks and libraries DOIs, Citation Companion objects Versioning Knowledge Graphs linking out to OpenAIRE, DataCite etc Deposit workflows in Zenodo Workflow Collaboratory and Collections
  • 18. Workflow Services: Testing and monitoring Uses RO-Crates to exchange. Enriches RO-Crates & their metadata Integrated with WorkflowHub Central aggregation point for workflow test statuses and outputs from various testing services (e.g., Travis CI, GitHub Actions, Jenkins, etc.). Facilitate the periodic automated execution of workflow tests. Benchmarking and Technical monitoring of bioinformatics tools. Check workflow performance, provenance on containers, memory usage … https://openebench.bsc.es/dashboard https://lifemonitor.eu/
  • 19. Enable new services https://github.com/inab/WfExS-backend Workflow execution service for handling sensitive human data & analysis Consumes and creates RO-Crates UI to start computational tasks based on containerised software [Jose Maria Fernandez, Laura Rodrigues-Navas, Salvador Capella, BSC]
  • 20. Beyond Biology, Beyond Our Infrastructures Specimen Data Refinery Natural History Collection Digitalisation Pipelines
  • 21. EOSC-Life Workflow Collaboratory Bringing EOSC to users through workflows Making a picture out of a jigsaw through metadata and APIs Essential to get WfMS on board and adopt active community efforts Mainly built through virtual hackathons and open development Being rolled out into big EU projects on infectious diseases and cancer Consultancy and training essential in infrastructures and habitually under or miss resourced Workflow best practice Delegate to WfMS communities PEOPLE ARE INFRASTRUCTURE
  • 22. WorkflowHub Club : a open community effort Join us on https://about.workflowhub.eu/community/ EOSC-Life https://www.eosc-life.eu/ RO-Crate https://www.researchobject.org/ro-crate/ WorkflowHub https://workflowhub.eu/ Galaxy Europe https://galaxyproject.eu/ Bioschemas https://bioschemas.org/ Common Workflow Language https://www.commonwl.org/