2. The European Open Science Cloud
• seamless access to data, tools,
compute and services
• FAIR management
• reliable reuse of all research
digital objects produced along the
research life cycle
• Web of FAIR Data and services for
science & value-added services.
A federated and open multi-
disciplinary environment where
they can publish, find and re-use
data, tools and services for
research, innovation and
educational purposes.
4. 13 Research Infrastructures
350+ institutions
45+ partners in project
https://lifescience-ri.eu/
https://www.eosc-life.eu/
An open collaborative
space for digital
biology in Europe
Since 2019
5. 13 Research Infrastructures
350+ institutions
45+ partners in project
An open collaborative
space for digital
biology in Europe
https://lifescience-ri.eu/
https://www.eosc-life.eu/
Since 2019
6. Computational Workflows for Data intensive Bioscience
prepare, analyze, and share increasing volumes of complex data
CryoEM Image Analysis
Metagenomic Pipelines
Protein Ligand
Simulation
[Adam Hospital]
[Rob Finn]
[Carlos Oscar Sorzano Sanchez]
Nature 573, 149-150 (2019)
https://doi.org/10.1038/d41586-019-02619-z
Computational Workflows
Multi-step processes to
coordinate and execute
multiple codes and handle
data and processing
dependencies
8. Take EOSC to the users’ tools
Workflows are an entry point to
the tools and datasets
functions for production quality
FAIR data processing
access to secure data processing
democratising resources
Figure Credit: Romain Dallet
A data and method commons / collaboratory
A portable environment of interoperable tools
RIs publish data, methods & services for management,
storage and reuse
9. WORKFLOW
APPLICATION USER
Collaboratory stakeholders
TOOL
DEVELOPER
WORKFLOW
USER
SYS ADMIN WORKFLOW
DEVELOPER
& CUSTODIAN
COMPUTATIONAL
USER
Workflow System as a Platform Workflow System as a Service
Labour
Reach
need
infrastructure
& services
need tools to be
wrapped &
maintained
need workflows to be
developed, tested,
run & maintained
need to find and understand
workflows, with explanations to
use properly and safely.
10. Principles of the Collaboratory: Honour legacy & diversity
Workflow management system agnostic
• WfMS
• Jupyter Notebooks
• Scripts
• Common Workflow Language
Different degrees of support
Buy-in & On-boarding of WfMS:
• popular WfMS: Galaxy, nextflow,
snakemake, CWL
• Specialised WfMS: SCIPION, NMRPipeline
Workflow lifecycle support
Workflows as FAIR Digital Objects
11. Principles of the Collaboratory: Honour legacy & diversity
https://fairdo.org
https://fairdo.org/wg/fdo-cwfr/
EOSC interoperability framework 2021
Workflows as FAIR Digital Objects
Encourage workflow communities to
make workflows data-FAIR
Support what we already have and
communities actually use
Towards adoption and sustainability
Open federated ecosystem of services
Open ended standards and metadata
exchange for glue
Open communities
12. The EOSC-Life Workflow Collaboratory Infra Roadmap
People, workflows, services and standards for FAIR Workflows.
13. CONTAINERS
WF REPOS
REGISTRIES
<my script>
Dedicated Workflow
Testing and Monitoring
services
Workflow Registry
Existing EOSC, community,
commercial computational
infrastructure
Existing Workflow Mgt & Execution Systems
Community Wf Repos
14. FAIR Workflow Metadata & Standards Framework
Describe self-describe workflows with PIDs and metadata.
Flow: move workflows between services and platforms. Conduits, not silos.
Parts: package (scattered) objects linked together by context (metadata files + their objects)
RO-Crate https://www.researchobject.org/ro-crate/
Bioschemas https://bioschemas.org/
Common Workflow Language https://www.commonwl.org
GA4GH TRS https://ga4gh.github.io/tool-registry-service-schemas/
15. Practical, lightweight approach Machine
and human readable, search engine friendly
and developer familiar
FAIR Object Underware
Standard Web Native PIDs + JSON-LD +
Schema.org, off the shelf archiving formats
Self-describing, duck-typed by profiles +
add more schema.org and domain
ontologies
Extensible, descriptive and content
openendedness, honouring legacy, diversity,
and known and unknown unknowns - one size
does not fit all
A Graph inside the RO-Crate
PIDs connect the Graph to the
outside world
http://www.researchobject.org/ro-crate/
16. https://workflowhub.eu
https://workflowhub.org
Workflows
• May remain in their host
repositories
• Organised by teams, collections
& properties
• Linked with data, docs …
WorkflowHub
• GA4GH TRS API
• RO-Crate for import/export
• Bioschemas for metadata
• CWL for canonical workflow
description
• Full GitHub integration Fall 2021
Mixed depth of support for WfMS
• Lifting metadata from systems
• RO-Crate / TRS support
• Coupling to execution platforms
17. 1
Linking up providers and users
Building visibility & reputation
Reciprocity to close the
“Find – Get– Use – Credit” loop
Canonical workflows, workflow
blocks and libraries
DOIs, Citation
Companion objects
Versioning
Knowledge Graphs linking out to
OpenAIRE, DataCite etc
Deposit workflows in Zenodo
Workflow Collaboratory
and Collections
18. Workflow Services: Testing and monitoring
Uses RO-Crates to exchange. Enriches RO-Crates & their metadata
Integrated with WorkflowHub
Central aggregation point for workflow test statuses and outputs from various testing
services (e.g., Travis CI, GitHub Actions, Jenkins, etc.).
Facilitate the periodic automated execution of workflow tests.
Benchmarking and Technical monitoring of bioinformatics tools.
Check workflow performance, provenance on containers, memory
usage …
https://openebench.bsc.es/dashboard
https://lifemonitor.eu/
19. Enable new services
https://github.com/inab/WfExS-backend
Workflow execution service for handling
sensitive human data & analysis
Consumes and creates RO-Crates
UI to start computational tasks based
on containerised software
[Jose Maria Fernandez, Laura Rodrigues-Navas, Salvador Capella, BSC]
20. Beyond Biology, Beyond Our Infrastructures
Specimen Data Refinery
Natural History Collection
Digitalisation Pipelines
21. EOSC-Life Workflow Collaboratory
Bringing EOSC to users through workflows
Making a picture out of a jigsaw through
metadata and APIs
Essential to get WfMS on board and adopt
active community efforts
Mainly built through virtual hackathons
and open development
Being rolled out into big EU projects on
infectious diseases and cancer
Consultancy and
training essential in
infrastructures and
habitually under or miss
resourced
Workflow best practice
Delegate to WfMS
communities
PEOPLE
ARE
INFRASTRUCTURE
22. WorkflowHub Club : a open community effort
Join us on
https://about.workflowhub.eu/community/
EOSC-Life https://www.eosc-life.eu/
RO-Crate https://www.researchobject.org/ro-crate/
WorkflowHub https://workflowhub.eu/
Galaxy Europe https://galaxyproject.eu/
Bioschemas https://bioschemas.org/
Common Workflow Language https://www.commonwl.org/