SlideShare a Scribd company logo
Wf4Ever:
Preserving workflows as
digital Research Objects
       Stian Soiland-Reyes
  myGrid, University of Manchester

          EGI Community Forum 2012, Workflow Systems workshop
           Leibniz Supercomputing Centre, Münich, 2012-03-28
My background

                               Taverna - Scientific Workflow Management
                                  System
                               ~85000 downloads
                               ~EU projects: SCAPE, BioVeL, HELIO,
http://www.taverna.org.uk/
                               e-Lico, VPH-SHARE, EGI-INSPiRE….

                               myExperiment - Web 3.0 virtual
                                 environment, library and social
                                 network for workflows
http://www.myexperiment.org/
                               ~5000 registered users
                               ~2200 workflows
                               ~21 different systems

                                                                          2
“A biologist would rather share their
 toothbrush than their gene name”




                                  Mike Ashburner and others
                                Professor in Dept of Genetics,
                                 University of Cambridge, UK
http://www.myexperiment.org/

       “Facebook for Scientists”           A probe into researcher behaviour
       ...but different to Facebook!

   A repository of research methods       Open source (BSD) Ruby on Rails app

 A social network of people and things       REST and SPARQL, Linked Data

 A Social Virtual Research Environment    Influenced BioCatalogue, MethodBox
                                                      and SysMO-SEEK

     myExperiment currently has 5378 members, 292 groups, 2273
                workflows, 534 files and 217 packs
 Workflow Preservation
    Research Objects
       Provenance
    Recommendation
 Astronomy and Genomics
                           http://www.wf4ever-project.org/
Wf4Ever
                                                                 Challenges
Preservation of scientific workflows   » Scientific workflows enable automation
     in data-intensive science           of scientific methods and encourage
                                         best practices to be shared
                                       » Workflows need to be preserved for
                                            › Reuse, fundamental for incremental
                                              scientific development
                                            › Method reproducibility, key for
                                              credit and publication
                                       » Workflow preservation is complex!
                                       » Heterogeneous types of information
                                         need to be aggregated, including
                                         workflows and related resources
                                         forming research objects
                                       » Research objects need to be trusted and
                                         understandable n years from now
                                       » Social aspects need to be addressed in
                                         order to support reuse in scientific
                                         communities
                                                                               7
The R.* dimensions


Reusable. The key tenet of Research                 Replayable. Studies might involve
Objects is to support the sharing and               single investigations that happen in
reuse of data, methods and processes.               milliseconds or protracted processes
Repurposeable. Reuse may also                       that take years.
involve the reuse of constituent parts of Referenceable. If research objects are
the Research Object.                      to augment or replace traditional
Repeatable. There should be sufficient publication methods, then they must be
                                          referenceable or citeable.
information in a Research Object to be
able to repeat the study, perhaps years Revealable. Third parties must be able
later.                                    to audit the steps performed in the
Reproducible. A third party can start research in order to be convinced of the
                                          validity of results.
with the same inputs and methods and
see if a prior result can be confirmed.   Respectful. Explicit representations of
                                          the provenance, lineage and flow of
                                          intellectual property.
   Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
Wf4Ever
                                   Forms of decay
Workflow Decay
• Service decay
     • Flux/decay/unavailability
• Data decay
     • Formats/ids/standards
• Infrastructure decay
     • platform/resources


Experiment Decay
•   Methodological changes
•   New technologies
•   New resources/components
•   New data
                                                 9
Preservation, Conservation, Recreating

Preserving
Archived Record
Fixed Snapshots
Review
Rerun & Replay

Conserving
Active Instrument
Live
Rerun & Reuse
Repair & Restore

Recreating
Archived Record
Active Instrument
Live
Rebuild Recycle Repurpose

                                                                     10
Workflow Decay
                                                 Decay at different abstraction levels


                                                                               Redo




                                                                            Flux


                                                                            Flux


                                                                            Flux

                                                                                      11
http://www.gridworkflow.org/kwfgrid/gwes/docs/
Research objects




              12
Research Objects as Social Objects




13          13
                                     13
http://purl.org/wf4ever/ro#
                               Research Object model core (simplified)


                              ore:aggregates
                                                   ro:ResearchObject
        ro:Resource                                                           ore:isDescribedBy



                                                                                     ro:Manifest
wfdesc:Workflow

              ro:annotatesAggregatedResource         ro:AggregatedAnnotation

                                 Note: This figure shows a simplified view of the RO core.




   RO specification: http://wf4ever.github.com/ro/
                                                                                                   14
http://purl.org/wf4ever/ro#
Research Object model core




                                15
http://purl.org/wf4ever/wfdesc#
RO model: Workflow Description




                                     16
http://purl.org/wf4ever/wfprov#
Workflow Provenance (wfprov)




                                   17
Technical infrastructure


• Models  Semantic Web Encoding
    •   Research Object
    •   Annotation
    •   Provenance
    •   Evolution and Versioning
• Services Web APIs, REST services
    • Foundational, Extension, User
    • APIs, Architecture
• Principles
    • Map into standards
    • Adopt standards
    • Lightweight components
• Ecosystem
    • Command line
    • Portal
    • Third party systems
                                                           18
The Wf4Ever Proposal
                      Services


User
Clients



Extension
Services




Foundation
Services



                               19
Wf4Ever Reference Implementation
                                                                         Prototype, Dec 2011

   Access & Usage Clients

                                                                Dropbox Client
                   RO Portal             RO Manager Tool
                                                                       ROBox



           Data Management & Analysis Services



                     Stability              Completeness
                                                                 Recommender
                    Evaluation               Evaluation



Storage Services                                           Lifecycle Services

                                                                        Taverna Workflow
                                                                          Mgmt System
                               RO Digital Library



                                                                                           20
Roadmap
                              Year 1 (Dec 2010  Dec 2011)


» Exploration (2011)
   Problem specification and requirements identification
   Better understanding of workflow preservation needs
    from the domains (what does it mean to preserve a
    scientific workflow?)
   Proofs of concepts
   Preliminary models, components, and integrated
    reference implementation
   Result identification

                                                            21
Roadmap
                                   Year 2 (Dec 2011  Dec 2012)


Realization/validation (2012)
   › Validate the models, architectures and software in practice
   › Distributed components with different access/security
     arrangements – forming REST APIs and specifications
   › RO Content Campaign: Generate 1000s of ROs
   › First productization phase: Stable releases of models and
     reference implementation
   › Decay monitoring and notification (why my wf is no longer
     stable), reacting to decay, attribution and credit support
     beyond recommendation. Detailed use of provenance
   › Execution and interoperability support (SHIWA integration)
                                                                    22
Roadmap
                                 Year 3 (Dec 2012  Dec 2013)


» Exploitation (2013)
   › Final productization phase
   › Deployment in user environments and systems, enhanced with
     workflow preservation capabilities
   › RO-enabled myExperiment
   › RO-enabled Galaxy
   › RO-enabled dataVerse
   › … and more!
   › Deployment in publishers e.g. Elsevier, Digital Science,
     GigaScience

                                                                  23
Collaborations and impact
»   SHIWA – Sharing Interoperable Workflows
»   Publishers/journals: Elsevier, GigaScience (by BGI)
»   OpenPHACTS (nanopublications)
»   SCAPE (dataset preservation)
»   BioVel (biodiversity - species preservation!)
»   Dataverse (data repository)
»   Galaxy (workflow system for genomics)
»   GenomeSpace (data integration platform)




                                                             24
Thank you!




                                      Any Questions?

                     http://www.wf4ever-project.org/




This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California,
94041, USA.                                                                        25

More Related Content

Viewers also liked

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
Stian Soiland-Reyes
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
Stian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Stian Soiland-Reyes
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
Stian Soiland-Reyes
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
Stian Soiland-Reyes
 

Viewers also liked (6)

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
 

Similar to 2012 03-28 Wf4ever, preserving workflows as digital research objects

A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
Herbert Van de Sompel
 
myExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesmyExperiment and the Rise of Social Machines
myExperiment and the Rise of Social Machines
David De Roure
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
Jun Zhao
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
Oscar Corcho
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
seanb
 
Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)
dgarijo
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
Russell Jarvis
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
David De Roure
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
Nick Sheppard
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
Carole Goble
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3guru122
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
Rudy Potenzone
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
dgarijo
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word WadeAlex Wade
 

Similar to 2012 03-28 Wf4ever, preserving workflows as digital research objects (20)

A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
myExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesmyExperiment and the Rise of Social Machines
myExperiment and the Rise of Social Machines
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
 

More from Stian Soiland-Reyes

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
Stian Soiland-Reyes
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
Stian Soiland-Reyes
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
Stian Soiland-Reyes
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
Stian Soiland-Reyes
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
Stian Soiland-Reyes
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
Stian Soiland-Reyes
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
Stian Soiland-Reyes
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
Stian Soiland-Reyes
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
Stian Soiland-Reyes
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
Stian Soiland-Reyes
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
Stian Soiland-Reyes
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
Stian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Stian Soiland-Reyes
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
Stian Soiland-Reyes
 

More from Stian Soiland-Reyes (14)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

2012 03-28 Wf4ever, preserving workflows as digital research objects

  • 1. Wf4Ever: Preserving workflows as digital Research Objects Stian Soiland-Reyes myGrid, University of Manchester EGI Community Forum 2012, Workflow Systems workshop Leibniz Supercomputing Centre, Münich, 2012-03-28
  • 2. My background Taverna - Scientific Workflow Management System ~85000 downloads ~EU projects: SCAPE, BioVeL, HELIO, http://www.taverna.org.uk/ e-Lico, VPH-SHARE, EGI-INSPiRE…. myExperiment - Web 3.0 virtual environment, library and social network for workflows http://www.myexperiment.org/ ~5000 registered users ~2200 workflows ~21 different systems 2
  • 3. “A biologist would rather share their toothbrush than their gene name” Mike Ashburner and others Professor in Dept of Genetics, University of Cambridge, UK
  • 4. http://www.myexperiment.org/  “Facebook for Scientists”  A probe into researcher behaviour ...but different to Facebook!  A repository of research methods  Open source (BSD) Ruby on Rails app  A social network of people and things  REST and SPARQL, Linked Data  A Social Virtual Research Environment  Influenced BioCatalogue, MethodBox and SysMO-SEEK myExperiment currently has 5378 members, 292 groups, 2273 workflows, 534 files and 217 packs
  • 5.
  • 6.  Workflow Preservation  Research Objects  Provenance  Recommendation  Astronomy and Genomics http://www.wf4ever-project.org/
  • 7. Wf4Ever Challenges Preservation of scientific workflows » Scientific workflows enable automation in data-intensive science of scientific methods and encourage best practices to be shared » Workflows need to be preserved for › Reuse, fundamental for incremental scientific development › Method reproducibility, key for credit and publication » Workflow preservation is complex! » Heterogeneous types of information need to be aggregated, including workflows and related resources forming research objects » Research objects need to be trusted and understandable n years from now » Social aspects need to be addressed in order to support reuse in scientific communities 7
  • 8. The R.* dimensions Reusable. The key tenet of Research Replayable. Studies might involve Objects is to support the sharing and single investigations that happen in reuse of data, methods and processes. milliseconds or protracted processes Repurposeable. Reuse may also that take years. involve the reuse of constituent parts of Referenceable. If research objects are the Research Object. to augment or replace traditional Repeatable. There should be sufficient publication methods, then they must be referenceable or citeable. information in a Research Object to be able to repeat the study, perhaps years Revealable. Third parties must be able later. to audit the steps performed in the Reproducible. A third party can start research in order to be convinced of the validity of results. with the same inputs and methods and see if a prior result can be confirmed. Respectful. Explicit representations of the provenance, lineage and flow of intellectual property. Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
  • 9. Wf4Ever Forms of decay Workflow Decay • Service decay • Flux/decay/unavailability • Data decay • Formats/ids/standards • Infrastructure decay • platform/resources Experiment Decay • Methodological changes • New technologies • New resources/components • New data 9
  • 10. Preservation, Conservation, Recreating Preserving Archived Record Fixed Snapshots Review Rerun & Replay Conserving Active Instrument Live Rerun & Reuse Repair & Restore Recreating Archived Record Active Instrument Live Rebuild Recycle Repurpose 10
  • 11. Workflow Decay Decay at different abstraction levels Redo Flux Flux Flux 11 http://www.gridworkflow.org/kwfgrid/gwes/docs/
  • 13. Research Objects as Social Objects 13 13 13
  • 14. http://purl.org/wf4ever/ro# Research Object model core (simplified) ore:aggregates ro:ResearchObject ro:Resource ore:isDescribedBy ro:Manifest wfdesc:Workflow ro:annotatesAggregatedResource ro:AggregatedAnnotation Note: This figure shows a simplified view of the RO core. RO specification: http://wf4ever.github.com/ro/ 14
  • 18. Technical infrastructure • Models  Semantic Web Encoding • Research Object • Annotation • Provenance • Evolution and Versioning • Services Web APIs, REST services • Foundational, Extension, User • APIs, Architecture • Principles • Map into standards • Adopt standards • Lightweight components • Ecosystem • Command line • Portal • Third party systems 18
  • 19. The Wf4Ever Proposal Services User Clients Extension Services Foundation Services 19
  • 20. Wf4Ever Reference Implementation Prototype, Dec 2011 Access & Usage Clients Dropbox Client RO Portal RO Manager Tool ROBox Data Management & Analysis Services Stability Completeness Recommender Evaluation Evaluation Storage Services Lifecycle Services Taverna Workflow Mgmt System RO Digital Library 20
  • 21. Roadmap Year 1 (Dec 2010  Dec 2011) » Exploration (2011) Problem specification and requirements identification Better understanding of workflow preservation needs from the domains (what does it mean to preserve a scientific workflow?) Proofs of concepts Preliminary models, components, and integrated reference implementation Result identification 21
  • 22. Roadmap Year 2 (Dec 2011  Dec 2012) Realization/validation (2012) › Validate the models, architectures and software in practice › Distributed components with different access/security arrangements – forming REST APIs and specifications › RO Content Campaign: Generate 1000s of ROs › First productization phase: Stable releases of models and reference implementation › Decay monitoring and notification (why my wf is no longer stable), reacting to decay, attribution and credit support beyond recommendation. Detailed use of provenance › Execution and interoperability support (SHIWA integration) 22
  • 23. Roadmap Year 3 (Dec 2012  Dec 2013) » Exploitation (2013) › Final productization phase › Deployment in user environments and systems, enhanced with workflow preservation capabilities › RO-enabled myExperiment › RO-enabled Galaxy › RO-enabled dataVerse › … and more! › Deployment in publishers e.g. Elsevier, Digital Science, GigaScience 23
  • 24. Collaborations and impact » SHIWA – Sharing Interoperable Workflows » Publishers/journals: Elsevier, GigaScience (by BGI) » OpenPHACTS (nanopublications) » SCAPE (dataset preservation) » BioVel (biodiversity - species preservation!) » Dataverse (data repository) » Galaxy (workflow system for genomics) » GenomeSpace (data integration platform) 24
  • 25. Thank you! Any Questions? http://www.wf4ever-project.org/ This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 25