Successfully reported this slideshow.

The Taverna Software Suite

1

Share

Loading in …3
×
1 of 37
1 of 37

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

The Taverna Software Suite

  1. 1. The Taverna Software Suite Prof Carole Goble FREng FBCS CITP The University of Manchester, UK carole.goble@manchester.ac.uk http://www.mygrid.org.uk http://www.taverna.org.uk
  2. 2. The Taverna Suite of Tools Client User Interfaces User InterfacesWorkflow Repository Service Catalogue Third Party Tools Web Portals / Gateways Activity and Service Plug-in Manager Workflow Provenance Workflow Server Secure Service Access OAuth1 & 2, username/password, certificates. Workflow Engine Virtual Machine Prog APIs Command Line Player Workflow Components Workbench Taverna Lite Interaction Server
  3. 3. VPH-Share Project Models of Human Physiology Eagle Genomics & NHS Next Generation Sequencing based Patient Diagnostics Astronomy & HelioPhysics Library Doc Preservation Systems Biology of Micro-Organisms OpenTox Project Chemistry Development Kit Drug Toxicity BioDiversity Invasive Species Modelling Metagenomics
  4. 4. 5820 members, 304 groups, 2415 workflows, 604 files and 229 packs (research objects)
  5. 5. biovel.myexperiment.org
  6. 6. 5820 members, 304 groups, 2415 workflows, 604 files and 229 packs (research objects)
  7. 7. The Wf4Ever Components http://www.wf4ever-project.org Models Encoded in Standards Contributed to Standards Services Foundational, Extension, User APIs, Architecture Web protocols/services Policy and Planning Leveraging established protocols Preservation planning, policies Best workflow design practices Reference Systems Command line+ Third party systemsUser Driver
  8. 8. The Research Object www.researchobjects.org Execution Platform
  9. 9. Using and Making Standards Standard id for each component ORCID, DOI, URI OAI-ORE Structuring and Bundling descriptions and components. W3C Open Annotation Data Model (AO) Wf4Ever instrumental and hosting rollout meeting in Manchester Transferable annotations Structured and semantically tagged packs for exchange and for linking across repositories Semantic Web Encoding Aggregation Annotation Identit y ro Ontology
  10. 10. Preservation Checklist Monitoring environment Metadata Completeness Release, not Publish Software release practice for workflows and scripts, services, data, articles, research objects Gamble, Zhao, Klyne, Goble. MIM: A Minimum Information Model Vocabulary and Framework for Scientific Linked Data, 8th IEEE e-Science 2012, Chicago, USA W3C PROV Repair record Preserved record of execution Gil, Miles, Belhajjame, Deus, Garijo, Klyne, Missier, Soiland-Reyes, Zednik. Primer for the PROV Provenance Model. World Wide Web Consortium (W3C). 2012. Belhajjame, Goble, Soiland-Reyes, De Roure. Fostering Scientific Workflow Preservation Through Discovery of Substitute Services. Proc 7th IEEE eScience 2011 Stockholm Sweden Schopf, Treating Data Like Software: A Case for Production Quality Data, JCDL 2012 minim wfprov roevo
  11. 11. Preservation Model Experiment Descriptions Organise workflows into structured studies wfdesc Inputs, outputs, dependencies Workflow Decay Component, Data & Infrastructure unavailability or inaccessibility Taverna Components Experiment Decay Methodological changes New technologies, resources, components, data Workflow Motifs IEEE e-Science 2012 FGCS submission Best Practices SWAT4LS
  12. 12. http://www.researchobject.org/ W3C Research Object for Scholarly Communication (ROSC) Community Group http://www.w3.org/community/rosc/
  13. 13. Taverna Engine Execution • Scufl2 language • Functional dataflow, simple control flows, implicit iteration • Linking services and tools • Data movement, monitoring, staging, reference • “In Workflow Programming” Beanshell scripting • Provenance collection: W3C PROV(+) format • Plug-in Framework – Infrastructures: Grid, HPC, Web Services (SOAP, REST) – Domain: CDK, BioMart, VOTable, SADI – Common Tools: Excel Spreadsheets, Google Refine, R • OAuth security plug-in
  14. 14. Taverna Pro-Workbench • Desktop application • GUI • Intermediate results views • Gateway to BioCatalogue and myExperiment • Plug-in Framework
  15. 15. Workflow Blocks made of a workflow • Well described • Well behaved • Well looked after • Agreed fail • Agreed formats in and out • Agreed provenance Deposited in myExperiment Grouped into families Components
  16. 16. Workflow Blocks made of a workflow • Well described • Well behaved • Well looked after • Agreed fail • Agreed formats in and out • Agreed provenance Deposited in myExperiment Grouped into families Components
  17. 17. Workflow Blocks made of a workflow • Well described • Well behaved • Well looked after • Agreed fail • Agreed formats in and out • Agreed provenance Deposited in myExperiment Grouped into families Components
  18. 18. Desktop Client http://www.xworx.org/ Data Centric Interface BIFI (Beautiful Interfaces for Inputs) Taverna Workbench Plug-in, GUI definition language
  19. 19. Data services • Vanilla Taverna – Domain data type neutral • AstroTaverna plug-in – IVOA data services – VOTables • PyWPS plug-in – Exposes OGC-compliant Web Processing Services that can handle large data
  20. 20. Taverna Server • Multiple clients, Multi-user • SOAP and REST API Server HostServer Host Taverna Server “Client” Taverna Server “Client” Taverna Server Front End Taverna Server Front End TavServ Back End TavServ Back End TavServ Back End TavServ Back End TavServ Back End TavServ Back End ServiceService ServiceService ServiceService
  21. 21. Taverna Server Family • Taverna Server – Multiple clients, Multi-user – SOAP and REST API • Taverna Server Amazon Machine Image – Bundled R server, Atom feed server – Multiple instances in Amazon Cloud and as required, for multiple users/uses and different security scenarios • Taverna Virtual Machine • Taverna Command Line • Bundled Servers
  22. 22. Calling DCI Grid/Cloud Services • Expose services/tools as WSDL/REST services – HELIO: Fixed host name – VPH-Share: Services running on dynamically started instances – SZTAKI Desktop Grid – BOINC/Debian Package • Specific service/extension to Taverna – UNICORE plugin: Ask grid what services are available, Include services in a workflow, Invoke services on the grid see talk by Shahbaz Memon • Library to control job submission to grid – PBS plugin: beanshells in a workflow include invocations of jobs – KnowARC plugin: Advanced Resource Connector to submit jobs to NorduGrid
  23. 23. Web interface Input SNPs Results Storage (S3) Ensembl (mySQL) Cache (S3) Taverna Server Taverna Server Taverna Server Workflow engine orchestrator e-Hive other Taverna Application specific tools and Web Services Application specific tools and Web Services Application specific tools and Web Services WS WS ToolToolWS All user interaction via web interface User data stored in the Cloud Data for all tools and Web Services stored in the Cloud Unified access to different workflow engines with our common REST API Tools and Web Services for each workflow are installed together for easy replication Cloud Analytics for Life Sciences
  24. 24. Tavoop—Taverna & Hadoop • Compiles Taverna Workflow to collection of Hadoop jobs • Designed for handling very large amounts of data – Overhead to using Hadoop, but wins if enough data – Data ingest (expensive step) must have already been done • Supports Taverna Platform Execution interface • Parallelisable service types • http://wiki.opf-labs.org/display/SP/PPL Hadoop ClusterHadoop Cluster Taverna Execution InterfaceTaverna Execution Interface Tavoop CompilerTavoop Compiler Portal (Taverna Player) Portal (Taverna Player) GUI Application (Workbench) GUI Application (Workbench)
  25. 25. Interacting with a workflow • Many workflows need user interaction • A workflow on a server does not need to be “press a button and wait” – VPH-Share opens a VNC connection to the spawned instance. • Taverna Interaction Service – Users interact with a workflow (wherever it is running) in a web browser. – Interaction Service Plug-in in workbench
  26. 26. URLs and Frames
  27. 27. Taverna Tool Spectrum Technical Computational Scientist Domain Scientist Workbench Workbench Components Lite Domain-Specific Website / Tool / Portal Workflow Visibility Concept KnowledgeTaverna Domain High Low Player Command Line
  28. 28. Taverna Client Family • Java library / Ruby GEM • Run a Taverna workflow in another workflow system e.g. Galaxy tools • Command line • Simple Taverna “player” – Fixed workflow • Upload & run workflows and choose data – Universitat Pompeu Fabra’s “Soaplab MajorDomo” – Taverna Lite
  29. 29. Taverna-Lite Generic Web-based Client Hide complexity Access to datasets Upload and interact with workflows Build Portal • Homepage • User-Sessions • Workflow Management • Run Management • Server Credentials Uses Components for simpler assembly and workflow edits
  30. 30. Web apps to create and run workflows Service Chaining Editor Pete Walker et al Plymouth Marine Laboratory For chaining OGC Web Processing Service geospatial Web services
  31. 31. Web apps to create and run workflows Online Taverna • Dr Vadim Surpin and Vitaly Sharanutsa • Institute for Information Transmission Problems of Russian Academy of Sciences (IITP RAS) An online, in-browser application for assembling and running Taverna Workflows over a HPC platform Software Sustainability Institute Booth Dr Vadim Surpin
  32. 32. Upload workflow by URL Online Taverna
  33. 33. Taverna 3 Beta July 2013
  34. 34. Summary • Taverna Suite for interactive and batch workflows • Flexible Plug-ins and Flexibly Plugged-in • Themed Taverna • Establishing Taverna Foundation • We welcome collaboration/contribution • http://www.taverna.org.uk
  35. 35. Learn more…. • myGrid – http://www.mygrid.org.uk • Taverna – http://www.taverna.org.uk • myExperiment – http://www.myexperiment.org • BioCatalogue – http://www.biocatalogue.org • Wf4ever – http://www.wf4ever-project.org • SCAPE – http://www.scape-project.eu • Software Sustainability Institute – http://www.software.ac.uk • BioVeL – http://www.biovel.eu
  36. 36. • Virtual data objects – Johan • MOU – Portals for BioVeL – DCI platforms • myExperiment – SHIWA repository (execution) – How can we interchange

Editor's Notes

  • Mature workflow platform – since 2004
  • 880 unique IP addresses that called home for updates. In March
  • http://www.myexperiment.org/packs/231.html
    EXECUTION ENVIRONMENT environment in – SHIWA portal
    So you can execute multiple applications.
    Rep
    Different views.
  • http://www.myexperiment.org/packs/231.html
  • SIOC and VoID
  • HELIO and the interop - talk at EGI
    myExperiment doesn’t manage workflow engines or DCIs
    And only really supports Taverna
    Galaxy engie
  • Identity
    URI for each component
    Aggregation
    OAI-ORE
    Bundling descriptions
    Annotation
    Annotation Ontology + OAC
    Release of the OAC model
    Wf4Ever instrumental and hosting rollout meeting in Manchester
  • Context specific tabs
    Structured annotation
    Transferable annotations
    Structured and semantically tagged packs
    For exchange
    For linking across repositories
  • Component level:
    - flux/decay/unavailability
    Data level:
    - formats/ids/standards
    Infrastructure level:
    - platform/resources
    Log
    Woodman, et al  Achieving Reproducibility by Combining Provenance with Service and Workflow Versioning. In: The 6th Workshop on Workflows in Support of Large-Scale Science. 2011, Seattle
    Track
    Versions and retractions
    Error propagation
    Contributions and credits
    Fix
    Workflow repair, alternate component discovery, Black box annotation
    ReRun and Replay
    Partial reproducibility: Replay some of the workflow
    A verifiable, reviewable trace in people terms
    Analyse
    Calculate data quality & trust,
    Decide what data to keep or release
    Compare to find differences and discrepancies
    Instrument systems and data that people actually use. The 95%.
    Attract people to use provenance-enabled systems
    Developer toolkits to make Useful Provenance Apps
    Layers, presentation, simplification, polishing for publication, tracking, repairing
    Multi-component provenance: Workflow + GoogleRefine + R + …Prov - language for exchanging provenance information among applications. http://www.w3.org/2011/prov/wiki/Main_Page
    Automated provenance-rich publications http://reproducibleresearch.org
  • Infrastructures: Grid (UNICORE, caGrid), Supercomputing (PBS, OPAL), Web Services (SOAP, REST)
  • They are specified by the url of an Html page (or php or whatever). When it is shown, the page is included inside a frame. The frame handles the getting and sending of data and also telling the engine that the interaction has finished. When the interaction page is loaded it makes a javascript call to the frame to get the input values. It then fills in the page and lets the user do the actual interacting. The interaction page can send values back by making javascript calls to the frame. The page can also fail or cancel the interaction.
  • Java Library and ruby GEM - Talk to server’s REST API
  • Open Geospatial Consortium
  • The components of the architecture:
    An OSGi platform, with the Taverna Platform API
    implemented by Taverna Core 
    executes a workflow using the Taverna Engine
    uses Activity plugins for the different service types (WSDL, REST, Biomart, R scripts, command line tools, etc)
    also implemented by the Taverna Server client which uses the Java Client library to proxy running of a workflow on the Taverna Server
    The Taverna workbench to design and run workflows
    UI plugins for each service type
    executes workflows using the Taverna platform API
    The Taverna command line which executes workflows using the Taverna platform API
    A Taverna Server, which exposes the Taverna platform API as a REST API and SOAP API for executing workflows
    Taverna Player, which use the Ruby client library to execute workflows on the Taverna Server
    Taverna Lite, which also uses the Ruby client library to execute workflows, but also manage a repository of workflows and allow user interactions.
    The OSGi framework (OSGi being an acronym for "Open Services Gateway initiative") is a module system and service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments. Applications or components (coming in the form of bundles for deployment) can be remotely installed, started, stopped, updated, and uninstalled without requiring a reboot; management of Java packages/classes is specified in great detail. Application life cycle management (start, stop, install, etc.) is done via APIs that allow for remote downloading of management policies. The service registry allows bundles to detect the addition of new services, or the removal of services, and adapt accordingly.
    The OSGi specifications have moved beyond the original focus of service gateways, and are now used in applications ranging from mobile phones to the open source Eclipse IDE. Other application areas include automobiles, industrial automation, building automation, PDAs, grid computing, entertainment, fleet management and application servers.
  • ×