-
1.
Integrating Taverna Player
into Scratchpads
www.taverna.org.uk | scratchpads.eu
Robert Haines+, Simon Rycroft*,
Vince Smith*, Carole Goble+
+University of Manchester, UK; *Natural History Museum, UK
robert.haines@manchester.ac.uk
-
2.
Scratchpads and Taverna Player
• What are Scratchpads?
• Taverna and Taverna Player
• Why integrate the two?
• Lightweight integration (embed)
• Tight integration (Web Service)
• Scratchpads developed in ViBRANT
• Taverna Player developed in BioVeL
-
3.
What are Scratchpads?
• Virtual Research Environments
• Hosted websites for biodiversity data
• Virtual research & publication platform
• Curated data and analysis
• Completely open access & open source
• Modular & flexible
-
4.
The Scratchpads concept
A Scratchpad is a website that holds data for you and your community
Your data External data & services
-
5.
The Scratchpads concept
-
6.
Scratchpads details
• Drupal CMS (7.26) with both custom and
Drupal.org contributed modules.
• Over 500 Scratchpads sites managed by Aegir
– www.aegirproject.org
• Hosted on two application servers
– Load balancing and caching performed by Varnish
• 2 MySQL database servers
– master-master configuration
• 1 Apache Solr search server
-
7.
Taverna
• Scientific Workflow Management System
• Workbench
– Desktop application
• Command-line tool
– Batch
• Server
– Multi-user
– Secure separation of workflow runs
– REST and SOAP interfaces
-
8.
Taverna Player
• A Ruby on Rails plugin library
– Hooks into host application’s
• Workflow model
• Authentication and authorization system
– Provides a REST interface
• Talks to Taverna Server’s REST interface
– Uploads the workflow, sets inputs
– Presents workflow interactions to the user
– Retrieves results, logs and provenance data
-
9.
Taverna Player
• Surfaces a workflow run in three ways:
– As a Web interface in the browser
• In the host application
– As an embeddable widget
• In any Web page (c.f. YouTube videos)
– As a REST-based Web Service
• All look-and-feel and styling is derived from
the host application
– Rails’s hierarchical layouts and views
-
10.
Taverna Player
• Total workflow run isolation
– A worker per run
– State passed via database
• Scaling
Taverna
Player
Host Application
Taverna
Server
Workers
Taverna
Server
Taverna
Server
-
11.
Taverna all together
-
12.
Taverna Player details
• Ruby on Rails
– Version 3.2 (released) and 4.x (testing)
– Plugin
• Delayed Job for workflow run isolation
– Manages workflow run queues
– Start workers to match Taverna Server capacity
– Loss of a workflow run will not affect any others
-
13.
Player in the BioVeL Portal
https://www.youtube.com/watch?v=s3D8JXc-tSM
SEEK
Carole Goble,
1400
-
14.
Why integrate?
• Join two communities
• BioVeL Portal
– Good for the “day job”, collaboration with others
in, or close to, the project.
• Scratchpads
– Dissemination; wide reach but focussed area
– Move science into the public domain
– Lots of data compatible with BioVeL pipelines
-
15.
Workflows in Scratchpads I
• Lightweight embedding of a workflow
• Scratchpads updated to expose public data
– As CSV
– For each individual taxa
• Workflow is run as “guest” user
– Embedding only available for “public” workflows
• Results stay in Taverna Player
-
16.
Workflows in Scratchpads I
• Embed like a YouTube video
• Embedded workflow is passed the URI of data
• This level of integration is lightweight
– Science showcases
– One off analyses
<iframe src="http://portal.org/runs/new?
embedded=true&
workflow_id=1&
input_uri=http://scratchpad.org/taxa/1234/data“
>
</iframe>
https://github.com/myGrid/taverna-player/wiki/Embedding
-
17.
Workflows in Scratchpads I
-
18.
Workflows in Scratchpads II
• Tighter integration of analysis pipelines
– Scratchpads directly controls Taverna Player
• Scratchpads has ‘offsite computation’ modules
– Large scale batch operations, etc
– Workflow runs added to this group
• Scratchpads uses the Taverna Player REST API
– Within the host application/portal
-
19.
Workflows in Scratchpads II
Taverna
Player
Host Application
Taverna
Server
Scratchpads
Data
Control
(JSON REST API)
https://github.com/myGrid/taverna-player/wiki/JSON-API-Documentation
-
20.
Workflows in Scratchpads II
• Scratchpads can
– Authenticate to Taverna Player
– Get a list of workflows and show these to the user
– Set up the workflow run, inputs, etc
– Present interactions to the user
– Retrieve results for further analysis
• This level of integration is more suited to
– Long-running workflows
– Larger, repeated studies
-
21.
Workflows in Scratchpads II
-
22.
Workflows in Scratchpads II
-
23.
Workflows in Scratchpads II
-
24.
Workflows in Scratchpads II
-
25.
Integration comparison
Lightweight embedding
• Run a specified workflow
– Chosen by the Scratchpads
owner
• Results are not stored in the
Scratchpads itself
• Workflow run retains host
app look and feel
Tight integration
• Run any workflow
– That the Scratchpads is
authorized to see
• Results are available for
further analysis
• Workflow appears as part of
the Scratchpads
• Workflows are run within Taverna Player in the host app
• Interactions are presented to the user
• Results can be downloaded
Common
-
26.
Thank you!
• Taverna Player
– Robert Haines: robert.haines@manchester.ac.uk
– Web: www.taverna.org.uk
– Code: github.com/myGrid/taverna-player
– Licence: BSD
• Scratchpads
– Simon Rycroft: s.rycroft@nhm.ac.uk
– Web: scratchpads.eu
– Code: git.scratchpads.eu/git/scratchpads-2.0.git
– Licence: GPL2
This work was enabled by BioVeL (grant no. 283359) and ViBRANT (grant no. 261532) that received funding
from the European Union’s Seventh Framework Programme for research, technological development and demonstration.
www.biovel.eu | www.vbrant.eu
The Scratchpads platform is being developed for the last 5 years under this framework. To provide researchers with the necessary tools to make taxonomy digital, open and linked!
To facilitate the development of virtual research environments
What are Scratchpads? They are hosted websites for biodiversity data, but the specifics and the kind of data are entirely up to the user.
Scratchpads are designed to be a virtual research and publication platform, a place where you should be able to store, save, publish and reuse your work.
They are completely open access, open source and free to use.
They are modular and flexible. It is possible for other developers to work with us to write modules for specific functions or communities.
Scratchpads are fed by your data. Scratchpads help you structure your data in a way that makes them both human and machine readable. Allows you to contribute to global biodiversity databases and also aggregates all related to your data information from external resources.
Taverna server spawns commandline tool for user separation.
The components of the architecture:
An OSGi platform, with the Taverna Platform API
implemented by Taverna Core
executes a workflow using the Taverna Engine
uses Activity plugins for the different service types (WSDL, REST, Biomart, R scripts, command line tools, etc)
also implemented by the Taverna Server client which uses the Java Client library to proxy running of a workflow on the Taverna Server
The Taverna workbench to design and run workflows
UI plugins for each service type
executes workflows using the Taverna platform API
The Taverna command line which executes workflows using the Taverna platform API
A Taverna Server, which exposes the Taverna platform API as a REST API and SOAP API for executing workflows
Taverna Player, which use the Ruby client library to execute workflows on the Taverna Server
Taverna Lite, which also uses the Ruby client library to execute workflows, but also manage a repository of workflows and allow user interactions.
The OSGi framework (OSGi being an acronym for "Open Services Gateway initiative") is a module system and service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments. Applications or components (coming in the form of bundles for deployment) can be remotely installed, started, stopped, updated, and uninstalled without requiring a reboot; management of Java packages/classes is specified in great detail. Application life cycle management (start, stop, install, etc.) is done via APIs that allow for remote downloading of management policies. The service registry allows bundles to detect the addition of new services, or the removal of services, and adapt accordingly.
The OSGi specifications have moved beyond the original focus of service gateways, and are now used in applications ranging from mobile phones to the open source Eclipse IDE. Other application areas include automobiles, industrial automation, building automation, PDAs, grid computing, entertainment, fleet management and application servers.
Examples direct from BioVeL: Killer Whales (pop mod), Moths (enm)