Overview of the Taverna 2 Platform - Presentation Transcript
An introduction to the T2 Platform Tom Oinn, [email_address] , 7 th October 2008 Updated 4 th November 2008
In this presentation ‘ Taverna ’ is distinct from ‘ The Taverna Workbench ’.
‘ Taverna ’ should be interpreted as ‘ the functionality provided by the core components ’
Workflow creation and modification
Workflow enactment, monitoring, provenance
... and many more
‘ Taverna ’ in this context applies to the version 2 of the codebase in cases where a distinction is significant.
‘ Service ’ is used in the sense of a software component providing a service rather than any particular technology such as SOAP or REST. ‘ Services ’ here are plain Java objects.
Taverna (both 1.x and 2) is complex
Inherent complexity
Wide ranging functionality
Mechanistic complexity
Loaded through Raven to resolve dependency issues
Plug-in mechanism
Many different extension points
Requires unusual knowledge of classloading to resolve issues
Taverna is valuable
Developers want to use Taverna to:
Extend Taverna through plug-in implementations
Embed and use Taverna in other systems
The complexity obscures the value
Guiding principles:
Do not require knowledge of classloading
Do not require a specific application environment
Allow minimal code
Provide ‘exactly as much’ functionality as required
Explicitly support common usage patterns
Prevent dependency on implementation
Extensibility and openness
An application using Taverna will contain:
The Taverna API jars
Support jars used by the application
The application code itself
Applies to all applications
Command Line
GUI
Web app
Axis service
...
Your Application Code T2 Core API Platform API, Spring etc. Platform service APIs Platform Libraries, T2 interfaces ... ... ... Application Specific Libraries for your code
The application must also contain...
Implementations of the enactor, workflow model and similar.
Loaded through Raven
Inherit and implement Taverna APIs from the parent class loader
Raven Class Loaders T2 Core Impl Implementation Dependencies Implementations of T2 Interfaces T2 Core API Platform API, Spring etc. Platform service APIs Platform Libraries, T2 interfaces
Taverna is plug-in based
Each plug-in is isolated in its own class loader and contains:
Plug-in specific libraries
The extension point implementation code
The plug-in inherits the Taverna APIs from the parent class loader
These APIs include the extension point interfaces
A plug-in can contain many extension point implementations
Plug-in Class Loaders 1..n ... ... Plug-in specific libraries (not T2 APIs) Plug-in Implementation ... ... T2 Core API Platform API, Spring etc. Platform service APIs Platform Libraries, T2 interfaces
Configured instances
Implement service APIs
Use T2 implementation classes from Raven
Act as a bridge between Raven driven implementations and application code
Think of as a ‘ toolbox ’ for working with Taverna facilities
Configured, linked and instantiated through Spring
XML configuration
Set shown here is indicative
We will provide more than this
3 rd parties can also provide components here
Workflow Exporter Workflow Parser Reference Service Edit Kit Activity Kit Enactor Monitor Factory ... Plug-in Service Raven Class Loaders T2 Core Impl Implementation Dependencies Implementations of T2 Interfaces T2 Core API Platform API, Spring etc. Platform service APIs Platform Libraries, T2 interfaces XML Context Configuration
Root Class Loader from Application, Web-App, Web Service ... Classes only available through services Classes available to all code Service Beans Workflow Exporter Workflow Parser Reference Service Edit Kit Activity Kit Enactor Monitor Factory ... Plug-in Service T2 Core API Platform API, Spring etc. Platform service APIs Platform Libraries, T2 interfaces ... ... ... Application Specific Libraries for your code Your Application Code Raven Class Loaders T2 Core Impl Implementation Dependencies Implementations of T2 Interfaces Plug-in Class Loaders 1..n ... ... Plug-in specific libraries (not T2 APIs) Plug-in Implementation ... ... XML Context Configuration
Initialize platform (once per application)
Downloads implementation code where required
Configures and instantiates service beans
Configures plug-in manager, loads plug-ins
Specified by Spring XML based configuration files
We will provide base configurations as part of the platform
Fetch appropriate service bean(s) by name
The standard Taverna platform will define a set of base services and their service names
Use service bean(s) to access Taverna functionality
Code purely against interface based APIs – service beans perform any object instantiation such as creation of new workflow model instances.
An example use of the platform to load a workflow, acquire input data, run the workflow and handle the output.
I’m using pseudo-code here rather than Java but in general one line of pseudo-code will correspond to one line of Java, not taking into account standard Java constructs such as casting, type declarations etc. for brevity.
Also consider that the final ‘base’ set of service beans is not yet defined, so service names and interfaces are illustrative rather than definitive.
The ‘ loader ’ here is a platform service bean, a tool in the toolbox defined by the platform.
This particular tool can be used to create a new workflow model from an XML definition, in this case from a URL.
Important point – only one explicit object instantiation: the platform itself.
In a web application or web service this would already have been provided in the servlet context, but for conventional applications we need it explicitly.
Aside – the workflow model in Taverna 2 is read-only. If we wanted to modify the model we just loaded we would need an appropriate ‘workflow editor’ service. In our current code this is the Edits interface.
‘ ref_service ’ is a Reference Service . This is the component of Taverna used to register data and obtain an internal identifier for it. This identifier is then used as input when running the workflow.
Two inputs are registered:
‘ input1 ’ is data held in a file locally
‘ input2 ’ is a literal string value
The real Reference Service interface is slightly more complex as it registers data asynchronously but the principle is the same. As before there is no direct object construction (other than the file), so no direct link to any implementation classes.
Following the same pattern as before, we obtain an enactor service bean from the platform.
This tool creates workflow instances from workflow definitions. In this particular example we’re ignoring the workflow context, following the ‘minimal code’ principle the service bean API would include this simple version and use some sensible default context.
Having obtained the workflow instance we can push the previously registered data into its (in this case two) inputs. Workflows in T2 are pipeline based, so the act of pushing data in starts the enactment, there is no explicit ‘start’ operation (except for cases where there are no workflow inputs).
I’m taking some liberty with the ‘real’ API here but the principle is the same. We don’t have a blocking fetch method at the moment but maybe we should!
We call a method on the workflow instance to fetch the output which will block until it’s available.
The result is in the form of an internal identifier in the reference system, so we then use the reference service to render the data referenced by that identifier to a string and print it to the console.
result = wf_instance.blocking_fetch(“ output ”); System.out.println(ref_service.getAsString(result));
Going back to the ‘guiding principles’ does this code satisfy them?
There’s no reference to classloading, and the application would just run from the normal command line.
There’s no more code than required to do the job. This is a simple example and didn’t address various issues such as context configuration, but those issues weren’t in the summary of the application either so their corresponding absence in the code is another sign of success.
The code never refers to implementation types, it is therefore resilient in the face of implementation updates and changes.
The code is almost self documenting.
We can’t make any statements about the extensibility of the platform, but all the other requirements are satisfied.
Plug-in code is used to extend Taverna
Includes all ‘base’ functionality such as web service invocation, http references etc.
Plug-in development issues
No tooling support
Testing is problematic
Problems with optional maven dependencies
Can’t use libraries which are not maven artifacts
Plug-in usage issues
Insufficient metadata to property describe plug-in functionality
Plug-in management service API missing
Plug-ins have no presence in the code once loaded
Provide tool support
Plug-in description generator
Plug-in verifier
Plug-in registry
Modified class loading strategy
Single class loader per plug-in
Allow mix of artifact and non-artifact jars
Use Raven to fetch artifact jars, but not to obtain class loaders
Testing is a special case of application development
See previous section
We can provide archetypes for integration tests using the platform to simplify testing
Plain text description
Short name
Version, triple numeric with periods
Development status (enumeration, for example ‘alpha | beta | stable’ to be decided)
Author list
Author name
Author affiliation (optional)
Author email (optional)
Author URL (optional)
Author Icon (optional, 128x128 true colour PNG with transparency)
Tag list
Free text tag
Homepage URL (can be generated by tooling along with the page itself)
Related link list
Link description
Link URI (most likely HTTP but potentially other reference)
Icon, 128x128 true colour PNG with transparency
Plugin manager background image (arbitrary size, true colour PNG with transparency, used as a background for the plugin manager panel for this plugin)
Core API version targeted specified as an artifact (group, artifact, version)
SPI implementation list
Plain text description
Short name
Raven Repository Artifact Cache Plugin Manager Service Bean SPI Registry SPI Registry Instance Registry Instance Registry Get artifact / jar Consumer Code i.e. Taverna Workbench Non-artifact Repository Jar Cache Download Manager Install, list, enable, disable... plugins Consume SPI instances, receive notification of addition / removal
Most places in application code explicitly referencing artifacts and raven will now reference the plug-in manager
Code using SPI and instance registries is unchanged
Workflow serializer changes to reference plugin rather than artifact specifier
Plug-ins can be disabled on the fly
Plug-in manager appears as infrastructure bean in platform
Plug-in management functionality is generic
Part of the generic platform rather than the Taverna specific part
The plug-in manager is instantiated and configured through Spring
Plug-in manager and related components are available in the root class loader for an application
Exposed as named beans in the Spring context
Applications access infrastructure properties through the Spring context
Access through interfaces not implementation types
Next slide shows actual implementation components in the current code
system artifact set base base base base property classname Download Manager Raven Jar Manager base /artifacts/agrp/aid-aver.jar POM.xml Parser base /artifacts/agrp/aid-aver.pom base /artifacts/agrp/aid-aver.jar Plug-in Parser base /pgrp/pid-pver.xml Component file path Key : SPI Registry Plug-in Manager base /pgrp/pid-pver/jars/jarname.jar
1 st November – Initial generic platform implementation
Early version released to public CVS 4 th November
1 st December – Specification of Taverna specific platform services
Early implementation of enactor, workflow loader and reference service components expected mid-November
1 st January – Initial implementation of Taverna services
Mid February – 2 day Developer Workshop in Manchester
Sign up at http://spreadsheets.google.com/a/crypticsquid.com/viewform?key=pGRSW--IktWUFVQVLleNzIw&hl=en
1 comments
Comments 1 - 1 of 1 previous next Post a comment