Successfully reported this slideshow.
You’ve unlocked unlimited downloads on SlideShare!
Data Document Access through DataManager
interface locally, DataPeer remotely
Identifier with namespace
File reference Has unique namespace
within peer group
Zero or more ref scheme instances
pointing to identical immutable data
List (depth) Reference
Configuration Scheme Plugins
Identifier with namespace (extension point)
Depth, List of child IDs • No context required?
• Local network name, subnet mask
Identifier with namespace • File system name and mount point
• Whatever you need here
Example nested list structure :
•Downstream process filters on the event depth it needs:
Leaf1 •If the minimum depth is too high it iterates, discarding all
List2 but the finest grained events
list1 •If the maximum depth is too low it wraps in a new single
List3 leaf3 element collection, discarding all but the root event
•Identifiers in the boxes are those from the previous slide
Appears on data link as :
Leaf3[1,0] List3 Leaf2[0,1] Leaf1[0,0] List2 List1
Processors (or, more accurately, service proxies) can now emit results piece by piece
Sensor proxy that can emit a temperature reading / cell count / image every ten seconds
Database query that returns rows one row at a time from the data server
Management of collection events is handled by the framework
Taverna 2 opens up the per-processor dispatch logic.
Dispatch layers can ignore, pass unmodified, block, modify or act on any message and can
communicate with adjacent layers.
Each processor contains a single stack of arbitrarily many dispatch layers.
Single dispatch layer Dispatch layer composition allows
for complex control flow within a
Job Queue & Single Job & Service List
Job specification messages from
DispatchLayer is an extensibility
Use it to implement dynamic
Data and error messages from
binding, caching, recursive
Fault Result behaviour…?
Parallelize This dispatch stack
• Ensures that at least ‘n’ jobs are pulled from the queue and sent to the layer below
• Reacts to faults and results by pulling more jobs off the queue and sending them replicates the current
down, passing the fault or result message back up to the stack manager
Taverna 1 processor
logic in that retry is
within failover and
• Responds to job events from above by storing the job, removing all but one service
from the service list and passing the job down. both are within the
• Responds to faults by fetching the corresponding job, rewriting the original service
set to include only the next service and resending the job down. If no more services
are available propagate the fault upwards
• Responds to results by discarding any failover state for that job
Layers can occur
multiple times, you
could easily have
• Responds to jobs by storing the job along with an initial retry count of zero
retry both above
• Responds to faults by checking the retry count, and either incrementing and
resending the job or propagating the fault message if the count is exceeded
and below the
failover layer for
• Responds to jobs by invoking the first concrete service in the service list with the
specified input data
• Sends fault and results messages to the layer above
‘Service’ in this case means ‘Taverna 2 proxy to something we can invoke’ – name might
Service invocation is asynchronous by default – all AsynchronousService implementations should
return control immediately and, ideally, use thread pooling amongst instances of that type.
Results, failure messages are pushed to an AsynchronousServiceCallback object which also
provides the necessary context to the invocation :
DataManager SecurityManager Message Push
• Resolve input data • Provides a set of • Allows explicit push • Used to push fault
references security agents of actor state P- and result messages
available to assertions to a back to the
• Register result data
manage connected invocation layer of
to get an identifier
authentication provenance store the dispatch stack
against protected for invocation
resources specific metadata
In this scenario the
agent is discovered Set of
based on the service, a
message is passed to
the agent to be signed
and that message
Credentials never leave
Taverna 2 combines data managers, workflow enactors and security agents into transient
collaborative virtual experiments within a peer group. These groups can be shared and
membership managed over time and can persist beyond a single workflow run.
User 1 User 2
External External Data
Services Stores i.e. SRB
Set of Set of
DM DM DM
Peer group (i.e. JXTA group) – Virtual Experiment Session
Define a workflow as nested boundaries of control.
Each boundary pushes its identifier onto an ID stack on data entering it and pops it when exiting.
When a new ID is created the controlling entity registers with a singleton monitor tree, attaching to
the parent identified by the path defined by the previous value of the ID stack on that data.
WF1 Iteration over nested
Each node defines a set of properties.
If a property is mutable it can be used to steer the
P2 Properties could include parallelism setting, service
binding criteria, current job queue length, queue
consumption, number of failures in the last minute…
Due December 2007 in ‘visible to end user’ form.
Thisrelease will probably not include everything, esp
steering agents and virtual experiment management.
Early tech preview real soon now [tm]
Complete code rewrite, current status is around
90% complete on enactor and data manager core.
Java code in CVS on sourceforge, project name is
‘taverna’, CVS module is ‘t2core’
Licensed under LGPL at present
Hands on session later if anyone’s interested?
Core Research and
Investigators Postgraduates Pioneers Funding and Industrial Development
• Matthew Addis, Andy • Tracy Craddock, Keith • EPSRC • Nedim Alpdemir, Pinar
•Hannah Tipney, May
Brass, Alvaro Fernandes, Flanagan, Antoon Goderis, Alper, Khalid Belhajjame,
• Wellcome Trust
Tassabehji, Medical Genetics
Rob Gaizauskas, Carole Alastair Hampshire, Duncan Tim Carver, Rich Cawley,
team at St Marys Hospital, • OMII-UK
Goble, Chris Greenhalgh, Hull, Martin Szomszor, Justin Ferris, Matthew
Manchester, UK; Simon
Luc Moreau, Norman Paton, Kaixuan Wang, Qiuwei Yu, Gamble, Kevin Glover,
Pearce, Claire Jennings, • Dennis Quan, Sean Martin,
Peter Rice, Alan Robinson, Jun Zhao Mark Greenwood, Ananth
Institute of Human Genetics Michael Niemi (IBM), Mark
Robert Stevens, Paul Krishna, Matt Lee, Peter Li,
School of Clinical Medical
Watson, Anil Wipat Phillip Lord, Darren Marvin,
Sciences, University of Simon Miles, Arijit
Newcastle, UK; Doug Kell, Mukherjee, Tom Oinn,
Stuart Owen, Juri Papay,
Peter Li, Manchester Centre
for Integrative Systems
Matthew Pocock, Stefan
Biology, UoM, UK; Andy
Brass, Paul Fisher, Bio-Health
Roberts, Martin Senger,
Informatics Group, UoM, UK, Nick Sharman, Stian
Simon Hubbard, Faculty of Soiland, Victor Tan, Franck
Life Sciences, UoM, UK Tanoh, Daniele Turi, Alan R.
Williams, David Withers,
Katy Wolstencroft and
Additional T2 thanks to Matthew Pocock, Thomas Down & David DeRoure amongst others!
Please see http://www.mygrid.org.uk/wiki/Mygrid/Acknowledgements for most up to date list