Taverna + ARC Workflow management, command line tools and the grid Saturday, June 27, 2009 Hajo Nils Krabbenhöft, University of Lübeck
Why use a grid? cpu power storage space node redundancy
Why use a grid? resource sharing tit-for-tat buy cpu power shared maintenance software development package deployment configuration
Why use Taverna 2 ? knowledge sharing web services plug-ins myExperiment
Why use Taverna 2 ? myExperiment
Why use Taverna 2 ? dependency management data management & conversion easy tweaking database integration
shell script pitfalls command line ambiguities no specific program version unaware of changes to syntax
manual grid usage upload & download data xrsl difficult to write no failure recovery
proposed solution Workflow plug-in submits ARC executes use cases runtime environments out of through
proposed solution ARC grid middleware homogeneous interface security certificates data management scalable
proposed solution runtime environments specify program & version installation on demand use cases require RE no command line shared repository
proposed solution Taverna plug-in job submission use cases => run in parallel storage management data references => fast silent failover SSH + local for testing
proposed solution Taverna easy to present  embedded workflows parameter tweaking managed dependencies easy retry easy parallelization X. Zhou et al.: An Easy Setup for Parallel Medical Image Processing: Using Taverna and ARC
reality check dynamic RE still experimental use common tools send binaries call administrators firewall need LDAP and GSIFTP ports proxy support
reality check  disk caching since Taverna 2.0 programs not locally installable use as web service  upload is slow upload static files to SE
neat toys Taverna as web service Taverna on grid node embedded Taverna NestedVM package arbitrary C program into JAR
neat toys Amazon S3 upload from Taverna grid URL good for static data Amazon EC on-demand grid nodes control from Taverna
neat toys use case java API submit, receive, monitor data references silent failover but NO dependency management GridRunnable e.g. clustering based on different criteria
Thank you for your attention http:// taverna.nordugrid.org

Krabbenhoft_TavernaARC_BOSC2009

  • 1.
    Taverna + ARCWorkflow management, command line tools and the grid Saturday, June 27, 2009 Hajo Nils Krabbenhöft, University of Lübeck
  • 2.
    Why use agrid? cpu power storage space node redundancy
  • 3.
    Why use agrid? resource sharing tit-for-tat buy cpu power shared maintenance software development package deployment configuration
  • 4.
    Why use Taverna2 ? knowledge sharing web services plug-ins myExperiment
  • 5.
    Why use Taverna2 ? myExperiment
  • 6.
    Why use Taverna2 ? dependency management data management & conversion easy tweaking database integration
  • 7.
    shell script pitfallscommand line ambiguities no specific program version unaware of changes to syntax
  • 8.
    manual grid usageupload & download data xrsl difficult to write no failure recovery
  • 9.
    proposed solution Workflowplug-in submits ARC executes use cases runtime environments out of through
  • 10.
    proposed solution ARCgrid middleware homogeneous interface security certificates data management scalable
  • 11.
    proposed solution runtimeenvironments specify program & version installation on demand use cases require RE no command line shared repository
  • 12.
    proposed solution Tavernaplug-in job submission use cases => run in parallel storage management data references => fast silent failover SSH + local for testing
  • 13.
    proposed solution Tavernaeasy to present embedded workflows parameter tweaking managed dependencies easy retry easy parallelization X. Zhou et al.: An Easy Setup for Parallel Medical Image Processing: Using Taverna and ARC
  • 14.
    reality check dynamicRE still experimental use common tools send binaries call administrators firewall need LDAP and GSIFTP ports proxy support
  • 15.
    reality check disk caching since Taverna 2.0 programs not locally installable use as web service upload is slow upload static files to SE
  • 16.
    neat toys Tavernaas web service Taverna on grid node embedded Taverna NestedVM package arbitrary C program into JAR
  • 17.
    neat toys AmazonS3 upload from Taverna grid URL good for static data Amazon EC on-demand grid nodes control from Taverna
  • 18.
    neat toys usecase java API submit, receive, monitor data references silent failover but NO dependency management GridRunnable e.g. clustering based on different criteria
  • 19.
    Thank you foryour attention http:// taverna.nordugrid.org