Krabbenhoft_TavernaARC_BOSC2009
Upcoming SlideShare
Loading in...5
×
 

Krabbenhoft_TavernaARC_BOSC2009

on

  • 563 views

 

Statistics

Views

Total Views
563
Views on SlideShare
563
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Krabbenhoft_TavernaARC_BOSC2009 Krabbenhoft_TavernaARC_BOSC2009 Presentation Transcript

  • Taverna + ARC
    • Workflow management, command line tools and the grid
    Saturday, June 27, 2009 Hajo Nils Krabbenhöft, University of Lübeck
  • Why use a grid?
    • cpu power
    • storage space
    • node redundancy
  • Why use a grid?
    • resource sharing
      • tit-for-tat
      • buy cpu power
    • shared maintenance
      • software development
      • package deployment
      • configuration
  • Why use Taverna 2 ?
    • knowledge sharing
      • web services
      • plug-ins
      • myExperiment
  • Why use Taverna 2 ?
      • myExperiment
  • Why use Taverna 2 ?
    • dependency management
    • data management & conversion
    • easy tweaking
    • database integration
  • shell script pitfalls
    • command line ambiguities
      • no specific program version
      • unaware of changes to syntax
  • manual grid usage
    • upload & download data
    • xrsl difficult to write
    • no failure recovery
  • proposed solution Workflow plug-in submits ARC executes use cases runtime environments out of through
  • proposed solution
    • ARC grid middleware
      • homogeneous interface
      • security certificates
      • data management
      • scalable
  • proposed solution
    • runtime environments
      • specify program & version
      • installation on demand
    • use cases
      • require RE
      • no command line
      • shared repository
  • proposed solution
    • Taverna plug-in
      • job submission
        • use cases => run in parallel
      • storage management
        • data references => fast
      • silent failover
      • SSH + local for testing
  • proposed solution
    • Taverna
      • easy to present
      • embedded workflows
      • parameter tweaking
      • managed dependencies
      • easy retry
      • easy parallelization
    X. Zhou et al.: An Easy Setup for Parallel Medical Image Processing: Using Taverna and ARC
  • reality check
    • dynamic RE still experimental
      • use common tools
      • send binaries
      • call administrators
    • firewall
      • need LDAP and GSIFTP ports
      • proxy support
  • reality check
    • disk caching since Taverna 2.0
    • programs not locally installable
      • use as web service
    • upload is slow
      • upload static files to SE
  • neat toys
    • Taverna as web service
    • Taverna on grid node
    • embedded Taverna
    • NestedVM
      • package arbitrary C program into JAR
  • neat toys
    • Amazon S3
      • upload from Taverna
      • grid URL
      • good for static data
    • Amazon EC
      • on-demand grid nodes
      • control from Taverna
  • neat toys
    • use case java API
      • submit, receive, monitor
      • data references
      • silent failover
      • but NO dependency management
    • GridRunnable
      • e.g. clustering based on different criteria
  • Thank you for your attention http:// taverna.nordugrid.org