Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
1. Endofday: A Container
Workflow Engine for Scalable,
Reproducible Computation
Joe Stubbs, Stephen Talley, Walter Moreira, Rion
Dooley and Ann Stapleton
8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016
Presented at Taylor Lab journal club (Aug 24, 2017)
2. Motivation
Computational experiments leverage workflows as a way to
package multiple steps and their linkage
Software necessary to run workflows can be challenging to
install and configure
Docker and application containers can solve some of these
issues by encapsulating application installation , configuration,
and dependencies
4. Endofday aims
Execute entire workflows on any Linux machine with Docker
installed without requiring installation of additional software
Use YAML for defining workflow dependencies, inputs, and
outputs
Provide integration with the Agave platform’s application catalog
5. Agave
A set of RESTful APIs for registering and leveraging storage and
execution servers on the Internet
Users can move data, register applications, and launch jobs
against remote schedulers that have been registered
Endofday can leverage apps registered w/ Agave
6. Using endofday
Ships as a Docker container:
docker run -v $(pwd):/staging jstubbs/endofday --setup
Create a YAML file defining a workflow
Run with
./endofday.sh my_workflow.yml
7. Workflow definition:
inputs & outputs
inputs:
[- label <- path/to/source]
- ped_input <- data/toydata.ped
- map_input <- data/toydata.map
outputs:
- demonstrate.comptable
- demonstrate.TPhist
Outputs is optional and serves more like documentation
In the future will be possible to reference global outputs while
composing workflows
9. Execution via Agave
Slight modification to the workflow definition is required to
point to Agave resources
Endofday will submit job requests to the Agave job service
Possible to mix local & remote execution
10. Sample usage
Use Validate workflow to
- Validate appropriateness of a tool for the given dataset, and
- Integrate existing tools into a workflow
Four tools: Simulate, FaST-LMM, Winnow, and Demonstrate
Also demonstrate usage via Agave on Stampede
Required tools had to be registered as apps w/in Agave’s tenant
Moving to another host would require (complicated) tool install