Bioinformatics alchemy 101 transmuting dark script matter into reusable tools - ross lazarus
- 596 views
Reproducibility is a fundamental goal of good experimental science. Despite the increasing availability and deployment of analytic frameworks such as Galaxy, readily reproducible bioinformatic ...
Reproducibility is a fundamental goal of good experimental science. Despite the increasing availability and deployment of analytic frameworks such as Galaxy, readily reproducible bioinformatic analysis remains difficult to achieve. Mature complex workflows often require small tweaks to accommodate the idiosyncracies of new datasets, but integrating the required new capabilities into the framework is prohibitively complex and expensive. As a result, when problems are encountered in an existing pipeline, data may be temporarily diverted for manual processing outside the framework. These manual steps typically involve relatively trivial, transient, undocumented and poorly curated programs or scripts - "dark script matter" that rarely reaches appropriate local version control or archiving systems where production code is maintained, threatening the goal of reproducible analysis. The Galaxy Toolfactory is a Galaxy tool that allows scripts (R, perl, python, Bash...) to be run directly and repeatably through the normal Galaxy interface. The Toolfactory optionally generates all the biolerplate code needed for a new Galaxy tool that permanently wraps the script for reuse. Newly generated tools can be uploaded to a local or remote Galaxy Toolshed. Tools can be installed in a running Galaxy server from any Toolshed through the administrative interface for subsequent use in worflows and analyses. The conversion of a trivial script into a working, shareable Galaxy tool will be demonstrated during the presentation.
- Total Views
- Views on SlideShare
- Embed Views