The Galaxy ToolShed
The software repository of Galaxy
Galaxy is an interface to a system
cpu storage binaries libraries
GALAXY
GUI
Toolshed: software repository
●
In the long term: empty galaxy and with
installation install wanted tools, on a
user basis.
Toolshed: get your own?...
●
Code is part of main distribution
●
./run_tool_shed.sh
●
Very easy to have it run locally...
Code is shared through hg
Galaxy main code on Bitbucket
hg
Galaxy
server Toolshed+
Toolshed: run your own?
●
Toolshed is completely separate
process to Galaxy
●
Uses it's own pg database: need to
create a ...
Sharing a tool is basically simple
Allyouhavetoshareis(ifit'sasimplescript):
tool_conf.xml
tool.pl
Thiscanbedistributedusi...
Sharing through the toolshed
Galaxy moves to installing everything through the Tool Shed: see
shed_tool_conf.xml
<?xml ver...
Tasks of the toolshed
● Communicate with any Galaxy that wants
to install a tool from it (Galaxy admin that
accepts the to...
Philosophy: task of a tool
Somefunctionalityisencodedredundantlyintools.
Anexampleisvisualisingdata:somecallR,somecallGNUp...
My original aim...
Prod gal
test gal
Dev 1 Dev 2 Dev 3
Tool Shed
(BITS?)
Update
Offic. gal-dist
'Official' advise
●
Run Galaxy and toolshed locally
●
Develop your tool in your local Galaxy
●
If everything runs, wrap it...
All code is shared through hg
Galaxy main code on Bitbucket
hg
Galaxy
server
Toolshed
server+
All code is shared through hg
Galaxy main code on Bitbucket
hg
Galaxy
server
Toolshed
server+
FancyTool (hg repo)
SuperToo...
Code is shared through hg
Code is shared through hg
Tips
● To test installation: empty your local toolbox
What is mercurial?
version/source control system
Without mercurial
What is mercurial?
version/source control system
Without mercurial
What is mercurial?
version/source control system
Without mercurial: continuously adding changes.
What is mercurial?
version/source control system
With mercurial: fix certain states of your file
“commits”
What is mercurial?
1. keep track of the changes YOU do on your files, scripts, folders,...
joachim@joachim-laptop:~/Projec...
What is mercurial?
You can go back to a previous revision (e.g. hg update 2).
You can do some changes to the files (creati...
What is mercurial?
You can go back to a previous revision.
You can do some changes to the files.
joachim@joachim-laptop:~/...
What is mercurial?
When done a change, you can merge the heads
together again in one tip.
joachim@joachim-laptop:~/Project...
What is mercurial?
When done a change, you can merge the heads
together again in one tip.
joachim@joachim-laptop:~/Project...
What is mercurial?
1. keep track of the changes YOU do on your files, scripts, folders,...
2. clone your working directory...
What is mercurial?
You can compare two different repositories with incoming.
If you want to merge the changes, you can use...
What is mercurial?
You can compare two different repositories with incoming.
If you want to merge the changes, you can use...
What is mercurial?
You can compare two different repositories with incoming.
If you want to merge the changes, you can use...
What is mercurial?
Hg commit to fix the change!
“commit”
What is mercurial?
So, in your directory,
OR you change/add yourself files
OR mercurial does this for you (during a merge)...
What is mercurial?
1. keep track of the changes YOU do on your files, scripts, folders,...
2. clone your working directory...
Sharing in mercurial?
The directories might be located
- on local directories:
- on your intranet (hg serve):
- on the int...
What is mercurial?
Guide!
http://mercurial.selenic.com/guide/
http://hginit.com/
Galaxy Toolshed
Galaxy Toolshed contains a bunch of Mercurial repositories
you can clone
Getting ready for Galaxy development
How I develop for Galaxy:
Getting ready for Galaxy development
How I develop for Galaxy:
template
Set tool
name Toolshed
upload
hg clone
Dev Galaxy
...
Getting ready for Galaxy development
And the last step:
template
Set tool
name Toolshed
upload
hg clone
Dev Galaxy
hg push...
How I develop for Galaxy:
- you need a personal Galaxy (hg clone …)
- you might use a Toolshed repository
1. Get a templat...
2. Rename the files:
- replace 'tool' with your tool name
[galaxy@joagal razers]$ ls
razers3_wrapper.xml README
tool_data_...
3. Edit the wrapper.xml: the <tool> section.
Getting ready for Galaxy development
4. Pack again everything in a tarball and upload to the test
Toolshed in a new repository
Getting ready for Galaxy develop...
4. Pack again everything in a tarball and upload to the test
Toolshed in a new repository
Getting ready for Galaxy develop...
5. hg clone your repository to a folder in your development
Galaxy.
Getting ready for Galaxy development
5. hg clone your repository to a folder in your development
Galaxy.
Getting ready for Galaxy development
[galaxy@joagal Ga...
5. hg clone your repository to a folder in your development
Galaxy.
Getting ready for Galaxy development
[galaxy@joagal Ga...
6. Link the complete directory to a directory under
$GALAXY_HOME/tools/ and make Galaxy aware of it by
modifying tool_conf...
7. (re)start your Galaxy
$ ./run.sh –reload
And check if tool loads:
Getting ready for Galaxy development
8. Get your tools parameters
display straight:
Fill the rest of the tool's XML file.
Add also the loc.file (which contains...
9. Fun! Start developing your tool
Development happens in the
development Galaxy,
committing changes from time to
time (ev...
Mercurial credentials should be stored in ~/.hgrc (hgrc.ini
for windows)
[ui]
username = "joachim <joachim.jacob@vib.be>"
...
When development is ready...
Push the last changes to the Galaxy test Toolshed.
Export from the Galaxy Test Toolshed and i...
When development is ready...
Push the last changes to the Galaxy test Toolshed.
Export from the Galaxy Test Toolshed and i...
Galaxy manages scripts (tools)
1. Galaxy knows the location of tools, as this is set
in (an) xml file(s)
2. The tool refer...
4 different XML files
● integrated_tool_panel.xml - layout of panel
● shed_tool_conf.xml - tools from shed
● tool_conf.xml...
Galaxy installation directory
● Galaxy is installed as the user galaxy
/home/galaxy/galaxy-dist
● Installation and Version...
Galaxy installation directory
● Galaxy is installed on linux as the user galaxy
in /home/galaxy/galaxy-dist
● Important lo...
integrated_tool_panel.xml
<toolbox>
<section id="fasta_manipulation" name="FASTA manipulation" version="">
<tool id="fasta...
tool_conf.xml
<?xml version="1.0"?>
<toolbox>
<section name="FASTA manipulation" id="fasta_manipulation">
<tool file="fast...
tool.xml, the tool definition file
<tool id="fasta_compute_length" name="Compute sequence length">
<description></descript...
Tool interface is build from xml
The tool XML points to a script
./tools/fasta_tools/fasta_compute_length.py :
#!/usr/bin/env python
"""
Uses fasta_to_len ...
The tool XML points to a binary
#!/usr/bin/env python
"""
Runs BWA on single-end or paired-end data.
Produces a SAM file c...
Options for building interfaces
Overviewofthetagson
http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax
Theparame...
Select a dataset from history
Ifthetypeofinput=”data”,adropdownlistofhistoryitemsappear.
Theacceptedformatshouldbeincluded...
Choose from a list
<param name="detection_thresh" type="select"
multiple="true" label="Detection thresholds">
<option valu...
Select reference data
<param name="indices"
type="select" label="Select a reference genome">
<options from_data_table="bwa...
Select reference data
./tool_data_table_conf.xml:
<table name="bwa_indexes" comment_char="#">
<columns>value, dbkey, name,...
Select reference data
Thereferencedataisonadiskmountedon/mnt/genomes
/mnt/genomes/ (800GB)
|-- hg18
| |-- bfast
| |-- bowt...
Other useful input: conditional
<conditional name >
<param type=select … />
<option name=no > No </option>
<option name=ye...
Other useful input: conditional
conditional
Output section
Itistheeasiestthatyourscriptcanacceptthenameoftheoutputfiletooutputtheresultsto.Theeffectiveoutputfilenames...
How to integrate a tool?
Youhave:ascriptthatacceptsparametersandwritestheresultstoatextfile.
TODO
1.putyourscriptin~/galax...
Wrapping Binaries
Thingsgetabitdifficultwithwrapperscripts:scriptsthatdriveathirdpartybinary,whichneedstobeavailableonthes...
Tool dependencies
Some tools in the Toolshed require common
code base: e.g. R, samtools, GATK
In your .xml you specify the...
Tool dependencies
In your .xml these requirements must match the
tool_dependencies.xml
Tool dependencies
In your .xml these requirements must match the
tool_dependencies.xml
Tool dependencies
In your .xml these requirements must match the
tool_dependencies.xml
Tool_dependencies.xml
1, define a dependency as repository of a toolshed
containin a tool dependency definition type
2, or...
Tool_dependencies.xml
This is the simplest you can get. Really.
Tool_dependencies.xml
A more complex example
Tool_dependencies.xml
A more complex example
Lesson 1
It pays of to use / build on repositories started by
others.
The problem is the testing
1, build your tool and make it work in your galaxy
2, define your dependencies
3, search the (t...
The problem is the testing
You might consider a virtual test machine e.g. In
Virtualbox.
1, install your OS
2, fetch galax...
Tool dependencies
Dependencies
IGENOMES (http://support.illumina.com/sequencing/sequencing_software/igenome.ilmn)
gtf file...
The Galaxy toolshed
Upcoming SlideShare
Loading in …5
×

The Galaxy toolshed

1,431 views

Published on

This presentation provides some technical details on the function of the Galaxy toolshed. It was prepared for a group (Biobix at UGent), during my previous job.

Published in: Science, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,431
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
23
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The Galaxy toolshed

  1. 1. The Galaxy ToolShed The software repository of Galaxy
  2. 2. Galaxy is an interface to a system cpu storage binaries libraries GALAXY GUI
  3. 3. Toolshed: software repository ● In the long term: empty galaxy and with installation install wanted tools, on a user basis.
  4. 4. Toolshed: get your own?... ● Code is part of main distribution ● ./run_tool_shed.sh ● Very easy to have it run locally...
  5. 5. Code is shared through hg Galaxy main code on Bitbucket hg Galaxy server Toolshed+
  6. 6. Toolshed: run your own? ● Toolshed is completely separate process to Galaxy ● Uses it's own pg database: need to create a new user account ● Files of toolshed need to be stored separate next to Galaxy root
  7. 7. Sharing a tool is basically simple Allyouhavetoshareis(ifit'sasimplescript): tool_conf.xml tool.pl ThiscanbedistributedusingtheToolShed Dependencieshavetobeinstalledseparately
  8. 8. Sharing through the toolshed Galaxy moves to installing everything through the Tool Shed: see shed_tool_conf.xml <?xml version="1.0"?> <toolbox tool_path="/shed_tools"> <section id="textutil" name="Text Manipulation" version=""> <tool file="/shed_tools/toolshed.g2.bx.psu.edu/repos/bjoern- gruening/sed_wrapper/e850a63e5aed/sed_wrapper/sed.xml" guid="toolshed.g2.bx.psu.edu/repos/bjoern- gruening/sed_wrapper/sed_stream_editor/0.0.1"> <tool_shed>toolshed.g2.bx.psu.edu</tool_shed> <repository_name>sed_wrapper</repository_name> <repository_owner>bjoern-gruening</repository_owner> <installed_changeset_revision>e850a63e5aed </installed_changeset_revision> <id>toolshed.g2.bx.psu.edu/repos/bjoern- gruening/sed_wrapper/sed_stream_editor/0.0.1</id> <version>0.0.1</version> </tool> </section> </toolbox>
  9. 9. Tasks of the toolshed ● Communicate with any Galaxy that wants to install a tool from it (Galaxy admin that accepts the tool needs to add your Toolshed) ● Periodically runs functional tests on the Tools ● Allow people to update the tools ● Codevelop tools
  10. 10. Philosophy: task of a tool Somefunctionalityisencodedredundantlyintools. Anexampleisvisualisingdata:somecallR,somecallGNUplot. IreallythinkthatthepreferredoutputofGalaxyneedstobetext.Ananversatilestrongvisualisationtoolcandrawthengraphsas neededfromtheoutput. (PNG,PDFandothervisualformatsaresupported.) BTW:the2differentrepositorytypescomplywiththisview,
  11. 11. My original aim... Prod gal test gal Dev 1 Dev 2 Dev 3 Tool Shed (BITS?) Update Offic. gal-dist
  12. 12. 'Official' advise ● Run Galaxy and toolshed locally ● Develop your tool in your local Galaxy ● If everything runs, wrap it up as .tar ● Upload everything to Toolshed of your choice. ● Test download in a test Galaxy from the Toolshed ● Debug... Do not use the toolshed As a development environment
  13. 13. All code is shared through hg Galaxy main code on Bitbucket hg Galaxy server Toolshed server+
  14. 14. All code is shared through hg Galaxy main code on Bitbucket hg Galaxy server Toolshed server+ FancyTool (hg repo) SuperTool (hg repo) PowerTool (hg repo) Your uploaded .tar balls
  15. 15. Code is shared through hg
  16. 16. Code is shared through hg
  17. 17. Tips ● To test installation: empty your local toolbox
  18. 18. What is mercurial? version/source control system Without mercurial
  19. 19. What is mercurial? version/source control system Without mercurial
  20. 20. What is mercurial? version/source control system Without mercurial: continuously adding changes.
  21. 21. What is mercurial? version/source control system With mercurial: fix certain states of your file “commits”
  22. 22. What is mercurial? 1. keep track of the changes YOU do on your files, scripts, folders,... joachim@joachim-laptop:~/Projects/hgprojects$ hg log changeset: 2:726fa53bcd7d tag: tip user: Joachim Jacob <joachim.jacob@gmail.com> date: Fri Nov 16 11:24:09 2012 +0100 summary: Third change, playing with copy and remove changeset: 1:744894cb4ee6 user: Joachim Jacob <joachim.jacob@gmail.com> date: Fri Nov 16 11:09:49 2012 +0100 summary: I have added a small change to hello.txt changeset: 0:b84e0105967f user: Joachim Jacob <joachim.jacob@gmail.com> date: Fri Nov 16 11:08:01 2012 +0100
  23. 23. What is mercurial? You can go back to a previous revision (e.g. hg update 2). You can do some changes to the files (creating multiple heads) “head” “head”
  24. 24. What is mercurial? You can go back to a previous revision. You can do some changes to the files. joachim@joachim-laptop:~/Projects/hgprojects$ hg update 1 1 files updated, 0 files merged, 3 files removed, 0 files unresolved joachim@joachim-laptop:~/Projects/hgprojects$ nano hello.txt joachim@joachim-laptop:~/Projects/hgprojects$ hg commit -m "Bug fix" created new head joachim@joachim-laptop:~/Projects/hgprojects$ hg summary parent: 3:2d1d80bd0124 tip Bug fix branch: default commit: (clean) update: 1 new changesets, 2 branch heads (merge)
  25. 25. What is mercurial? When done a change, you can merge the heads together again in one tip. joachim@joachim-laptop:~/Projects/hgprojects$ hg merge merging hello.txt and another.txt to another.txt merging hello.txt and mvtest.txt to mvtest.txt 1 files updated, 2 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) “merge”
  26. 26. What is mercurial? When done a change, you can merge the heads together again in one tip. joachim@joachim-laptop:~/Projects/hgprojects$ hg commit -m 'Commit the bug fix permanently' “commit” In case of conflicts, use 'hg resolve --list' to view the conflicting files. Fix them by hand.
  27. 27. What is mercurial? 1. keep track of the changes YOU do on your files, scripts, folders,... 2. clone your working directory to a new directory (e.g. to work on another feature). “clone”
  28. 28. What is mercurial? You can compare two different repositories with incoming. If you want to merge the changes, you can use pull. “incoming”
  29. 29. What is mercurial? You can compare two different repositories with incoming. If you want to merge the changes, you can use pull. “pull”
  30. 30. What is mercurial? You can compare two different repositories with incoming. If you want to merge the changes, you can use pull. “merge”
  31. 31. What is mercurial? Hg commit to fix the change! “commit”
  32. 32. What is mercurial? So, in your directory, OR you change/add yourself files OR mercurial does this for you (during a merge) (undo with 'rollback') Both need to be followed by a commit.
  33. 33. What is mercurial? 1. keep track of the changes YOU do on your files, scripts, folders,... 2. clone your working directory to a new directory (e.g. to work on another feature). 3. Share changes with other users.
  34. 34. Sharing in mercurial? The directories might be located - on local directories: - on your intranet (hg serve): - on the internet: You can also export a commit, send it through email, and import it. You can also set up an push repository online on BitBucket. “pull /path/to/directory” “pull http://10.10.10.100:8000” “pull hg clone http://joachim@toolshed.bits.vib.be/repos/joachim/clcaligner
  35. 35. What is mercurial? Guide! http://mercurial.selenic.com/guide/ http://hginit.com/
  36. 36. Galaxy Toolshed Galaxy Toolshed contains a bunch of Mercurial repositories you can clone
  37. 37. Getting ready for Galaxy development How I develop for Galaxy:
  38. 38. Getting ready for Galaxy development How I develop for Galaxy: template Set tool name Toolshed upload hg clone Dev Galaxy hg push
  39. 39. Getting ready for Galaxy development And the last step: template Set tool name Toolshed upload hg clone Dev Galaxy hg push Galaxy.bits.vib.be
  40. 40. How I develop for Galaxy: - you need a personal Galaxy (hg clone …) - you might use a Toolshed repository 1. Get a template (right): a tar ball with some files. Getting ready for Galaxy development ● README ● tool_data_table_conf.xml.sample ● tool_dependencies.xml ● tool_indices.loc.sample ● tool_wrapper_template.pl ● tool_wrapper.xml
  41. 41. 2. Rename the files: - replace 'tool' with your tool name [galaxy@joagal razers]$ ls razers3_wrapper.xml README tool_data_table_conf.xml.sample tool_indices.loc.sample tool_wrapper_template.pl Getting ready for Galaxy development
  42. 42. 3. Edit the wrapper.xml: the <tool> section. Getting ready for Galaxy development
  43. 43. 4. Pack again everything in a tarball and upload to the test Toolshed in a new repository Getting ready for Galaxy development
  44. 44. 4. Pack again everything in a tarball and upload to the test Toolshed in a new repository Getting ready for Galaxy development
  45. 45. 5. hg clone your repository to a folder in your development Galaxy. Getting ready for Galaxy development
  46. 46. 5. hg clone your repository to a folder in your development Galaxy. Getting ready for Galaxy development [galaxy@joagal GalaxyHangar]$ hg clone http://joachim@192.168.10.23 :9009/repos/joachim/fastqseqlen destination directory: fastqseqlen requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 2 changes to 2 files updating to branch default resolving manifests getting README getting fastqseqlen.xml 2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  47. 47. 5. hg clone your repository to a folder in your development Galaxy. Getting ready for Galaxy development [galaxy@joagal GalaxyHangar]$ cd fastqseqlen/ [galaxy@joagal fastqseqlen]$ ls fastqseqlen.xml README [galaxy@joagal fastqseqlen]$ [galaxy@joagal fastqseqlen]$ hg summary parent: 0:3f22736718ef tip Uploaded files branch: default commit: (clean) update: (current) [galaxy@joagal fastqseqlen]$
  48. 48. 6. Link the complete directory to a directory under $GALAXY_HOME/tools/ and make Galaxy aware of it by modifying tool_conf.xml Getting ready for Galaxy development
  49. 49. 7. (re)start your Galaxy $ ./run.sh –reload And check if tool loads: Getting ready for Galaxy development
  50. 50. 8. Get your tools parameters display straight: Fill the rest of the tool's XML file. Add also the loc.file (which contains your reference data) if needed. (when modifying the XML, to see the changes you have to restart Galaxy. Kill Galaxy and run ./run.sh --reload again. Getting ready for Galaxy development
  51. 51. 9. Fun! Start developing your tool Development happens in the development Galaxy, committing changes from time to time (evt. with pushing to Toolshed) Starting Galaxy tools development $ hg commit -m "Alpha version of RazerS3 wrapper" $ hg push --debug $ hg commit -m "Some small changes" $ hg push --debug
  52. 52. Mercurial credentials should be stored in ~/.hgrc (hgrc.ini for windows) [ui] username = "joachim <joachim.jacob@vib.be>" verbose=True [extensions] hgext.graphlog = [auth] bb.prefix = http://192.168.10.26:9009/repos/joachim/razers bb.username = joachim bb.password = ******** Starting Galaxy tools development
  53. 53. When development is ready... Push the last changes to the Galaxy test Toolshed. Export from the Galaxy Test Toolshed and import in BITS Toolshed. Install in Galaxy.bits.vib.be
  54. 54. When development is ready... Push the last changes to the Galaxy test Toolshed. Export from the Galaxy Test Toolshed and import in BITS Toolshed. Install in Galaxy.bits.vib.be
  55. 55. Galaxy manages scripts (tools) 1. Galaxy knows the location of tools, as this is set in (an) xml file(s) 2. The tool referenced by an xml file can be - a script that does all calculations by itself (e.g. bash script, python script,...) - a script that does calculations by using 3rd party libraries (e.g. R) - a script that does calculations by calling a 3rd party binary
  56. 56. 4 different XML files ● integrated_tool_panel.xml - layout of panel ● shed_tool_conf.xml - tools from shed ● tool_conf.xml - tools from install or own ● migrated_tools_conf.xml : tools removed from tool_conf.xml upon updating. Noot:dezexmlfileszijnpasinvoegenade laatsteupdate!
  57. 57. Galaxy installation directory ● Galaxy is installed as the user galaxy /home/galaxy/galaxy-dist ● Installation and Version control of this directory is done by Mercurial (config in .hg directory, file .hgignore to ignore updating certain files) ● Installation for production required some changes: PostgresDB, apache serving static content, network settings, running galaxy as a daemon in the background http://wiki.g2.bx.psu.edu/Admin/Get%20Galaxy
  58. 58. Galaxy installation directory ● Galaxy is installed on linux as the user galaxy in /home/galaxy/galaxy-dist ● Important locations under this directory: - universe_wsgi.ini → general config file - *.xml → 'embedding' of tools and types - tools/ → location of the scripts - database/ → location of the datasets http://wiki.g2.bx.psu.edu/Admin/Get%20Galaxy
  59. 59. integrated_tool_panel.xml <toolbox> <section id="fasta_manipulation" name="FASTA manipulation" version=""> <tool id="fasta_compute_length" /> <tool id="fasta_filter_by_length" /> <tool id="fasta_concatenate0" /> <tool id="fasta2tab" /> <tool id="tab2fasta" /> <tool id="cshl_fasta_formatter" /> <tool id="cshl_fasta_nucleotides_changer" /> <tool id="cshl_fastx_collapser" /> </section> </toolbox> IsdoorGalaxysamengesteldvanshed_tool_conf.xmlen tool_conf.xml.DeIDvaneentoolverwijstnaardeIDvaluein deandere*.xmlfiles.ALEENaantepassenbijwijzigenpositie inhettoolpaneel
  60. 60. tool_conf.xml <?xml version="1.0"?> <toolbox> <section name="FASTA manipulation" id="fasta_manipulation"> <tool file="fasta_tools/fasta_compute_length.xml" /> <tool file="fasta_tools/fasta_filter_by_length.xml" /> <tool file="fasta_tools/fasta_concatenate_by_species.xml" /> <tool file="fasta_tools/fasta_to_tabular.xml" /> <tool file="fasta_tools/tabular_to_fasta.xml" /> <tool file="fastx_toolkit/fasta_formatter.xml" /> <tool file="fastx_toolkit/fasta_nucleotide_changer.xml" /> <tool file="fastx_toolkit/fastx_collapser.xml" /> </section> </toolbox> Isdoorontwikkelaarsaantepassenvoorhettoevoegenvan nieuwetools:hierbijverwijsjenaardelocatie,startendvanaf detoolsdirectory(tools/,uituniverse_wsgi.ini),vandetoolxml.
  61. 61. tool.xml, the tool definition file <tool id="fasta_compute_length" name="Compute sequence length"> <description></description> <command interpreter="python"> fasta_compute_length.py $input $output $keep_first </command> <inputs> <param name="input" type="data" format="fasta" label="Compute length for these sequences"/> <param name="keep_first" type="integer" size="5" value="0" label="How many title characters to keep?" help="'0' = keep the whole thing"/> </inputs> <outputs> <data name="output" format="tabular"/> </outputs> <tests/> <help/> </tool> Elketoolheefteenxml,datverwijstnaarhetscript,datde interfaceopbouwtenparametersnaardetoolzendt.
  62. 62. Tool interface is build from xml
  63. 63. The tool XML points to a script ./tools/fasta_tools/fasta_compute_length.py : #!/usr/bin/env python """ Uses fasta_to_len converter code. """ import sys from galaxy.datatypes.converters.fasta_to_len import compute_fasta_length compute_fasta_length( sys.argv[1], sys.argv[2], sys.argv[3]) Inditgevalvindtdeberekeningplaatsinpythonzelf.Soms moetenechter3rd partieslibrariesgeinstalleerdworden.
  64. 64. The tool XML points to a binary #!/usr/bin/env python """ Runs BWA on single-end or paired-end data. Produces a SAM file containing the mappings. Works with BWA version 0.5.9. usage: bwa_wrapper.py [options] See below for options """ import optparse, os, shutil, subprocess, sys, tempfile def stop_err( msg ): sys.stderr.write( '%sn' % msg ) sys.exit() def check_is_double_encoded( fastq ): # check that first read is bases, not one base followed by numbers bases = [ 'A', 'C', 'G', 'T', 'a', 'c', 'g', 't', 'N' ] nums = [ '0', '1', '2', '3' ] for line in file( fastq, 'rb'): if not line.strip() or line.startswith( '@' ):
  65. 65. Options for building interfaces Overviewofthetagson http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax Theparameterstoconstructtheinterfaceareplacedwithin <input> </input>tags Thetagsyouuseinthe<input>sectiondefinealotthesyntax touseinothertagsets,suchas<output>,and<command> BASICUSE <param name=[param_name]type=text value=”default” label=”Explanationoftheparameter”help=”help”/> e.g.
  66. 66. Select a dataset from history Ifthetypeofinput=”data”,adropdownlistofhistoryitemsappear. Theacceptedformatshouldbeincludedasformat=”format”. <param name="input"type="data" format="tabular"label="Dataset"/>
  67. 67. Choose from a list <param name="detection_thresh" type="select" multiple="true" label="Detection thresholds"> <option value="0.001">0.001</option> <option value="0.002">0.002</option> <option value="0.003">0.003</option> <option value="0.004">0.004</option> </param>
  68. 68. Select reference data <param name="indices" type="select" label="Select a reference genome"> <options from_data_table="bwa_indexes"> <filter type="sort_by" column="2" /> <validator type="no_options" message="No indexes are available" /> </options> <!-- is not option --> </param> Forsometoolsindexeddatacanbemade available(e.g.BLAST,NGSmappers,…). Topass indexedsets,theycanbereferencedtoby tool_data_table_conf.xml:theypointto ./tool_data/<toolname>.loc files
  69. 69. Select reference data ./tool_data_table_conf.xml: <table name="bwa_indexes" comment_char="#"> <columns>value, dbkey, name, path</columns> <file path="tool-data/bwa_index.loc" /> </table> ./tool_data/<toolname>.loc hg19_chr21 hg19 Human chrom 21 bld 37 (hg19) /mnt/genomes/hg19_chrom21/bwa/base/build37_chr21.fa hg18 hg18 Human genome bld 36 (hg18) /mnt/genomes/hg18/bwa/base/build36.fa hg19 hg19 Human genome bld 37 (hg19) /mnt/genomes/hg19/bwa/base/build37.fa Thereferencedataisonadiskmountedon/mnt/genomes
  70. 70. Select reference data Thereferencedataisonadiskmountedon/mnt/genomes /mnt/genomes/ (800GB) |-- hg18 | |-- bfast | |-- bowtie | |-- bwa |-- hg19 | |-- bfast | |-- bowtie | |-- bwa
  71. 71. Other useful input: conditional <conditional name > <param type=select … /> <option name=no > No </option> <option name=yes > Yes </option> </param> <when value=”No”> <param name=[name] … /> <!--e.g.askforinput--> </when> <when value=”Yes”/> </conditional>
  72. 72. Other useful input: conditional conditional
  73. 73. Output section Itistheeasiestthatyourscriptcanacceptthenameoftheoutputfiletooutputtheresultsto.TheeffectiveoutputfilenamesarethenpassedbyGalaxyto yourprogram. <outputs> <data format="fasta" name="trim_fasta" label="${tool.name} on ${on_string} seq"/> </outputs> <command … > myscript.pl -i $input -o $trim_fasta </command > Important:settheformattothecorrecttype! Optionaloutputfiles:canbehandledwiththe<conditional>tagset,andlinkingittothe<filter>tagintheoutputsets.
  74. 74. How to integrate a tool? Youhave:ascriptthatacceptsparametersandwritestheresultstoatextfile. TODO 1.putyourscriptin~/galaxy-dist/tools/mytools/ 2.inthatdirectory,createamytool.xmlfile,pointingtothattool,withalltagsetssetcorrectly. 3.in~/galaxy-dist/tool_conf.xml,enteralinewithyourtoolxmlfile 4.restartgalaxy:#service galaxyd restart (4'.optional:changethelocationofyourtoolinintegrated_tool_panel.xml andrestartagain) 5.There'sthemagic.Enjoyyourtool!
  75. 75. Wrapping Binaries Thingsgetabitdifficultwithwrapperscripts:scriptsthatdriveathirdpartybinary,whichneedstobeavailableonthesystem.Ihaveinstalled3rd partybinariesin: /opt (Inonecase,Ifoundmyselfwritingapython script,todrivea3rd partybash script,thatconsecutivelyexecutedaJAVA binaryandanR command,togenerateaPDFdocument.Thecorrectimplementation:executetheJAVAbinary,generatetext.LetvisualisationtoolsinGalaxy generategraphs)
  76. 76. Tool dependencies Some tools in the Toolshed require common code base: e.g. R, samtools, GATK In your .xml you specify these requirements:
  77. 77. Tool dependencies In your .xml these requirements must match the tool_dependencies.xml
  78. 78. Tool dependencies In your .xml these requirements must match the tool_dependencies.xml
  79. 79. Tool dependencies In your .xml these requirements must match the tool_dependencies.xml
  80. 80. Tool_dependencies.xml 1, define a dependency as repository of a toolshed containin a tool dependency definition type 2, or write directly in the tool_dependencies.xml the instructions to install the dependency, and make it available system wide. Galaxy aims to be platform independent, so A HELL OF A JOB. http://wiki.galaxyproject.org/ToolShedToolFeatures#Automatic_third-party_tool_dependency_installa
  81. 81. Tool_dependencies.xml This is the simplest you can get. Really.
  82. 82. Tool_dependencies.xml A more complex example
  83. 83. Tool_dependencies.xml A more complex example
  84. 84. Lesson 1 It pays of to use / build on repositories started by others.
  85. 85. The problem is the testing 1, build your tool and make it work in your galaxy 2, define your dependencies 3, search the (test)toolshed for repositories you can use – tool dependency definitions (“just installing packages, without providing an interface”). 4, put them as requirements in your tool.xml 5, the ones you do not find: decide whether to create a separate tool dependency definition and integrate them OR 5' add them to your dependencies.xml file. 6' Update/Load to a Toolshed 7' Fire up a test Galaxy, and plug the tool in to see whether it works.
  86. 86. The problem is the testing You might consider a virtual test machine e.g. In Virtualbox. 1, install your OS 2, fetch galaxy 3, set the universe_wsgi.ini ready (admin, location,...) 4, plug in your repository 5, SNAPSHOT your machine 6, graphically install your tool 7, define what went wrong 7` update the repository 7`` and restore the snapshot 8, interate until SUCCESS!
  87. 87. Tool dependencies Dependencies IGENOMES (http://support.illumina.com/sequencing/sequencing_software/igenome.ilmn) gtf file: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.gtf reference whole genome sequence: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Sequence/WholeGenomeFasta/ reference chromosome sequences: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Sequence/Chromosomes/ PHIX-control sequences: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Sequence/AbundantSequences/phix.fa TopHat2 (Bowtie2) and STAR indexes: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Sequence/Bowtie2Index $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Sequence/STARIndex Chr size file: $IGENOMES_ROOT/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/ChromInfo.txt Binaries STAR (https://code.google.com/p/rna-star/) TOPHAT2 (http://tophat.cbcb.umd.edu/) BLASTP ( ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) or USEARCH (http://www.drive5.com/usearch/download.html) R (http://www.r-project.org/) SAMTOOLS (http://sourceforge.net/projects/samtools/files/samtools/) GATK (http://www.broadinstitute.org/gatk/download) PICARD (http://sourceforge.net/projects/picard/files/picard-tools/) SQLITE3 (http://www.sqlite.org/download.html) Custom Ensembl SQLite DB tables included: coord_system exon_transcript intergene (made by the intergenic TIScalling script based on gene) transcript exon gene data

×