Toolfactory foam mar21_2013

129 views
96 views

Published on

Bah! Humbug! The embedded movies do not work. Gak.
This slide show was NOT presented during the FOAM meeting as the PC was being used to futz with the new Cloudman instance so I could use it for the demo.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
129
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Toolfactory foam mar21_2013

  1. 1. Bioinformatic Alchemy 101 Transmuting dark script matter into reusable tools Ross Lazarus BakerIDI 1
  2. 2. Context: bioinformatic analyses Big data; complex analyses Repeatable, automated pipelines Reproducibility real goal Reproducibility is hard 2
  3. 3. Frameworks Eg VGL Local SOPs for biologists Tools, canned workflows Minimise opportunities for error Maximise reproducibilty 3
  4. 4. In real life 90/10 rule Need to tweak SOPs Trivial disposable scripts Not documented or curated Not reliably available to re-run “Dark script matter” 4
  5. 5. Dark Script Matter Outside usual VCS/pipelines Manual =/= reproducible Necessary evil? Platform extensions complex Eg Galaxy – hours of work 5
  6. 6. Plan Context: Reproducible analyses Frameworks vs Dark Scripts Alchemy: script to Galaxy tool Demonstration Summary Conclusions 6
  7. 7. Galaxy Tool Factory An installable Galaxy tool Runs scripts: Python,R,Perl,sh Generates new Galaxy tools Tool code wraps the script Minutes – not hours 7
  8. 8. Galaxy Tool Shed Separate server Stores/serves Galaxy tools Admin can install to Galaxy Mercurial VCS archives Explicit tool versioning Sharing and reproducibility 8
  9. 9. Demo 1: Install the Tool Factory
  10. 10. Demo 2: Create a new tool
  11. 11. Demo 3: Quick install and test
  12. 12. Prepare script Python; R; Perl; Sh Parse CL params – 1=in, 2=out Typically workflow transformations Arbitrary complexity Simple example Write transpose of a tabular file 14
  13. 13. Prepare/upload test data SMALL sample input Becomes functional test case h1 h2 h3 h4 r11 r12 r13 r14 r21 r22 r23 r24 15
  14. 14. # R transpose a tabular input file and write as# a tabular output fileourargs = commandArgs(TRUE)inf = ourargs[1]outf = ourargs[2]inp = read.table(inf,head=F,row.names=NULL,sep=t)outp = t(inp)write.table(outp,outf,quote=FALSE, sep="t",row.names=F,col.names=FALSE) 16
  15. 15. Demo part 1As an admin, test run the code 17
  16. 16. Use Redo button; Generate When working right Use Redo to save retyping Select Generate option Provide tool ID, help text Execute Expect a toolfactory.gz in history Copy link (floppy disk icon) 18
  17. 17. Whats in the toolshed.gz ? A gzipd mercurial tool repository (!) Auto generated tool XML file Auto generated tool python wrapper Functional test case - the sample data Familiar Galaxy tool for all users Executes your script over their data Interoperably inside Galaxy 19
  18. 18. Upload TS gzip to new repository Upload to any tool shed Create new repo; sensible name! Choose Upload files to new repo Paste URL (floppydisk save icon) New tool ready to install 20
  19. 19. Install and Test New Tool Back to Galaxy admin interface Browse local tool shed Choose new tool Install to local Galaxy Try it out Run functional test 21
  20. 20. Summary GTF = script to tool in minutes Integrated with Galaxy and TS Simple workflow components If needed, generate simple tool Then add parameters manually 22
  21. 21. Tool Factory Operation Guide Galaxy Install new tool from toolshed Script Tool Factory from Galaxy admin page; (Python,R, Tool Form; Test; Functional test; perl, sh) Paste script; Upload/pasteSample Input for Test run; Create new repository. functional test Check outputs; Upload files – paste TS gzip Rerun/fix; link and upload Generate TS gzip; Copy download link for Tool Shed pasting 23
  22. 22. GALAXYhttp://usegalaxy.org 24
  23. 23. Galaxy Tool FactoryGenerate a new Galaxy tool From a python, R, Perl or bash script Using a Galaxy write as a tabular output file # transpose a tabular input file and tool Via a Tool Shed ourargs = commandArgs(T) inf = ourargs[1] outf = ourargs[2] inp = read.table(inf,head=F,row.names=NULL,sep=t) outp = t(inp) write.table(outp,outf,quote=F, sep="t",row.names=F,col.names=F) 25
  24. 24. Tool Factory Operation Guide Galaxy Install new tool from toolshedScript – R, Tool Factory from Galaxy admin page;perl, python Tool Form; Test; Functional test; Paste script; Upload/pasteSample Input for Test run; Create new repository. functional test Check outputs; Upload files – paste TS gzip Rerun/fix; link and upload Generate TS gzip; Copy download link for Tool Shed pasting 26

×