Your SlideShare is downloading. ×
0
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Developing an open source community for cloud bioinformatics

2,052

Published on

Talk for Amazon workshop: …

Talk for Amazon workshop:

http://aws.amazon.com/genomics_workshop/

Published in: Technology, News & Politics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,052
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Developing an open source community for cloud bioinformatics Brad Chapman http://bcbio.wordpress.com/ 8 June 2010
  • 2. Overview 1 Building open source bioinformatics communities is hard. 2 Developer resources are a productive target. 3 Framework: collaborative software images and data snapshots.
  • 3. Motivation Open source OpenBio, Biopython Graduate school – developed distributed algorithm. Never reused. Work Startup: Automated biological pipelines. Research hospital: Democratization of analysis.
  • 4. Filters in biological computing Working in same biological area Interest in developing open source code Technical abilities Your software is good enough
  • 5. Successful bioinformatics Sean Eddy, HMMER ...the best software in the field is often an unplanned labor of love from a single investigator. http://selab.janelia.org/people/eddys/blog/?p=313
  • 6. Recognizing contributions
  • 7. Successful community projects OpenBio: BioPerl, Biopython, BioJava Bioconductor Common theme Aimed at developers. Biologists benefit indirectly.
  • 8. Lowering activation energy
  • 9. Establishing common platform The solution = to all our problems Remove install and distribution barriers Building block for scaling
  • 10. Existing cloud bioinformatics work JCVI Cloud BioLinux bioperl-max MachetEC2 Debian Med Overlapping set of useful functionality.
  • 11. Integrated community solution Inclusive but configurable Easy to contribute Automated Bootstrap bare machine to fully ready distributed AMI. http://github.com/chapmanb/bcbb/tree/master/ec2/ biolinux/
  • 12. Inclusive but configurable # Top level YAML configuration file specifying # groups of programs to be installed. packages: - python - r - erlang - databases - viz - bio_search - bio_alignment - bio_nextgen - bio_sequencing - bio_visualization - phylogeny libraries: - r-libs - python-libs
  • 13. Easy to contribute # Configuration file defining R specific libraries that # are installed via CRAN and Bioconductor. cranrepo: http://software.rc.fas.harvard.edu/mirrors/R/ cran: - ggplot2 - rjson - sqldf - NMF - ape biocrepo: http://bioconductor.org/biocLite.R bioc: - ShortRead - BSgenome - edgeR - GOstats - biomaRt - Rsamtools
  • 14. Automated def install_biolinux(): ec2_ubuntu_environment() pkg_install, lib_install = _read_main_config() _apt_packages(pkg_install) _do_library_installs(lib_install) def _ruby_library_installer(config): for gem in config[’gems’]: sudo("gem install %s" % gem) Fabric: http://docs.fabfile.org/
  • 15. Ready to use biological data % ls /referenceGenomes/ % ls Hsapiens/hg18 Athaliana arachne Celegans bowtie Dmelanogaster bwa Ecoli eland Hsapiens maq Mmusculus seq Msmegmatis snps Mtuberculosis_H37Rv ucsc Paeruginosa_UCBPP-PA14 phiX174 Rnorvegicus Scerevisiae Xtropicalis http://github.com/chapmanb/bcbb/blob/master/galaxy/galaxy_fabfile.py
  • 16. Organization: Codefest 2010 www.open-bio.org/wiki/Codefest_2010

×