Ntino Krampis GSC 2011

Uploaded on

“Cloud BioLinux:Standardized, Pre-Configured and On-Demand …

“Cloud BioLinux:Standardized, Pre-Configured and On-Demand
Computing for Genomics and Beyond
”. Genomics Standards Consortium Conference 2010, European Bioinformatics Institute, Hinxton, UK

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Cloud BioLinux: Standardized, Pre-Configured and On-Demand Computing for Genomics and Beyond Ntino Krampis, PhD GSC 2011 Hinxton, UK
  • 2. Expensive sequencing and large organizations Commodity sequencing and small labs● large sequencing center, multi-million, broad-impact sequencing projects● dedicated bioinformatics department, coordination with other centers● small-factor, bench-top sequencer available: GS Junior by 454● sequencing as a standard technique in basic biology and genetics research● RNAseq and ChiPseq, and each biologist will be tackling a metagenome
  • 3. “Bioinformatics nation is a land of city-states” Lincoln Stein● smaller labs building small-scale bioinformatics infrastructures● duplication of effort in compiling and installing software tools● some labs have no hardware, expertise, or time to install and run software● early pioneer in this area was NEBC BioLinux ( tinyurl.com/BioLinux-NEBC )● desktop linux with with 100+ pre-configured bioinformatics tools● example: glimmer, hmmer, phylip, rasmol, genespring, clustalw, EMBOSS how about large-scale sequence datasets ?
  • 4. Cloud BioLinuxstandardized, pre-configured and on-demand bioinformatics computing on the cloud ● JCVIs cloud computing expertise ● NEBCs bioinformatics software repository ● community effort – ISMB / BOSC 2010 ● standardized, pre-configured Virtual Machine (VM, image) + ● VM: emulates a computer server, encapsulates operating system, software libraries and bioinformatics tools ● Amazon EC2 computational capacity as a utility, on-demand ● rich interface through a remote desktop client =tinyurl.com/CloudBioLinux-JCVIhttp://cloudbiolinux.com
  • 5. Cloud BioLinux and Genomic Standards framework to distribute bioinformatics tools, data and analysis results create cloud VM / images with standardized software configurations● customize Cloud BioLinux VMs, based on community requirements● share customized VMs with collaborators, avoiding effort duplication● mix and match software from NEBC or other (DebianMed, Scientific Linux etc.) whole system snapshot exchange (Dudley and Butte 2010)● capture the state of the computing system and data● software execution parameters and “massaged” input datasets● save into cloud VM / image and share along with analysis results democratize access to computing resources● large-scale computing independently of institutional or geographic boundaries● only need a desktop computer with internet access
  • 6. Cloud BioLinux and Genomic Standards create cloud VM / images with standard software configurations● framework to describe software components in cloud VM / image● based on python-fabric automated deployment tool● software components listed in simple text files● edit the files to mix and match software according to your community needs● community members use files to share descriptions of customized systems● start with a bare-bones VM, fabric downloads and installs specified software● Labs with sensitive data and capacity for private clouds: works identically onAmazon EC2 or Eucalyptus open-source cloudtinyurl.com/python-fabric open.eucalyptus.com
  • 7. software domains in bioinformatics: nextgensequencing, de novo assembly, annotation, phylogeny, molecular structures, gene expression analysis high-level configuration describing software groups for each group individual bioinformatics tools tinyurl.com/CloudBioLinux-github
  • 8. Cloud BioLinux and Genomic Standards whole system snapshot exchange simply signup at aws.amazon.com then aws.amazon.com/console andhttp://tinyurl.com/cloud-biolinux-tutorial
  • 9. Cloud BioLinux and Genomic Standards whole system snapshot exchange find Cloud Biolinux using ID enter desired password for remote desktop login all other default http://tinyurl.com/cloud-biolinux-tutorial
  • 10. free remote desktop client:nomachine.com/download.php simply enter VM IP address and your password
  • 11. What if I want to share myalignments witha collaborator?save your data as a new VM 0.10$ / GB / monthat 15GB, it costs 1.5$ / month
  • 12. Cloud BioLinux and Genomic Standards whole system snapshot exchangeshare your analysis results: publicly or only with your collaboratorsauthorized users can access the cloud VM/image with all the software, data, analysis results
  • 13. Cloud BioLinux and Genomic Standards whole system snapshot exchange start VM / image share perform analysis snapshot researcher Bresearcher A snapshot perform analysis share start VM / image
  • 14. Cloud Biolinux The future● expand community, receive feedback, add more software to the VM● analysis pipelines that are used by large sequencing centers● actively seeking funding to put major effort in development● 2011 ISMB/BOSC in Vienna, Austria, http://metalab.at/● tinyurl.com/cloudbiolinux-lists or community@cloudbiolinux.com
  • 15. Acknowledgments & CreditsBrad Chapman - development of the fabric scripts and community organizerTim Booth, Bela Tiwari, Dawn Field – BioLinux 6.0 development and EC2 documentationDeepak Singh and AWS - education grant supporting ISMB / BOSC workshopJustin Johnson – community and sponsorship of cloudbiolinux.comJ. Craig Venter Inst. - time allowed to work on an open-source projectD. Gomez, E. Navarro, J. Shao, I. Singh – JCVI technology innovationMembers of the Cloud Biolinux community:Enis AfganMichael HeuerRichard HollandMark Jensen Thank you !Dave MessinaSteffen MöllerRoman Valls