• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Cloud ntino-krampis
 

Cloud ntino-krampis

on

  • 1,141 views

 

Statistics

Views

Total Views
1,141
Views on SlideShare
1,139
Embed Views
2

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 2

http://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Cloud ntino-krampis Cloud ntino-krampis Presentation Transcript

    • Cloud BioLinux: pre-Configured and on-demandcomputing for genomics independently of institutional, geographic or economic boundaries Ntino Krampis, PhD JCVI-NIAID workshop 2011 S. Africa
    • Expensive sequencing and large organizations Commodity sequencing and small labs● large sequencing center, multi-million, broad-impact sequencing projects● dedicated bioinformatics department, coordination with other centers● small-factor, bench-top sequencer available: GS Junior by 454● sequencing as a standard technique in basic biology and genetics research● RNAseq and ChiPseq, and each biologist will be tackling a metagenome
    • Acquiring the sequence data is only the first step● downstream bioinformatics analysis for scientific discovery● many commonly-used bioinformatics tools are difficult to install● usually available only as source code - needs technical expertise● large-scale sequence data analysis requires high performance and expensive computing hardware
    • Alternative: computational capacity on the cloud● Cloud Computing: large-scale, highperformance computers accessiblethrough the Internet●Example: using Gmail, Google Docs,Yahoo! Mail, FaceBook etc. you store andaccess data on a remote computer●Cloud Computing services - AmazonEC2 (http://aws.amazon.com/ec2) rent highcomputational and data storage capacityon remote computers
    • How does Cloud Computing work ? remote Amazon EC2 Cloud Computing serviceoperating system, bioinformatics softwareand data, are installed in a Virtual Machine VM VM VM(VM)a VM is uploaded and executed on a cloudcomputing servicerun a practically unlimited number of VMs Internetfor large-scale sequence data analysisaccess VM on a desktop computer throughthe Internet local desktop computers
    • Cloud BioLinux● Cloud BioLinux by leverages VM technology and the cloud, offering pre-configured bioinformatics computing● allow setting up a high-performance data analysis environment, without any technical expertise● researchers can perform large-scale data analysis, by simply using a desktop computer with Internet access● accessible without any institutional, economic or national boundaries
    • Launching Cloud BioLinux1. sign up for an Amazon EC2 cloud account: http://aws.amazon.com/ec2 Also can connect an existing account from the main Amazon.com website for the cloud usage charges. We have an account ready for you: Username: aws_nhgri@jcvi.org Password: Nhg4|CL0ud!2. using the account credentials sign in to the EC2 cloud console (select EC2 in the dropdown menu below the sign-in button): http://aws.amazon.com/console3. launch Cloud BioLinux through the cloud console wizard
    • Launching Cloud BioLinux Click the button :http://aws.amazon.com/console
    • Launch instance wizard: steps 1 & 2 1. specify the Cloud BioLinux identifier under “Community AMIs” tab 2. computational capacity: memory, processor, CPU cores
    • Launch instance wizard: step 3 3. specify a password for login for the Cloud BioLinux desktop, under “User Data” box 4. remaining steps: all as default, keep clicking the “Continue” button until the wizard finishes and you are back to the console
    • Launching Cloud BioLinux back to the console after we completed the wizard Pick a runninginstance, select with your mouse andcopy its “PublicDNS” address (Cloud BioLinux server address on the cloud)
    • While waiting for Cloud BioLinux to boot up...● examples of NCBI public datasets on EC2● bringing the data to the compute
    • Final step: connecting remotely to Cloud BioLinux click the NX client icon on your computers desktopA. paste the DNS in the “Host” box B. select “Unix”, “Gnome”, remote desktop size C. “ubuntu” is the default user Login “workshop” is the password we set
    • What if I want to share myalignments witha collaborator?save your data as a new VM 0.10$ / GB / monthat 15GB, it costs 1.5$ / month
    • Cloud BioLinux whole system snapshot exchangeshare your analysis results: publicly or only with your collaboratorsauthorized users can access the cloud VM/image with all the software, data, analysis results
    • Cloud BioLinux and Genomic Standards whole system snapshot exchange start VM / image share perform analysis snapshot researcher Bresearcher A snapshot perform analysis share start VM / image
    • Acknowledgments & CreditsBrad Chapman - development of the fabric scripts and community organizerTim Booth, Bela Tiwari, Dawn Field – BioLinux 6.0 development and EC2 documentationDeepak Singh and AWS - education grant supporting ISMB / BOSC workshopJustin Johnson – community and sponsorship of cloudbiolinux.comJ. Craig Venter Inst. - time allowed to work on an open-source projectD. Gomez, E. Navarro, J. Shao, I. Singh – JCVI technology innovationMembers of the Cloud Biolinux community:Enis AfganMichael HeuerRichard HollandMark Jensen Thank you !Dave MessinaSteffen MöllerRoman Valls