Your SlideShare is downloading. ×
Cloud ntino-krampis
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cloud ntino-krampis

1,031

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,031
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Cloud BioLinux: pre-Configured and on-demandcomputing for genomics independently of institutional, geographic or economic boundaries Ntino Krampis, PhD JCVI-NIAID workshop 2011 S. Africa
  • 2. Expensive sequencing and large organizations Commodity sequencing and small labs● large sequencing center, multi-million, broad-impact sequencing projects● dedicated bioinformatics department, coordination with other centers● small-factor, bench-top sequencer available: GS Junior by 454● sequencing as a standard technique in basic biology and genetics research● RNAseq and ChiPseq, and each biologist will be tackling a metagenome
  • 3. Acquiring the sequence data is only the first step● downstream bioinformatics analysis for scientific discovery● many commonly-used bioinformatics tools are difficult to install● usually available only as source code - needs technical expertise● large-scale sequence data analysis requires high performance and expensive computing hardware
  • 4. Alternative: computational capacity on the cloud● Cloud Computing: large-scale, highperformance computers accessiblethrough the Internet●Example: using Gmail, Google Docs,Yahoo! Mail, FaceBook etc. you store andaccess data on a remote computer●Cloud Computing services - AmazonEC2 (http://aws.amazon.com/ec2) rent highcomputational and data storage capacityon remote computers
  • 5. How does Cloud Computing work ? remote Amazon EC2 Cloud Computing serviceoperating system, bioinformatics softwareand data, are installed in a Virtual Machine VM VM VM(VM)a VM is uploaded and executed on a cloudcomputing servicerun a practically unlimited number of VMs Internetfor large-scale sequence data analysisaccess VM on a desktop computer throughthe Internet local desktop computers
  • 6. Cloud BioLinux● Cloud BioLinux by leverages VM technology and the cloud, offering pre-configured bioinformatics computing● allow setting up a high-performance data analysis environment, without any technical expertise● researchers can perform large-scale data analysis, by simply using a desktop computer with Internet access● accessible without any institutional, economic or national boundaries
  • 7. Launching Cloud BioLinux1. sign up for an Amazon EC2 cloud account: http://aws.amazon.com/ec2 Also can connect an existing account from the main Amazon.com website for the cloud usage charges. We have an account ready for you: Username: aws_nhgri@jcvi.org Password: Nhg4|CL0ud!2. using the account credentials sign in to the EC2 cloud console (select EC2 in the dropdown menu below the sign-in button): http://aws.amazon.com/console3. launch Cloud BioLinux through the cloud console wizard
  • 8. Launching Cloud BioLinux Click the button :http://aws.amazon.com/console
  • 9. Launch instance wizard: steps 1 & 2 1. specify the Cloud BioLinux identifier under “Community AMIs” tab 2. computational capacity: memory, processor, CPU cores
  • 10. Launch instance wizard: step 3 3. specify a password for login for the Cloud BioLinux desktop, under “User Data” box 4. remaining steps: all as default, keep clicking the “Continue” button until the wizard finishes and you are back to the console
  • 11. Launching Cloud BioLinux back to the console after we completed the wizard Pick a runninginstance, select with your mouse andcopy its “PublicDNS” address (Cloud BioLinux server address on the cloud)
  • 12. While waiting for Cloud BioLinux to boot up...● examples of NCBI public datasets on EC2● bringing the data to the compute
  • 13. Final step: connecting remotely to Cloud BioLinux click the NX client icon on your computers desktopA. paste the DNS in the “Host” box B. select “Unix”, “Gnome”, remote desktop size C. “ubuntu” is the default user Login “workshop” is the password we set
  • 14. What if I want to share myalignments witha collaborator?save your data as a new VM 0.10$ / GB / monthat 15GB, it costs 1.5$ / month
  • 15. Cloud BioLinux whole system snapshot exchangeshare your analysis results: publicly or only with your collaboratorsauthorized users can access the cloud VM/image with all the software, data, analysis results
  • 16. Cloud BioLinux and Genomic Standards whole system snapshot exchange start VM / image share perform analysis snapshot researcher Bresearcher A snapshot perform analysis share start VM / image
  • 17. Acknowledgments & CreditsBrad Chapman - development of the fabric scripts and community organizerTim Booth, Bela Tiwari, Dawn Field – BioLinux 6.0 development and EC2 documentationDeepak Singh and AWS - education grant supporting ISMB / BOSC workshopJustin Johnson – community and sponsorship of cloudbiolinux.comJ. Craig Venter Inst. - time allowed to work on an open-source projectD. Gomez, E. Navarro, J. Shao, I. Singh – JCVI technology innovationMembers of the Cloud Biolinux community:Enis AfganMichael HeuerRichard HollandMark Jensen Thank you !Dave MessinaSteffen MöllerRoman Valls

×