Your SlideShare is downloading. ×
Cloud burst 소개
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Cloud burst 소개

891
views

Published on

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
891
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CloudBurst• CloudBurst : Highly Sensitive Short Read Mapping with MapReduce• New parallel read-mapping algorithm optimized for mapping NGS data to the human genome and other reference genomes• SNP discovery, genotyping, and personal genomics
  • 2. CloudBurst• It is modeled after the short read mapping program RMAP• Reports either all alignments or the unambiguous best alignment for each read with any number of mismatches or differences• This level of sensitivity could be prohibitively time consuming, but CloudBurst uses the open-source Hadoop implementation of MapReduce to parallelize execution using multiple compute nodes.
  • 3. CloudBurst• Running time – scales linearly with the number of reads mapped – with near linear speedup as the number of processors increases.• CloudBurst reduces the running time from hours to mere minutes for typical jobs involving mapping of millions of short reads to the human genome.
  • 4. Algorithm Overview• CloudBurst uses seed-and-extend algorithms to map reads to a reference genome.• Seed – k differences : the alignment must have a region of length s=r/k+1 called a seed that exactly matches the reference.• Extend – CloudBurst attempts to extend the alignment into an end-to-end alignment with at most k mismatches or differences
  • 5. Algorithm Overview• CloudBurst uses the Hadoop implementation of MapReduce to catalog and extend the seeds• Map phase emits – all length-s k-mers from the reference sequences – all non-overlapping length-s kmers from the reads• Shuffle phase – read and reference kmers are brought together• Reduce phase – the seeds are extended into end-to-end alignments
  • 6. Algorithm Overview
  • 7. DemoGetting Started.docx 참고
  • 8. Related Tools• Bowtie: Ultrafast short read alignment• SoapSNP: Accurate SNP/consensus calling• Tophat: RNA-Seq splice junction mapper• Cufflinks: Isoform assembly, quantitation• Hadoop: Open Source MapReduce• CloudBurst: Sensitive MapReduce alignment• Crossbow: Read Mapping and SNP calling in the clouds• Jnomics: Cloud-Scale Sequence Analysis• Contrail: Cloud-based de novo assembly• Myrna: Cloud-Scale differential expression of RNAseq
  • 9. Q&A
  • 10. Figure 1: A MapReduce approach for detecting genetic variants from high-throughput genome sequencing. 출처 : http://www.nature.com/nbt/journal/v30/n3/fig_tab/nbt.2134_F1.html