Alignment of raw reads in Avadis NGS


Published on

Avadis NGS provides support for aligning raw reads for small RNA, ChIP-Seq and DNA-Seq analysis. The alignment algorithm"COBWeb" integrated with Avadis NGS is a new proprietary algorithm based on the Burrows Wheeler Transform.

Published in: Technology, Business
1 Comment
  • Loved the presentation.. It is a well explained presentation..thank you
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Alignment of raw reads in Avadis NGS

  1. 1. Pioneering Scientific IntelligenceDNA/Small RNA Alignmentin Avadis NGS 1.3Strictly Confidential © Strand Life Sciences
  2. 2. How does CoBWeb compare with other What is an Alignment algorithm? algorithms? What issues must an Alignment How is CoBWeb exposed in Avadis algorithm consider? NGS? What is the future evolution ofHow do Alignment algorithms work? CoBWeb? How does CoBWeb work? Questions we will seek to answer in this presentation © Strand
  3. 3. What is an Alignment algorithm? © Strand
  4. 4. Subject’s GenomeAGGCTACGCATTTCCCATAAAGACCCACGCTTAAGTTCAGGCTACGCATGTCCCATAATGACCCACACTTAAGTTC Reference Genome, close but not quite the same as the Subject’s Genome © Strand
  5. 5. What issues must an Alignment algorithm consider? © Strand
  6. 6. Mismatches and Gaps Reference GenomeDeletion Reads SNP © Strand
  7. 7. Handling paired reads Subject’s Genome × Reference Genome Repeat Repeat Region Region © Strand
  8. 8. A variety ofRead Lengths Short reads ~50, few mismatches and gaps Long reads, few hundreds to thousands, ma ny more mismatches and gaps © Strand
  9. 9. Speed and Memory Run in 4GB RAM Allow use of multiple Billions of cores/process reads. ors Scale speed with more memory © Strand
  10. 10. How do Alignment algorithms work? © Strand
  11. 11. Indexing the Genome to find Seed Matches Scanning the Reference for each Read takes too long The Reference Index The Index very quickly yields locations in the Reference where some part (seed) of the Read matches.This Seed occurs at This Seed occurs atReference locations Reference locations x1, x2… x3, x4… © Strand
  12. 12. Detailed Alignment at Seed Match Locations SeedReference Match Read How many Mismatches and Gaps are needed for the Read to match around the Seed? Smith-Waterman or Dynamic Programming © Strand
  13. 13. The Burrows-Wheeler based Index The original Reference C G A C $ All its circular shifts, sorted A C $ C G This column is 2 the BWT lexicographically 0 C G A C $ 3 C $ C G A 1 G A C $ C Circular Shift Indices 4 $ C G A C The Index These can be sampled comprises these to fit into reduced along with some memory at the expense housekeeping data of speed without structures sacrificing correctness © Strand
  14. 14. The Burrows-Wheeler based Index EXACT Reference Match Read All Exact Matches of a Read (NO Mismatches or Gaps) in the Reference can be found in time proportional to the length of the Read and largely independent of the size of the Reference. © Strand
  15. 15. How does CoBWeb work? © Strand
  16. 16. SeedingStrategy This 15-mer occurs This 15-mer occurs at locations at locations x1, x2… x3, x4… This whole 30-mer occurs at location x5 Use the BW based index, augmented with additional data structures for speed, to find one or more Long Seed Matches in the Reference Justification: Most long Reads do not have Mismatches and Gaps strewn across their length; And Long Seeds there are usually long will have few stretches that match matching locations. exactly. © Strand
  17. 17. Advantages Separating the Smith- Seed length is not Waterman phase from specified in advance, so the BW Index search Long and Short reads can allows an unlimited be handled seamlessly. number of gaps and mismatches. © Strand
  18. 18. How does CoBWeb compare with other algorithms? © Strand
  19. 19. Comparison with BWA CoBWeb: 94% BWA: 4% Alignment error + 1 gap Read Score with up of possibly Length 50 to 2 Gaps multiple length Read Length 150 A little faster than BWA with comparable results © Strand
  20. 20. How is CoBWeb exposed in Avadis NGS? © Strand
  21. 21. Entry Two new experiment types, DNA Alignment and Small-RNA Alignment © Strand
  22. 22. The Alignment Workflow Run Alignment, and then create a DNA Variant or ChIP-Seq Experiment from the results. © Strand
  23. 23. Specify number of Alignment Mismatches andParameters Gaps, and handling of Multiple Matching. Specify Adaptor Trimming (only for Small RNA) and 3’,5’ trimming based on quality Screen against Contaminant Databases. © Strand
  24. 24. What is the future evolution of CoBWeb? © Strand
  25. 25. ToDos Chimeric Reads RNA-Seq Alignment Base Quality recalibration Affine Gap Costs © Strand
  26. 26. © Strand