Alignment of raw reads in Avadis NGS

  • 617 views
Uploaded on

Avadis NGS provides support for aligning raw reads for small RNA, ChIP-Seq and DNA-Seq analysis. The alignment algorithm"COBWeb" integrated with Avadis NGS is a new proprietary algorithm based on the …

Avadis NGS provides support for aligning raw reads for small RNA, ChIP-Seq and DNA-Seq analysis. The alignment algorithm"COBWeb" integrated with Avadis NGS is a new proprietary algorithm based on the Burrows Wheeler Transform.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Loved the presentation.. It is a well explained presentation..thank you
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads

Views

Total Views
617
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
8
Comments
1
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Pioneering Scientific IntelligenceDNA/Small RNA Alignmentin Avadis NGS 1.3Strictly Confidential © Strand Life Sciences
  • 2. How does CoBWeb compare with other What is an Alignment algorithm? algorithms? What issues must an Alignment How is CoBWeb exposed in Avadis algorithm consider? NGS? What is the future evolution ofHow do Alignment algorithms work? CoBWeb? How does CoBWeb work? Questions we will seek to answer in this presentation © Strand
  • 3. What is an Alignment algorithm? © Strand
  • 4. Subject’s GenomeAGGCTACGCATTTCCCATAAAGACCCACGCTTAAGTTCAGGCTACGCATGTCCCATAATGACCCACACTTAAGTTC Reference Genome, close but not quite the same as the Subject’s Genome © Strand
  • 5. What issues must an Alignment algorithm consider? © Strand
  • 6. Mismatches and Gaps Reference GenomeDeletion Reads SNP © Strand
  • 7. Handling paired reads Subject’s Genome × Reference Genome Repeat Repeat Region Region © Strand
  • 8. A variety ofRead Lengths Short reads ~50, few mismatches and gaps Long reads, few hundreds to thousands, ma ny more mismatches and gaps © Strand
  • 9. Speed and Memory Run in 4GB RAM Allow use of multiple Billions of cores/process reads. ors Scale speed with more memory © Strand
  • 10. How do Alignment algorithms work? © Strand
  • 11. Indexing the Genome to find Seed Matches Scanning the Reference for each Read takes too long The Reference Index The Index very quickly yields locations in the Reference where some part (seed) of the Read matches.This Seed occurs at This Seed occurs atReference locations Reference locations x1, x2… x3, x4… © Strand
  • 12. Detailed Alignment at Seed Match Locations SeedReference Match Read How many Mismatches and Gaps are needed for the Read to match around the Seed? Smith-Waterman or Dynamic Programming © Strand
  • 13. The Burrows-Wheeler based Index The original Reference C G A C $ All its circular shifts, sorted A C $ C G This column is 2 the BWT lexicographically 0 C G A C $ 3 C $ C G A 1 G A C $ C Circular Shift Indices 4 $ C G A C The Index These can be sampled comprises these to fit into reduced along with some memory at the expense housekeeping data of speed without structures sacrificing correctness © Strand
  • 14. The Burrows-Wheeler based Index EXACT Reference Match Read All Exact Matches of a Read (NO Mismatches or Gaps) in the Reference can be found in time proportional to the length of the Read and largely independent of the size of the Reference. © Strand
  • 15. How does CoBWeb work? © Strand
  • 16. SeedingStrategy This 15-mer occurs This 15-mer occurs at locations at locations x1, x2… x3, x4… This whole 30-mer occurs at location x5 Use the BW based index, augmented with additional data structures for speed, to find one or more Long Seed Matches in the Reference Justification: Most long Reads do not have Mismatches and Gaps strewn across their length; And Long Seeds there are usually long will have few stretches that match matching locations. exactly. © Strand
  • 17. Advantages Separating the Smith- Seed length is not Waterman phase from specified in advance, so the BW Index search Long and Short reads can allows an unlimited be handled seamlessly. number of gaps and mismatches. © Strand
  • 18. How does CoBWeb compare with other algorithms? © Strand
  • 19. Comparison with BWA CoBWeb: 94% BWA: 4% Alignment error + 1 gap Read Score with up of possibly Length 50 to 2 Gaps multiple length Read Length 150 A little faster than BWA with comparable results © Strand
  • 20. How is CoBWeb exposed in Avadis NGS? © Strand
  • 21. Entry Two new experiment types, DNA Alignment and Small-RNA Alignment © Strand
  • 22. The Alignment Workflow Run Alignment, and then create a DNA Variant or ChIP-Seq Experiment from the results. © Strand
  • 23. Specify number of Alignment Mismatches andParameters Gaps, and handling of Multiple Matching. Specify Adaptor Trimming (only for Small RNA) and 3’,5’ trimming based on quality Screen against Contaminant Databases. © Strand
  • 24. What is the future evolution of CoBWeb? © Strand
  • 25. ToDos Chimeric Reads RNA-Seq Alignment Base Quality recalibration Affine Gap Costs © Strand
  • 26. http://www.avadis-ngs.com © Strand