Dual index adapters with UMIs resolve index hopping
and increase sensitivity of variant detection
Nick Downey, PhD
Applications Scientist
1
Outline
• Description of new xGen® Dual Index UMI Adapters
• Overview of NGS workflow that includes sample multiplexing
• Discussion of cross-talk
– Sources
– Reduction with dual index adapters
• Discussion of accurate variant detection with UMI error correction
• Additional resources
UMI = unique molecular identifier
2
New xGen Dual Index UMI Adapter
3
3-in-1 design
• Designed for Illumina sequencers
• Compatible with standard, end repair and A-tailing library construction,
including PCR-free library methods
• Reduced sample cross-talk with dual, unique sample indexes
• Ideal for error correction and/or counting applications through use of a
degenerate, 9 bp UMI
xGen Dual Index UMI Adapter
4
3-in-1 design
NGS workflow
5
Library construction
6
Fragmentation
End repair and A-tailing
Adapter ligation
Bead cleanup
Library amplification
Bead cleanup
• Y-adapters: 13 bp are complementary
• Library conversion of top and bottom strand
NGS target capture enrichment
7
• IDT xGen Lockdown® Probes
– Individually synthesized
– Individual QC for every probe
– Individually normalized
– Pooled
What is sample cross-talk?
8
• Reads are assigned to the wrong sample
Applications that can be impacted by cross-talk
• Low-frequency somatic variant detection—false positives from
other samples
• Ancient DNA research—a single sequence may support DNA
survival or contamination
• Viral detection—false positives from other samples
• Gene expression—bleed over from one sample to another
• Microbial profiling—bleed over from one sample to another
9
Kircher M, Sawyer S, Meyer M. (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the
Illumina platform. Nucleic Acids Res, 40(1):e3.
D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for
16S rRNA community profiling. BMC Genomics, 17:55.
Sources of sample cross-talk
10
• Index contamination
• Index hopping
• Misread bases
within sample index
• Sample carryover
from previous runs
• Index
mis-assignment
• Index hopping
D’Amore R, Ijaz UZ, et al. (2016) BMC Genomics, 17:55.
Combinatorial indexing is susceptible to cross-talk
11
Unique, dual indexes reduce contamination
mis-assignment
12
• Index contamination may occur on any sequencing platform
Unique, dual indexes mitigate index hopping
during multiplexed target enrichment
13
• There are low levels of index hopping in multiplexed target enrichment
• Index hopping reads are filtered out using unique dual indexes
Index hopping during multiplexed sequencing
• Patterned flow cells use
exclusion amplification (ExAmp)
chemistry
• ExAmp is associated with
higher levels of index hopping
than bridge amplification
• PCR-free libraries exhibit higher
levels of index hopping than
amplified libraries
14
Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000
DNA sequencing. bioRxiv, doi: https://doi.org/10.1101/125724.
Illumina (2017) Minimize index hopping in multiplexed runs. www.illumina.com/science/education/minimizing-index-hopping.html
[accessed November 9, 2017].
Unique, dual indexes mitigate cross-talk for
PCR-free libraries
• PCR-free libraries were
sequenced on an Illumina
MiSeq with bridge amplification
• Index hopping or adapter
contamination reads are
correctly filtered out with
unique dual index adapters
15
Unique, dual indexes reduce sample cross-talk
16
Co-development of dual index adapters with Illumina
17
IDT manufacturers Illumina UDI adapters
xGen Dual Index UMI Adapter
18
3-in-1 design
Circulating cell-free DNA as a “liquid biopsy”
19
Bettegowda C, Sausen M, et al. (2014) Detection of circulating tumor DNA in early- and late-stage human malignancies.
Sci Transl Med, 6(224):224ra24.
• Less invasive than
performing a tissue
biopsy
• Theoretically represents
tumor heterogeneity
better than a localized
biopsy sample
• Facilitates on-going,
highly personalized
monitoring
Levels of error correction and sensitivity
20
Consensus analysis
21
TP
Total reads
TP
UMI reads
Consensus analysis
22
TP
Total reads
TP
UMI reads
TP
Consensus reads
(Min3)
Consensus analysis
23
TP
Total reads
TP
UMI reads
TP
Consensus reads
(Min3)
Consensus analysis
24
TP
Total reads
TP
UMI reads
TP
Consensus reads
(Min3)
Tumor model system
• 25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to
assess sensitivity and positive predictive value (PPV)
• Libraries were captured with a set of custom xGen® Lockdown® Probes
covering 288 common SNP sites for a total target area of ~35kb
• Variant calling was performed with VarDict
25
Consensus analysis increases variant calling accuracy
26
All expected variants
0.2% variant calling threshold
Consensus analysis increases variant calling accuracy
27
Low frequency variant calling accuracy (≤1%)
54 low frequency variants
Oxidative errors are removed using consensus analysis
28
Costello M, Pugh TJ, et al. (2013) Discovery and characterization of artifactual mutations in deep coverage targeted
capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res, 41(6):e67.
Accurate variant detection with UMI error correction
• Without molecular barcoding, it is difficult to distinguish true and
false positives at frequencies below ~1%
• Using UMIs to build consensus reads dramatically increases variant
calling accuracy down to 0.5% using 25 ng of input
– With minimal changes to sensitivity, the number of false positives
dropped ~30-fold using Dual Index UMI adapters
– Considering low frequency variants from 1% to 0.5%, PPV was
increased from 26% to 92%, while keeping sensitivity above 87% using a
variant calling threshold of 0.2%
– Oxidative errors were removed using consensus analysis
29
Ordering – idtdna.com
30
Ordering
31
Index sequences
32
Index sequences
33
Distinct, unique dual indexes
Conclusions
34
3-in-1 design
• Dual index UMI adapters resolve index hopping and enable accurate
low frequency variant detection
• IDT has 384 predesigned unique indexes for custom adapters
– Edit distance ≥3, color balanced as sets of 4,
designed for 2- and 4-color sequencers and with 50% GC content
– Adapters are made-to-order
• Submit orders to customquotes@idtdna.com
For help with custom adapter designs, contact Application Support
applicationsupport@idtdna.com
Additional resources
• Analysis guidelines
• Webinars
• Posters
35
THANK YOU
36

Dual index adapters with UMIs resolve index hopping and increase sensitivity of variant detection

  • 1.
    Dual index adapterswith UMIs resolve index hopping and increase sensitivity of variant detection Nick Downey, PhD Applications Scientist 1
  • 2.
    Outline • Description ofnew xGen® Dual Index UMI Adapters • Overview of NGS workflow that includes sample multiplexing • Discussion of cross-talk – Sources – Reduction with dual index adapters • Discussion of accurate variant detection with UMI error correction • Additional resources UMI = unique molecular identifier 2
  • 3.
    New xGen DualIndex UMI Adapter 3 3-in-1 design • Designed for Illumina sequencers • Compatible with standard, end repair and A-tailing library construction, including PCR-free library methods • Reduced sample cross-talk with dual, unique sample indexes • Ideal for error correction and/or counting applications through use of a degenerate, 9 bp UMI
  • 4.
    xGen Dual IndexUMI Adapter 4 3-in-1 design
  • 5.
  • 6.
    Library construction 6 Fragmentation End repairand A-tailing Adapter ligation Bead cleanup Library amplification Bead cleanup • Y-adapters: 13 bp are complementary • Library conversion of top and bottom strand
  • 7.
    NGS target captureenrichment 7 • IDT xGen Lockdown® Probes – Individually synthesized – Individual QC for every probe – Individually normalized – Pooled
  • 8.
    What is samplecross-talk? 8 • Reads are assigned to the wrong sample
  • 9.
    Applications that canbe impacted by cross-talk • Low-frequency somatic variant detection—false positives from other samples • Ancient DNA research—a single sequence may support DNA survival or contamination • Viral detection—false positives from other samples • Gene expression—bleed over from one sample to another • Microbial profiling—bleed over from one sample to another 9 Kircher M, Sawyer S, Meyer M. (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res, 40(1):e3. D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling. BMC Genomics, 17:55.
  • 10.
    Sources of samplecross-talk 10 • Index contamination • Index hopping • Misread bases within sample index • Sample carryover from previous runs • Index mis-assignment • Index hopping D’Amore R, Ijaz UZ, et al. (2016) BMC Genomics, 17:55.
  • 11.
    Combinatorial indexing issusceptible to cross-talk 11
  • 12.
    Unique, dual indexesreduce contamination mis-assignment 12 • Index contamination may occur on any sequencing platform
  • 13.
    Unique, dual indexesmitigate index hopping during multiplexed target enrichment 13 • There are low levels of index hopping in multiplexed target enrichment • Index hopping reads are filtered out using unique dual indexes
  • 14.
    Index hopping duringmultiplexed sequencing • Patterned flow cells use exclusion amplification (ExAmp) chemistry • ExAmp is associated with higher levels of index hopping than bridge amplification • PCR-free libraries exhibit higher levels of index hopping than amplified libraries 14 Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing. bioRxiv, doi: https://doi.org/10.1101/125724. Illumina (2017) Minimize index hopping in multiplexed runs. www.illumina.com/science/education/minimizing-index-hopping.html [accessed November 9, 2017].
  • 15.
    Unique, dual indexesmitigate cross-talk for PCR-free libraries • PCR-free libraries were sequenced on an Illumina MiSeq with bridge amplification • Index hopping or adapter contamination reads are correctly filtered out with unique dual index adapters 15
  • 16.
    Unique, dual indexesreduce sample cross-talk 16
  • 17.
    Co-development of dualindex adapters with Illumina 17 IDT manufacturers Illumina UDI adapters
  • 18.
    xGen Dual IndexUMI Adapter 18 3-in-1 design
  • 19.
    Circulating cell-free DNAas a “liquid biopsy” 19 Bettegowda C, Sausen M, et al. (2014) Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med, 6(224):224ra24. • Less invasive than performing a tissue biopsy • Theoretically represents tumor heterogeneity better than a localized biopsy sample • Facilitates on-going, highly personalized monitoring
  • 20.
    Levels of errorcorrection and sensitivity 20
  • 21.
  • 22.
    Consensus analysis 22 TP Total reads TP UMIreads TP Consensus reads (Min3)
  • 23.
    Consensus analysis 23 TP Total reads TP UMIreads TP Consensus reads (Min3)
  • 24.
    Consensus analysis 24 TP Total reads TP UMIreads TP Consensus reads (Min3)
  • 25.
    Tumor model system •25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to assess sensitivity and positive predictive value (PPV) • Libraries were captured with a set of custom xGen® Lockdown® Probes covering 288 common SNP sites for a total target area of ~35kb • Variant calling was performed with VarDict 25
  • 26.
    Consensus analysis increasesvariant calling accuracy 26 All expected variants 0.2% variant calling threshold
  • 27.
    Consensus analysis increasesvariant calling accuracy 27 Low frequency variant calling accuracy (≤1%) 54 low frequency variants
  • 28.
    Oxidative errors areremoved using consensus analysis 28 Costello M, Pugh TJ, et al. (2013) Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res, 41(6):e67.
  • 29.
    Accurate variant detectionwith UMI error correction • Without molecular barcoding, it is difficult to distinguish true and false positives at frequencies below ~1% • Using UMIs to build consensus reads dramatically increases variant calling accuracy down to 0.5% using 25 ng of input – With minimal changes to sensitivity, the number of false positives dropped ~30-fold using Dual Index UMI adapters – Considering low frequency variants from 1% to 0.5%, PPV was increased from 26% to 92%, while keeping sensitivity above 87% using a variant calling threshold of 0.2% – Oxidative errors were removed using consensus analysis 29
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    Conclusions 34 3-in-1 design • Dualindex UMI adapters resolve index hopping and enable accurate low frequency variant detection • IDT has 384 predesigned unique indexes for custom adapters – Edit distance ≥3, color balanced as sets of 4, designed for 2- and 4-color sequencers and with 50% GC content – Adapters are made-to-order • Submit orders to customquotes@idtdna.com For help with custom adapter designs, contact Application Support applicationsupport@idtdna.com
  • 35.
    Additional resources • Analysisguidelines • Webinars • Posters 35
  • 36.