Unique, dual-matched adapters mitigate
index hopping between NGS samples
Kristina Giorda, PhD
1
Outline
• NGS workflow and cross-talk
• Sources of sample cross-talk and mitigation strategies
• Adapter recommendations
2
NGS workflow
3
Library construction
4
Fragmentation
End repair and A-tailing
Adapter ligation
Bead cleanup
Library amplification
Bead cleanup
• Y-adapters: 13 bp are complementary
• Library conversion of top and bottom strand
Indexing strategies
5
Ligation
Library amplification
NGS target capture enrichment
6
• IDT xGen® Lockdown® Probes
– Individually synthesized
– Individual QC for every probe
– Individually normalized
– Pooled
Sequencing
7
What is sample cross-talk?
8
Reads are assigned to the wrong sample
Applications that might be impacted by cross-talk
• Low-frequency somatic variant detection—false positives from other
samples
• Ancient DNA research—a single sequence may support DNA
survival or contamination
• Viral detection—false positives from other samples
• Gene expression—bleed over from one sample to another
• Microbial profiling—bleed over from one sample to another
9
Kircher M, Sawyer S, et al. (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina
platform. Nucleic Acids Research, 40(1):e3–e3.
D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for 16s
rRNA community profiling. BMC Genomics, 17(1):55.
Custom dual indices
10
AGBT 2014
IDT has been making custom dual indices for a long time
Sources of sample cross-talk
• Contamination
• Index hopping during multiplex capture
• Index hopping during cluster amplification
• Misread bases within index sequences
• Sample carryover from previous sequencing runs
11
D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing
platforms for 16s rRNA community profiling. BMC Genomics, 17(1):55.
Sources of sample cross-talk
12
Index contamination
Combinatorial indexing
13
Unique, dual matched indices reduce contamination mis-
assignment
14
• P5 and P7 ligations are independent
• Index contamination may occur on any sequencing platform
Unique, dual-matched adapters reduce contamination mis-
assignment exponentially
15
• 16 libraries were prepared
and captured with the IDT
xGen® AML Cancer Panel
• Libraries were sequenced on
the NextSeq® System
(Illumina)
• 0.09% of reads were filtered
out that would have been mis-
assigned with combinatorial
indices
Sources of sample cross-talk
16
Index contamination
Index hopping
Index swapping
Index cross-talk
Spreading of signal
Multiplexed target enrichment index hopping
17
• Target enrichment index
hopping primarily occurs
during post-capture PCR
• Index hopping may occur on
the P5 and P7 side
Approach for measuring multiplex capture index
hopping
• 16 libraries were prepared and
1-, 4-, 8-, or 16-plex captures
were performed with the IDT
xGen® AML Cancer Panel as
using 500 ng per library
• Libraries were sequenced on the
NextSeq® System (Illumina)
18
Unique, dual-matched indices mitigate index
hopping during multiplexed target enrichment
19
• There are low levels of index hopping in multiplexed target enrichment
• Index hopping reads are filtered out using unique dual-matched indices
Index hopping reads are effectively filtered out
with unique dual-matched indices
• Index hopping reads are filtered
out with unique dual-matched
adapters
• Reads would have been mis-
assigned with combinatorial
indices
20
Sources of sample cross-talk
21
Index contamination
Index hopping
Index swapping
Index cross-talk
Spreading of signal
Index hopping during multiplexed sequencing
• Patterned flow cells utilize exclusion amplification (ExAmp) chemistry,
associated with more index mis-assignment than bridge amplification
22
Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among multiplexed samples in
Illumina HiSeq 4000 DNA sequencing. bioRxiv.
www.illumina.com/science/education/minimizing-index-hopping.html
Index hopping during multiplexed sequencing
23
www.illumina.com/content/dam/illumina-marketing/documents/products/whitepapers/index-hopping-white-paper-
770-2017-004.pdf?linkId=36607862
• Index hopping can occur on
the P5 and P7 side
Illumina’s recommendations
24
www.illumina.com/content/dam/illumina-marketing/documents/products/whitepapers/index-hopping-white-paper-
770-2017-004.pdf?linkId=36607862
Sources of sample cross-talk
25
Index contamination
Index hopping
Index swapping
Index cross-talk
Spreading of signal
Index mis-assignment
Demultiplexing noise
Index design
• Errors can be introduced during synthesis, library
preparation, amplification, and sequencing
• Need unique sequence tags for each sample
• Consider errors (insertions, deletions, and substitutions)
• Indices based on edit metric or Levenshtein distance are
used to account for substitutions and indels
• Need to: avoid homopolymers, match GC content, exclude
self-complements, color balance, and consider sequencing
platform (4 vs. 2 color)
26
Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and validating sequence
identification tags robust to indels. PLOS ONE, 7(8):e42543.
Edit distance considers substitutions and indels
27
Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and validating sequence
identification tags robust to indels. PLOS ONE, 7(8):e42543.
Index quality filtering
28
Wright ES, Vetsigian KH. (2016) Quality filtering of Illumina index reads mitigates sample
cross-talk. BMC Genomics, 17(1):876.
• Quality filtering index reads
minimizes cross-talk while
preserving the majority of reads
• Unique, dual indexing is required
for highly sensitive applications
• Run-specific and application-
specific thresholds can be used
to minimize cross-talk
Unique, dual-matched indices reduce sample
cross-talk
29
Illumina and IDT partner on NGS multiplexing and
exome enrichment
• The proprietary index kits will be compatible with Illumina library
prep products and sequencers
– Highly optimized for use on platforms with Illumina’s two-channel
chemistry and patterned flow cells such as the NovaSeq™ Series
– Extend the number of unique dual indexes (UDI) from 8 UDIs to 24
– The new 24 UDI kits can be preordered from Illumina now
– Future expansion to 96 UDI kits planned in partnership with Illumina
• The optimized index codes are now available through IDT for
incorporation into custom adapter orders
30
xGen® Dual Index UMI Adapter
31
3-in-1 design
Conclusions
• All sequencing platforms are susceptible to cross-talk for multiplexed
sensitive applications
• Unique dual indices mitigate sample cross-talk and enable sensitive
applications
• IDT has 384 predesigned unique indices for custom adapters
– Edit distance ≥3, color balanced as sets of 4, designed for 2- and 4-color
sequencers, and with 50% GC content
Contact Application Support applicationsupport@idtdna.com for
xGen® Dual Index UMI Adapters or help with custom designs
32
Additional resources
• Kircher M, Sawyer S, et al. (2012) Double indexing overcomes inaccuracies in multiplex
sequencing on the Illumina platform. Nucleic Acids Research, 40(1):e3–e3.
• D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and
sequencing platforms for 16s rRNA community profiling. BMC Genomics, 17(1):55.
• Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among
multiplexed samples in Illumina HiSeq 4000 DNA sequencing. bioRxiv.
• Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and
validating sequence identification tags robust to indels. PLOS ONE, 7(8):e42543.
• Wright ES, Vetsigian KH. (2016) Quality filtering of Illumina index reads mitigates sample cross-
talk. BMC Genomics, 17(1):876.
• enseqlopedia.com/2016/12/index-mis-assignment-between-samples-on-hiseq-4000-and-x-ten/
• www.illumina.com/science/education/minimizing-index-hopping.html
• www.idtdna.com/pages/support/technical-vault/video-library/next-generation-
sequencing/accurate-detection-of-low-frequency-genetic-variants-using-novel-molecular-tagged-
sequencing-adapters
33
THANK YOU
34

Unique, dual-matched adapters mitigate index hopping between NGS samples

  • 1.
    Unique, dual-matched adaptersmitigate index hopping between NGS samples Kristina Giorda, PhD 1
  • 2.
    Outline • NGS workflowand cross-talk • Sources of sample cross-talk and mitigation strategies • Adapter recommendations 2
  • 3.
  • 4.
    Library construction 4 Fragmentation End repairand A-tailing Adapter ligation Bead cleanup Library amplification Bead cleanup • Y-adapters: 13 bp are complementary • Library conversion of top and bottom strand
  • 5.
  • 6.
    NGS target captureenrichment 6 • IDT xGen® Lockdown® Probes – Individually synthesized – Individual QC for every probe – Individually normalized – Pooled
  • 7.
  • 8.
    What is samplecross-talk? 8 Reads are assigned to the wrong sample
  • 9.
    Applications that mightbe impacted by cross-talk • Low-frequency somatic variant detection—false positives from other samples • Ancient DNA research—a single sequence may support DNA survival or contamination • Viral detection—false positives from other samples • Gene expression—bleed over from one sample to another • Microbial profiling—bleed over from one sample to another 9 Kircher M, Sawyer S, et al. (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Research, 40(1):e3–e3. D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for 16s rRNA community profiling. BMC Genomics, 17(1):55.
  • 10.
    Custom dual indices 10 AGBT2014 IDT has been making custom dual indices for a long time
  • 11.
    Sources of samplecross-talk • Contamination • Index hopping during multiplex capture • Index hopping during cluster amplification • Misread bases within index sequences • Sample carryover from previous sequencing runs 11 D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for 16s rRNA community profiling. BMC Genomics, 17(1):55.
  • 12.
    Sources of samplecross-talk 12 Index contamination
  • 13.
  • 14.
    Unique, dual matchedindices reduce contamination mis- assignment 14 • P5 and P7 ligations are independent • Index contamination may occur on any sequencing platform
  • 15.
    Unique, dual-matched adaptersreduce contamination mis- assignment exponentially 15 • 16 libraries were prepared and captured with the IDT xGen® AML Cancer Panel • Libraries were sequenced on the NextSeq® System (Illumina) • 0.09% of reads were filtered out that would have been mis- assigned with combinatorial indices
  • 16.
    Sources of samplecross-talk 16 Index contamination Index hopping Index swapping Index cross-talk Spreading of signal
  • 17.
    Multiplexed target enrichmentindex hopping 17 • Target enrichment index hopping primarily occurs during post-capture PCR • Index hopping may occur on the P5 and P7 side
  • 18.
    Approach for measuringmultiplex capture index hopping • 16 libraries were prepared and 1-, 4-, 8-, or 16-plex captures were performed with the IDT xGen® AML Cancer Panel as using 500 ng per library • Libraries were sequenced on the NextSeq® System (Illumina) 18
  • 19.
    Unique, dual-matched indicesmitigate index hopping during multiplexed target enrichment 19 • There are low levels of index hopping in multiplexed target enrichment • Index hopping reads are filtered out using unique dual-matched indices
  • 20.
    Index hopping readsare effectively filtered out with unique dual-matched indices • Index hopping reads are filtered out with unique dual-matched adapters • Reads would have been mis- assigned with combinatorial indices 20
  • 21.
    Sources of samplecross-talk 21 Index contamination Index hopping Index swapping Index cross-talk Spreading of signal
  • 22.
    Index hopping duringmultiplexed sequencing • Patterned flow cells utilize exclusion amplification (ExAmp) chemistry, associated with more index mis-assignment than bridge amplification 22 Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing. bioRxiv. www.illumina.com/science/education/minimizing-index-hopping.html
  • 23.
    Index hopping duringmultiplexed sequencing 23 www.illumina.com/content/dam/illumina-marketing/documents/products/whitepapers/index-hopping-white-paper- 770-2017-004.pdf?linkId=36607862 • Index hopping can occur on the P5 and P7 side
  • 24.
  • 25.
    Sources of samplecross-talk 25 Index contamination Index hopping Index swapping Index cross-talk Spreading of signal Index mis-assignment Demultiplexing noise
  • 26.
    Index design • Errorscan be introduced during synthesis, library preparation, amplification, and sequencing • Need unique sequence tags for each sample • Consider errors (insertions, deletions, and substitutions) • Indices based on edit metric or Levenshtein distance are used to account for substitutions and indels • Need to: avoid homopolymers, match GC content, exclude self-complements, color balance, and consider sequencing platform (4 vs. 2 color) 26 Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and validating sequence identification tags robust to indels. PLOS ONE, 7(8):e42543.
  • 27.
    Edit distance considerssubstitutions and indels 27 Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and validating sequence identification tags robust to indels. PLOS ONE, 7(8):e42543.
  • 28.
    Index quality filtering 28 WrightES, Vetsigian KH. (2016) Quality filtering of Illumina index reads mitigates sample cross-talk. BMC Genomics, 17(1):876. • Quality filtering index reads minimizes cross-talk while preserving the majority of reads • Unique, dual indexing is required for highly sensitive applications • Run-specific and application- specific thresholds can be used to minimize cross-talk
  • 29.
    Unique, dual-matched indicesreduce sample cross-talk 29
  • 30.
    Illumina and IDTpartner on NGS multiplexing and exome enrichment • The proprietary index kits will be compatible with Illumina library prep products and sequencers – Highly optimized for use on platforms with Illumina’s two-channel chemistry and patterned flow cells such as the NovaSeq™ Series – Extend the number of unique dual indexes (UDI) from 8 UDIs to 24 – The new 24 UDI kits can be preordered from Illumina now – Future expansion to 96 UDI kits planned in partnership with Illumina • The optimized index codes are now available through IDT for incorporation into custom adapter orders 30
  • 31.
    xGen® Dual IndexUMI Adapter 31 3-in-1 design
  • 32.
    Conclusions • All sequencingplatforms are susceptible to cross-talk for multiplexed sensitive applications • Unique dual indices mitigate sample cross-talk and enable sensitive applications • IDT has 384 predesigned unique indices for custom adapters – Edit distance ≥3, color balanced as sets of 4, designed for 2- and 4-color sequencers, and with 50% GC content Contact Application Support applicationsupport@idtdna.com for xGen® Dual Index UMI Adapters or help with custom designs 32
  • 33.
    Additional resources • KircherM, Sawyer S, et al. (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Research, 40(1):e3–e3. • D’Amore R, Ijaz UZ, et al. (2016) A comprehensive benchmarking study of protocols and sequencing platforms for 16s rRNA community profiling. BMC Genomics, 17(1):55. • Sinha R, Stanley G, et al. (2017) Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing. bioRxiv. • Faircloth BC, Glenn TC. (2012) Not all sequence tags are created equal: Designing and validating sequence identification tags robust to indels. PLOS ONE, 7(8):e42543. • Wright ES, Vetsigian KH. (2016) Quality filtering of Illumina index reads mitigates sample cross- talk. BMC Genomics, 17(1):876. • enseqlopedia.com/2016/12/index-mis-assignment-between-samples-on-hiseq-4000-and-x-ten/ • www.illumina.com/science/education/minimizing-index-hopping.html • www.idtdna.com/pages/support/technical-vault/video-library/next-generation- sequencing/accurate-detection-of-low-frequency-genetic-variants-using-novel-molecular-tagged- sequencing-adapters 33
  • 34.

Editor's Notes

  • #25 Using Picard’s demultiplex tool unexpected index combinations end up in the unmatched file
  • #27 Errors can be introduced during synthesis, library preparation, amplification, and sequencing