Accurate detection of low frequency
genetic variants using novel, molecular
tagged sequencing adapters
Mirna Jarosz, PhD
Integrated DNA Technologies, Inc
Webinar—November 16, 2016
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
2
Precision health and oncology
• White House Precision Health Initiative mission statement:
– To enable a new era of medicine through research, technology, and
policies that empower patients, researchers, and providers to work
together toward development of individualized care.
• NCI and cancer.gov define precision medicine as:
– Discovering unique therapies that treat an individual’s cancer based on
the specific abnormalities of their tumor.
From www.cancer.gov
Critical-to-know mutation profile to treat lung cancer
Li T, Kung H-J, et al. (2013) Genotyping and genomic profiling of non–small-cell lung
cancer: Implications for current and future therapies. J Clin Oncol, 31(8):1039–1049. 4
Sufficient DNA is a challenge, and lung biopsies are invasive
Hagemann IS, Devarakonda S, et al. (2015) Clinical next-generation sequencing in patients with
non–small cell lung cancer. Cancer, 121(4):631–639. 5
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
6
Circulating cell-free DNA as a “liquid biopsy”
7
Bettegowda C, Sausen M, et al. (2014) Detection of circulating tumor DNA in early- and late-stage
human malignancies. Sci Transl Med, 6(224):224ra224.
• Less invasive than performing a
tissue biopsy
• Theoretically represents the full
tumor heterogeneity better than a
localized biopsy sample
• Facilitates on-going, highly
personalized monitoring
Demand for higher sensitivity
8
Early detection, monitoring for residual disease, detecting resistance
mutations, tumor profiling when biopsies are not possible
Clin Cancer Res 2014, 20(17):4613–4624
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
9
Sample -> sequencer-ready = library construction
10
Library construction
11
Fragmentation
End	repair	and	A-tailing
Adapter	ligation
Bead	cleanup
Library	amplification
Bead	cleanup
• Detecting low frequency variants requires
ultra-deep coverage
– Whole genome sequencing
– Whole exome sequencing
– Focused Targeted Panels
• IDT xGen® Lockdown® Probes
– Individually synthesized
– Individual QC for every probe
– Individually normalized
– Pooled
NGS target capture enrichment
12
Target enrichment using hybridization
13
xGen® Lockdown® Probes are individually synthesized and
QCed
Each xGen® Lockdown® Probe receives an individual ESI-MS analysis
14
Failed Remade
Full	length
Truncated
Full	length
Individual synthesis and QC means uniform and complete coverage
15
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
16
Levels of error correction and sensitivity
17
Adapter structures
Standard
P5
Dual sample
indexes
P5
A
TA
T
P7
P7
P5
P5
A
TA
T
P7
P7 UMI
UMI
18
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
19
Experimental details
20
Outline
• Review
– The growing need for accurate detection of low frequency variants
– Liquid biopsies: what are they, why are they important, and what makes them
challenging?
– Library preparation and target enrichment
• Experimental results
– New adapters containing unique molecular identifiers
– Model system for assessing accuracy of low frequency variant detection
– Analysis methods and accuracy results
22
Analysis summary
• Libraries were captured with a set of custom xGen® Lockdown® Probes
covering 288 common SNP sites for a total target area of ~35kb
• Variant calling performed with VarDict using a threshold variant frequency
of 0.25%
• No UMI analysis uses standard start/stop information to remove apparent
PCR duplicates
• UMI analysis adds back in unique molecules that just happened to share
start/stop sites
• Consensus analysis requires at least three reads from a unique molecule
and uses their consensus as input into variant calling
23
Sensitivity and positive predictive value (PPV) with
consensus analysis with 0.25% variant frequency threshold
TP
Total reads
TP
De-dup with start/stop
TP
With UMIs
97
98
99
100
20 40 60 80 100
PPV (%)
Sensitivity(%)
No UMI
(Start/Stop)
UMI
TP
De-dup with UMIs
Mean de-duped coverage
No UMI
(Start/Stop)
0
2000
4000
6000
8000
UMI
24
Sensitivity and positive predictive value (PPV) with
consensus analysis with 0.25% variant frequency threshold
TP
Consensus
97
98
99
100
20 40 60 80 100
PPV (%)
Sensitivity(%)
No UMI
(Start/Stop)
UMI
Consensus
Mean de-duped coverage
No UMI
(Start/Stop)
UMI
0
2000
4000
6000
8000
Consensus
25
Summary: sensitivity and specificity for SNVs
FP called FP filtered TP called TP filtered TP missing Sensitivity PPV
No UMI
(Start/stop)
641 2 241 0 1 99.59% 27.32%
UMI 368 0 239 0 3 98.76% 39.37%
Consensus only 2 13 239 0 3 98.76% 99.17%
26
Consensus calling reduces false positives
27
Error reduction by base
0
0.02
0.04
A>C A>G A>T C>A C>G C>T
Base substitution
Errorrate(%)
UMI
No UMI (Start/stop)
Consensus
28
Deeper consensus data
Family size
Normalizedcounts
Original sample
Deeper consensus coverage
29
Deeper consensus drives down C>A / G>T error rate
0
0.01
0.02
0.03
0.04
No UMI
Consensus minimum
Errorrate(%)
Error
C>A
C>T
A>T
A>C
A>G
C>G
2 3 4 5 6 7 8` 9 10
30
Conclusions
• Without molecular barcoding, it is difficult to distinguish true and false
positives at frequencies below ~5%
• The addition of UMI’s to the ligation adaptors increases unique coverage
due to the rescue of “false” PCR duplicates
• Using UMIs to build consensus reads dramatically increases variant calling
accuracy
– With minimal changes to sensitivity, the number of false positives dropped
~300-fold
– Considering variants down to 0.25% frequency, PPV was increased from
27% to >99%, while keeping sensitivity above 98.5%
31
“Best tech
support ever,
@idtdna!”
Questions?
TALK TO A PERSON.
Lauren SakowskiOur experts are available for consultation.
“The people
at @idtdna are
awesome. A+ for
customer service.”
Nikolai Braun
Contact us by web chat, email, or phone.
Find local contact details at: www.idtdna.com
Or email: applicationsupport@idtdna.com
THANK YOU!
We will email you the webinar recording
and slides next week.

Accurate detection of low frequency genetic variants using novel, molecular tagged sequencing adapters

  • 1.
    Accurate detection oflow frequency genetic variants using novel, molecular tagged sequencing adapters Mirna Jarosz, PhD Integrated DNA Technologies, Inc Webinar—November 16, 2016
  • 2.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 2
  • 3.
    Precision health andoncology • White House Precision Health Initiative mission statement: – To enable a new era of medicine through research, technology, and policies that empower patients, researchers, and providers to work together toward development of individualized care. • NCI and cancer.gov define precision medicine as: – Discovering unique therapies that treat an individual’s cancer based on the specific abnormalities of their tumor. From www.cancer.gov
  • 4.
    Critical-to-know mutation profileto treat lung cancer Li T, Kung H-J, et al. (2013) Genotyping and genomic profiling of non–small-cell lung cancer: Implications for current and future therapies. J Clin Oncol, 31(8):1039–1049. 4
  • 5.
    Sufficient DNA isa challenge, and lung biopsies are invasive Hagemann IS, Devarakonda S, et al. (2015) Clinical next-generation sequencing in patients with non–small cell lung cancer. Cancer, 121(4):631–639. 5
  • 6.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 6
  • 7.
    Circulating cell-free DNAas a “liquid biopsy” 7 Bettegowda C, Sausen M, et al. (2014) Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med, 6(224):224ra224. • Less invasive than performing a tissue biopsy • Theoretically represents the full tumor heterogeneity better than a localized biopsy sample • Facilitates on-going, highly personalized monitoring
  • 8.
    Demand for highersensitivity 8 Early detection, monitoring for residual disease, detecting resistance mutations, tumor profiling when biopsies are not possible Clin Cancer Res 2014, 20(17):4613–4624
  • 9.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 9
  • 10.
    Sample -> sequencer-ready= library construction 10
  • 11.
  • 12.
    • Detecting lowfrequency variants requires ultra-deep coverage – Whole genome sequencing – Whole exome sequencing – Focused Targeted Panels • IDT xGen® Lockdown® Probes – Individually synthesized – Individual QC for every probe – Individually normalized – Pooled NGS target capture enrichment 12
  • 13.
    Target enrichment usinghybridization 13
  • 14.
    xGen® Lockdown® Probesare individually synthesized and QCed Each xGen® Lockdown® Probe receives an individual ESI-MS analysis 14 Failed Remade Full length Truncated Full length
  • 15.
    Individual synthesis andQC means uniform and complete coverage 15
  • 16.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 16
  • 17.
    Levels of errorcorrection and sensitivity 17
  • 18.
  • 19.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 19
  • 20.
  • 22.
    Outline • Review – Thegrowing need for accurate detection of low frequency variants – Liquid biopsies: what are they, why are they important, and what makes them challenging? – Library preparation and target enrichment • Experimental results – New adapters containing unique molecular identifiers – Model system for assessing accuracy of low frequency variant detection – Analysis methods and accuracy results 22
  • 23.
    Analysis summary • Librarieswere captured with a set of custom xGen® Lockdown® Probes covering 288 common SNP sites for a total target area of ~35kb • Variant calling performed with VarDict using a threshold variant frequency of 0.25% • No UMI analysis uses standard start/stop information to remove apparent PCR duplicates • UMI analysis adds back in unique molecules that just happened to share start/stop sites • Consensus analysis requires at least three reads from a unique molecule and uses their consensus as input into variant calling 23
  • 24.
    Sensitivity and positivepredictive value (PPV) with consensus analysis with 0.25% variant frequency threshold TP Total reads TP De-dup with start/stop TP With UMIs 97 98 99 100 20 40 60 80 100 PPV (%) Sensitivity(%) No UMI (Start/Stop) UMI TP De-dup with UMIs Mean de-duped coverage No UMI (Start/Stop) 0 2000 4000 6000 8000 UMI 24
  • 25.
    Sensitivity and positivepredictive value (PPV) with consensus analysis with 0.25% variant frequency threshold TP Consensus 97 98 99 100 20 40 60 80 100 PPV (%) Sensitivity(%) No UMI (Start/Stop) UMI Consensus Mean de-duped coverage No UMI (Start/Stop) UMI 0 2000 4000 6000 8000 Consensus 25
  • 26.
    Summary: sensitivity andspecificity for SNVs FP called FP filtered TP called TP filtered TP missing Sensitivity PPV No UMI (Start/stop) 641 2 241 0 1 99.59% 27.32% UMI 368 0 239 0 3 98.76% 39.37% Consensus only 2 13 239 0 3 98.76% 99.17% 26
  • 27.
    Consensus calling reducesfalse positives 27
  • 28.
    Error reduction bybase 0 0.02 0.04 A>C A>G A>T C>A C>G C>T Base substitution Errorrate(%) UMI No UMI (Start/stop) Consensus 28
  • 29.
    Deeper consensus data Familysize Normalizedcounts Original sample Deeper consensus coverage 29
  • 30.
    Deeper consensus drivesdown C>A / G>T error rate 0 0.01 0.02 0.03 0.04 No UMI Consensus minimum Errorrate(%) Error C>A C>T A>T A>C A>G C>G 2 3 4 5 6 7 8` 9 10 30
  • 31.
    Conclusions • Without molecularbarcoding, it is difficult to distinguish true and false positives at frequencies below ~5% • The addition of UMI’s to the ligation adaptors increases unique coverage due to the rescue of “false” PCR duplicates • Using UMIs to build consensus reads dramatically increases variant calling accuracy – With minimal changes to sensitivity, the number of false positives dropped ~300-fold – Considering variants down to 0.25% frequency, PPV was increased from 27% to >99%, while keeping sensitivity above 98.5% 31
  • 32.
    “Best tech support ever, @idtdna!” Questions? TALKTO A PERSON. Lauren SakowskiOur experts are available for consultation. “The people at @idtdna are awesome. A+ for customer service.” Nikolai Braun Contact us by web chat, email, or phone. Find local contact details at: www.idtdna.com Or email: applicationsupport@idtdna.com
  • 33.
    THANK YOU! We willemail you the webinar recording and slides next week.