Cancer is a complex, heterogeneous disease of the genome. Most cancers result
from an accumulation of multiple genetic alterations that lead to dysfunction of cancer-associated
genes and pathways. Recent advances in sequencing technology have enabled comprehensive
profiling of genetic alterations in cancer. We have established a targeted sequencing platform
(IMPACT: Integrated Mutation Profiling of Actionable Cancer Targets) using hybridization capture and
next-generation sequencing (NGS) technology, which can reveal mutations, indels and copy number
alterations involving 340 cancer related genes.
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Developing a framework for for detection of low frequency somatic genetic alterations in targeted sequencing data
1. Developing a framework for for detection of low frequency somatic genetic alterations in targeted
Background
BACKGROUND: Cancer is a complex, heterogeneous disease of the genome. Most cancers result
from an accumulation of multiple genetic alterations that lead to dysfunction of cancer-associated
genes and pathways. Recent advances in sequencing technology have enabled comprehensive
profiling of genetic alterations in cancer. We have established a targeted sequencing platform
(IMPACT: Integrated Mutation Profiling of Actionable Cancer Targets) using hybridization capture and
next-generation sequencing (NGS) technology, which can reveal mutations, indels and copy number
alterations involving 340 cancer related genes.
METHOD: To identify mutations, indels, and copy number alterations, we present a unified analytic
framework developed in perl to discover and genotype variation among multiple samples
simultaneously with high sensitivity and specificity. Our framework incorporates many elements that
have become standard practice for NGS data analysis such as i) adaptor trimming, ii) mapping and
duplicate masking, iii) local realignment around indels, iv) base quality score recalibration, v) SNV
and indel calling, vi) annotation, and vii) filtering. Importantly, we utilize a tumor-normal pair approach,
where each tumor is always processed with a matched normal sample in order distinguish somatic
mutations from inherited variants. Local realignment is performed jointly for all samples from the
same patient to maximize the sensitivity and specificity for detecting somatic indels. To distinguish
true low-frequency somatic mutations from systematic sequencing artifacts, we genotype each
candidate sequence variant in a collection of unmatched normal samples from multiple sequence
runs. Filtering based on genotyping and genomic annotation not only eliminates sequencing artifacts
but also provides confidence in the calls that are made. We have applied this framework to analyze
deep coverage targeted sequencing data from >1,000 archived tumor specimens and have
implemented it for the prospective characterization of patient samples in the Molecular Diagnostics
Service at Memorial Sloan-Kettering Cancer Center.
*Abstract altered after submission.
Hybridize & select
(NimbleGen SeqCap:
Overview of the Framework
Results
Table 1: Change in number of mutation calls with the application of
filters. These mutation consist of 30 samples sequenced in a single pool.
Raw Calls
Sensitivity: Known and Novel Variant Frequency vs.
References
Type of
Mutation
1. McKenna A et all The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation
DNA sequencing data. Genome Res. 20:1297-303.
2. Picard Tools:http://picard.sourceforge.net
3. Li H.et al. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9.
4. MARTIN, M.. Cutadapt removes adapter sequences from high-throughput sequencing reads.
EMBnet.journal, North America, 17, may. 2011.
5. Li H. et al. Fast and accurate short read alignment with Burrows-Wheeler Transform.
Genome Informatics 2013 Meeting, 10/30/2013-
11/03/2013, Cold Spring Harbor, NY
Targeted Sequencing
sequencing data
Ronak H. Shah, A. Rose Brannon, Donavan T. Cheng, Helen H. Won, Sasinya N. Scott, Ahmet Zehir, Talia Mitchell,
Ryma Benayed, Catherine O Reilly, Aijazuddin Syed, Nancy Bouvier, Michael F. Berger
Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York , NY
Conclusions
• This analysis framework helps to identify low frequency, high
confidence somatic alterations, making our targeted sequencing
platform suitable for clinical use.
• This platform may provide important individual information regarding
tumor initiation and progression and a more reliable prediction of
personalized cancer therapies.
Berger Lab and the Diagnostic Molecular Pathology
Laboratory Staff
Bioinformatics, 25:1754-60.
Acknowledgements
Prepare 24-48
libraries
Probes for 340 cancer genes
Sequence to 500-
1000X (HiSeq 2500)
Align to genome & analyze
IMPACT Assay)
NORMAL
TUMOR
Image 1: KRAS p.Q61H exon 3 mutation found at 8%
allele frequency in a patient having liver cancer.
NORMAL
TUMOR
Image 2: EGFR p.M766_A767insASV exon 20 insertion
found at 7% allele frequency in a patient having lung
cancer.
NORMAL
TUMOR
Image 3: EGFR p.K745_A750del exon 19 deletion
found at 6% allele frequency in a patient having lung
cancer.
NORMAL
TUMOR
Image 4: EML4-Alk fusion detected as inversion with
3% of reads supporting the fusion in patient having lung
cancer.
EGFR
Amplificati
on (21
folds)
CDKN2A/CDK
N2B Deletion (-
5 folds)
Image 5: EGFR amplification observed with positive fold change of 21 & CDKN2A/CDKN2B deletion observed
with negative fold change of 5 in a patient having glioblastoma.
Effect of filters on mutation calling
Sensitivity to detect mutations at all frequencies
Correlation : 99%
Filter Using
Allele Depth
& Variant
Frequency
Filter using
Annotation,
Genotyping
Information &
Variant
Frequency
Filter from
Genotyping
information
for other
normal’s
SNV's 9674 652 (15%) 165 (25%) 137 (83 %)
INDEL's 1900660 1644 (0.08%) 102 (6%) 66 (64%)
Image 7: 99 % Correlation is
is achieved between expected
and observed variant
frequency at snp sites for
mixed normal samples vs.
normal sample on its own.
Total Depth
Image 6: Found all true positive mutation with 98% recall rate and also
found many Hot-Spot mutation at varying variant frequencies. Hotspot
mutations are recurrent mutations in Cosmic and TCGA.
98% of targets at
>50% of median
99% of targets at
>20% of median