SlideShare a Scribd company logo
ANALYSIS OF STRUCTURAL VARIANTS
FROM NEXT GENERATION SEQUENCING
Hemang Parikh, Ph.D.
NIST
Challenges for identifying true SVs
This Venn diagram shows the
numbers of unique and shared
structural variants (SVs) found
by different sequencing-based
discovery approaches that have
been used in the 1000
Genomes Project
Hence we decided to develop
methods to look for evidence
of SVs in mapped sequencing
reads from multiple
sequencing technologies
From Alkan et al. (2011)
• Coverage (mean and standard deviation)
• Paired-end distance/insert size (mean and
standard deviation)
• # of discordant paired-ends reads
• Soft clipping of the reads (mean and standard
deviation)
• Mapping quality (mean and standard deviation)
• # of heterozygous and homozygous SNP genotype
calls
• % of GC content
Validation parameters for each SV
Reference sequence
Repeatmasker data
Perl script
About 180
annotations
per SV
Aligned sequence
data (BAM file)
List of structural
variants (bed file)
NA12878 Data Sets—RM for GIAB
• Illumina (250 bp long sequences with 50X coverage)
• Illumina NIST (150 bp long sequences with 300X coverage)
• Illumina Platinum Genome (100 bp long sequences with
200X coverage)
• Illumina Moleculo
• Pacific Biosciences
Deletions Gold Sets for NA12878
• Personalis (n=2,306)
• The 1000 Genomes pilot (n=2,773)
• Complete Genomics (n=2,032)
• Conrad et al. (n=515)
• Kidds et al. (n=317)
• McCaroll et al. (n=128)
• The 1000 Genomes—aCGH array based (n=3,901)
• Roche NimbleGen 42 million—aCGH array based (n=719)
• Randomly generated (n=2,306)
Personalis deletions call set (n=2,306)
Log10 (SV Size)
2 3 4 5
Counts
600
400
200
0
• BAM-level evidence in the vicinity
of each SV, in most of the 19 CEPH
pedigree samples
• SV breakpoints were identified
• Some SVs were validated with PCR
Illumina NIST
-2 0 2 4
400
300
200
100
0
Counts
Log10 (M coverage) Log10 (M coverage)
-1 0 1 2 3
Counts
900
600
300
0
Personalis Random genome
Identifying likely SVs and likely non-SVs
Log10 (M coverage)
Counts
400
300
200
100
0
Random genome
Identify 99
percentile
value of an
annotation
parameter
-3 -2 -1 0 1 2
Compared
this value
with an
annotation
parameter
from SV
Gold Set
Annotatingwith IlluminaNIST and IlluminaMoleculo
Personalis SV Gold Set for Illumina
NIST annotation parameters
Personalis SV Gold Set for Illumina
Moleculo annotation parameters
L Insert size
L Soft Clipped
L # of discordant paired-ends reads
M Coverage
M Coverage SD
M Mapping quality
M Insert size
M Soft Clipped
M # of discordant paired-ends reads
L Soft Clipped
M Coverage
M Coverage SD
M Mapping quality
M Soft Clipped
0 1 2 3 4 5 6 7 8 9 10
0 21 96 323 350 231 126 80 40 10 2 1
1 4 19 45 59 61 29 16 9 9 0 1
2 1 22 108 200 214 111 69 36 8 3 0
3 0 0 0 1 1 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0
Illumina NIST
Molecul
o
0 1 2 3 4 5 6 7 8 9 10
0 2059 94 18 6 2 3 1 0 0 0 0
1 62 15 12 5 1 3 2 0 0 1 0
2 13 3 5 0 0 0 0 1 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0
Illumina NIST
Molecul
o
(B) Random genome
(A) Personalis
Conclusions
• Graphical visualization of the annotation parameters has shown clear
distinction between true positive and false positive SVs
• A key advantage of the proposed method is its simplicity and flexibility to
generate various annotation parameters from aligned sequence data based on
different sequencing datasets from the same genome
• This allows integration of multiple sequencing datasets to identify high-
confidence SV and non-SV calls that can be used as a benchmark to assess
false positive and false negative rates
• We are currently testing classification methods based on the annotation
parameters to generate both high-confidence SV calls and high-confidence
non-SV calls for NA12878
Acknowledgements
NIST
Marc Salit
Justin Zook
Hariharan Iyer
Desu Chen
Sumona Sarkar
Jennifer McDaniel
Lindsay Vang
David Catoe
Nathanael Olson
Genome in a Bottle Consortium
Personalis Inc.
Mark Pratt
Gabor Bartha
Jason Harris
Illumina Inc.
Michael Eberle
Stanford University
Michael Snyder
Amin Zia
Somalee Datta
Cuiping Pan
Sean Michael Boyle
Rajini Haraksingh
Natalie Jaeger

More Related Content

Similar to Aug2014 nist structural variant integration

RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
Jatinder Singh
 
qPCR Design Strategies for Specific Applications
qPCR Design Strategies for Specific ApplicationsqPCR Design Strategies for Specific Applications
qPCR Design Strategies for Specific Applications
Integrated DNA Technologies
 
Structural Variation Detection
Structural Variation DetectionStructural Variation Detection
Structural Variation Detection
Jennifer Shelton
 
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
VHIR Vall d’Hebron Institut de Recerca
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptx
Ranjan Jyoti Sarma
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016Hagai Aronowitz
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger
Eli Kaminuma
 
General pipeline of transcriptomics analysis
General pipeline of transcriptomics analysisGeneral pipeline of transcriptomics analysis
General pipeline of transcriptomics analysis
Santy Marques-Ladeira
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GenomeInABottle
 
Whole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptxWhole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptx
Haibo Liu
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
GenomeInABottle
 
DNA Splice site prediction
DNA Splice site predictionDNA Splice site prediction
DNA Splice site prediction
sageteam
 
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
QIAGEN
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)
Gunnar Rätsch
 
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR GenomicsTarget Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
Andrea Telatin
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
adil raja
 
lesson 2 digital data acquisition and data processing
lesson 2 digital data acquisition and data processinglesson 2 digital data acquisition and data processing
lesson 2 digital data acquisition and data processing
Mathew John
 

Similar to Aug2014 nist structural variant integration (20)

RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
Use of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay DesignUse of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay Design
 
qPCR Design Strategies for Specific Applications
qPCR Design Strategies for Specific ApplicationsqPCR Design Strategies for Specific Applications
qPCR Design Strategies for Specific Applications
 
Structural Variation Detection
Structural Variation DetectionStructural Variation Detection
Structural Variation Detection
 
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformat...
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptx
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
 
Rna seq
Rna seqRna seq
Rna seq
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger
 
Vanderbilt b
Vanderbilt bVanderbilt b
Vanderbilt b
 
General pipeline of transcriptomics analysis
General pipeline of transcriptomics analysisGeneral pipeline of transcriptomics analysis
General pipeline of transcriptomics analysis
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Whole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptxWhole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptx
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
DNA Splice site prediction
DNA Splice site predictionDNA Splice site prediction
DNA Splice site prediction
 
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
Multicopy reference assay (MRef) — a superior normalizer of sample input in D...
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)
 
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR GenomicsTarget Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
Target Enrichment with NGS: Cardiomyopathy as a case study - BMR Genomics
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
 
lesson 2 digital data acquisition and data processing
lesson 2 digital data acquisition and data processinglesson 2 digital data acquisition and data processing
lesson 2 digital data acquisition and data processing
 

More from GenomeInABottle

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
GenomeInABottle
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
GenomeInABottle
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
GenomeInABottle
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
GenomeInABottle
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
GenomeInABottle
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
GenomeInABottle
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
GenomeInABottle
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
GenomeInABottle
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
GenomeInABottle
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
GenomeInABottle
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
GenomeInABottle
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GenomeInABottle
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
GenomeInABottle
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GenomeInABottle
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
GenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
GenomeInABottle
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
GenomeInABottle
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
GenomeInABottle
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
GenomeInABottle
 

More from GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 

Recently uploaded

BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
Krishan Murari
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
Sapna Thakur
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
Swetaba Besh
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
MedicoseAcademics
 
Vision-1.pptx, Eye structure, basics of optics
Vision-1.pptx, Eye structure, basics of opticsVision-1.pptx, Eye structure, basics of optics
Vision-1.pptx, Eye structure, basics of optics
Sai Sailesh Kumar Goothy
 
Knee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdfKnee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdf
vimalpl1234
 
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness JourneyTom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
greendigital
 
Sex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skullSex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skull
ShashankRoodkee
 
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptxPharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Dr. Rabia Inam Gandapore
 
Top-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India ListTop-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India List
SwisschemDerma
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
Journal Article Review on Rasamanikya
Journal Article Review on RasamanikyaJournal Article Review on Rasamanikya
Journal Article Review on Rasamanikya
Dr. Jyothirmai Paindla
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in IndiaTop 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Swastik Ayurveda
 
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in DehradunDehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
chandankumarsmartiso
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
aljamhori teaching hospital
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Oleg Kshivets
 
ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.
drhasanrajab
 
263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,
sisternakatoto
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
MedicoseAcademics
 

Recently uploaded (20)

BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
 
Vision-1.pptx, Eye structure, basics of optics
Vision-1.pptx, Eye structure, basics of opticsVision-1.pptx, Eye structure, basics of optics
Vision-1.pptx, Eye structure, basics of optics
 
Knee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdfKnee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdf
 
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness JourneyTom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
 
Sex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skullSex determination from mandible pelvis and skull
Sex determination from mandible pelvis and skull
 
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptxPharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
 
Top-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India ListTop-Vitamin-Supplement-Brands-in-India List
Top-Vitamin-Supplement-Brands-in-India List
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
 
Journal Article Review on Rasamanikya
Journal Article Review on RasamanikyaJournal Article Review on Rasamanikya
Journal Article Review on Rasamanikya
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in IndiaTop 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in India
 
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in DehradunDehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
Dehradun #ℂall #gIRLS Oyo Hotel 8107221448 #ℂall #gIRL in Dehradun
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
 
ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.ABDOMINAL TRAUMA in pediatrics part one.
ABDOMINAL TRAUMA in pediatrics part one.
 
263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
 

Aug2014 nist structural variant integration

  • 1. ANALYSIS OF STRUCTURAL VARIANTS FROM NEXT GENERATION SEQUENCING Hemang Parikh, Ph.D. NIST
  • 2. Challenges for identifying true SVs This Venn diagram shows the numbers of unique and shared structural variants (SVs) found by different sequencing-based discovery approaches that have been used in the 1000 Genomes Project Hence we decided to develop methods to look for evidence of SVs in mapped sequencing reads from multiple sequencing technologies From Alkan et al. (2011)
  • 3. • Coverage (mean and standard deviation) • Paired-end distance/insert size (mean and standard deviation) • # of discordant paired-ends reads • Soft clipping of the reads (mean and standard deviation) • Mapping quality (mean and standard deviation) • # of heterozygous and homozygous SNP genotype calls • % of GC content Validation parameters for each SV
  • 4. Reference sequence Repeatmasker data Perl script About 180 annotations per SV Aligned sequence data (BAM file) List of structural variants (bed file)
  • 5. NA12878 Data Sets—RM for GIAB • Illumina (250 bp long sequences with 50X coverage) • Illumina NIST (150 bp long sequences with 300X coverage) • Illumina Platinum Genome (100 bp long sequences with 200X coverage) • Illumina Moleculo • Pacific Biosciences
  • 6. Deletions Gold Sets for NA12878 • Personalis (n=2,306) • The 1000 Genomes pilot (n=2,773) • Complete Genomics (n=2,032) • Conrad et al. (n=515) • Kidds et al. (n=317) • McCaroll et al. (n=128) • The 1000 Genomes—aCGH array based (n=3,901) • Roche NimbleGen 42 million—aCGH array based (n=719) • Randomly generated (n=2,306)
  • 7. Personalis deletions call set (n=2,306) Log10 (SV Size) 2 3 4 5 Counts 600 400 200 0 • BAM-level evidence in the vicinity of each SV, in most of the 19 CEPH pedigree samples • SV breakpoints were identified • Some SVs were validated with PCR
  • 8. Illumina NIST -2 0 2 4 400 300 200 100 0 Counts Log10 (M coverage) Log10 (M coverage) -1 0 1 2 3 Counts 900 600 300 0 Personalis Random genome
  • 9. Identifying likely SVs and likely non-SVs Log10 (M coverage) Counts 400 300 200 100 0 Random genome Identify 99 percentile value of an annotation parameter -3 -2 -1 0 1 2 Compared this value with an annotation parameter from SV Gold Set
  • 10. Annotatingwith IlluminaNIST and IlluminaMoleculo Personalis SV Gold Set for Illumina NIST annotation parameters Personalis SV Gold Set for Illumina Moleculo annotation parameters L Insert size L Soft Clipped L # of discordant paired-ends reads M Coverage M Coverage SD M Mapping quality M Insert size M Soft Clipped M # of discordant paired-ends reads L Soft Clipped M Coverage M Coverage SD M Mapping quality M Soft Clipped
  • 11. 0 1 2 3 4 5 6 7 8 9 10 0 21 96 323 350 231 126 80 40 10 2 1 1 4 19 45 59 61 29 16 9 9 0 1 2 1 22 108 200 214 111 69 36 8 3 0 3 0 0 0 1 1 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 Illumina NIST Molecul o 0 1 2 3 4 5 6 7 8 9 10 0 2059 94 18 6 2 3 1 0 0 0 0 1 62 15 12 5 1 3 2 0 0 1 0 2 13 3 5 0 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 Illumina NIST Molecul o (B) Random genome (A) Personalis
  • 12. Conclusions • Graphical visualization of the annotation parameters has shown clear distinction between true positive and false positive SVs • A key advantage of the proposed method is its simplicity and flexibility to generate various annotation parameters from aligned sequence data based on different sequencing datasets from the same genome • This allows integration of multiple sequencing datasets to identify high- confidence SV and non-SV calls that can be used as a benchmark to assess false positive and false negative rates • We are currently testing classification methods based on the annotation parameters to generate both high-confidence SV calls and high-confidence non-SV calls for NA12878
  • 13. Acknowledgements NIST Marc Salit Justin Zook Hariharan Iyer Desu Chen Sumona Sarkar Jennifer McDaniel Lindsay Vang David Catoe Nathanael Olson Genome in a Bottle Consortium Personalis Inc. Mark Pratt Gabor Bartha Jason Harris Illumina Inc. Michael Eberle Stanford University Michael Snyder Amin Zia Somalee Datta Cuiping Pan Sean Michael Boyle Rajini Haraksingh Natalie Jaeger