SlideShare a Scribd company logo
SV Data Jamboree
Justin Zook and Ali Bashir
With the Genome in a Bottle
Consortium
September 15, 2016
Sequencing technologies and
bioinformatics pipelines disagree
O’Rawe et al. Genome Medicine 2013, 5:28
Sequencing technologies and
bioinformatics pipelines disagree
O’Rawe et al. Genome Medicine 2013, 5:28
Candidate NIST Reference Materials
Genome PGP ID Coriell ID NIST ID NIST RM #
CEPH
Mother/Daugh
ter
N/A GM12878 HG001 RM8398
AJ Son huAA53E0 GM24385 HG002 RM8391
(son)/RM8392
(trio)
AJ Father hu6E4515 GM24149 HG003 RM8392 (trio)
AJ Mother hu8E87A9 GM24143 HG004 RM8392 (trio)
Asian Son hu91BD69 GM24631 HG005 RM8393
Asian Father huCA017E GM24694 N/A N/A
Asian Mother hu38168C GM24695 N/A N/A
Data for GIAB PGP Trios
Dataset Characteristics Coverage Availability Most useful for…
Illumina Paired-end
WGS
150x150bp
250x250bp
~300x/individual
~50x/individual
on SRA/FTP SNPs/indels/some SVs
Complete Genomics 100x/individual on SRA/ftp SNPs/indels/some SVs
SOLiD 5500W WGS 50bp single end 70x/son on FTP SNPs
Illumina Paired-end
WES
100x100bp ~300x/individual on SRA/FTP SNPs/indels in exome
Ion Proton Exome 1000x/individual on SRA/FTP SNPs/indels in exome
Illumina Mate pair ~6000 bp insert ~30x/individual on FTP SVs
Illumina “moleculo” Custom library ~30x by long
fragments
on FTP SVs/phasing/assembly
Complete Genomics LFR 100x/individual on SRA/FTP SNPs/indels/phasing
10X Pseudo-long reads 30-45x/individual on FTP SVs/phasing/assembly
PacBio ~10kb reads ~70x on AJ son, ~30x
on each AJ parent
on SRA/FTP SVs/phasing/assembly
/STRs
Oxford Nanopore 5.8kb 2D reads 0.02x on AJ son on FTP SVs/assembly
Nabsys 2.0 ~100kbp N50
nanopore maps
70x on AJ son SVs/assembly
BioNano Genomics 200-250kbp optical
map reads
~100x/AJ individual;
57x on Asian son
on FTP SVs/assembly
Data for GIAB PGP Trios
Dataset Characteristics Coverage Availability Most useful for…
Illumina Paired-end
WGS
150x150bp
250x250bp
~300x/individual
~50x/individual
on SRA/FTP SNPs/indels/some SVs
Complete Genomics 100x/individual on SRA/ftp SNPs/indels/some SVs
SOLiD 5500W WGS 50bp single end 70x/son on FTP SNPs
Illumina Paired-end
WES
100x100bp ~300x/individual on SRA/FTP SNPs/indels in exome
Ion Proton Exome 1000x/individual on SRA/FTP SNPs/indels in exome
Illumina Mate pair ~6000 bp insert ~30x/individual on FTP SVs
Illumina “moleculo” Custom library ~30x by long
fragments
on FTP SVs/phasing/assembly
Complete Genomics LFR 100x/individual on SRA/FTP SNPs/indels/phasing
10X Pseudo-long reads 30-45x/individual on FTP SVs/phasing/assembly
PacBio ~10kb reads ~70x on AJ son, ~30x
on each AJ parent
on SRA/FTP SVs/phasing/assembly
/STRs
Oxford Nanopore 5.8kb 2D reads 0.02x on AJ son on FTP SVs/assembly
Nabsys 2.0 ~100kbp N50
nanopore maps
70x on AJ son SVs/assembly
BioNano Genomics 200-250kbp optical
map reads
~100x/AJ individual;
57x on Asian son
on FTP SVs/assembly
Paper describing data…
51 authors
14 institutions
12 datasets
7 genomes
Data described in ISA-tab
Integration Methods to Establish
Benchmark Small Variant Calls
Candidate variants
Concordant variants
Find characteristics of bias
Arbitrate using evidence of
bias
Confidence Level Zook et al., Nature Biotechnology, 2014.
How can we extend this approach to
SVs?
Similarities to small variants
• Collect callsets from
multiple technologies
• Compare callsets to find
calls supported by multiple
technologies
Differences from small variants
• Callsets generally are not
sufficiently sensitive to
assume that regions without
calls are homozygous
reference
– SVs of different types/sizes are
not always detected easily
• Variants are often imprecisely
characterized
– breakpoints, size, type, etc.
• Representation of variants is
poorly standardized, especially
when complex
• Comparison tools in infancy
Callsets Contributed so far
Short reads
• Illumina
– Spiral Genetics
– cortex
– Commonlaw
– MetaSV
• Complete Genomics
• CG-SV
• CG-CNV
• CG-vcfBeta
Long reads and Linked reads
• PacBio
• CSHL-assembly
• Sniffles
• PBHoney-spots and –tails
• Parliament/pacbio
• Parliament/assembly
• MultibreakSV
• smrt-sv.dip
• Assemblytics-Falcon and-MHAP
• NHGRI assembly-based
• Nanopore mapping
• Nabsys force calls
• optical mapping
• BioNano with and without haplotype-aware
assembly
• 10X Genomics Chromium
• Deletions
• Large SVs
AJ Trio Assemblies
On FTP
• PacBio
– Falcon
– Canu
• BioNano
– Haploid
– Diploid
In Process
• Illumina
– DISCOVAR – contig N50 ~100k
• PacBio
– Falcon diploid in process
• Dovetail scaffolding
– With PacBio-falcon
– With PacBIo-Canu
– With DISCOVAR
• 10X?
– By itself
– Phasing PacBio
APPROACH #1: FIND DELETIONS WITH
SUPPORT FROM MULTIPLE TECHS AND
CONCORDANT BREAKPOINTS
Step 1: Merging calls
• Process
– Find union of calls >19bp from all deletion callsets and merge
any regions if within 1000 bp (results in 28460 regions)
– Annotate each merged region with fraction covered by calls
from each callset
– Split out those overlapping tandem repeats longer than 200bp
by >25% (2715 regions)
• Helps mitigate different representations of calls in
repetitive regions and imprecision of breakpoints from
many callers
• Limitations
– may not appropriately call compound heterozygous SVs
– Ignores other types of SVs in the region
– Loses genotype information
Callset #1
Callset #2
Step 2: Find size prediction accuracy
• Find “size prediction accuracy” of each callset
by calculating the difference from the median
predicted size for regions with calls from >3
callers, and rank callers for <3kb and >3kb size
ranges
Spiral 0.00%
Cortex 0.24%
CGSV 0.65%
AssemblyticsFalcon 0.79%
CGvcf 1.09%
fermikit 1.28%
smrtsvdip 1.43%
MetaSV 1.57%
MultibreakSV 1.62%
PBHoneySpots 2.13%
AssemblyticsMHAP 2.21%
ParliamentAssemblyForce 2.26%
CSHLassembly 2.29%
ParliamentPacBio 2.92%
ParliamentAssembly 3.00%
Spiral 0.04%
AssemblyticsFalcon 0.06%
CGSV 0.06%
CSHLassembly 0.08%
AssemblyticsMHAP 0.08%
MultibreakSV 0.10%
fermikit 0.11%
PBHoneyTails 0.38%
CommonLaw 0.48%
ParliamentPacBio 0.58%
smrtsvdip 0.62%
MetaSV 1.12%
sniffles 1.57%
Nabsys2tech01Force 3.02%
BioNano 3.67%
Size >3kbSize <3kb
IMPORTANT NOTE: These
stats are intended for
integration and to help
developers improve their
methods, not to compare
methods, since they likely
do not reflect actual size
prediction accuracy for all
methods.
Step 3: Find calls supported by 2 techs
1. Find calls supported by calls from 2 or more
technologies with size prediction within 20%
2. Find sensitivity of each caller to these calls in
size ranges 20-50, 50-100, 100-1000, 1000-
3000, and >3000 bp
Step 4: Filter questionable calls
supported by 2+ technologies
• 316 calls covered >25% by segmental
duplication >10kb
• 631 calls with at least one caller predicting a
size >2x different from the consensus size
• 34 calls where callsets missing this call from
multiple technologies have a multiplied (1-
sensitivity) < 2% in this size tranche
• 87 calls that overlap Ns in the reference
Overview of process
Merge
deletions
within 1kb
Rank calls by
closeness of
predicted
size to
median size
and select
call in each
region from
best callset
Find calls
supported
by 2+
technologies
with size
within 20%
Filter calls
overlapping
seg dups,
reference
N’s, or with
call with
predicted
size 2x larger
Number of Calls Supported by 2
Technologies by Size Range
<50bp 50-100bp 100-1000bp 1kb-3kb >3kb
pre-filtered 2542 1567 2447 731 730
filtered 2427 1415 2207 638 524
Size distributions
Support for all candidate regions
# of callsets # of technologies
Support for benchmark calls
# of callsets # of technologies
Approach #2: svcompare (NCBI
hackathon)
Builds on SURVIVOR
• Compares each new callset to
the first and adds new calls
not within 1kb of existing calls
• Outputs multi-sample vcf with
type, size, and breakpoints
from each callset in each
candidate region
• Integrates multiple types, but
doesn’t currently output size
of insertions or exact
sequence
• Developed by Fritz Sedlazeck,
JHU
Output stats
• 130k input regions from
calls >19bp
• 876 regions have >1 type
within a callset
• 2276 regions have >1 type
across callsets
• How to integrate discordant
types in same region?
https://github.com/NCBI-Hackathons/svcompare
Example start position distance from
median start by callset (400-1000bp)
Approach #3: “Type” candidate calls in
each dataset
svviz
• Looks for whether reads
support REF or ALT allele
– Can often easily infer
genotype
• Also generates other stats
about mapping reads
• Generates visualization of
mapped reads as well
• Nabsys has developed a
similar approach for their
mapping data
Compatible datasets
• PacBio
• Illumina 150bp and 250bp
paired end
• Illumina 6kb mate-pair
• 10X haplotype-separated
10X SV analyses
with svviz
• Find reads
supporting ref
and alt alleles in
each haplotype
• Verify support for
ref and alt is on
different
haplotypes for
hets
• Verify support
from both
haplotypes for
confidence homo
var or hom ref
call
SonDadMomSonDadMom
Goals for Data Jamboree
Share progress in algorithm
development
• New technologies
• New analysis methods
• Visualization methods
• Integration/comparison
methods
Outstanding questions to discuss
• Integration
– How to form high-confidence calls,
breakpoints, and genotypes from
multiple calls?
– What is the minimum viable product
for a practical benchmark set?
• Is this a good criterion: “When an
individual callset is compared to
ours, most FPs/FNs should be errors
in the individual callset”
– How to handle non-deletions?
• SV typing
• Future work
– How to form high-confidence
regions?
– SV phasing
– Is anyone developing SV
benchmarking tools?
Things to resolve
Integration
• How to compare events
with variable breakpoints
across callsets?
– Tandem repeats
• How to compare non-
deletions?
– Start with insertions?
• Distinguish precise
breakpoints when possible
Typing
• Leverage long-range
information to type with
short reads?
• How to deal with imprecise
breakpoints?
• At what point is something
validated?
– Potentially high-confidence
variants (or reference?)
– Haplotype-separated
Acknowledgements
• NIST
– Marc Salit
– Jenny McDaniel
– Lindsay Vang
– David Catoe
– Hemang Parikh
• Genome in a Bottle Consortium
• GA4GH Benchmarking Team
• FDA
– Liz Mansfield
• SV Callset Contributors
– CSHL/JHU
– Mt Sinai
– 10X
– Nabsys
– Spiral Genetics/Stanford
– Heng Li/Mike Lin
– DNAnexus
– Complete Genomics
– Baylor
– Bina/Roche
– BioNano Genomics
– Mark Chaisson
– NIH/NCBI
– NIH/NHGRI
– Can Alkan/Stanford

More Related Content

What's hot

NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
QIAGEN
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Integrated DNA Technologies
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
QIAGEN
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
GenomeInABottle
 
GIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatch
GenomeInABottle
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
GenomeInABottle
 
Giab agbt small_var_2019
Giab agbt small_var_2019Giab agbt small_var_2019
Giab agbt small_var_2019
GenomeInABottle
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seq
GenomeInABottle
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
GenomeInABottle
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
cursoNGS
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)
James Hadfield
 
rnaseq_from_babelomics
rnaseq_from_babelomicsrnaseq_from_babelomics
rnaseq_from_babelomics
Francisco Garc
 
Aug2015 analysis team spiral genetics
Aug2015 analysis team spiral geneticsAug2015 analysis team spiral genetics
Aug2015 analysis team spiral genetics
GenomeInABottle
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
Ramya P
 
140127 platinum genomes pedigree analyses
140127 platinum genomes pedigree analyses140127 platinum genomes pedigree analyses
140127 platinum genomes pedigree analyses
GenomeInABottle
 
Jan2016 bio nano han cao
Jan2016 bio nano han caoJan2016 bio nano han cao
Jan2016 bio nano han cao
GenomeInABottle
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
Integrated DNA Technologies
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
Genome Reference Consortium
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
Integrated DNA Technologies
 

What's hot (20)

NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
GIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatch
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
Giab agbt small_var_2019
Giab agbt small_var_2019Giab agbt small_var_2019
Giab agbt small_var_2019
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seq
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)
 
rnaseq_from_babelomics
rnaseq_from_babelomicsrnaseq_from_babelomics
rnaseq_from_babelomics
 
Aug2015 analysis team spiral genetics
Aug2015 analysis team spiral geneticsAug2015 analysis team spiral genetics
Aug2015 analysis team spiral genetics
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
140127 platinum genomes pedigree analyses
140127 platinum genomes pedigree analyses140127 platinum genomes pedigree analyses
140127 platinum genomes pedigree analyses
 
Jan2016 bio nano han cao
Jan2016 bio nano han caoJan2016 bio nano han cao
Jan2016 bio nano han cao
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
 

Viewers also liked

Sept2016 plenary goldfeder_clinical_accuracy
Sept2016 plenary goldfeder_clinical_accuracySept2016 plenary goldfeder_clinical_accuracy
Sept2016 plenary goldfeder_clinical_accuracy
GenomeInABottle
 
Sept2016 plenary nist_intro
Sept2016 plenary nist_introSept2016 plenary nist_intro
Sept2016 plenary nist_intro
GenomeInABottle
 
SeRC: de novo assembly workshop. Francesco Vezzi
SeRC: de novo assembly workshop. Francesco VezziSeRC: de novo assembly workshop. Francesco Vezzi
SeRC: de novo assembly workshop. Francesco Vezzi
Francesco Vezzi
 
Aug2015 steve lincoln analytical validation
Aug2015 steve lincoln analytical validationAug2015 steve lincoln analytical validation
Aug2015 steve lincoln analytical validation
GenomeInABottle
 
Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...
William Chow
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giab
GenomeInABottle
 

Viewers also liked (7)

Anne geddes ildy
Anne geddes ildyAnne geddes ildy
Anne geddes ildy
 
Sept2016 plenary goldfeder_clinical_accuracy
Sept2016 plenary goldfeder_clinical_accuracySept2016 plenary goldfeder_clinical_accuracy
Sept2016 plenary goldfeder_clinical_accuracy
 
Sept2016 plenary nist_intro
Sept2016 plenary nist_introSept2016 plenary nist_intro
Sept2016 plenary nist_intro
 
SeRC: de novo assembly workshop. Francesco Vezzi
SeRC: de novo assembly workshop. Francesco VezziSeRC: de novo assembly workshop. Francesco Vezzi
SeRC: de novo assembly workshop. Francesco Vezzi
 
Aug2015 steve lincoln analytical validation
Aug2015 steve lincoln analytical validationAug2015 steve lincoln analytical validation
Aug2015 steve lincoln analytical validation
 
Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giab
 

Similar to Sept2016 sv nist_intro

160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
GenomeInABottle
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
GenomeInABottle
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
GenomeInABottle
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
fruitbreedomics
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
GenomeInABottle
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
GenomeInABottle
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
GenomeInABottle
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
hansjansen9999
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Thermo Fisher Scientific
 
NAIMA method
NAIMA methodNAIMA method
NAIMA method
dandandany
 
Sept2016 sv 10_x
Sept2016 sv 10_xSept2016 sv 10_x
Sept2016 sv 10_x
GenomeInABottle
 
Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Ilya Klabukov
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
Alagar Suresh
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
GenomeInABottle
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
Integrated DNA Technologies
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
hansjansen9999
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
Nathan Olson
 
Cufflinks
CufflinksCufflinks
Cufflinks
Ravi Gandham
 

Similar to Sept2016 sv nist_intro (20)

160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
 
NAIMA method
NAIMA methodNAIMA method
NAIMA method
 
Sept2016 sv 10_x
Sept2016 sv 10_xSept2016 sv 10_x
Sept2016 sv 10_x
 
Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
Cufflinks
CufflinksCufflinks
Cufflinks
 

More from GenomeInABottle

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
GenomeInABottle
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
GenomeInABottle
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
GenomeInABottle
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
GenomeInABottle
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
GenomeInABottle
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
GenomeInABottle
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
GenomeInABottle
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
GenomeInABottle
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
GenomeInABottle
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
GenomeInABottle
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GenomeInABottle
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
GenomeInABottle
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GenomeInABottle
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
GenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
GenomeInABottle
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
GenomeInABottle
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
GenomeInABottle
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
GenomeInABottle
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
GenomeInABottle
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
GenomeInABottle
 

More from GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
 

Recently uploaded

CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptxCLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
Government Dental College & Hospital Srinagar
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
Jim Jacob Roy
 
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHYMERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
DRPREETHIJAMESP
 
Outbreak management including quarantine, isolation, contact.pptx
Outbreak management including quarantine, isolation, contact.pptxOutbreak management including quarantine, isolation, contact.pptx
Outbreak management including quarantine, isolation, contact.pptx
Pratik328635
 
Abortion PG Seminar Power point presentation
Abortion PG Seminar Power point presentationAbortion PG Seminar Power point presentation
Abortion PG Seminar Power point presentation
AksshayaRajanbabu
 
Post-Menstrual Smell- When to Suspect Vaginitis.pptx
Post-Menstrual Smell- When to Suspect Vaginitis.pptxPost-Menstrual Smell- When to Suspect Vaginitis.pptx
Post-Menstrual Smell- When to Suspect Vaginitis.pptx
FFragrant
 
CBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdfCBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdf
suvadeepdas911
 
10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations   10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations
Traumasoft LLC
 
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.GawadHemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
NephroTube - Dr.Gawad
 
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotesPromoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
PsychoTech Services
 
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
Holistified Wellness
 
Chapter 11 Nutrition and Chronic Diseases.pptx
Chapter 11 Nutrition and Chronic Diseases.pptxChapter 11 Nutrition and Chronic Diseases.pptx
Chapter 11 Nutrition and Chronic Diseases.pptx
Earlene McNair
 
pathology MCQS introduction to pathology general pathology
pathology MCQS introduction to pathology general pathologypathology MCQS introduction to pathology general pathology
pathology MCQS introduction to pathology general pathology
ZayedKhan38
 
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdfCHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
rishi2789
 
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptxEar and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
Dr. Rabia Inam Gandapore
 
share - Lions, tigers, AI and health misinformation, oh my!.pptx
share - Lions, tigers, AI and health misinformation, oh my!.pptxshare - Lions, tigers, AI and health misinformation, oh my!.pptx
share - Lions, tigers, AI and health misinformation, oh my!.pptx
Tina Purnat
 
Medical Quiz ( Online Quiz for API Meet 2024 ).pdf
Medical Quiz ( Online Quiz for API Meet 2024 ).pdfMedical Quiz ( Online Quiz for API Meet 2024 ).pdf
Medical Quiz ( Online Quiz for API Meet 2024 ).pdf
Jim Jacob Roy
 
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
rishi2789
 
Acute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdfAcute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdf
Jim Jacob Roy
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
Health Advances
 

Recently uploaded (20)

CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptxCLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
CLEAR ALIGNER THERAPY IN ORTHODONTICS .pptx
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
 
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHYMERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
MERCURY GROUP.BHMS.MATERIA MEDICA.HOMOEOPATHY
 
Outbreak management including quarantine, isolation, contact.pptx
Outbreak management including quarantine, isolation, contact.pptxOutbreak management including quarantine, isolation, contact.pptx
Outbreak management including quarantine, isolation, contact.pptx
 
Abortion PG Seminar Power point presentation
Abortion PG Seminar Power point presentationAbortion PG Seminar Power point presentation
Abortion PG Seminar Power point presentation
 
Post-Menstrual Smell- When to Suspect Vaginitis.pptx
Post-Menstrual Smell- When to Suspect Vaginitis.pptxPost-Menstrual Smell- When to Suspect Vaginitis.pptx
Post-Menstrual Smell- When to Suspect Vaginitis.pptx
 
CBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdfCBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdf
 
10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations   10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations
 
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.GawadHemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
Hemodialysis: Chapter 5, Dialyzers Overview - Dr.Gawad
 
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotesPromoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
 
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
 
Chapter 11 Nutrition and Chronic Diseases.pptx
Chapter 11 Nutrition and Chronic Diseases.pptxChapter 11 Nutrition and Chronic Diseases.pptx
Chapter 11 Nutrition and Chronic Diseases.pptx
 
pathology MCQS introduction to pathology general pathology
pathology MCQS introduction to pathology general pathologypathology MCQS introduction to pathology general pathology
pathology MCQS introduction to pathology general pathology
 
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdfCHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
CHEMOTHERAPY_RDP_CHAPTER 6_Anti Malarial Drugs.pdf
 
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptxEar and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
Ear and its clinical correlations By Dr. Rabia Inam Gandapore.pptx
 
share - Lions, tigers, AI and health misinformation, oh my!.pptx
share - Lions, tigers, AI and health misinformation, oh my!.pptxshare - Lions, tigers, AI and health misinformation, oh my!.pptx
share - Lions, tigers, AI and health misinformation, oh my!.pptx
 
Medical Quiz ( Online Quiz for API Meet 2024 ).pdf
Medical Quiz ( Online Quiz for API Meet 2024 ).pdfMedical Quiz ( Online Quiz for API Meet 2024 ).pdf
Medical Quiz ( Online Quiz for API Meet 2024 ).pdf
 
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
 
Acute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdfAcute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdf
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
 

Sept2016 sv nist_intro

  • 1. SV Data Jamboree Justin Zook and Ali Bashir With the Genome in a Bottle Consortium September 15, 2016
  • 2. Sequencing technologies and bioinformatics pipelines disagree O’Rawe et al. Genome Medicine 2013, 5:28
  • 3. Sequencing technologies and bioinformatics pipelines disagree O’Rawe et al. Genome Medicine 2013, 5:28
  • 4. Candidate NIST Reference Materials Genome PGP ID Coriell ID NIST ID NIST RM # CEPH Mother/Daugh ter N/A GM12878 HG001 RM8398 AJ Son huAA53E0 GM24385 HG002 RM8391 (son)/RM8392 (trio) AJ Father hu6E4515 GM24149 HG003 RM8392 (trio) AJ Mother hu8E87A9 GM24143 HG004 RM8392 (trio) Asian Son hu91BD69 GM24631 HG005 RM8393 Asian Father huCA017E GM24694 N/A N/A Asian Mother hu38168C GM24695 N/A N/A
  • 5. Data for GIAB PGP Trios Dataset Characteristics Coverage Availability Most useful for… Illumina Paired-end WGS 150x150bp 250x250bp ~300x/individual ~50x/individual on SRA/FTP SNPs/indels/some SVs Complete Genomics 100x/individual on SRA/ftp SNPs/indels/some SVs SOLiD 5500W WGS 50bp single end 70x/son on FTP SNPs Illumina Paired-end WES 100x100bp ~300x/individual on SRA/FTP SNPs/indels in exome Ion Proton Exome 1000x/individual on SRA/FTP SNPs/indels in exome Illumina Mate pair ~6000 bp insert ~30x/individual on FTP SVs Illumina “moleculo” Custom library ~30x by long fragments on FTP SVs/phasing/assembly Complete Genomics LFR 100x/individual on SRA/FTP SNPs/indels/phasing 10X Pseudo-long reads 30-45x/individual on FTP SVs/phasing/assembly PacBio ~10kb reads ~70x on AJ son, ~30x on each AJ parent on SRA/FTP SVs/phasing/assembly /STRs Oxford Nanopore 5.8kb 2D reads 0.02x on AJ son on FTP SVs/assembly Nabsys 2.0 ~100kbp N50 nanopore maps 70x on AJ son SVs/assembly BioNano Genomics 200-250kbp optical map reads ~100x/AJ individual; 57x on Asian son on FTP SVs/assembly
  • 6. Data for GIAB PGP Trios Dataset Characteristics Coverage Availability Most useful for… Illumina Paired-end WGS 150x150bp 250x250bp ~300x/individual ~50x/individual on SRA/FTP SNPs/indels/some SVs Complete Genomics 100x/individual on SRA/ftp SNPs/indels/some SVs SOLiD 5500W WGS 50bp single end 70x/son on FTP SNPs Illumina Paired-end WES 100x100bp ~300x/individual on SRA/FTP SNPs/indels in exome Ion Proton Exome 1000x/individual on SRA/FTP SNPs/indels in exome Illumina Mate pair ~6000 bp insert ~30x/individual on FTP SVs Illumina “moleculo” Custom library ~30x by long fragments on FTP SVs/phasing/assembly Complete Genomics LFR 100x/individual on SRA/FTP SNPs/indels/phasing 10X Pseudo-long reads 30-45x/individual on FTP SVs/phasing/assembly PacBio ~10kb reads ~70x on AJ son, ~30x on each AJ parent on SRA/FTP SVs/phasing/assembly /STRs Oxford Nanopore 5.8kb 2D reads 0.02x on AJ son on FTP SVs/assembly Nabsys 2.0 ~100kbp N50 nanopore maps 70x on AJ son SVs/assembly BioNano Genomics 200-250kbp optical map reads ~100x/AJ individual; 57x on Asian son on FTP SVs/assembly
  • 7. Paper describing data… 51 authors 14 institutions 12 datasets 7 genomes Data described in ISA-tab
  • 8. Integration Methods to Establish Benchmark Small Variant Calls Candidate variants Concordant variants Find characteristics of bias Arbitrate using evidence of bias Confidence Level Zook et al., Nature Biotechnology, 2014.
  • 9. How can we extend this approach to SVs? Similarities to small variants • Collect callsets from multiple technologies • Compare callsets to find calls supported by multiple technologies Differences from small variants • Callsets generally are not sufficiently sensitive to assume that regions without calls are homozygous reference – SVs of different types/sizes are not always detected easily • Variants are often imprecisely characterized – breakpoints, size, type, etc. • Representation of variants is poorly standardized, especially when complex • Comparison tools in infancy
  • 10. Callsets Contributed so far Short reads • Illumina – Spiral Genetics – cortex – Commonlaw – MetaSV • Complete Genomics • CG-SV • CG-CNV • CG-vcfBeta Long reads and Linked reads • PacBio • CSHL-assembly • Sniffles • PBHoney-spots and –tails • Parliament/pacbio • Parliament/assembly • MultibreakSV • smrt-sv.dip • Assemblytics-Falcon and-MHAP • NHGRI assembly-based • Nanopore mapping • Nabsys force calls • optical mapping • BioNano with and without haplotype-aware assembly • 10X Genomics Chromium • Deletions • Large SVs
  • 11. AJ Trio Assemblies On FTP • PacBio – Falcon – Canu • BioNano – Haploid – Diploid In Process • Illumina – DISCOVAR – contig N50 ~100k • PacBio – Falcon diploid in process • Dovetail scaffolding – With PacBio-falcon – With PacBIo-Canu – With DISCOVAR • 10X? – By itself – Phasing PacBio
  • 12. APPROACH #1: FIND DELETIONS WITH SUPPORT FROM MULTIPLE TECHS AND CONCORDANT BREAKPOINTS
  • 13. Step 1: Merging calls • Process – Find union of calls >19bp from all deletion callsets and merge any regions if within 1000 bp (results in 28460 regions) – Annotate each merged region with fraction covered by calls from each callset – Split out those overlapping tandem repeats longer than 200bp by >25% (2715 regions) • Helps mitigate different representations of calls in repetitive regions and imprecision of breakpoints from many callers • Limitations – may not appropriately call compound heterozygous SVs – Ignores other types of SVs in the region – Loses genotype information Callset #1 Callset #2
  • 14. Step 2: Find size prediction accuracy • Find “size prediction accuracy” of each callset by calculating the difference from the median predicted size for regions with calls from >3 callers, and rank callers for <3kb and >3kb size ranges Spiral 0.00% Cortex 0.24% CGSV 0.65% AssemblyticsFalcon 0.79% CGvcf 1.09% fermikit 1.28% smrtsvdip 1.43% MetaSV 1.57% MultibreakSV 1.62% PBHoneySpots 2.13% AssemblyticsMHAP 2.21% ParliamentAssemblyForce 2.26% CSHLassembly 2.29% ParliamentPacBio 2.92% ParliamentAssembly 3.00% Spiral 0.04% AssemblyticsFalcon 0.06% CGSV 0.06% CSHLassembly 0.08% AssemblyticsMHAP 0.08% MultibreakSV 0.10% fermikit 0.11% PBHoneyTails 0.38% CommonLaw 0.48% ParliamentPacBio 0.58% smrtsvdip 0.62% MetaSV 1.12% sniffles 1.57% Nabsys2tech01Force 3.02% BioNano 3.67% Size >3kbSize <3kb IMPORTANT NOTE: These stats are intended for integration and to help developers improve their methods, not to compare methods, since they likely do not reflect actual size prediction accuracy for all methods.
  • 15. Step 3: Find calls supported by 2 techs 1. Find calls supported by calls from 2 or more technologies with size prediction within 20% 2. Find sensitivity of each caller to these calls in size ranges 20-50, 50-100, 100-1000, 1000- 3000, and >3000 bp
  • 16. Step 4: Filter questionable calls supported by 2+ technologies • 316 calls covered >25% by segmental duplication >10kb • 631 calls with at least one caller predicting a size >2x different from the consensus size • 34 calls where callsets missing this call from multiple technologies have a multiplied (1- sensitivity) < 2% in this size tranche • 87 calls that overlap Ns in the reference
  • 17. Overview of process Merge deletions within 1kb Rank calls by closeness of predicted size to median size and select call in each region from best callset Find calls supported by 2+ technologies with size within 20% Filter calls overlapping seg dups, reference N’s, or with call with predicted size 2x larger
  • 18. Number of Calls Supported by 2 Technologies by Size Range <50bp 50-100bp 100-1000bp 1kb-3kb >3kb pre-filtered 2542 1567 2447 731 730 filtered 2427 1415 2207 638 524
  • 20. Support for all candidate regions # of callsets # of technologies
  • 21. Support for benchmark calls # of callsets # of technologies
  • 22. Approach #2: svcompare (NCBI hackathon) Builds on SURVIVOR • Compares each new callset to the first and adds new calls not within 1kb of existing calls • Outputs multi-sample vcf with type, size, and breakpoints from each callset in each candidate region • Integrates multiple types, but doesn’t currently output size of insertions or exact sequence • Developed by Fritz Sedlazeck, JHU Output stats • 130k input regions from calls >19bp • 876 regions have >1 type within a callset • 2276 regions have >1 type across callsets • How to integrate discordant types in same region? https://github.com/NCBI-Hackathons/svcompare
  • 23. Example start position distance from median start by callset (400-1000bp)
  • 24. Approach #3: “Type” candidate calls in each dataset svviz • Looks for whether reads support REF or ALT allele – Can often easily infer genotype • Also generates other stats about mapping reads • Generates visualization of mapped reads as well • Nabsys has developed a similar approach for their mapping data Compatible datasets • PacBio • Illumina 150bp and 250bp paired end • Illumina 6kb mate-pair • 10X haplotype-separated
  • 25. 10X SV analyses with svviz • Find reads supporting ref and alt alleles in each haplotype • Verify support for ref and alt is on different haplotypes for hets • Verify support from both haplotypes for confidence homo var or hom ref call SonDadMomSonDadMom
  • 26. Goals for Data Jamboree Share progress in algorithm development • New technologies • New analysis methods • Visualization methods • Integration/comparison methods Outstanding questions to discuss • Integration – How to form high-confidence calls, breakpoints, and genotypes from multiple calls? – What is the minimum viable product for a practical benchmark set? • Is this a good criterion: “When an individual callset is compared to ours, most FPs/FNs should be errors in the individual callset” – How to handle non-deletions? • SV typing • Future work – How to form high-confidence regions? – SV phasing – Is anyone developing SV benchmarking tools?
  • 27. Things to resolve Integration • How to compare events with variable breakpoints across callsets? – Tandem repeats • How to compare non- deletions? – Start with insertions? • Distinguish precise breakpoints when possible Typing • Leverage long-range information to type with short reads? • How to deal with imprecise breakpoints? • At what point is something validated? – Potentially high-confidence variants (or reference?) – Haplotype-separated
  • 28. Acknowledgements • NIST – Marc Salit – Jenny McDaniel – Lindsay Vang – David Catoe – Hemang Parikh • Genome in a Bottle Consortium • GA4GH Benchmarking Team • FDA – Liz Mansfield • SV Callset Contributors – CSHL/JHU – Mt Sinai – 10X – Nabsys – Spiral Genetics/Stanford – Heng Li/Mike Lin – DNAnexus – Complete Genomics – Baylor – Bina/Roche – BioNano Genomics – Mark Chaisson – NIH/NCBI – NIH/NHGRI – Can Alkan/Stanford

Editor's Notes

  1. Per incoming students – we constantly get proven wrong add stem cell body map slide talk about single-cell normalization with ERCCs calibration with ERCCs touch on GTC connection and Ron Davis’ remarks can we build a sufficiently large technology development portfolio? per Caroline Bertozzi – “Anything can be Interesting.”