SlideShare a Scribd company logo
An assessment of computational
genotyping of Structural Variations for
clinical diagnosis
Fritz Sedlazeck
Oct, 16, 2018
Scientific interests
Detection of Variants
Clairvoyante
Lou et al. (in review)
Sniffles
Sedlazeck et. al. (2018)
SURVIVOR
Jeffares et. al. (2017)
Mapping/ Assembly reads
NextGenMap-LR
Sedlazeck et. al. (2018)
Falcon Unzip
Chin et.al. (2016)
NextGenMap
Sedlazeck et.al. (2013)
Benchmarking
Teaser
Smolka et.al. (2015)
Sequencing
Jünemann et.al. (2013)
Applications
Model organisms:
-Cancer (SKBR3) (in preparation)
-miRNA editing (Vesely et.al. 2012)
Non Model organisms:
-Cottus transposons (Dennenmoser
et. al. 2017)
-Clunio (Kaiser et. al. 2016)
-Seabass (Vij et.al. 2016)
-Pineapple (Ming et.al. 2015)
“moonlight”'
How to detect Structural Variations
Structural Variations
Genomic DisordersEvolution
Impact on regulation Impact on phenotypes
RegulatoryState
Cell Line
A
54
9A
o
rta
B
_
ce
lls_
P
B
_R
o
ad
m
ap
C
D
1
4C
D
16
__
m
on
ocyte_
C
B
C
D
14
C
D
16
__m
ono
cyte
_V
B
C
D
4
_a
b_
T
_
cell_
V
B
C
D
8_a
b_
T
_
ce
ll_C
B
C
M
_C
D
4
_ab
_T
_cell_
V
B
D
N
D
_4
1
e
osin
o
ph
il_V
B
E
P
C
_V
B
eryth
rob
la
st_C
B
F
e
ta
l_
A
dren
al_
G
la
n
d
F
e
tal_
Intestin
e_L
arg
e
F
etal_
In
te
stin
e_
S
m
all
F
e
ta
l_
M
u
scle
_
Le
g
F
etal_
M
uscle
_T
runk
F
etal_
S
tom
a
ch
F
e
tal_
T
hym
us
G
astric
G
M
12
87
8
H
1_
m
esenchym
al
H
1_
ne
uron
al_
p
rog
en
itor
H
1_
troph
ob
la
st
H
1
E
S
C
H
9
H
e
La
_S
3
H
e
pG
2H
M
E
CH
S
M
M
H
S
M
M
tub
e
H
U
V
E
C
_p
ro
l_
C
B
H
U
V
E
CIM
R
90
iP
S
_2
0b
iP
S
_D
F
_19
_1
1
iP
S
_D
F
_6
_9K
56
2
Le
ft_V
e
ntric
leL
un
g
M
0_
m
acro
ph
ag
e_
C
B
M
0_
m
acrop
hag
e_
V
B
M
1_m
acro
ph
age
_C
B
M
1_
m
acro
ph
ag
e_
V
B
M
2
_m
a
crop
ha
ge
_C
B
M
2_
m
acro
ph
ag
e_
V
B
M
ono
cyte
s_C
D
1
4_
P
B
_R
o
ad
m
ap
M
on
ocyte
s_
C
D
1
4
M
S
C
_V
B
n
aiv
e
_B
_ce
ll_
V
B
N
a
tu
ral_
K
ille
r_cells_P
B
ne
utrop
hil_
C
B
n
eutrop
hil_m
ye
lo
cyte
_B
M
n
eu
tro
ph
il_V
BN
H
_A
N
H
D
F
_A
DN
H
E
KN
H
LF
O
steob
l
O
vary
P
an
crea
s
P
la
ce
nta
P
soa
s_M
uscle
R
ig
ht_A
triu
m
S
m
all_
Intestin
e
S
ple
e
n
T
_cells_P
B
_R
oa
dm
a
p
T
hym
us
C
T
C
F
_b
in
din
g
_siteA
C
T
IV
E
C
T
C
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
C
T
C
F
_bin
d
in
g_
site
P
O
IS
E
D
C
T
C
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
e
nha
ncerA
C
T
IV
E
e
nh
an
ce
rIN
A
C
T
IV
E
en
han
ce
rP
O
IS
E
D
e
nh
an
cerR
E
P
R
E
S
S
E
D
op
en
_chrom
a
tin
_reg
io
nA
C
T
IV
E
o
pe
n_
chro
m
atin
_
re
gio
n
IN
A
C
T
IV
E
o
pe
n_
chro
m
atin
_re
gio
n
N
A
ope
n_
ch
ro
m
atin
_
regio
n
P
O
IS
E
D
o
pe
n_
chro
m
atin
_re
gio
n
R
E
P
R
E
S
S
E
D
p
rom
o
te
rA
C
T
IV
E
pro
m
oter_
fla
n
kin
g
_reg
io
nA
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
re
gio
n
IN
A
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
regio
n
P
O
IS
E
D
p
ro
m
o
te
r_fla
nkin
g_re
gio
n
R
E
P
R
E
S
S
E
D
prom
oterIN
A
C
T
IV
E
pro
m
oterP
O
IS
E
D
prom
oterR
E
P
R
E
S
S
E
D
T
F
_b
in
din
g
_siteA
C
T
IV
E
T
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
T
F
_
bin
d
in
g_
site
N
A
T
F
_
bin
d
in
g_
site
P
O
IS
E
D
T
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
A
54
9A
o
rta
B
_
ce
lls_
P
B
_R
o
ad
m
ap
C
D
1
4C
D
16
__
m
on
ocyte_
C
B
C
D
14
C
D
16
__m
ono
cyte
_V
B
C
D
4
_a
b_
T
_
cell_
V
B
C
D
8_a
b_
T
_
ce
ll_C
B
C
M
_C
D
4
_ab
_T
_cell_
V
B
D
N
D
_4
1
e
osin
o
ph
il_V
B
E
P
C
_V
B
eryth
rob
la
st_C
B
F
e
ta
l_
A
dren
al_
G
la
n
d
F
e
tal_
Intestin
e_L
arg
e
F
etal_
In
te
stin
e_
S
m
all
F
e
ta
l_
M
u
scle
_
Le
g
F
etal_
M
uscle
_T
runk
F
etal_
S
tom
a
ch
F
e
tal_
T
hym
us
G
astric
G
M
12
87
8
H
1_
m
esenchym
al
H
1_
ne
uron
al_
p
rog
en
itor
H
1_
troph
ob
la
st
H
1
E
S
C
H
9
H
e
La
_S
3
H
e
pG
2H
M
E
CH
S
M
M
H
S
M
M
tub
e
H
U
V
E
C
_p
ro
l_
C
B
H
U
V
E
CIM
R
90
iP
S
_2
0b
iP
S
_D
F
_19
_1
1
iP
S
_D
F
_6
_9K
56
2
Le
ft_V
e
ntric
leL
un
g
M
0_
m
acro
ph
ag
e_
C
B
M
0_
m
acrop
hag
e_
V
B
M
1_m
acro
ph
age
_C
B
M
1_
m
acro
ph
ag
e_
V
B
M
2
_m
a
crop
ha
ge
_C
B
M
2_
m
acro
ph
ag
e_
V
B
M
ono
cyte
s_C
D
1
4_
P
B
_R
o
ad
m
ap
M
on
ocyte
s_
C
D
1
4
M
S
C
_V
B
n
aiv
e
_B
_ce
ll_
V
B
N
a
tu
ral_
K
ille
r_cells_P
B
ne
utrop
hil_
C
B
n
eutrop
hil_m
ye
lo
cyte
_B
M
n
eu
tro
ph
il_V
BN
H
_A
N
H
D
F
_A
DN
H
E
KN
H
LF
O
steob
l
O
vary
P
an
crea
s
P
la
ce
nta
P
soa
s_M
uscle
R
ig
ht_A
triu
m
S
m
all_
Intestin
e
S
ple
e
n
T
_cells_P
B
_R
oa
dm
a
p
T
hym
us
C
T
C
F
_b
in
din
g
_siteA
C
T
IV
E
C
T
C
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
C
T
C
F
_bin
d
in
g_
site
P
O
IS
E
D
C
T
C
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
e
nha
ncerA
C
T
IV
E
e
nh
an
ce
rIN
A
C
T
IV
E
en
han
ce
rP
O
IS
E
D
e
nh
an
cerR
E
P
R
E
S
S
E
D
op
en
_chrom
a
tin
_reg
io
nA
C
T
IV
E
o
pe
n_
chro
m
atin
_
re
gio
n
IN
A
C
T
IV
E
o
pe
n_
chro
m
atin
_re
gio
n
N
A
ope
n_
ch
ro
m
atin
_
regio
n
P
O
IS
E
D
o
pe
n_
chro
m
atin
_re
gio
n
R
E
P
R
E
S
S
E
D
p
rom
o
te
rA
C
T
IV
E
pro
m
oter_
fla
n
kin
g
_reg
io
nA
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
re
gio
n
IN
A
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
regio
n
P
O
IS
E
D
p
ro
m
o
te
r_fla
nkin
g_re
gio
n
R
E
P
R
E
S
S
E
D
prom
oterIN
A
C
T
IV
E
pro
m
oterP
O
IS
E
D
prom
oterR
E
P
R
E
S
S
E
D
T
F
_b
in
din
g
_siteA
C
T
IV
E
T
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
T
F
_
bin
d
in
g_
site
N
A
T
F
_
bin
d
in
g_
site
P
O
IS
E
D
T
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
0500100015002000
scale
affected#
Remaining Challenges for SVs calling
1. Accuracy of the calls
1. False positives
2. False negatives
2. Functional interpretation?
1. Population frequencies/ Curation
Illumina data
PacBio data
ONT data
Remaining Challenges for SVs calling
1. Accuracy of the calls
1. False positives
2. False negatives
2. Functional interpretation?
1. Population frequencies/ Curation
Illumina
PacBio
Nanopore
How to call SV in routine scans?
SV genotyping
• Advantages
• Low/no false positives
• False negatives ??
• Focus on variants that are know to have an impact.
• Disadvantages
• We cannot find novel SVs
Varuna Chander
Approaches
• DELLY: SV caller that also supports genotyping
• STIX: SV genotyper
• SVTyper: SV genotyper
• SV2: SV genotyper
Simulated data
1. We simulated SVs of different
types and sizes
2. Called SVs with Delly, Manta
and Lumpy
3. Merged calls with SURVIVOR
4. Used the merges as input to
the SV genotyper
5. Evaluated their results for SV
that they support.
Giab v0.5.0 deletions
• Most of the genotyper only
handle the DEL
• Constrain on the input
format/field
• Lack of sensitivity
Paragraph
• Graph based SV genotyper
• GiaB all types:
• Sensitivity: 82%
• Precision: 99%
• GT concordance: 80%
• Available:
github.com/Illumina/paragraph
P Krusche et al (in preparation)
Remaining Challenges for SVs calling
1. Accuracy of the calls
1. False positives
2. False negatives
2. Functional interpretation?
1. Population frequencies/ Curation
STIX: Population frequency
• Online framework to annotate
your SVs with allele
frequencies.
• ~0.1 sec / SV
• Storing informative reads
• (0.18% of BAM)
• Currently ~9000 samples
• Multiple ethnicities
Layer et al. (in preparation)
Acknowledgments
Varuna Chander
William Salerno
Richard Gibbs
Peter Krusche
Sai Chen,
Mike Eberle
Ryan Layer

More Related Content

What's hot

140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
GenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
GenomeInABottle
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878
GenomeInABottle
 

What's hot (20)

140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
 
Molecular QC: Using Reference Standards in NGS Pipelines
Molecular QC: Using Reference Standards in NGS PipelinesMolecular QC: Using Reference Standards in NGS Pipelines
Molecular QC: Using Reference Standards in NGS Pipelines
 
Whole Genome Amplification from Single Cell
Whole Genome Amplification from Single CellWhole Genome Amplification from Single Cell
Whole Genome Amplification from Single Cell
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Molecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics PipelineMolecular QC: Interpreting your Bioinformatics Pipeline
Molecular QC: Interpreting your Bioinformatics Pipeline
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1
 
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
Advanced NGS Data Analysis & Interpretation- BGW + IVA: NGS Tech Overview Web...
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878
 
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Analysis and Interpretation of Cell-free DNA
Analysis and Interpretation of Cell-free DNAAnalysis and Interpretation of Cell-free DNA
Analysis and Interpretation of Cell-free DNA
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 

Similar to Giab sv genotyping

Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
Ulises Urzua
 
Verifying the role of AID in Chronic Lymphocytic Leukemia
Verifying the role of AID in Chronic Lymphocytic LeukemiaVerifying the role of AID in Chronic Lymphocytic Leukemia
Verifying the role of AID in Chronic Lymphocytic Leukemia
Charlotte Broadbent
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and Pathology
Dan Gaston
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
Sean Davis
 

Similar to Giab sv genotyping (20)

Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
 
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
 
GTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation SequencingGTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation Sequencing
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
Dna microarray mehran
Dna microarray  mehranDna microarray  mehran
Dna microarray mehran
 
150224 giab 30 min generic slides
150224 giab 30 min generic slides150224 giab 30 min generic slides
150224 giab 30 min generic slides
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
 
Verifying the role of AID in Chronic Lymphocytic Leukemia
Verifying the role of AID in Chronic Lymphocytic LeukemiaVerifying the role of AID in Chronic Lymphocytic Leukemia
Verifying the role of AID in Chronic Lymphocytic Leukemia
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and Pathology
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 

More from GenomeInABottle

More from GenomeInABottle (17)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortium
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...
 

Recently uploaded

Recently uploaded (20)

The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
 
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
 
Why invest into infodemic management in health emergencies
Why invest into infodemic management in health emergenciesWhy invest into infodemic management in health emergencies
Why invest into infodemic management in health emergencies
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
 
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #GirlsFor Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
 
Evaluation of antidepressant activity of clitoris ternatea in animals
Evaluation of antidepressant activity of clitoris ternatea in animalsEvaluation of antidepressant activity of clitoris ternatea in animals
Evaluation of antidepressant activity of clitoris ternatea in animals
 
Aptopadesha Pramana / Pariksha: The Verbal Testimony
Aptopadesha Pramana / Pariksha: The Verbal TestimonyAptopadesha Pramana / Pariksha: The Verbal Testimony
Aptopadesha Pramana / Pariksha: The Verbal Testimony
 
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
 
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptxPT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
 
CURRENT HEALTH PROBLEMS AND ITS SOLUTION BY AYURVEDA.pptx
CURRENT HEALTH PROBLEMS AND ITS SOLUTION BY AYURVEDA.pptxCURRENT HEALTH PROBLEMS AND ITS SOLUTION BY AYURVEDA.pptx
CURRENT HEALTH PROBLEMS AND ITS SOLUTION BY AYURVEDA.pptx
 
Impact of cancers therapies on the loss in cardiac function, myocardial fffic...
Impact of cancers therapies on the loss in cardiac function, myocardial fffic...Impact of cancers therapies on the loss in cardiac function, myocardial fffic...
Impact of cancers therapies on the loss in cardiac function, myocardial fffic...
 
Multiple sclerosis diet.230524.ppt3.pptx
Multiple sclerosis diet.230524.ppt3.pptxMultiple sclerosis diet.230524.ppt3.pptx
Multiple sclerosis diet.230524.ppt3.pptx
 
Non-Invasive assessment of arterial stiffness in advanced heart failure patie...
Non-Invasive assessment of arterial stiffness in advanced heart failure patie...Non-Invasive assessment of arterial stiffness in advanced heart failure patie...
Non-Invasive assessment of arterial stiffness in advanced heart failure patie...
 
Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...
 
Final CAPNOCYTOPHAGA INFECTION by Gauri Gawande.pptx
Final CAPNOCYTOPHAGA INFECTION by Gauri Gawande.pptxFinal CAPNOCYTOPHAGA INFECTION by Gauri Gawande.pptx
Final CAPNOCYTOPHAGA INFECTION by Gauri Gawande.pptx
 
Presentació "Advancing Emergency Medicine Education through Virtual Reality"
Presentació "Advancing Emergency Medicine Education through Virtual Reality"Presentació "Advancing Emergency Medicine Education through Virtual Reality"
Presentació "Advancing Emergency Medicine Education through Virtual Reality"
 
US E-cigarette Summit: Taming the nicotine industrial complex
US E-cigarette Summit: Taming the nicotine industrial complexUS E-cigarette Summit: Taming the nicotine industrial complex
US E-cigarette Summit: Taming the nicotine industrial complex
 
linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...
 
Arterial health throughout cancer treatment and exercise rehabilitation in wo...
Arterial health throughout cancer treatment and exercise rehabilitation in wo...Arterial health throughout cancer treatment and exercise rehabilitation in wo...
Arterial health throughout cancer treatment and exercise rehabilitation in wo...
 
Young at heart: Cardiovascular health stations to empower healthy lifestyle b...
Young at heart: Cardiovascular health stations to empower healthy lifestyle b...Young at heart: Cardiovascular health stations to empower healthy lifestyle b...
Young at heart: Cardiovascular health stations to empower healthy lifestyle b...
 

Giab sv genotyping

  • 1. An assessment of computational genotyping of Structural Variations for clinical diagnosis Fritz Sedlazeck Oct, 16, 2018
  • 2. Scientific interests Detection of Variants Clairvoyante Lou et al. (in review) Sniffles Sedlazeck et. al. (2018) SURVIVOR Jeffares et. al. (2017) Mapping/ Assembly reads NextGenMap-LR Sedlazeck et. al. (2018) Falcon Unzip Chin et.al. (2016) NextGenMap Sedlazeck et.al. (2013) Benchmarking Teaser Smolka et.al. (2015) Sequencing Jünemann et.al. (2013) Applications Model organisms: -Cancer (SKBR3) (in preparation) -miRNA editing (Vesely et.al. 2012) Non Model organisms: -Cottus transposons (Dennenmoser et. al. 2017) -Clunio (Kaiser et. al. 2016) -Seabass (Vij et.al. 2016) -Pineapple (Ming et.al. 2015) “moonlight”'
  • 3. How to detect Structural Variations
  • 4. Structural Variations Genomic DisordersEvolution Impact on regulation Impact on phenotypes RegulatoryState Cell Line A 54 9A o rta B _ ce lls_ P B _R o ad m ap C D 1 4C D 16 __ m on ocyte_ C B C D 14 C D 16 __m ono cyte _V B C D 4 _a b_ T _ cell_ V B C D 8_a b_ T _ ce ll_C B C M _C D 4 _ab _T _cell_ V B D N D _4 1 e osin o ph il_V B E P C _V B eryth rob la st_C B F e ta l_ A dren al_ G la n d F e tal_ Intestin e_L arg e F etal_ In te stin e_ S m all F e ta l_ M u scle _ Le g F etal_ M uscle _T runk F etal_ S tom a ch F e tal_ T hym us G astric G M 12 87 8 H 1_ m esenchym al H 1_ ne uron al_ p rog en itor H 1_ troph ob la st H 1 E S C H 9 H e La _S 3 H e pG 2H M E CH S M M H S M M tub e H U V E C _p ro l_ C B H U V E CIM R 90 iP S _2 0b iP S _D F _19 _1 1 iP S _D F _6 _9K 56 2 Le ft_V e ntric leL un g M 0_ m acro ph ag e_ C B M 0_ m acrop hag e_ V B M 1_m acro ph age _C B M 1_ m acro ph ag e_ V B M 2 _m a crop ha ge _C B M 2_ m acro ph ag e_ V B M ono cyte s_C D 1 4_ P B _R o ad m ap M on ocyte s_ C D 1 4 M S C _V B n aiv e _B _ce ll_ V B N a tu ral_ K ille r_cells_P B ne utrop hil_ C B n eutrop hil_m ye lo cyte _B M n eu tro ph il_V BN H _A N H D F _A DN H E KN H LF O steob l O vary P an crea s P la ce nta P soa s_M uscle R ig ht_A triu m S m all_ Intestin e S ple e n T _cells_P B _R oa dm a p T hym us C T C F _b in din g _siteA C T IV E C T C F _ bin d in g_ site IN A C T IV E C T C F _bin d in g_ site P O IS E D C T C F _ bin d in g_ site R E P R E S S E D e nha ncerA C T IV E e nh an ce rIN A C T IV E en han ce rP O IS E D e nh an cerR E P R E S S E D op en _chrom a tin _reg io nA C T IV E o pe n_ chro m atin _ re gio n IN A C T IV E o pe n_ chro m atin _re gio n N A ope n_ ch ro m atin _ regio n P O IS E D o pe n_ chro m atin _re gio n R E P R E S S E D p rom o te rA C T IV E pro m oter_ fla n kin g _reg io nA C T IV E p rom o te r_fla nkin g_ re gio n IN A C T IV E p rom o te r_fla nkin g_ regio n P O IS E D p ro m o te r_fla nkin g_re gio n R E P R E S S E D prom oterIN A C T IV E pro m oterP O IS E D prom oterR E P R E S S E D T F _b in din g _siteA C T IV E T F _ bin d in g_ site IN A C T IV E T F _ bin d in g_ site N A T F _ bin d in g_ site P O IS E D T F _ bin d in g_ site R E P R E S S E D A 54 9A o rta B _ ce lls_ P B _R o ad m ap C D 1 4C D 16 __ m on ocyte_ C B C D 14 C D 16 __m ono cyte _V B C D 4 _a b_ T _ cell_ V B C D 8_a b_ T _ ce ll_C B C M _C D 4 _ab _T _cell_ V B D N D _4 1 e osin o ph il_V B E P C _V B eryth rob la st_C B F e ta l_ A dren al_ G la n d F e tal_ Intestin e_L arg e F etal_ In te stin e_ S m all F e ta l_ M u scle _ Le g F etal_ M uscle _T runk F etal_ S tom a ch F e tal_ T hym us G astric G M 12 87 8 H 1_ m esenchym al H 1_ ne uron al_ p rog en itor H 1_ troph ob la st H 1 E S C H 9 H e La _S 3 H e pG 2H M E CH S M M H S M M tub e H U V E C _p ro l_ C B H U V E CIM R 90 iP S _2 0b iP S _D F _19 _1 1 iP S _D F _6 _9K 56 2 Le ft_V e ntric leL un g M 0_ m acro ph ag e_ C B M 0_ m acrop hag e_ V B M 1_m acro ph age _C B M 1_ m acro ph ag e_ V B M 2 _m a crop ha ge _C B M 2_ m acro ph ag e_ V B M ono cyte s_C D 1 4_ P B _R o ad m ap M on ocyte s_ C D 1 4 M S C _V B n aiv e _B _ce ll_ V B N a tu ral_ K ille r_cells_P B ne utrop hil_ C B n eutrop hil_m ye lo cyte _B M n eu tro ph il_V BN H _A N H D F _A DN H E KN H LF O steob l O vary P an crea s P la ce nta P soa s_M uscle R ig ht_A triu m S m all_ Intestin e S ple e n T _cells_P B _R oa dm a p T hym us C T C F _b in din g _siteA C T IV E C T C F _ bin d in g_ site IN A C T IV E C T C F _bin d in g_ site P O IS E D C T C F _ bin d in g_ site R E P R E S S E D e nha ncerA C T IV E e nh an ce rIN A C T IV E en han ce rP O IS E D e nh an cerR E P R E S S E D op en _chrom a tin _reg io nA C T IV E o pe n_ chro m atin _ re gio n IN A C T IV E o pe n_ chro m atin _re gio n N A ope n_ ch ro m atin _ regio n P O IS E D o pe n_ chro m atin _re gio n R E P R E S S E D p rom o te rA C T IV E pro m oter_ fla n kin g _reg io nA C T IV E p rom o te r_fla nkin g_ re gio n IN A C T IV E p rom o te r_fla nkin g_ regio n P O IS E D p ro m o te r_fla nkin g_re gio n R E P R E S S E D prom oterIN A C T IV E pro m oterP O IS E D prom oterR E P R E S S E D T F _b in din g _siteA C T IV E T F _ bin d in g_ site IN A C T IV E T F _ bin d in g_ site N A T F _ bin d in g_ site P O IS E D T F _ bin d in g_ site R E P R E S S E D 0500100015002000 scale affected#
  • 5. Remaining Challenges for SVs calling 1. Accuracy of the calls 1. False positives 2. False negatives 2. Functional interpretation? 1. Population frequencies/ Curation Illumina data PacBio data ONT data
  • 6. Remaining Challenges for SVs calling 1. Accuracy of the calls 1. False positives 2. False negatives 2. Functional interpretation? 1. Population frequencies/ Curation Illumina PacBio Nanopore
  • 7. How to call SV in routine scans? SV genotyping • Advantages • Low/no false positives • False negatives ?? • Focus on variants that are know to have an impact. • Disadvantages • We cannot find novel SVs Varuna Chander
  • 8. Approaches • DELLY: SV caller that also supports genotyping • STIX: SV genotyper • SVTyper: SV genotyper • SV2: SV genotyper
  • 9. Simulated data 1. We simulated SVs of different types and sizes 2. Called SVs with Delly, Manta and Lumpy 3. Merged calls with SURVIVOR 4. Used the merges as input to the SV genotyper 5. Evaluated their results for SV that they support.
  • 10. Giab v0.5.0 deletions • Most of the genotyper only handle the DEL • Constrain on the input format/field • Lack of sensitivity
  • 11. Paragraph • Graph based SV genotyper • GiaB all types: • Sensitivity: 82% • Precision: 99% • GT concordance: 80% • Available: github.com/Illumina/paragraph P Krusche et al (in preparation)
  • 12. Remaining Challenges for SVs calling 1. Accuracy of the calls 1. False positives 2. False negatives 2. Functional interpretation? 1. Population frequencies/ Curation
  • 13. STIX: Population frequency • Online framework to annotate your SVs with allele frequencies. • ~0.1 sec / SV • Storing informative reads • (0.18% of BAM) • Currently ~9000 samples • Multiple ethnicities Layer et al. (in preparation)
  • 14. Acknowledgments Varuna Chander William Salerno Richard Gibbs Peter Krusche Sai Chen, Mike Eberle Ryan Layer

Editor's Notes

  1. Welcome everyone. My name is Fritz Sedlazeck and I am currently working at the Human Genome Sequencing Center @ Baylor in Houston. Today I am going to give you an update on our efforts to improve long read mapping as well as SV calling. Before I dive into that let me shortly introduce myself and my scientific interest.
  2. I am a computational biologist mainly focusing on method developing for mapping of short and long reads and detection of SVs. To get a better insight in what are the artifact and what is the true signal that we have to deal with I imitated and contribute in benchmarking studies for sequencers and mappers. Overall I am also happy to collaborate with many people on multiple organisms around the world.
  3. Evolution: Main driver. E.g. gene gains or loss. Hybrid Genome architecture (cottus) Genomic disorders: Cancer (in prep.) and other diseases Impact on regulation that we are currently studying over ENTEX Impact of phenotypes: That I just published where we could show the contribution of CNV and rearrangements on traits.
  4. Establish catalog of SVs that we already understand. Use genotyping to scan for these SVs Report found SV per sample
  5. Display slide during questions. Check with Will!!