SlideShare a Scribd company logo
1 of 28
Download to read offline
Structural Variation Characterization
Across the Human Genome and
Populations
Fritz Sedlazeck
October, 17, 2017
Scientific interests
Detection of Variants
Sniffles
(in bioRxiv)
SURVIVOR
Jeffares et. al. (2017)
BOD-Score
Sedlazeck et.al.(2013)
Mapping/ Assembly reads
NextGenMap-LR
(in bioRxiv)
Falcon Unzip
Chin et.al. (2016)
NextGenMap
Sedlazeck et.al. (2013)
Benchmarking/ Biases
DangerTrack
Dolgalev et.al. (2017)
Teaser
Smolka et.al. (2015)
Sequencing
Jünemann et.al. (2013)
Applications
Model organisms:
-Cancer (SKBR3) (in bioRxiv)
-miRNA editing (Vesely et.al. 2012)
Non Model organisms:
-Cottus transposons (Dennenmoser
et. al. 2017)
-Clunio (Kaiser et. al. 2016)
-Seabass (Vij et.al. 2016)
-Pineapple (Ming et.al. 2015)
“moonlight”'
Structural Variations
Genomic DisordersEvolution
Impact on regulation Impact on phenotypes
RegulatoryState
Cell Line
A
54
9A
o
rta
B
_
ce
lls_
P
B
_R
o
ad
m
ap
C
D
1
4C
D
16
__
m
on
ocyte_
C
B
C
D
14
C
D
16
__m
ono
cyte
_V
B
C
D
4
_a
b_
T
_
cell_
V
B
C
D
8_a
b_
T
_
ce
ll_C
B
C
M
_C
D
4
_ab
_T
_cell_
V
B
D
N
D
_4
1
e
osin
o
ph
il_V
B
E
P
C
_V
B
eryth
rob
la
st_C
B
F
e
ta
l_
A
dren
al_
G
la
n
d
F
e
tal_
Intestin
e_L
arg
e
F
etal_
In
te
stin
e_
S
m
all
F
e
ta
l_
M
u
scle
_
Le
g
F
etal_
M
uscle
_T
runk
F
etal_
S
tom
a
ch
F
e
tal_
T
hym
us
G
astric
G
M
12
87
8
H
1_
m
esenchym
al
H
1_
ne
uron
al_
p
rog
en
itor
H
1_
troph
ob
la
st
H
1
E
S
C
H
9
H
e
La
_S
3
H
e
pG
2H
M
E
CH
S
M
M
H
S
M
M
tub
e
H
U
V
E
C
_p
ro
l_
C
B
H
U
V
E
CIM
R
90
iP
S
_2
0b
iP
S
_D
F
_19
_1
1
iP
S
_D
F
_6
_9K
56
2
Le
ft_V
e
ntric
leL
un
g
M
0_
m
acro
ph
ag
e_
C
B
M
0_
m
acrop
hag
e_
V
B
M
1_m
acro
ph
age
_C
B
M
1_
m
acro
ph
ag
e_
V
B
M
2
_m
a
crop
ha
ge
_C
B
M
2_
m
acro
ph
ag
e_
V
B
M
ono
cyte
s_C
D
1
4_
P
B
_R
o
ad
m
ap
M
on
ocyte
s_
C
D
1
4
M
S
C
_V
B
n
aiv
e
_B
_ce
ll_
V
B
N
a
tu
ral_
K
ille
r_cells_P
B
ne
utrop
hil_
C
B
n
eutrop
hil_m
ye
lo
cyte
_B
M
n
eu
tro
ph
il_V
BN
H
_A
N
H
D
F
_A
DN
H
E
KN
H
LF
O
steob
l
O
vary
P
an
crea
s
P
la
ce
nta
P
soa
s_M
uscle
R
ig
ht_A
triu
m
S
m
all_
Intestin
e
S
ple
e
n
T
_cells_P
B
_R
oa
dm
a
p
T
hym
us
C
T
C
F
_b
in
din
g
_siteA
C
T
IV
E
C
T
C
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
C
T
C
F
_bin
d
in
g_
site
P
O
IS
E
D
C
T
C
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
e
nha
ncerA
C
T
IV
E
e
nh
an
ce
rIN
A
C
T
IV
E
en
han
ce
rP
O
IS
E
D
e
nh
an
cerR
E
P
R
E
S
S
E
D
op
en
_chrom
a
tin
_reg
io
nA
C
T
IV
E
o
pe
n_
chro
m
atin
_
re
gio
n
IN
A
C
T
IV
E
o
pe
n_
chro
m
atin
_re
gio
n
N
A
ope
n_
ch
ro
m
atin
_
regio
n
P
O
IS
E
D
o
pe
n_
chro
m
atin
_re
gio
n
R
E
P
R
E
S
S
E
D
p
rom
o
te
rA
C
T
IV
E
pro
m
oter_
fla
n
kin
g
_reg
io
nA
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
re
gio
n
IN
A
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
regio
n
P
O
IS
E
D
p
ro
m
o
te
r_fla
nkin
g_re
gio
n
R
E
P
R
E
S
S
E
D
prom
oterIN
A
C
T
IV
E
pro
m
oterP
O
IS
E
D
prom
oterR
E
P
R
E
S
S
E
D
T
F
_b
in
din
g
_siteA
C
T
IV
E
T
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
T
F
_
bin
d
in
g_
site
N
A
T
F
_
bin
d
in
g_
site
P
O
IS
E
D
T
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
A
54
9A
o
rta
B
_
ce
lls_
P
B
_R
o
ad
m
ap
C
D
1
4C
D
16
__
m
on
ocyte_
C
B
C
D
14
C
D
16
__m
ono
cyte
_V
B
C
D
4
_a
b_
T
_
cell_
V
B
C
D
8_a
b_
T
_
ce
ll_C
B
C
M
_C
D
4
_ab
_T
_cell_
V
B
D
N
D
_4
1
e
osin
o
ph
il_V
B
E
P
C
_V
B
eryth
rob
la
st_C
B
F
e
ta
l_
A
dren
al_
G
la
n
d
F
e
tal_
Intestin
e_L
arg
e
F
etal_
In
te
stin
e_
S
m
all
F
e
ta
l_
M
u
scle
_
Le
g
F
etal_
M
uscle
_T
runk
F
etal_
S
tom
a
ch
F
e
tal_
T
hym
us
G
astric
G
M
12
87
8
H
1_
m
esenchym
al
H
1_
ne
uron
al_
p
rog
en
itor
H
1_
troph
ob
la
st
H
1
E
S
C
H
9
H
e
La
_S
3
H
e
pG
2H
M
E
CH
S
M
M
H
S
M
M
tub
e
H
U
V
E
C
_p
ro
l_
C
B
H
U
V
E
CIM
R
90
iP
S
_2
0b
iP
S
_D
F
_19
_1
1
iP
S
_D
F
_6
_9K
56
2
Le
ft_V
e
ntric
leL
un
g
M
0_
m
acro
ph
ag
e_
C
B
M
0_
m
acrop
hag
e_
V
B
M
1_m
acro
ph
age
_C
B
M
1_
m
acro
ph
ag
e_
V
B
M
2
_m
a
crop
ha
ge
_C
B
M
2_
m
acro
ph
ag
e_
V
B
M
ono
cyte
s_C
D
1
4_
P
B
_R
o
ad
m
ap
M
on
ocyte
s_
C
D
1
4
M
S
C
_V
B
n
aiv
e
_B
_ce
ll_
V
B
N
a
tu
ral_
K
ille
r_cells_P
B
ne
utrop
hil_
C
B
n
eutrop
hil_m
ye
lo
cyte
_B
M
n
eu
tro
ph
il_V
BN
H
_A
N
H
D
F
_A
DN
H
E
KN
H
LF
O
steob
l
O
vary
P
an
crea
s
P
la
ce
nta
P
soa
s_M
uscle
R
ig
ht_A
triu
m
S
m
all_
Intestin
e
S
ple
e
n
T
_cells_P
B
_R
oa
dm
a
p
T
hym
us
C
T
C
F
_b
in
din
g
_siteA
C
T
IV
E
C
T
C
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
C
T
C
F
_bin
d
in
g_
site
P
O
IS
E
D
C
T
C
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
e
nha
ncerA
C
T
IV
E
e
nh
an
ce
rIN
A
C
T
IV
E
en
han
ce
rP
O
IS
E
D
e
nh
an
cerR
E
P
R
E
S
S
E
D
op
en
_chrom
a
tin
_reg
io
nA
C
T
IV
E
o
pe
n_
chro
m
atin
_
re
gio
n
IN
A
C
T
IV
E
o
pe
n_
chro
m
atin
_re
gio
n
N
A
ope
n_
ch
ro
m
atin
_
regio
n
P
O
IS
E
D
o
pe
n_
chro
m
atin
_re
gio
n
R
E
P
R
E
S
S
E
D
p
rom
o
te
rA
C
T
IV
E
pro
m
oter_
fla
n
kin
g
_reg
io
nA
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
re
gio
n
IN
A
C
T
IV
E
p
rom
o
te
r_fla
nkin
g_
regio
n
P
O
IS
E
D
p
ro
m
o
te
r_fla
nkin
g_re
gio
n
R
E
P
R
E
S
S
E
D
prom
oterIN
A
C
T
IV
E
pro
m
oterP
O
IS
E
D
prom
oterR
E
P
R
E
S
S
E
D
T
F
_b
in
din
g
_siteA
C
T
IV
E
T
F
_
bin
d
in
g_
site
IN
A
C
T
IV
E
T
F
_
bin
d
in
g_
site
N
A
T
F
_
bin
d
in
g_
site
P
O
IS
E
D
T
F
_
bin
d
in
g_
site
R
E
P
R
E
S
S
E
D
0500100015002000
scale
affected#
Diploid genome
• Impact on Regulation
• Variability of genes
• Need to understand the full structure
Challenges: Pursuing the diploid genome
1. Accurate prediction of SVs
2. Comparison of SVs
3. Annotation and interpretation of SVs
4. Population analysis
5. Diploid Genome
Layer et.al. (2014)
1.1 How to detect Structural Variations (SVs)
• (+) SVs in repetitive regions
• (+) Span SVs
• (+) Uniform coverage
• (+) Can identify more complex SVs
• (-) Higher seq. error rate
• (-) Hard to align
1.1 Long Read Technologies
1.1 Accurate mapping and SV calling
NextGenMap-LR (NGMLR):
• Long read mapper
• Convex gap costs
• Faster then BWA-MEM
Sniffles:
• SV caller for long reads
• All types of SVs
• Phasing of SVs
1.2 NA12878: SV calling
Tech. Cover
age
Avg read len Method SVs TRA
PacBio 55x 4,334 Sniffles 22,877 119
Oxford
Nanopore
@Baylor
34x 4,982 Sniffles 12,596 46
Illumina 50x 2 x 101 Manta, Delly,
Lumpy
7,275 2,247
Sedlazeck et.al. (2017)
1.1 NA12878: SV calling
Tech. Cove
rage
Avg read
len
Method SVs TRA DEL INS
PacBio 55x 4,334 Sniffles 22,877 119 9,933 12,052
Oxford
Nanopore
@Baylor
34x 4,982 Sniffles 12,596 46 7,102 5,166
Illumina 50x 2 x 101 Manta,
Delly,
Lumpy
7,275 2,247 3,744 0
Sedlazeck et.al. (2017)
1.1 NA12878: check 2,247 vs 119 TRA
Illumina data
Translocation:
PacBio data
ONT data
Truncated reads:
Insertion
In rep. region
Overlap Illumina TRA(%)
Insertions 53.05
Deletions 12.06
Duplications 0.57
Nested 0.31
High coverage 1.87
Low complexity 9.79
Explained 77.65
Sedlazeck et.al. (2017)
1.1 NA12878: check 2,247 vs 119 TRA
ONT data
PacBio data
Illumina data
Insertion
In rep. region
Inversion:
Translocation:
Truncated reads:
Insertion
In rep. region
Sedlazeck et.al. (2017)
1.2 More complex SVs
Inverted tandem duplication:
• Pelizaeus-Merzbacher
disease
• MECP2
• VIPR2
Sedlazeck et.al. (2017)
PacBio data
Illumina data
1.2 More complex SVs
Inversion flanked by deletions:
• Haemophilia A
• Only found over long range PCR!
(2007)
Sedlazeck et.al. (2017)
Illumina data
PacBio data
Challenges
1. Accurate prediction of SVs: Sniffles (talk on Thursday!)
2. Comparison of SVs
3. Annotation and interpretation of SVs
4. Population analysis
5. Diploid Genome
Layer et.al. (2014)
2. Comparison of SVs
SURVIVOR Framework:
• Compare SVs
• GiaB: 95 vcf file: 1 minute
• Simulate SVs
• Simulate long reads
• Summarize SVs results
Jeffares et.al. (2017)
New SVs
Observed
SVs
2. Genome in a Bottle: merging 95 vcfs (1 min)
10x Genomics
BioNano
Complete Genomics
Illumina
PacBio
Minimum 2 callers:SV Caller Comparison:
Using PCR+Sanger validate SVs form multiple categories.
Join CSHL + Baylor to help with validations!
Challenges
1. Accurate prediction of SVs: Sniffles (talk on Thursday!)
2. Comparison of SVs: SURIVOR
3. Annotation and interpretation of SVs
4. Population analysis
5. Diploid Genome
Histogram over genes impacted
#Gene hit by SVS
Frequency
0 20 40 60 80
0200040006000
3. Annotation: SURVIVOR_ant
Annotating SVs with:
• Multiple GTF, BED, VCF
Genome in a Bottle:
• 63,677 genes (GTF)
• 1,733,686 regions (3 bed files)
• 22 seconds:
• 8,314 Genes impacted
Sedlazeck et.al. (2017)
#Genes
# SV hit gene
Genes impacted by SVs
Challenges
1. Accurate prediction of SVs: Sniffles (talk on Thursday!)
2. Comparison of SVs: SURIVOR
3. Annotation and interpretation of SVs: SURVIVOR_ANT
4. Population analysis
5. Diploid Genome
4. SVs in Population: SURVIVOR
• Birth defect study (Karyn Meltz
Steinberg, WashU: Wed. 9am: Room 310A)
• 4 callers, 114 samples
• CCDG (William Salerno, HGSC: poster on
Friday, #1281)
• 5 callers, 22,600 samples
• Non human:
• S. Pombe: 3 callers, 161 samples
• Tomato: 3 callers, 846 samples
4. SVs in 22,600 Individuals
We need large SV studies:
• Common vs. rare SVs
• Inform GWAS studies
• Ethnicity specific SVs
• Catalog variability of regions
• MHC, LPA, etc.
0.0e+00 5.0e+07 1.0e+08 1.5e+08
0.000.100.20
CHR6: Average SV Allele Frequency per 100kb
Allelefrequency
MHC LPA
#SVs
Shared across individuals
Position
Challenges
1. Accurate prediction of SVs: Sniffles (talk on Thursday!)
2. Comparison of SVs: SURIVOR
3. Annotation and interpretation of SVs: SURVIVOR_ANT
4. Population analysis: SURVIVOR
5. Diploid Genome
5.1 Diploid Genome
Challenges:
• Sequencing technology
• Computational methods
• Money
HGSC Approach: GADGET
1. Sequence 100 individuals: PacBio + 10x Genomics
2. SV detection/genotyping
3. Phasing of SVs+ SNP
4. Population based genotyping of SVs short reads.
5.2 Diploid Genome
Selecting 100 samples
• We want to maximize the outcome/
$ spent
• Selection of samples (red)
• Select top 100 (red)
• Random selection of samples
(boxplot)
Histogram of mat[, 2]
# SVS
#Patients
2e+04 4e+04 6e+04 8e+04 1e+05
050100150200250
1 6 12 19 26 33 40 47 54 61 68 75 82 89 96
020406080100
Random vs. informed choice of samples (CCDG)
# of chosen Samples
SVinpopulation(%)
Informed
Top100
Random
Number of chosen samples
SVinpopulation(%)
5.3 Diploid Genome (Prototype)
Challenges/ Summary
1. Accurate prediction of SVs: Sniffles (Talk on
Thursday!)
2. Comparison of SVs: SURIVOR
3. Annotation and interpretation of SVs:
SURVIVOR_ANT
4. Population analysis: SURVIVOR
5. Diploid Genome: GADGET
All methods are available:
https://github.com/fritzsedlazeck
https://fritzsedlazeck.github.io/
1 6 12 19 26 33 40 47 54 61 68 75 82 89 96
020406080100
Random vs. informed choice of samples (CCDG)
# of chosen Samples
SVinpopulation(%)
Informed
Top100
Random
Number of chosen samples
SVinpopulation(%)
William Salerno
Stephen Richards
Richard Gibbs
Michael Schatz
Schatz lab
Acknowledgments
Daniel Jeffares
Jürg Bähler
Christophe Dessimoz
Justin Zook
GiaB consortium

More Related Content

What's hot

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCGenome Reference Consortium
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)Shaojun Xie
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amGenome Reference Consortium
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesGenome Reference Consortium
 
Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysisGenome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsGenome Reference Consortium
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmKnome_Inc
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878GenomeInABottle
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectGenome Reference Consortium
 
The key considerations of crispr genome editing
The key considerations of crispr genome editingThe key considerations of crispr genome editing
The key considerations of crispr genome editingChris Thorne
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Databricks
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyGenome Reference Consortium
 
Genome Editing Comes of Age
Genome Editing Comes of AgeGenome Editing Comes of Age
Genome Editing Comes of AgeCandy Smellie
 

What's hot (20)

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 
Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysis
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi Rehm
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
The key considerations of crispr genome editing
The key considerations of crispr genome editingThe key considerations of crispr genome editing
The key considerations of crispr genome editing
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
 
Transitioning to gr_ch38
Transitioning to gr_ch38Transitioning to gr_ch38
Transitioning to gr_ch38
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
Genome Editing Comes of Age
Genome Editing Comes of AgeGenome Editing Comes of Age
Genome Editing Comes of Age
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 

Similar to Ashg sedlazeck grc_share

Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisDespoina Kalfakakou
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009Sean Davis
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGenomeInABottle
 
16 apple germplasm strcture and tools for germplasm curators durel charles eric
16 apple germplasm strcture and tools for germplasm curators durel charles eric16 apple germplasm strcture and tools for germplasm curators durel charles eric
16 apple germplasm strcture and tools for germplasm curators durel charles ericfruitbreedomics
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics TechnologiesSean Davis
 
Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...Sri Ambati
 
GTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation SequencingGTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation SequencingYanqi Chan
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
 
single-cell-sequencing-research-review
single-cell-sequencing-research-reviewsingle-cell-sequencing-research-review
single-cell-sequencing-research-reviewSwati Kadam Ph.D.
 
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...FOODCROPS
 
The Laboratory Diagnosis Of Tuberous Sclerosis
The Laboratory Diagnosis Of Tuberous SclerosisThe Laboratory Diagnosis Of Tuberous Sclerosis
The Laboratory Diagnosis Of Tuberous Sclerosisatss
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceGenomeInABottle
 
Ben Turner - MRI workshop
Ben Turner -  MRI workshopBen Turner -  MRI workshop
Ben Turner - MRI workshopMS Trust
 
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCING
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCINGVALIDATION OF NGS SEQUENCING BY SANGER SEQUENCING
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCINGNARRANAGAPAVANKUMAR
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017David Cook
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Reid Robison
 
Genetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyGenetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyDeepak Kumar
 

Similar to Ashg sedlazeck grc_share (20)

Giab sv genotyping
Giab sv genotypingGiab sv genotyping
Giab sv genotyping
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
16 apple germplasm strcture and tools for germplasm curators durel charles eric
16 apple germplasm strcture and tools for germplasm curators durel charles eric16 apple germplasm strcture and tools for germplasm curators durel charles eric
16 apple germplasm strcture and tools for germplasm curators durel charles eric
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...Next Generation Sequencing and its Applications in Medical Research - Frances...
Next Generation Sequencing and its Applications in Medical Research - Frances...
 
GTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation SequencingGTC group 8 - Next Generation Sequencing
GTC group 8 - Next Generation Sequencing
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
single-cell-sequencing-research-review
single-cell-sequencing-research-reviewsingle-cell-sequencing-research-review
single-cell-sequencing-research-review
 
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
2016. kayondo si. associatonn mapping identifies qt ls underlying cassava bro...
 
The Laboratory Diagnosis Of Tuberous Sclerosis
The Laboratory Diagnosis Of Tuberous SclerosisThe Laboratory Diagnosis Of Tuberous Sclerosis
The Laboratory Diagnosis Of Tuberous Sclerosis
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
 
Ben Turner - MRI workshop
Ben Turner -  MRI workshopBen Turner -  MRI workshop
Ben Turner - MRI workshop
 
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCING
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCINGVALIDATION OF NGS SEQUENCING BY SANGER SEQUENCING
VALIDATION OF NGS SEQUENCING BY SANGER SEQUENCING
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
Genetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyGenetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacology
 
Giab agbt SVs_2019
Giab agbt SVs_2019Giab agbt SVs_2019
Giab agbt SVs_2019
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 

More from Genome Reference Consortium

Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsGenome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesGenome Reference Consortium
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonGenome Reference Consortium
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGenome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesGenome Reference Consortium
 

More from Genome Reference Consortium (16)

Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
 

Recently uploaded

Male Infertility Panel Discussion by Dr Sujoy Dasgupta
Male Infertility Panel Discussion by Dr Sujoy DasguptaMale Infertility Panel Discussion by Dr Sujoy Dasgupta
Male Infertility Panel Discussion by Dr Sujoy DasguptaSujoy Dasgupta
 
Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024EwoutSteyerberg1
 
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...Ganesan Yogananthem
 
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.aarjukhadka22
 
historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu Medical University
 
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...Shubhanshu Gaurav
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxNaveenkumar267201
 
Red Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfRed Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfMedicoseAcademics
 
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptx
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptxGood Laboratory Practice (GLP) in Pharma-LikeWays.pptx
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptxLikeways
 
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...bkling
 
Physiology of Smooth Muscles -Mechanics of contraction and relaxation
Physiology of Smooth Muscles -Mechanics of contraction and relaxationPhysiology of Smooth Muscles -Mechanics of contraction and relaxation
Physiology of Smooth Muscles -Mechanics of contraction and relaxationMedicoseAcademics
 
The Importance of Mental Health: Why is Mental Health Important?
The Importance of Mental Health: Why is Mental Health Important?The Importance of Mental Health: Why is Mental Health Important?
The Importance of Mental Health: Why is Mental Health Important?Ryan Addison
 
ANATOMICAL FAETURES OF BONES FOR NURSING STUDENTS .pptx
ANATOMICAL FAETURES OF BONES  FOR NURSING STUDENTS .pptxANATOMICAL FAETURES OF BONES  FOR NURSING STUDENTS .pptx
ANATOMICAL FAETURES OF BONES FOR NURSING STUDENTS .pptxWINCY THIRUMURUGAN
 
Using Data Visualization in Public Health Communications
Using Data Visualization in Public Health CommunicationsUsing Data Visualization in Public Health Communications
Using Data Visualization in Public Health Communicationskatiequigley33
 
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptx
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptxDNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptx
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptxMAsifAhmad
 
Neurological history taking (2024) .
Neurological  history  taking  (2024)  .Neurological  history  taking  (2024)  .
Neurological history taking (2024) .Mohamed Rizk Khodair
 
Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.kishan singh tomar
 
Pregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfPregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfMedicoseAcademics
 
How to cure cirrhosis and chronic hepatitis naturally
How to cure cirrhosis and chronic hepatitis naturallyHow to cure cirrhosis and chronic hepatitis naturally
How to cure cirrhosis and chronic hepatitis naturallyZurück zum Ursprung
 

Recently uploaded (20)

Male Infertility Panel Discussion by Dr Sujoy Dasgupta
Male Infertility Panel Discussion by Dr Sujoy DasguptaMale Infertility Panel Discussion by Dr Sujoy Dasgupta
Male Infertility Panel Discussion by Dr Sujoy Dasgupta
 
Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024
 
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
 
Cone beam CT: concepts and applications.pptx
Cone beam CT: concepts and applications.pptxCone beam CT: concepts and applications.pptx
Cone beam CT: concepts and applications.pptx
 
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
 
historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu
 
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...
FDMA FLAP - The first dorsal metacarpal artery (FDMA) flap is used mainly for...
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
 
Red Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdfRed Blood Cells_anemia & polycythemia.pdf
Red Blood Cells_anemia & polycythemia.pdf
 
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptx
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptxGood Laboratory Practice (GLP) in Pharma-LikeWays.pptx
Good Laboratory Practice (GLP) in Pharma-LikeWays.pptx
 
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
 
Physiology of Smooth Muscles -Mechanics of contraction and relaxation
Physiology of Smooth Muscles -Mechanics of contraction and relaxationPhysiology of Smooth Muscles -Mechanics of contraction and relaxation
Physiology of Smooth Muscles -Mechanics of contraction and relaxation
 
The Importance of Mental Health: Why is Mental Health Important?
The Importance of Mental Health: Why is Mental Health Important?The Importance of Mental Health: Why is Mental Health Important?
The Importance of Mental Health: Why is Mental Health Important?
 
ANATOMICAL FAETURES OF BONES FOR NURSING STUDENTS .pptx
ANATOMICAL FAETURES OF BONES  FOR NURSING STUDENTS .pptxANATOMICAL FAETURES OF BONES  FOR NURSING STUDENTS .pptx
ANATOMICAL FAETURES OF BONES FOR NURSING STUDENTS .pptx
 
Using Data Visualization in Public Health Communications
Using Data Visualization in Public Health CommunicationsUsing Data Visualization in Public Health Communications
Using Data Visualization in Public Health Communications
 
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptx
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptxDNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptx
DNA nucleotides Blast in NCBI and Phylogeny using MEGA Xi.pptx
 
Neurological history taking (2024) .
Neurological  history  taking  (2024)  .Neurological  history  taking  (2024)  .
Neurological history taking (2024) .
 
Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.Different drug regularity bodies in different countries.
Different drug regularity bodies in different countries.
 
Pregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfPregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdf
 
How to cure cirrhosis and chronic hepatitis naturally
How to cure cirrhosis and chronic hepatitis naturallyHow to cure cirrhosis and chronic hepatitis naturally
How to cure cirrhosis and chronic hepatitis naturally
 

Ashg sedlazeck grc_share

  • 1. Structural Variation Characterization Across the Human Genome and Populations Fritz Sedlazeck October, 17, 2017
  • 2. Scientific interests Detection of Variants Sniffles (in bioRxiv) SURVIVOR Jeffares et. al. (2017) BOD-Score Sedlazeck et.al.(2013) Mapping/ Assembly reads NextGenMap-LR (in bioRxiv) Falcon Unzip Chin et.al. (2016) NextGenMap Sedlazeck et.al. (2013) Benchmarking/ Biases DangerTrack Dolgalev et.al. (2017) Teaser Smolka et.al. (2015) Sequencing Jünemann et.al. (2013) Applications Model organisms: -Cancer (SKBR3) (in bioRxiv) -miRNA editing (Vesely et.al. 2012) Non Model organisms: -Cottus transposons (Dennenmoser et. al. 2017) -Clunio (Kaiser et. al. 2016) -Seabass (Vij et.al. 2016) -Pineapple (Ming et.al. 2015) “moonlight”'
  • 3. Structural Variations Genomic DisordersEvolution Impact on regulation Impact on phenotypes RegulatoryState Cell Line A 54 9A o rta B _ ce lls_ P B _R o ad m ap C D 1 4C D 16 __ m on ocyte_ C B C D 14 C D 16 __m ono cyte _V B C D 4 _a b_ T _ cell_ V B C D 8_a b_ T _ ce ll_C B C M _C D 4 _ab _T _cell_ V B D N D _4 1 e osin o ph il_V B E P C _V B eryth rob la st_C B F e ta l_ A dren al_ G la n d F e tal_ Intestin e_L arg e F etal_ In te stin e_ S m all F e ta l_ M u scle _ Le g F etal_ M uscle _T runk F etal_ S tom a ch F e tal_ T hym us G astric G M 12 87 8 H 1_ m esenchym al H 1_ ne uron al_ p rog en itor H 1_ troph ob la st H 1 E S C H 9 H e La _S 3 H e pG 2H M E CH S M M H S M M tub e H U V E C _p ro l_ C B H U V E CIM R 90 iP S _2 0b iP S _D F _19 _1 1 iP S _D F _6 _9K 56 2 Le ft_V e ntric leL un g M 0_ m acro ph ag e_ C B M 0_ m acrop hag e_ V B M 1_m acro ph age _C B M 1_ m acro ph ag e_ V B M 2 _m a crop ha ge _C B M 2_ m acro ph ag e_ V B M ono cyte s_C D 1 4_ P B _R o ad m ap M on ocyte s_ C D 1 4 M S C _V B n aiv e _B _ce ll_ V B N a tu ral_ K ille r_cells_P B ne utrop hil_ C B n eutrop hil_m ye lo cyte _B M n eu tro ph il_V BN H _A N H D F _A DN H E KN H LF O steob l O vary P an crea s P la ce nta P soa s_M uscle R ig ht_A triu m S m all_ Intestin e S ple e n T _cells_P B _R oa dm a p T hym us C T C F _b in din g _siteA C T IV E C T C F _ bin d in g_ site IN A C T IV E C T C F _bin d in g_ site P O IS E D C T C F _ bin d in g_ site R E P R E S S E D e nha ncerA C T IV E e nh an ce rIN A C T IV E en han ce rP O IS E D e nh an cerR E P R E S S E D op en _chrom a tin _reg io nA C T IV E o pe n_ chro m atin _ re gio n IN A C T IV E o pe n_ chro m atin _re gio n N A ope n_ ch ro m atin _ regio n P O IS E D o pe n_ chro m atin _re gio n R E P R E S S E D p rom o te rA C T IV E pro m oter_ fla n kin g _reg io nA C T IV E p rom o te r_fla nkin g_ re gio n IN A C T IV E p rom o te r_fla nkin g_ regio n P O IS E D p ro m o te r_fla nkin g_re gio n R E P R E S S E D prom oterIN A C T IV E pro m oterP O IS E D prom oterR E P R E S S E D T F _b in din g _siteA C T IV E T F _ bin d in g_ site IN A C T IV E T F _ bin d in g_ site N A T F _ bin d in g_ site P O IS E D T F _ bin d in g_ site R E P R E S S E D A 54 9A o rta B _ ce lls_ P B _R o ad m ap C D 1 4C D 16 __ m on ocyte_ C B C D 14 C D 16 __m ono cyte _V B C D 4 _a b_ T _ cell_ V B C D 8_a b_ T _ ce ll_C B C M _C D 4 _ab _T _cell_ V B D N D _4 1 e osin o ph il_V B E P C _V B eryth rob la st_C B F e ta l_ A dren al_ G la n d F e tal_ Intestin e_L arg e F etal_ In te stin e_ S m all F e ta l_ M u scle _ Le g F etal_ M uscle _T runk F etal_ S tom a ch F e tal_ T hym us G astric G M 12 87 8 H 1_ m esenchym al H 1_ ne uron al_ p rog en itor H 1_ troph ob la st H 1 E S C H 9 H e La _S 3 H e pG 2H M E CH S M M H S M M tub e H U V E C _p ro l_ C B H U V E CIM R 90 iP S _2 0b iP S _D F _19 _1 1 iP S _D F _6 _9K 56 2 Le ft_V e ntric leL un g M 0_ m acro ph ag e_ C B M 0_ m acrop hag e_ V B M 1_m acro ph age _C B M 1_ m acro ph ag e_ V B M 2 _m a crop ha ge _C B M 2_ m acro ph ag e_ V B M ono cyte s_C D 1 4_ P B _R o ad m ap M on ocyte s_ C D 1 4 M S C _V B n aiv e _B _ce ll_ V B N a tu ral_ K ille r_cells_P B ne utrop hil_ C B n eutrop hil_m ye lo cyte _B M n eu tro ph il_V BN H _A N H D F _A DN H E KN H LF O steob l O vary P an crea s P la ce nta P soa s_M uscle R ig ht_A triu m S m all_ Intestin e S ple e n T _cells_P B _R oa dm a p T hym us C T C F _b in din g _siteA C T IV E C T C F _ bin d in g_ site IN A C T IV E C T C F _bin d in g_ site P O IS E D C T C F _ bin d in g_ site R E P R E S S E D e nha ncerA C T IV E e nh an ce rIN A C T IV E en han ce rP O IS E D e nh an cerR E P R E S S E D op en _chrom a tin _reg io nA C T IV E o pe n_ chro m atin _ re gio n IN A C T IV E o pe n_ chro m atin _re gio n N A ope n_ ch ro m atin _ regio n P O IS E D o pe n_ chro m atin _re gio n R E P R E S S E D p rom o te rA C T IV E pro m oter_ fla n kin g _reg io nA C T IV E p rom o te r_fla nkin g_ re gio n IN A C T IV E p rom o te r_fla nkin g_ regio n P O IS E D p ro m o te r_fla nkin g_re gio n R E P R E S S E D prom oterIN A C T IV E pro m oterP O IS E D prom oterR E P R E S S E D T F _b in din g _siteA C T IV E T F _ bin d in g_ site IN A C T IV E T F _ bin d in g_ site N A T F _ bin d in g_ site P O IS E D T F _ bin d in g_ site R E P R E S S E D 0500100015002000 scale affected#
  • 4. Diploid genome • Impact on Regulation • Variability of genes • Need to understand the full structure
  • 5. Challenges: Pursuing the diploid genome 1. Accurate prediction of SVs 2. Comparison of SVs 3. Annotation and interpretation of SVs 4. Population analysis 5. Diploid Genome Layer et.al. (2014)
  • 6. 1.1 How to detect Structural Variations (SVs)
  • 7. • (+) SVs in repetitive regions • (+) Span SVs • (+) Uniform coverage • (+) Can identify more complex SVs • (-) Higher seq. error rate • (-) Hard to align 1.1 Long Read Technologies
  • 8. 1.1 Accurate mapping and SV calling NextGenMap-LR (NGMLR): • Long read mapper • Convex gap costs • Faster then BWA-MEM Sniffles: • SV caller for long reads • All types of SVs • Phasing of SVs
  • 9. 1.2 NA12878: SV calling Tech. Cover age Avg read len Method SVs TRA PacBio 55x 4,334 Sniffles 22,877 119 Oxford Nanopore @Baylor 34x 4,982 Sniffles 12,596 46 Illumina 50x 2 x 101 Manta, Delly, Lumpy 7,275 2,247 Sedlazeck et.al. (2017)
  • 10. 1.1 NA12878: SV calling Tech. Cove rage Avg read len Method SVs TRA DEL INS PacBio 55x 4,334 Sniffles 22,877 119 9,933 12,052 Oxford Nanopore @Baylor 34x 4,982 Sniffles 12,596 46 7,102 5,166 Illumina 50x 2 x 101 Manta, Delly, Lumpy 7,275 2,247 3,744 0 Sedlazeck et.al. (2017)
  • 11. 1.1 NA12878: check 2,247 vs 119 TRA Illumina data Translocation: PacBio data ONT data Truncated reads: Insertion In rep. region Overlap Illumina TRA(%) Insertions 53.05 Deletions 12.06 Duplications 0.57 Nested 0.31 High coverage 1.87 Low complexity 9.79 Explained 77.65 Sedlazeck et.al. (2017)
  • 12. 1.1 NA12878: check 2,247 vs 119 TRA ONT data PacBio data Illumina data Insertion In rep. region Inversion: Translocation: Truncated reads: Insertion In rep. region Sedlazeck et.al. (2017)
  • 13. 1.2 More complex SVs Inverted tandem duplication: • Pelizaeus-Merzbacher disease • MECP2 • VIPR2 Sedlazeck et.al. (2017) PacBio data Illumina data
  • 14. 1.2 More complex SVs Inversion flanked by deletions: • Haemophilia A • Only found over long range PCR! (2007) Sedlazeck et.al. (2017) Illumina data PacBio data
  • 15. Challenges 1. Accurate prediction of SVs: Sniffles (talk on Thursday!) 2. Comparison of SVs 3. Annotation and interpretation of SVs 4. Population analysis 5. Diploid Genome Layer et.al. (2014)
  • 16. 2. Comparison of SVs SURVIVOR Framework: • Compare SVs • GiaB: 95 vcf file: 1 minute • Simulate SVs • Simulate long reads • Summarize SVs results Jeffares et.al. (2017) New SVs Observed SVs
  • 17. 2. Genome in a Bottle: merging 95 vcfs (1 min) 10x Genomics BioNano Complete Genomics Illumina PacBio Minimum 2 callers:SV Caller Comparison: Using PCR+Sanger validate SVs form multiple categories. Join CSHL + Baylor to help with validations!
  • 18. Challenges 1. Accurate prediction of SVs: Sniffles (talk on Thursday!) 2. Comparison of SVs: SURIVOR 3. Annotation and interpretation of SVs 4. Population analysis 5. Diploid Genome
  • 19. Histogram over genes impacted #Gene hit by SVS Frequency 0 20 40 60 80 0200040006000 3. Annotation: SURVIVOR_ant Annotating SVs with: • Multiple GTF, BED, VCF Genome in a Bottle: • 63,677 genes (GTF) • 1,733,686 regions (3 bed files) • 22 seconds: • 8,314 Genes impacted Sedlazeck et.al. (2017) #Genes # SV hit gene Genes impacted by SVs
  • 20. Challenges 1. Accurate prediction of SVs: Sniffles (talk on Thursday!) 2. Comparison of SVs: SURIVOR 3. Annotation and interpretation of SVs: SURVIVOR_ANT 4. Population analysis 5. Diploid Genome
  • 21. 4. SVs in Population: SURVIVOR • Birth defect study (Karyn Meltz Steinberg, WashU: Wed. 9am: Room 310A) • 4 callers, 114 samples • CCDG (William Salerno, HGSC: poster on Friday, #1281) • 5 callers, 22,600 samples • Non human: • S. Pombe: 3 callers, 161 samples • Tomato: 3 callers, 846 samples
  • 22. 4. SVs in 22,600 Individuals We need large SV studies: • Common vs. rare SVs • Inform GWAS studies • Ethnicity specific SVs • Catalog variability of regions • MHC, LPA, etc. 0.0e+00 5.0e+07 1.0e+08 1.5e+08 0.000.100.20 CHR6: Average SV Allele Frequency per 100kb Allelefrequency MHC LPA #SVs Shared across individuals Position
  • 23. Challenges 1. Accurate prediction of SVs: Sniffles (talk on Thursday!) 2. Comparison of SVs: SURIVOR 3. Annotation and interpretation of SVs: SURVIVOR_ANT 4. Population analysis: SURVIVOR 5. Diploid Genome
  • 24. 5.1 Diploid Genome Challenges: • Sequencing technology • Computational methods • Money HGSC Approach: GADGET 1. Sequence 100 individuals: PacBio + 10x Genomics 2. SV detection/genotyping 3. Phasing of SVs+ SNP 4. Population based genotyping of SVs short reads.
  • 25. 5.2 Diploid Genome Selecting 100 samples • We want to maximize the outcome/ $ spent • Selection of samples (red) • Select top 100 (red) • Random selection of samples (boxplot) Histogram of mat[, 2] # SVS #Patients 2e+04 4e+04 6e+04 8e+04 1e+05 050100150200250 1 6 12 19 26 33 40 47 54 61 68 75 82 89 96 020406080100 Random vs. informed choice of samples (CCDG) # of chosen Samples SVinpopulation(%) Informed Top100 Random Number of chosen samples SVinpopulation(%)
  • 26. 5.3 Diploid Genome (Prototype)
  • 27. Challenges/ Summary 1. Accurate prediction of SVs: Sniffles (Talk on Thursday!) 2. Comparison of SVs: SURIVOR 3. Annotation and interpretation of SVs: SURVIVOR_ANT 4. Population analysis: SURVIVOR 5. Diploid Genome: GADGET All methods are available: https://github.com/fritzsedlazeck https://fritzsedlazeck.github.io/ 1 6 12 19 26 33 40 47 54 61 68 75 82 89 96 020406080100 Random vs. informed choice of samples (CCDG) # of chosen Samples SVinpopulation(%) Informed Top100 Random Number of chosen samples SVinpopulation(%)
  • 28. William Salerno Stephen Richards Richard Gibbs Michael Schatz Schatz lab Acknowledgments Daniel Jeffares Jürg Bähler Christophe Dessimoz Justin Zook GiaB consortium

Editor's Notes

  1. Welcome everyone. My name is Fritz Sedlazeck and I am currently working at the Human Genome Sequencing Center @ Baylor in Houston. Today I am going to talk about challenges in SV calling and our pursue of the diploid genome that we are working on. Before I dive into that let me shortly introduce myself and my scientific interest.
  2. I am a computational biologist mainly focusing on method developing for mapping and assembly of short and long reads. Detecting of genomic variations focusing on SVs. Benchmarking and detecting biases in methods and sequencing technologies And to apply all of these to obtain more insights into molecular biology. The focus of the talk today is around structural variations.
  3. Only when we account for all variations we will be able to obtain deeper insights.
  4. However there are certain challenges to get there.
  5. Look at the Venn again. Probably each caller has some fraction of true and false positives. Reflecting the complexity of calling SVs
  6. Structural Varitions are in generally loosely defined as 50bp+ ….. Short read based callign often discussed to lack sensitivity and large FDR!
  7. Avg len for Pacbio nowardays much higher! Many Deletions on Nanopore -> 11,394 (96.19%) were deletions, and the majority (89.72%) were within a homopolymer . Check indel for significanc @ONT INS: probably missing repetitive regions due to caller??
  8. 3 times sequenced in different labs!
  9. This highlights a huge bias in short reads and explains why illumina is not enough!
  10. Look at the Venn again. Probably each caller has some fraction of true and false positives. So one possible solution could be to combine (make a consensus call).
  11. So now that we can call SVs across different callers. How can we annotate and rank these calls?
  12. AC131097.3: Long non coding SMYD3: Histone methyltransferase
  13. Now we have the calls and annotation. Are these SVs common in the population or unique to my sample??
  14. CCDG 2hours 30 min. Tomato: 6min. CCDG: NHGRI program; 30x WGS
  15. #Singeltons/stats.. Fraxction of rare vs. common. Patients -> Individuals.
  16. Now that we have methods to identify SVs, reduce FDR, annotate and have mechanism to know if they are rare or common, we need to understand their context -> Diploid genome.
  17. #SV singeltons, #SV two samples? Put greedy curv without crossing out. Interesting: Do the curve for 3 sd . More time!
  18. Display slide during questions. Check with Will!!