SlideShare a Scribd company logo
Variant calling
while accounting for
alternate haplotypes
Aaron Quinlan
University of Virginia
!
quinlanlab.org
Genome Reference Consortium Workshop
Sept 21, 2014
Motivation: HG has long had alt loci, but we don’t handle them properly
Deanna Church and Brad Holmes
Regions w/ alternate loci
(261 in total)
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
A case study: The MAPT locus (17q21.31)
doi:10.1038/nrneurol.2012.169
Disease phenotypes associated with each haplotype
H2 positively selected in Europeans, carriers predisposed to 17q21.31
microdeletion syndrome via NAHR
H1 associated with Alzheimer and Parkinson diseases
A case study: The MAPT locus (17q21.31)
Scale
chr17:
7 Vert. El
Multiz Align
RepeatMasker
1 Mb hg38
45,500,000 46,000,000 46,500,000 47,000,000
Contigs New to GRCh38/(hg38), Not Carried Forward from GRCh37/(hg19)
GRCh38 Alignments to the Alternate Sequences/Haplotypes
GRCh38 Haplotype to Reference Sequence Mapping Correspondence
RefSeq Genes
Multiz Alignment & Conservation (7 Species)
Repeating Elements by RepeatMasker
chr17_GL000258v2_alt
chr17_KI270908v1_alt
chr17_GL000258v2_alt
chr17_KI270908v1_alt
chr17_GL000258v2_alt
KIF18B
KIF18B
MIR6783
C1QL1
DCAKD
DCAKD
DCAKD
DCAKD
NMT1
PLCD3
MIR6784
ACBD4
ACBD4
ACBD4
ACBD4
ACBD4
HEXIM1
HEXIM2
FMNL1
MAP3K14-AS1
MAP3K14-AS1
MAP3K14-AS1
SPATA32
MAP3K14-AS1
MAP3K14-AS1
MAP3K14
ARHGAP27
ARHGAP27
ARHGAP27
PLEKHM1
PLEKHM1
PLEKHM1
MIR4315-2
MIR4315-1
LRRC37A4P
LOC644172
CRHR1
MGC57346
MGC57346
CRHR1-IT1
CRHR1-IT1
CRHR1
CRHR1
CRHR1
CRHR1
MAPT-AS1
SPPL2C
MAPT
MAPT
MAPT
MAPT
MAPT
MAPT
MAPT
MAPT
MAPT-IT1
STH
KANSL1
KANSL1
KANSL1
KANSL1-AS1
LOC644172
LRRC37A
ARL17B
ARL17A
ARL17B
NSFP1
ARL17A
LRRC37A2
ARL17A
ARL17A
ARL17A
ARL17A
ARL17B
NSF
NSF
NSFP1
WNT3
WNT9B
GOSR2
GOSR2
GOSR2
MIR5089
RPRML
H1 and inverted H2 don’t recombine
Extensive LD owing to
suppressed recombination
between H1 and inverted H2
Several SNPs distiguish H1 from H2
…
Kitzman et al: NA10847 is heterozygous H1/H2
Kitzman et al: NA10847 is heterozygous H1/H2NA10847
How do we support
variant detection with
alternate alleles using the
VCF format?
A G A C A C T A G T C T
H1 (chr17) H2 (GL000258v2_alt)
NA10847 is heterozygous H1/H2
A G A C A C T A G T C T
H1 (chr17) H2 (GL000258v2_alt)
A G A C A C
T A G T C T
NA10847 (het. H1/H2)
NA10847 is heterozygous H1/H2
A G A C A C T A G T C T
H1 (chr17) H2 (GL000258v2_alt)
A G A C A C
T A G T C T
NA10847 (het. H1/H2)
NA10847 is heterozygous H1/H2
A G A C A C
T A G T C T
A G A C A C
T A G T C T
Requires that we align
reads to both loci*
* We need to be able to distinguish multiple alignments arising from ALT loci versus multiple
mappings arising from segdups, repetitive elements. Else, MAPQs penalized and/or alignments
not reported, depending on the behavior of the aligner.
!
Heng Li has started a discussion about how best to make BWA alt-aware
Colin Hercus has started a discussion about how best to make Novoalign alt-aware
chr17 A T 0/1
chr17 G A 0/1
chr17 A G 0/1
chr17 C T 0/1
chr17 A C 0/1
chr17 C T 0/1
GL000258v2_alt T A 0/1
GL000258v2_alt A G 0/1
GL000258v2_alt G A 0/1
GL000258v2_alt T C 0/1
GL000258v2_alt C A 0/1
GL000258v2_alt T C 0/1
VCF entries for chr17 (H1) VCF entries for alt locus (H2)
Variant calling
NA10847 is heterozygous H1/H2
chr17 A T 0/1
chr17 G A 0/1
chr17 A G 0/1
chr17 C T 0/1
chr17 A C 0/1
chr17 C T 0/1
GL000258v2_alt T A 0/1
GL000258v2_alt A G 0/1
GL000258v2_alt G A 0/1
GL000258v2_alt T C 0/1
GL000258v2_alt C A 0/1
GL000258v2_alt T C 0/1
VCF entries for chr17 (H1) VCF entries for alt locus (H2)
Variant calling
NA10847 is heterozygous H1/H2
Note: Allelic
relationship is
not reflected in
“raw” VCF
Proposal: Develop a downstream
tool that leverages informative SNPs
to distinguish and assign haplotypes
at alt loci via a standard VCF file.
Intermediate solution until variant callers handle this
complexity natively
Overview
“Raw”
V/BCF file
“database” of
alt loci &
inform.
SNPs
Overview
“Raw”
V/BCF file
“database” of
alt loci &
inform.
SNPs
alt_locus main_locus
GL000258.2 (H2) chr17:45309498-46836265 (H1)
infor. markers GRCh38 position H1 H2
rs241039 45637307 A T
rs2049515 45684490 C T
. . .
Overview
“Raw”
V/BCF file
“database” of
alt loci &
inform.
SNPs
New tool(s)
alt_locus main_locus
GL000258.2 (H2) chr17:45309498-46836265 (H1)
infor. markers GRCh38 position H1 H2
rs241039 45637307 A T
rs2049515 45684490 C T
. . .
Overview
“Raw”
V/BCF file
“database” of
alt loci &
inform.
SNPs
New tool(s)
Updated
V/BCF file
Haplotype
predictions
per indiv.
alt_locus main_locus
GL000258.2 (H2) chr17:45309498-46836265 (H1)
infor. markers GRCh38 position H1 H2
rs241039 45637307 A T
rs2049515 45684490 C T
. . .
Augment VCF with assembly information
##seq-info=<name=chr17, id=CM000679.2>
!
##region-info=<name=MAPT, id=GL000258.2,
assoc_id=CM000679.2, reg=45309498-46836265>
Based on ideas from Deanna Church
Introduce new (reserved) VCF INFO tags
##INFO=<ID=ALTLOCS,
Number=.,
Type=String,
Description=“A list of the alternate
loci in the reference
genome that are
associated with this
locus”>
##INFO=<ID=ALTHAPS,
Number=.,
Type=String,
Description=“A list of the known
haplotypes that are
associated with this
locus”>
##FORMAT=<ID=HT,Number=1,Type=String,Description=“Haplotype
combination based on ALTHAPS">
(Draft) example VCF output, post “correction”
##seq-info=<name=chr17, id=CM000679.2>
##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265>
##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is
the reference genome that are associated with this locus”>
##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes
that are associated with this locus”>
!
#CHROM POS REF ALT INFO FORMAT NA10847
chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1
chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1
chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2 GT:HT 0/1:0/1
. . .
GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1
GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1
GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2 GT:HT 0/1:0/1
. . .
(Draft) example VCF output, post “correction”
##seq-info=<name=chr17, id=CM000679.2>
##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265>
##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is
the reference genome that are associated with this locus”>
##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes
that are associated with this locus”>
!
#CHROM POS REF ALT INFO FORMAT NA10847
chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1
chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1
chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2 GT:HT 0/1:0/1
. . .
GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1
GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1
GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2 GT:HT 0/1:0/1
. . .
Solely two different haplotypes is the base case.
For example NA10847 is actually H1/H2D
H2D derived from
ancestral H2.
Markers that distinguish
H2D from H2
chr17 A T 0/1
chr17 G A 0/1
chr17 A G 0/1
chr17 C T 0/1
chr17 A C 0/1
chr17 C T 0/1
chr17 C T 0/1
chr17 C T 0/1
chr17 G A 0/1
chr17 A G 0/1
chr17 C T 0/1
GL000258v2_alt T A 0/1
GL000258v2_alt A G 0/1
GL000258v2_alt G A 0/1
GL000258v2_alt T C 0/1
GL000258v2_alt C A 0/1
GL000258v2_alt T C 0/1
GL000258v2_alt C T 0/1
GL000258v2_alt C T 0/1
GL000258v2_alt G A 0/1
GL000258v2_alt A G 0/1
GL000258v2_alt C T 0/1
VCF entries for chr17 VCF entries for alt locus
Interpretation is much harder w/ many haplotypes
H1 H2 H2D
(Draft) example VCF output, post “correction”
##seq-info=<name=chr17, id=CM000679.2>
##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265>
##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that
are associated with this locus”>
##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with
this locus”>
!
#CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599
chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
(Draft) example VCF output, post “correction”
##seq-info=<name=chr17, id=CM000679.2>
##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265>
##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that
are associated with this locus”>
##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with
this locus”>
!
#CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599
chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
NA10847 chr17 45309498 46836265 H1 H2D rs241039,rs434428,…
NA12878 chr17 45309498 46836265 H1 H1 rs241039,rs434428,…
NA21599 chr17 45309498 46836265 H2D H2D rs241039,rs434428,…
sample chrom start end hap1 hap2 markers
(Draft) example VCF output, post “correction”
##seq-info=<name=chr17, id=CM000679.2>
##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265>
##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that
are associated with this locus”>
##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with
this locus”>
!
#CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599
chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2
. . .
NA10847 chr17 45309498 46836265 H1 H2D rs241039,rs434428,…
NA12878 chr17 45309498 46836265 H1 H1 rs241039,rs434428,…
NA21599 chr17 45309498 46836265 H2D H2D rs241039,rs434428,…
sample chrom start end hap1 hap2 markers
The good and the bad
Good
• No burden on existing
variant callers to adapt to
calling w.r.t. alt loci
• Tool for updating VCF can
be updated and improved in
parallel with variant callers.
Bad
• One more step / file in the
variant interpretation pipeline
• This strategy is only
applicable to cases where
informative markers exist.
Use CNVs in WGS: e.g.,
KANSL1 partial duplications
to distinguish MAPT alt loci
The good and the bad
Good
• No burden on existing
variant callers to adapt to
calling w.r.t. alt loci
• Tool for updating VCF can
be updated and improved in
parallel with variant callers.
Bad
• One more step / file in the
variant interpretation pipeline
• This strategy is only
applicable to cases where
informative markers exist.
Use CNVs in WGS: e.g.,
KANSL1 partial duplications
to distinguish MAPT alt loci
To Do / Discuss
• How best to improve alignment strategies to facilitate variant
and haplotype detection?
• How to best represent the resolved alternate loci in VCF format?
Many thanks for helpful discussions with:
Deanna Church
Brad Holmes
Karyn Meltz-Steinberg
Heng Li
Colin Hercus

More Related Content

What's hot

Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
Genome Reference Consortium
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
Genome Reference Consortium
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
Genome Reference Consortium
 
150224 grc kms
150224 grc kms150224 grc kms
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
Genome Reference Consortium
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
Genome Reference Consortium
 
Schneider_AGBT2014
Schneider_AGBT2014Schneider_AGBT2014
Schneider_AGBT2014
vaschn
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
Genome Reference Consortium
 
ABGT 2016 Workshop Schneider
ABGT 2016 Workshop SchneiderABGT 2016 Workshop Schneider
ABGT 2016 Workshop Schneider
Genome Reference Consortium
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
Genome Reference Consortium
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
Shaojun Xie
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
Genome Reference Consortium
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
Genome Reference Consortium
 
Grc workshop agbt2015_tg
Grc workshop agbt2015_tgGrc workshop agbt2015_tg
Grc workshop agbt2015_tg
Genome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
Genome Reference Consortium
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
Genome Reference Consortium
 
agbt 2016 workshop lindsay
agbt 2016 workshop lindsayagbt 2016 workshop lindsay
agbt 2016 workshop lindsay
Genome Reference Consortium
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
Genome Reference Consortium
 
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
Eli Kaminuma
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
Genome Reference Consortium
 

What's hot (20)

Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
 
150224 grc kms
150224 grc kms150224 grc kms
150224 grc kms
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
 
Schneider_AGBT2014
Schneider_AGBT2014Schneider_AGBT2014
Schneider_AGBT2014
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
ABGT 2016 Workshop Schneider
ABGT 2016 Workshop SchneiderABGT 2016 Workshop Schneider
ABGT 2016 Workshop Schneider
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
Grc workshop agbt2015_tg
Grc workshop agbt2015_tgGrc workshop agbt2015_tg
Grc workshop agbt2015_tg
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
agbt 2016 workshop lindsay
agbt 2016 workshop lindsayagbt 2016 workshop lindsay
agbt 2016 workshop lindsay
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
[2020-09-01] IIBMP2020 Generating annotation texts of HLA sequences with anti...
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 

Viewers also liked

Presentatie Prof. de Graaf en Dr. Willems van Dijk
Presentatie Prof. de Graaf en Dr. Willems van DijkPresentatie Prof. de Graaf en Dr. Willems van Dijk
Presentatie Prof. de Graaf en Dr. Willems van Dijk
CVON
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Manikhandan Mudaliar
 
Peripheral Neurological Disorders & Central Nervous Center
Peripheral Neurological Disorders & Central Nervous CenterPeripheral Neurological Disorders & Central Nervous Center
Peripheral Neurological Disorders & Central Nervous Center
jben501
 
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Copenhagenomics
 
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Copenhagenomics
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
VHIR Vall d’Hebron Institut de Recerca
 

Viewers also liked (7)

Presentatie Prof. de Graaf en Dr. Willems van Dijk
Presentatie Prof. de Graaf en Dr. Willems van DijkPresentatie Prof. de Graaf en Dr. Willems van Dijk
Presentatie Prof. de Graaf en Dr. Willems van Dijk
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Peripheral Neurological Disorders & Central Nervous Center
Peripheral Neurological Disorders & Central Nervous CenterPeripheral Neurological Disorders & Central Nervous Center
Peripheral Neurological Disorders & Central Nervous Center
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...Can we exploit the power of NGS to move towards personalized medicine?, Cente...
Can we exploit the power of NGS to move towards personalized medicine?, Cente...
 
Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...Exome sequencing for disease gene identification and patient diagnostics, Gen...
Exome sequencing for disease gene identification and patient diagnostics, Gen...
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 

Similar to Variant Calling II

Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
Deanna Church
 
Using the GRCh38 reference assembly for clinical interpretation in VSClinical
 Using the GRCh38 reference assembly for clinical interpretation in VSClinical Using the GRCh38 reference assembly for clinical interpretation in VSClinical
Using the GRCh38 reference assembly for clinical interpretation in VSClinical
Golden Helix
 
Concordance_of_HTA_array_and_real_time_qPCR_results
Concordance_of_HTA_array_and_real_time_qPCR_resultsConcordance_of_HTA_array_and_real_time_qPCR_results
Concordance_of_HTA_array_and_real_time_qPCR_resultsAndrea Ujvari
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
Prof. Wim Van Criekinge
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
DATAVERSITY
 
VCF and RDF
VCF and RDFVCF and RDF
VCF and RDF
Jeremy Carroll
 
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...Human Variome Project
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
Reece Hart
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1
ICGEB
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
Vall d'Hebron Institute of Research (VHIR)
 
RNA-Seq with R-Bioconductor
RNA-Seq with R-BioconductorRNA-Seq with R-Bioconductor
artificial or synthetic transcription factor for regulation of gene expression
artificial or synthetic transcription factor for regulation of gene expressionartificial or synthetic transcription factor for regulation of gene expression
artificial or synthetic transcription factor for regulation of gene expression
Balaji Rathod
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
GenomeInABottle
 
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platformApproach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
Thermo Fisher Scientific
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Databricks
 

Similar to Variant Calling II (20)

Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
 
Using the GRCh38 reference assembly for clinical interpretation in VSClinical
 Using the GRCh38 reference assembly for clinical interpretation in VSClinical Using the GRCh38 reference assembly for clinical interpretation in VSClinical
Using the GRCh38 reference assembly for clinical interpretation in VSClinical
 
Concordance_of_HTA_array_and_real_time_qPCR_results
Concordance_of_HTA_array_and_real_time_qPCR_resultsConcordance_of_HTA_array_and_real_time_qPCR_results
Concordance_of_HTA_array_and_real_time_qPCR_results
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
 
20161021_master_lesson_no_feedback
20161021_master_lesson_no_feedback20161021_master_lesson_no_feedback
20161021_master_lesson_no_feedback
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
 
VCF and RDF
VCF and RDFVCF and RDF
VCF and RDF
 
Bioinformatica 27-10-2011-t4-alignments
Bioinformatica 27-10-2011-t4-alignmentsBioinformatica 27-10-2011-t4-alignments
Bioinformatica 27-10-2011-t4-alignments
 
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1
 
Gateway
GatewayGateway
Gateway
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
 
Abrf poster2007
Abrf poster2007Abrf poster2007
Abrf poster2007
 
RNA-Seq with R-Bioconductor
RNA-Seq with R-BioconductorRNA-Seq with R-Bioconductor
RNA-Seq with R-Bioconductor
 
artificial or synthetic transcription factor for regulation of gene expression
artificial or synthetic transcription factor for regulation of gene expressionartificial or synthetic transcription factor for regulation of gene expression
artificial or synthetic transcription factor for regulation of gene expression
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platformApproach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
 

More from Genome Reference Consortium

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
Genome Reference Consortium
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
Genome Reference Consortium
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
Genome Reference Consortium
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
Genome Reference Consortium
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
Genome Reference Consortium
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
Genome Reference Consortium
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
Genome Reference Consortium
 
Mane v2 final
Mane v2 finalMane v2 final
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
Genome Reference Consortium
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
Genome Reference Consortium
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
Genome Reference Consortium
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
Genome Reference Consortium
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
Genome Reference Consortium
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
Genome Reference Consortium
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
Genome Reference Consortium
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
Genome Reference Consortium
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
Genome Reference Consortium
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
Genome Reference Consortium
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
Genome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
Genome Reference Consortium
 

More from Genome Reference Consortium (20)

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 

Recently uploaded

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 

Recently uploaded (20)

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 

Variant Calling II

  • 1. Variant calling while accounting for alternate haplotypes Aaron Quinlan University of Virginia ! quinlanlab.org Genome Reference Consortium Workshop Sept 21, 2014
  • 2. Motivation: HG has long had alt loci, but we don’t handle them properly Deanna Church and Brad Holmes Regions w/ alternate loci (261 in total) http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
  • 3. A case study: The MAPT locus (17q21.31) doi:10.1038/nrneurol.2012.169 Disease phenotypes associated with each haplotype H2 positively selected in Europeans, carriers predisposed to 17q21.31 microdeletion syndrome via NAHR H1 associated with Alzheimer and Parkinson diseases
  • 4. A case study: The MAPT locus (17q21.31) Scale chr17: 7 Vert. El Multiz Align RepeatMasker 1 Mb hg38 45,500,000 46,000,000 46,500,000 47,000,000 Contigs New to GRCh38/(hg38), Not Carried Forward from GRCh37/(hg19) GRCh38 Alignments to the Alternate Sequences/Haplotypes GRCh38 Haplotype to Reference Sequence Mapping Correspondence RefSeq Genes Multiz Alignment & Conservation (7 Species) Repeating Elements by RepeatMasker chr17_GL000258v2_alt chr17_KI270908v1_alt chr17_GL000258v2_alt chr17_KI270908v1_alt chr17_GL000258v2_alt KIF18B KIF18B MIR6783 C1QL1 DCAKD DCAKD DCAKD DCAKD NMT1 PLCD3 MIR6784 ACBD4 ACBD4 ACBD4 ACBD4 ACBD4 HEXIM1 HEXIM2 FMNL1 MAP3K14-AS1 MAP3K14-AS1 MAP3K14-AS1 SPATA32 MAP3K14-AS1 MAP3K14-AS1 MAP3K14 ARHGAP27 ARHGAP27 ARHGAP27 PLEKHM1 PLEKHM1 PLEKHM1 MIR4315-2 MIR4315-1 LRRC37A4P LOC644172 CRHR1 MGC57346 MGC57346 CRHR1-IT1 CRHR1-IT1 CRHR1 CRHR1 CRHR1 CRHR1 MAPT-AS1 SPPL2C MAPT MAPT MAPT MAPT MAPT MAPT MAPT MAPT MAPT-IT1 STH KANSL1 KANSL1 KANSL1 KANSL1-AS1 LOC644172 LRRC37A ARL17B ARL17A ARL17B NSFP1 ARL17A LRRC37A2 ARL17A ARL17A ARL17A ARL17A ARL17B NSF NSF NSFP1 WNT3 WNT9B GOSR2 GOSR2 GOSR2 MIR5089 RPRML
  • 5. H1 and inverted H2 don’t recombine Extensive LD owing to suppressed recombination between H1 and inverted H2
  • 6. Several SNPs distiguish H1 from H2 …
  • 7. Kitzman et al: NA10847 is heterozygous H1/H2
  • 8. Kitzman et al: NA10847 is heterozygous H1/H2NA10847
  • 9. How do we support variant detection with alternate alleles using the VCF format?
  • 10. A G A C A C T A G T C T H1 (chr17) H2 (GL000258v2_alt) NA10847 is heterozygous H1/H2
  • 11. A G A C A C T A G T C T H1 (chr17) H2 (GL000258v2_alt) A G A C A C T A G T C T NA10847 (het. H1/H2) NA10847 is heterozygous H1/H2
  • 12. A G A C A C T A G T C T H1 (chr17) H2 (GL000258v2_alt) A G A C A C T A G T C T NA10847 (het. H1/H2) NA10847 is heterozygous H1/H2 A G A C A C T A G T C T A G A C A C T A G T C T Requires that we align reads to both loci* * We need to be able to distinguish multiple alignments arising from ALT loci versus multiple mappings arising from segdups, repetitive elements. Else, MAPQs penalized and/or alignments not reported, depending on the behavior of the aligner. ! Heng Li has started a discussion about how best to make BWA alt-aware Colin Hercus has started a discussion about how best to make Novoalign alt-aware
  • 13. chr17 A T 0/1 chr17 G A 0/1 chr17 A G 0/1 chr17 C T 0/1 chr17 A C 0/1 chr17 C T 0/1 GL000258v2_alt T A 0/1 GL000258v2_alt A G 0/1 GL000258v2_alt G A 0/1 GL000258v2_alt T C 0/1 GL000258v2_alt C A 0/1 GL000258v2_alt T C 0/1 VCF entries for chr17 (H1) VCF entries for alt locus (H2) Variant calling NA10847 is heterozygous H1/H2
  • 14. chr17 A T 0/1 chr17 G A 0/1 chr17 A G 0/1 chr17 C T 0/1 chr17 A C 0/1 chr17 C T 0/1 GL000258v2_alt T A 0/1 GL000258v2_alt A G 0/1 GL000258v2_alt G A 0/1 GL000258v2_alt T C 0/1 GL000258v2_alt C A 0/1 GL000258v2_alt T C 0/1 VCF entries for chr17 (H1) VCF entries for alt locus (H2) Variant calling NA10847 is heterozygous H1/H2 Note: Allelic relationship is not reflected in “raw” VCF
  • 15. Proposal: Develop a downstream tool that leverages informative SNPs to distinguish and assign haplotypes at alt loci via a standard VCF file. Intermediate solution until variant callers handle this complexity natively
  • 17. Overview “Raw” V/BCF file “database” of alt loci & inform. SNPs alt_locus main_locus GL000258.2 (H2) chr17:45309498-46836265 (H1) infor. markers GRCh38 position H1 H2 rs241039 45637307 A T rs2049515 45684490 C T . . .
  • 18. Overview “Raw” V/BCF file “database” of alt loci & inform. SNPs New tool(s) alt_locus main_locus GL000258.2 (H2) chr17:45309498-46836265 (H1) infor. markers GRCh38 position H1 H2 rs241039 45637307 A T rs2049515 45684490 C T . . .
  • 19. Overview “Raw” V/BCF file “database” of alt loci & inform. SNPs New tool(s) Updated V/BCF file Haplotype predictions per indiv. alt_locus main_locus GL000258.2 (H2) chr17:45309498-46836265 (H1) infor. markers GRCh38 position H1 H2 rs241039 45637307 A T rs2049515 45684490 C T . . .
  • 20. Augment VCF with assembly information ##seq-info=<name=chr17, id=CM000679.2> ! ##region-info=<name=MAPT, id=GL000258.2, assoc_id=CM000679.2, reg=45309498-46836265> Based on ideas from Deanna Church
  • 21. Introduce new (reserved) VCF INFO tags ##INFO=<ID=ALTLOCS, Number=., Type=String, Description=“A list of the alternate loci in the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAPS, Number=., Type=String, Description=“A list of the known haplotypes that are associated with this locus”> ##FORMAT=<ID=HT,Number=1,Type=String,Description=“Haplotype combination based on ALTHAPS">
  • 22. (Draft) example VCF output, post “correction” ##seq-info=<name=chr17, id=CM000679.2> ##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265> ##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with this locus”> ! #CHROM POS REF ALT INFO FORMAT NA10847 chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1 chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1 chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2 GT:HT 0/1:0/1 . . . GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1 GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1 GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2 GT:HT 0/1:0/1 . . .
  • 23. (Draft) example VCF output, post “correction” ##seq-info=<name=chr17, id=CM000679.2> ##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265> ##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with this locus”> ! #CHROM POS REF ALT INFO FORMAT NA10847 chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1 chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2 GT:HT 0/1:0/1 chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2 GT:HT 0/1:0/1 . . . GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1 GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2 GT:HT 0/1:0/1 GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2 GT:HT 0/1:0/1 . . . Solely two different haplotypes is the base case.
  • 24. For example NA10847 is actually H1/H2D H2D derived from ancestral H2. Markers that distinguish H2D from H2
  • 25. chr17 A T 0/1 chr17 G A 0/1 chr17 A G 0/1 chr17 C T 0/1 chr17 A C 0/1 chr17 C T 0/1 chr17 C T 0/1 chr17 C T 0/1 chr17 G A 0/1 chr17 A G 0/1 chr17 C T 0/1 GL000258v2_alt T A 0/1 GL000258v2_alt A G 0/1 GL000258v2_alt G A 0/1 GL000258v2_alt T C 0/1 GL000258v2_alt C A 0/1 GL000258v2_alt T C 0/1 GL000258v2_alt C T 0/1 GL000258v2_alt C T 0/1 GL000258v2_alt G A 0/1 GL000258v2_alt A G 0/1 GL000258v2_alt C T 0/1 VCF entries for chr17 VCF entries for alt locus Interpretation is much harder w/ many haplotypes H1 H2 H2D
  • 26. (Draft) example VCF output, post “correction” ##seq-info=<name=chr17, id=CM000679.2> ##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265> ##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with this locus”> ! #CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599 chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . . GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . .
  • 27. (Draft) example VCF output, post “correction” ##seq-info=<name=chr17, id=CM000679.2> ##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265> ##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with this locus”> ! #CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599 chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . . GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . . NA10847 chr17 45309498 46836265 H1 H2D rs241039,rs434428,… NA12878 chr17 45309498 46836265 H1 H1 rs241039,rs434428,… NA21599 chr17 45309498 46836265 H2D H2D rs241039,rs434428,… sample chrom start end hap1 hap2 markers
  • 28. (Draft) example VCF output, post “correction” ##seq-info=<name=chr17, id=CM000679.2> ##region-info=<name=MAPT,id=GL000258.2,assoc_id=CM000679.2,reg=45309498-46836265> ##INFO=<ID=ALTLOC,Number=.,Type=String,Description=“A list of the alternate loci is the reference genome that are associated with this locus”> ##INFO=<ID=ALTHAP,Number=.,Type=String,Description=“A list of the known haplotypes that are associated with this locus”> ! #CHROM POS REF ALT INFO FORMAT NA10847 NA12878 NA21599 chr17 111 A T ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 222 G A ALTLOCS=GL000258.2;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 chr17 333 A G ALTLOCS=GL000258.2;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . . GL000258.2 111 A T ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 222 G A ALTLOCS=chr17;ALTHAPs=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 GL000258.2 333 A G ALTLOCS=chr17;ALTHAPS=H1,H2,H2D GT:HT 0/1:0/2 0/0:0/0 1/1:2/2 . . . NA10847 chr17 45309498 46836265 H1 H2D rs241039,rs434428,… NA12878 chr17 45309498 46836265 H1 H1 rs241039,rs434428,… NA21599 chr17 45309498 46836265 H2D H2D rs241039,rs434428,… sample chrom start end hap1 hap2 markers
  • 29. The good and the bad Good • No burden on existing variant callers to adapt to calling w.r.t. alt loci • Tool for updating VCF can be updated and improved in parallel with variant callers. Bad • One more step / file in the variant interpretation pipeline • This strategy is only applicable to cases where informative markers exist. Use CNVs in WGS: e.g., KANSL1 partial duplications to distinguish MAPT alt loci
  • 30. The good and the bad Good • No burden on existing variant callers to adapt to calling w.r.t. alt loci • Tool for updating VCF can be updated and improved in parallel with variant callers. Bad • One more step / file in the variant interpretation pipeline • This strategy is only applicable to cases where informative markers exist. Use CNVs in WGS: e.g., KANSL1 partial duplications to distinguish MAPT alt loci To Do / Discuss • How best to improve alignment strategies to facilitate variant and haplotype detection? • How to best represent the resolved alternate loci in VCF format?
  • 31. Many thanks for helpful discussions with: Deanna Church Brad Holmes Karyn Meltz-Steinberg Heng Li Colin Hercus