SlideShare a Scribd company logo
Variation reference graphs and
the variation graph toolkit vg
Erik Garrison, Jouni Siren, Eric Dawson, Richard
Durbin
Wellcome Trust Sanger Institute
Adam Novak, Benedict Paten et al., UCSC
and many others
Variation Reference
• Go beyond a linear reference
– Why a (quasi)-linear reference and a catalog
of variants which we keeping finding again?
• Local variation: graph reference
– Map to a structure including known variation
– >99% variants per person already seen
• Long range variation: haplotype structure
– Exploit variation sharing – support phasing
– Recombination rate ~ mutation rate
– >99% recombination breakpoints per person
Variation Reference
• Go beyond a linear reference
– Why a (quasi)-linear reference and a catalog
of variants which we keeping finding again?
• Local variation: graph reference
– Map to a structure including known variation
– >99% variants per person already seen
• Long range variation: haplotype structure
– Exploit variation sharing – support phasing
– Recombination rate ~ mutation rate
– >99% recombination breakpoints per person
Variation graphs: “Pan Genome”
A variation graph represents many genomes in
one non-redundant structure.
Nodes contain sequence and edges between the ends
of nodes represent potential links between successive
sequences
Variation graphs and train
tracks
The links in a variation graph are bidirectional.
They behave in many ways like train tracks.
Nodes have positive
and negative strands,
allowing them to be
traversed in either
direction, and can be
connected to form loops
(repeats), inversions
and translocations.
NB There are other ways to do this. One can have
sequence on edges. Or unidirectional graphs (nearly)
twice as big.
“Computational Pan-Genomics:
Status, Promises and
Challenges.”
Computational Pan-Genomics
Consortium. Briefings in
Bioinformatics (2016) in press
Essential
operations on
pan-genomes
github.com/vgteam/vg
Operations
implemented in vg
Implementation in vg
• Nodes with sequence, Edges, Paths,
Mappings
• Alignment tools and .gam format
• Serialisation to disk via protobuf, succinct
representation xg, graph building/editing,
extraction, unrolling and DAGification of local
graphs etc.
https://github.com/vgteam/vg
AGCTCTCCTTGTCCCTCCTACGATCTCTTCACTGGCCTCTTATCTTTACTGTTACCAAATCTTTCCGGAAGCTGCTCTTTC
find k-mer
subgraphs
read
k-mers
node ids
hit clusters
cluster ids
target subgraph
partial order
alignment
Alignment
k-mer based
alignment of
short reads to a
variation graph
store results in
Graph Alignment
Map (GAM) format
Alternative index: GCSA2
• Generalised Compressed Suffix Array
– Jouni Siren, Niko Valimaki, Veli Makinen
• Natural extension of BWT to graphs
– Essentially set of minimal unique k-mers
with one base prefix extension
– Supports compression, FM-index style
search etc.
• Now implemented for vg graph search
– <20GB index, fast SMEM seed and extendJouni Siren talk tomorrow
(Maximal Exact Match)
Pilot alignment and
variant calling
evaluation
Slides from Benedict Paten and collaborators
Variation reference graphs and the variation graph toolkit vg
Variation reference graphs and the variation graph toolkit vg
Genotyper output
The genotyper considers support for every bubble based on
embedded paths and emits genotypes as Locus records that
are each a set of alleles represented as paths relative to the
base graph.
Most variants are within the reference.
Also consider new variants by (temporarily)
augmenting the graph to include repeatedly seen
alignment alternatives.
Genotype evaluation
mix CHM1/13 Illumina reads – truth from PacBio
MHC BRCA1
Reference
Graph
Augmented
Graph &
Alignments
Alignments, Paths,
Genotypes, and
Annotations Relative
to the Augmented
Graph
Aligned
Reads
Translation
Coordinates in vg are
not stable across graph
edits.
But, we can retain a
mapping from new to
old coordinates when
editing.
This translation
provides a stable
coordinate system for
VGs, solves surjection
problem, and enables
a virtuous feedback
loop!
An architecture supporting
stable coordinates
Thank you
Erik Garrison, Jouni Sirén, Eric Dawson,
Jerven Bolleman, Adam Novak, Glen
Hickey, Benedict Paten, Will Jones, Jordan
Eizenga, Toshiaki Katayama, Orion Buske,
Raoul Bonnal, Mike Lin, and many others
who have helped us understand, design,
implement and evaluate vg.

More Related Content

Viewers also liked

Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
Genome Reference Consortium
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
Genome Reference Consortium
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
Genome Reference Consortium
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
Genome Reference Consortium
 
Exploiting long read sequencing technology to build a substantially improved ...
Exploiting long read sequencing technology to build a substantially improved ...Exploiting long read sequencing technology to build a substantially improved ...
Exploiting long read sequencing technology to build a substantially improved ...
Genome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
Genome Reference Consortium
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
Genome Reference Consortium
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
Genome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
Genome Reference Consortium
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
Genome Reference Consortium
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
Genome Reference Consortium
 

Viewers also liked (11)

Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
Exploiting long read sequencing technology to build a substantially improved ...
Exploiting long read sequencing technology to build a substantially improved ...Exploiting long read sequencing technology to build a substantially improved ...
Exploiting long read sequencing technology to build a substantially improved ...
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 

Similar to Variation reference graphs and the variation graph toolkit vg

Graph mining seminar_2009
Graph mining seminar_2009Graph mining seminar_2009
Graph mining seminar_2009
Houw Liong The
 
graph_mining_seminar_2009.ppt
graph_mining_seminar_2009.pptgraph_mining_seminar_2009.ppt
graph_mining_seminar_2009.ppt
Venkateswara Rao Katevarapu
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
sesejun
 
Data Consistency in Distributed Systems with Akka Distributed Data
Data Consistency in Distributed Systems with Akka Distributed DataData Consistency in Distributed Systems with Akka Distributed Data
Data Consistency in Distributed Systems with Akka Distributed Data
Dmitry Martyanov
 
B 4 gravty
B 4 gravtyB 4 gravty
B 4 gravty
LINE Corporation
 
DAGs, SCC, Square root decomposition,.pptx
DAGs, SCC, Square root decomposition,.pptxDAGs, SCC, Square root decomposition,.pptx
DAGs, SCC, Square root decomposition,.pptx
DeekshaM35
 
Scaling Genomic Analyses
Scaling Genomic AnalysesScaling Genomic Analyses
Scaling Genomic Analyses
fnothaft
 
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptxLecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Cuongnc220592
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Hong ChangBum
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
GenomeInABottle
 
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first SearchFast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
Yuichiro Yasui
 
March 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working GroupMarch 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working Group
GenomeInABottle
 
pattern_recognition2.ppt
pattern_recognition2.pptpattern_recognition2.ppt
pattern_recognition2.ppt
EricBacconi1
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Stuart MacGowan
 
I ♥ Maps: Quantum GIS + Python
I ♥ Maps: Quantum GIS + PythonI ♥ Maps: Quantum GIS + Python
I ♥ Maps: Quantum GIS + Python
Paige Bailey
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Flink Forward
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
COST action BM1006
 
An Introduction to NV_path_rendering
An Introduction to NV_path_renderingAn Introduction to NV_path_rendering
An Introduction to NV_path_rendering
Mark Kilgard
 
Outlier Analysis.pdf
Outlier Analysis.pdfOutlier Analysis.pdf
Outlier Analysis.pdf
H K Yoon
 
Graph Algorithms - Map-Reduce Graph Processing
Graph Algorithms - Map-Reduce Graph ProcessingGraph Algorithms - Map-Reduce Graph Processing
Graph Algorithms - Map-Reduce Graph Processing
Jason J Pulikkottil
 

Similar to Variation reference graphs and the variation graph toolkit vg (20)

Graph mining seminar_2009
Graph mining seminar_2009Graph mining seminar_2009
Graph mining seminar_2009
 
graph_mining_seminar_2009.ppt
graph_mining_seminar_2009.pptgraph_mining_seminar_2009.ppt
graph_mining_seminar_2009.ppt
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Data Consistency in Distributed Systems with Akka Distributed Data
Data Consistency in Distributed Systems with Akka Distributed DataData Consistency in Distributed Systems with Akka Distributed Data
Data Consistency in Distributed Systems with Akka Distributed Data
 
B 4 gravty
B 4 gravtyB 4 gravty
B 4 gravty
 
DAGs, SCC, Square root decomposition,.pptx
DAGs, SCC, Square root decomposition,.pptxDAGs, SCC, Square root decomposition,.pptx
DAGs, SCC, Square root decomposition,.pptx
 
Scaling Genomic Analyses
Scaling Genomic AnalysesScaling Genomic Analyses
Scaling Genomic Analyses
 
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptxLecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
 
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first SearchFast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
 
March 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working GroupMarch 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working Group
 
pattern_recognition2.ppt
pattern_recognition2.pptpattern_recognition2.ppt
pattern_recognition2.ppt
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
 
I ♥ Maps: Quantum GIS + Python
I ♥ Maps: Quantum GIS + PythonI ♥ Maps: Quantum GIS + Python
I ♥ Maps: Quantum GIS + Python
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
An Introduction to NV_path_rendering
An Introduction to NV_path_renderingAn Introduction to NV_path_rendering
An Introduction to NV_path_rendering
 
Outlier Analysis.pdf
Outlier Analysis.pdfOutlier Analysis.pdf
Outlier Analysis.pdf
 
Graph Algorithms - Map-Reduce Graph Processing
Graph Algorithms - Map-Reduce Graph ProcessingGraph Algorithms - Map-Reduce Graph Processing
Graph Algorithms - Map-Reduce Graph Processing
 

More from Genome Reference Consortium

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
Genome Reference Consortium
 
What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
Genome Reference Consortium
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
Genome Reference Consortium
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
Genome Reference Consortium
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
Genome Reference Consortium
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
Genome Reference Consortium
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
Genome Reference Consortium
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
Genome Reference Consortium
 
Mane v2 final
Mane v2 finalMane v2 final
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
Genome Reference Consortium
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
Genome Reference Consortium
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
Genome Reference Consortium
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
Genome Reference Consortium
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
Genome Reference Consortium
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
Genome Reference Consortium
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
Genome Reference Consortium
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
Genome Reference Consortium
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
Genome Reference Consortium
 

More from Genome Reference Consortium (18)

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 

Recently uploaded

Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptx
Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptxProstatitis Severity- How to Determine if You Have Mild Symptoms.pptx
Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptx
AmandaChou9
 
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptxOBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
Niranjan Chavan
 
Prevention of Cruelty to animals act 1960
Prevention of Cruelty to animals act 1960Prevention of Cruelty to animals act 1960
Prevention of Cruelty to animals act 1960
PratibhaSonawane5
 
intermine.bio2rdf.org : A QLever SPARQL endpoint
intermine.bio2rdf.org : A QLever SPARQL endpointintermine.bio2rdf.org : A QLever SPARQL endpoint
intermine.bio2rdf.org : A QLever SPARQL endpoint
François Belleau
 
BCBR MCQs with Answers.pdf for exam for NMC promotions
BCBR MCQs with Answers.pdf for exam for NMC promotionsBCBR MCQs with Answers.pdf for exam for NMC promotions
BCBR MCQs with Answers.pdf for exam for NMC promotions
sathya swaroop patnaik
 
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
ED PIllsForever
 
Interpretation of ECG - Cardiac Arrhythmias
Interpretation of ECG - Cardiac ArrhythmiasInterpretation of ECG - Cardiac Arrhythmias
Interpretation of ECG - Cardiac Arrhythmias
MedicoseAcademics
 
World Population Day 2024_Overview_Dr Bijan Das
World Population Day 2024_Overview_Dr Bijan DasWorld Population Day 2024_Overview_Dr Bijan Das
World Population Day 2024_Overview_Dr Bijan Das
srmnchatripura
 
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
PVI, PeerView Institute for Medical Education
 
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
Sarthi Life Sciences
 
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
Niranjan Chavan
 
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptxPICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
Aloy Okechukwu Ugwu
 
2nd week of Human development .embryology
2nd week of Human development .embryology2nd week of Human development .embryology
2nd week of Human development .embryology
Mithilesh Chaurasia
 
Rice Bran Oil Manufacturing Process
Rice Bran Oil Manufacturing ProcessRice Bran Oil Manufacturing Process
Rice Bran Oil Manufacturing Process
nishurani4455
 
THE MANAGEMENT OF PROSTATE CANCER . pptx
THE MANAGEMENT OF PROSTATE CANCER . pptxTHE MANAGEMENT OF PROSTATE CANCER . pptx
THE MANAGEMENT OF PROSTATE CANCER . pptx
Bright Chipili
 
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHYSTAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
DRPREETHIJAMESP
 
MEDICAL PROFESSIONALISM Class of compassionate care
MEDICAL PROFESSIONALISM Class of compassionate careMEDICAL PROFESSIONALISM Class of compassionate care
MEDICAL PROFESSIONALISM Class of compassionate care
Debre Berhan University
 
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
FFragrant
 
Introduction to Dental Implant for undergraduate student
Introduction to Dental Implant for undergraduate studentIntroduction to Dental Implant for undergraduate student
Introduction to Dental Implant for undergraduate student
Shamsuddin Mahmud
 
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsxDr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
Dr.Tarik Enaairi
 

Recently uploaded (20)

Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptx
Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptxProstatitis Severity- How to Determine if You Have Mild Symptoms.pptx
Prostatitis Severity- How to Determine if You Have Mild Symptoms.pptx
 
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptxOBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
OBSTETRICS SEPSIS - BUNDLE APPROACH.pptx
 
Prevention of Cruelty to animals act 1960
Prevention of Cruelty to animals act 1960Prevention of Cruelty to animals act 1960
Prevention of Cruelty to animals act 1960
 
intermine.bio2rdf.org : A QLever SPARQL endpoint
intermine.bio2rdf.org : A QLever SPARQL endpointintermine.bio2rdf.org : A QLever SPARQL endpoint
intermine.bio2rdf.org : A QLever SPARQL endpoint
 
BCBR MCQs with Answers.pdf for exam for NMC promotions
BCBR MCQs with Answers.pdf for exam for NMC promotionsBCBR MCQs with Answers.pdf for exam for NMC promotions
BCBR MCQs with Answers.pdf for exam for NMC promotions
 
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
Overcoming Erectile Dysfunction Lifestyle Changes and the Role of Sildigra 25...
 
Interpretation of ECG - Cardiac Arrhythmias
Interpretation of ECG - Cardiac ArrhythmiasInterpretation of ECG - Cardiac Arrhythmias
Interpretation of ECG - Cardiac Arrhythmias
 
World Population Day 2024_Overview_Dr Bijan Das
World Population Day 2024_Overview_Dr Bijan DasWorld Population Day 2024_Overview_Dr Bijan Das
World Population Day 2024_Overview_Dr Bijan Das
 
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
Stepping Forward to Transform MCL Management: Guidance on the Selection and U...
 
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
PCD Pharma Franchise For Gynae Products | Infertility Range - Sarthi Life Sci...
 
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
STRATEGIES FOR RATIONALISING/REDUCING CAESAREAN SECTION RATE BY USE OF "SION ...
 
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptxPICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
PICTURE TEST IN OBSTETRICS AND GYNAECOLOGY-Aloy Okechukwu Ugwu.pptx
 
2nd week of Human development .embryology
2nd week of Human development .embryology2nd week of Human development .embryology
2nd week of Human development .embryology
 
Rice Bran Oil Manufacturing Process
Rice Bran Oil Manufacturing ProcessRice Bran Oil Manufacturing Process
Rice Bran Oil Manufacturing Process
 
THE MANAGEMENT OF PROSTATE CANCER . pptx
THE MANAGEMENT OF PROSTATE CANCER . pptxTHE MANAGEMENT OF PROSTATE CANCER . pptx
THE MANAGEMENT OF PROSTATE CANCER . pptx
 
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHYSTAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
STAPHYSAGRIA.BHMS.MATERIA MEDICA.HOMOEOPATHY
 
MEDICAL PROFESSIONALISM Class of compassionate care
MEDICAL PROFESSIONALISM Class of compassionate careMEDICAL PROFESSIONALISM Class of compassionate care
MEDICAL PROFESSIONALISM Class of compassionate care
 
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
Safeguarding Reproductive Health- Preventing Fallopian Tube Blockage After a ...
 
Introduction to Dental Implant for undergraduate student
Introduction to Dental Implant for undergraduate studentIntroduction to Dental Implant for undergraduate student
Introduction to Dental Implant for undergraduate student
 
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsxDr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
Dr.Tarik Enaairi - Dermatology - Mastocytosis.ppsx
 

Variation reference graphs and the variation graph toolkit vg

  • 1. Variation reference graphs and the variation graph toolkit vg Erik Garrison, Jouni Siren, Eric Dawson, Richard Durbin Wellcome Trust Sanger Institute Adam Novak, Benedict Paten et al., UCSC and many others
  • 2. Variation Reference • Go beyond a linear reference – Why a (quasi)-linear reference and a catalog of variants which we keeping finding again? • Local variation: graph reference – Map to a structure including known variation – >99% variants per person already seen • Long range variation: haplotype structure – Exploit variation sharing – support phasing – Recombination rate ~ mutation rate – >99% recombination breakpoints per person
  • 3. Variation Reference • Go beyond a linear reference – Why a (quasi)-linear reference and a catalog of variants which we keeping finding again? • Local variation: graph reference – Map to a structure including known variation – >99% variants per person already seen • Long range variation: haplotype structure – Exploit variation sharing – support phasing – Recombination rate ~ mutation rate – >99% recombination breakpoints per person
  • 4. Variation graphs: “Pan Genome” A variation graph represents many genomes in one non-redundant structure. Nodes contain sequence and edges between the ends of nodes represent potential links between successive sequences
  • 5. Variation graphs and train tracks The links in a variation graph are bidirectional. They behave in many ways like train tracks. Nodes have positive and negative strands, allowing them to be traversed in either direction, and can be connected to form loops (repeats), inversions and translocations. NB There are other ways to do this. One can have sequence on edges. Or unidirectional graphs (nearly) twice as big.
  • 6. “Computational Pan-Genomics: Status, Promises and Challenges.” Computational Pan-Genomics Consortium. Briefings in Bioinformatics (2016) in press Essential operations on pan-genomes
  • 8. Implementation in vg • Nodes with sequence, Edges, Paths, Mappings • Alignment tools and .gam format • Serialisation to disk via protobuf, succinct representation xg, graph building/editing, extraction, unrolling and DAGification of local graphs etc. https://github.com/vgteam/vg
  • 9. AGCTCTCCTTGTCCCTCCTACGATCTCTTCACTGGCCTCTTATCTTTACTGTTACCAAATCTTTCCGGAAGCTGCTCTTTC find k-mer subgraphs read k-mers node ids hit clusters cluster ids target subgraph partial order alignment Alignment k-mer based alignment of short reads to a variation graph store results in Graph Alignment Map (GAM) format
  • 10. Alternative index: GCSA2 • Generalised Compressed Suffix Array – Jouni Siren, Niko Valimaki, Veli Makinen • Natural extension of BWT to graphs – Essentially set of minimal unique k-mers with one base prefix extension – Supports compression, FM-index style search etc. • Now implemented for vg graph search – <20GB index, fast SMEM seed and extendJouni Siren talk tomorrow (Maximal Exact Match)
  • 11. Pilot alignment and variant calling evaluation Slides from Benedict Paten and collaborators
  • 14. Genotyper output The genotyper considers support for every bubble based on embedded paths and emits genotypes as Locus records that are each a set of alleles represented as paths relative to the base graph. Most variants are within the reference. Also consider new variants by (temporarily) augmenting the graph to include repeatedly seen alignment alternatives.
  • 15. Genotype evaluation mix CHM1/13 Illumina reads – truth from PacBio MHC BRCA1
  • 16. Reference Graph Augmented Graph & Alignments Alignments, Paths, Genotypes, and Annotations Relative to the Augmented Graph Aligned Reads Translation Coordinates in vg are not stable across graph edits. But, we can retain a mapping from new to old coordinates when editing. This translation provides a stable coordinate system for VGs, solves surjection problem, and enables a virtuous feedback loop! An architecture supporting stable coordinates
  • 17. Thank you Erik Garrison, Jouni Sirén, Eric Dawson, Jerven Bolleman, Adam Novak, Glen Hickey, Benedict Paten, Will Jones, Jordan Eizenga, Toshiaki Katayama, Orion Buske, Raoul Bonnal, Mike Lin, and many others who have helped us understand, design, implement and evaluate vg.