Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Genome variation graphs with the vg toolkit

69 views

Published on

Presentation at 2019 ASHG GRC/GIAB workshop describing features and recent updates to the vg toolkit, including examples of comparisons to other methods used for alignment and variant detection.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Genome variation graphs with the vg toolkit

  1. 1. Genome variation graphs with the vg toolkit Jean Monlong Updates from the GRC & GIAB Oct 15, 2019
  2. 2. Variation Graphs An approach to incorporating information on human diversity into the genomic reference. 2
  3. 3. Sequencing reads map better on variation graphs 3
  4. 4. is a complete, open source solution for graph construction, read mapping, and variant calling. https://github.com/vgteam/vg 4
  5. 5. Garrison et al (2018). Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nature Biotechnology. Better read mapping in regions with variants 5
  6. 6. Better allele balances at heterozygous sites Deletion Size Insertion Size SNP AltAlleleFraction Garrison et al (2018). Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nature Biotechnology. 6
  7. 7. Structural Variants (SVs) Genomic variants >50bp. E.g. insertions, deletions, inversions. 7
  8. 8. Graph-based SV genotypers: vg, Paragraph, BayesTyper Traditional SV genotypers: Delly, SVTyper Genotyping SVs from short-read data High-quality SV catalogs from long-read sequencing studies (HGSVC 2019, GIAB 2019, SVPOP 2019). Hickey et al (2019). Genotyping structural variants in pangenome graphs using the vg toolkit. bioRxiv. 8
  9. 9. Graph from de novo assemblies Experiment with 12 yeast strains. - better read mapping. - SV better supported by reads. Hickey et al (2019). Genotyping structural variants in pangenome graphs using the vg toolkit. bioRxiv. 9
  10. 10. Variant normalization construct graph realign Robin Rounthwaite 10
  11. 11. Transcriptome analysis with variation graphs Jonas Sibbesen https://github.com/jonassibbesen/rpvg 11
  12. 12. New graph formats optimized for memory and speed VG: Legacy PG (PackedGraph): Small HG (HashGraph): Fast OG (Optimized Dynamic Graph Index): Balanced XG: Small and fast, but static Jordan Eizenga, Emily Fujimoto, Erik Garrison https://github.com/vgteam/libbdsg 12
  13. 13. 13 Graph Burrows-Wheeler Transform Stores 2,504 haplotypes from 1K Genomes Project in 14.6 GiB Can scale past tens of thousands of haplotypes Sirén et al. (2018). Haplotype-aware graph indexes. In 18th International Workshop on Algorithms in Bioinformatics (WABI) https://github.com/jltsiren/gbwtgraph
  14. 14. Faster short read mapping with giraffe Xian Chang, Jouni Siren, Adam Novak Assuming that most indels are in the graph and a large haplotype database. - Minimizer-based indexing. - Restrict to known haplotypes. - Gapless extension. - Faster clustering using a snarl tree. 14
  15. 15. https://github.com/vgteam/sequenceTubeMap Visualization Beyer et al. (2019). Sequence tube maps: making graph genomes intuitive to commuters. Bioinformatics (Oxford, England). 15
  16. 16. Pipelines in Toil and WDL https://github.com/vgteam/toil-vg https://github.com/vgteam/vg_wdl toil-vg WDL Graph construction x Read mapping x x Variant calling( SNV/indels & SVs) x x Charles Markello, Glenn Hickey, Mike Lin, Eric T. Dawson 16
  17. 17. Acknowledgements Benedict Paten Adam Novak Erik Garrison Jordan Eizenga Glenn Hickey Jouni Siren David Heller Check out Charles Markello’ Platform Talk (PgmNr 15) on applying vg for rare variant discovery! https://github.com/vgteam/vg 17 Jonas Sibbesen Xian Chang Charles Markello Yohei Rosen Robin Rounthwaite Susanna Morin Emily Fujimoto Eric Dawson Mike Lin Wolfgang Beyer Richard Durbin Daniel Zerbino

×