GRCWorkshop_geval_1KG_slides

Genome Reference Consortium
Genome Reference ConsortiumGenome Reference Consortium
gVAL Browser and the GRC TrackHub 
Laura Clarke 
http://geval.sanger.ac.uk/
• gVAL Browser 
• GRC TrackHub 
• 1000 Genomes and GRCh38
gEVAL Browser 
• Evaluation of Genome Assemblies 
• Based on Ensembl Framework 
• Run by Sanger Institute 
• Navigation of current assemblies 
• View different annotation 
• Punchlists of Assembly Issues 
http://geval.sanger.ac.uk/
gEVAL Browser - Annotation 
• Intra species alignments 
• Optical Map data 
• Clone End mapping 
• Black Tags/Clone sequence anomalies 
• Self Comparisons 
• CCDS Alignments 
• RefSeq Alignments 
• Repeat Annotation 
– Centromeric repeats 
– Telomeric repeats 
– 1000 Genomes identified Mobile Element Insertions 
http://geval.sanger.ac.uk/
Human Build Info 
• Primary Reference 
– GRCh38 
• Current Build 
– Human 20140826 
• Older Versions 
– GRCh37.p13 
– NCBI36 
• Other Assemblies 
– CHM1_1.1 
– NA12878 
– Huref 
– YH2.0 
http://geval.sanger.ac.uk/
http://geval.sanger.ac.uk/
Intra-species comparisons
GRCh37 vs HuRef 
GRCh38 
HuRef 
GRCh37 
http://geval.sanger.ac.uk/
GRCh38 
HuRef 
GRCh37 
GRCh38 fix. 
http://geval.sanger.ac.uk/
HG-1312 (add wgs to PATH)
Add fix to GRCh38 
http://geval.sanger.ac.uk/
Clone End Library Mappings 
19 Human Clone Libraries 
• Reveals state of assembly 
• Orientation 
• Mis-assemblies 
• Incorrect location 
• Source of new sequence 
Mapped 1 time 
Mapped multiple times 
Wrong direction (<<, <>, >>) 
Wrong distance from partner 
Spanning partner in the vicinity 
http://geval.sanger.ac.uk/
Clone End Library Mappings
Clone End Library Mappings 
http://geval.sanger.ac.uk/
Using Clone End Library Mappings 
before 
after 
http://geval.sanger.ac.uk/
Optical Maps 
• ..ordered restriction maps from single stained molecules of 
DNA. 
• D.Schwartz (UW-Madison) 
http://geval.sanger.ac.uk/
TrackHub 
http://ngs.sanger.ac.uk/production/grit/track_hub/hub.tx 
t 
Display GRC issues in Ensembl and UCSC 
• Genome issues under review by the GRC 
• Genomic regions defined by the GRC 
• Alignments between the primary assembly and 
alternate loci or patches 
• Clone sequence anomalies 
• Human regions with clones from the CHORI-17 library 
(CHM1tert)
Adding a TrackHub to Ensembl 
http://ngs.sanger.ac.uk/production/grit/track_hub/hub.txt
Adding a TrackHub to Ensembl
TrackHub Display 1:12956267- 
13431650
Adding a TrackHub to UCSC
Adding a TrackHub to UCSC
TrackHub Display 1:12956267- 
13431650
TrackHub Display 1:12956267- 
13431650
The 1000 Genomes Project 
• IGSR established to maintain 1000 Genomes Data 
• Phase 3 release is out on GRCh37 
• Reads will be remapping in 1Q 2015 
• dbSNP has GRCh38 mapping for most sites now 
• Plan to recall/map in 2Q 2015
Acknowledgments 
GRIT (Team135) 
• Kate Auger 
• Joanna Collins 
• Guy Griffith 
• Glenn Harden 
• Paul Heath 
• Britt Kilian 
• Kerstin Howe 
• Sarah Pelan 
• Glen Threadgold 
• James Torrance 
• Jo Wood 
Alumni 
• Kim Brugger 
• Mario Caccamo 
• Ian Sealy 
• Tina Eyre 
• Ed Zuiderwijk 
• James Smith 
• Paul Bevan 
• Simon Brent 
• Harpreet Riat 
• David Harper 
• David Schwartz (UW Madison) 
• Steve Goldstein (UWMadison) 
• Evan Eichler (Uwashington) 
• Jeff Kid (Uwashington) 
http://geval.sanger.ac.uk/
1 of 26

More Related Content

What's hot(20)

Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
Genome Reference Consortium483 views
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
Genome Reference Consortium1.3K views
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
Genome Reference Consortium236 views
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
Genome Reference Consortium7.4K views
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
Genome Reference Consortium1.4K views
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
Genome Reference Consortium349 views
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
Genome Reference Consortium590 views
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
Genome Reference Consortium574 views
agbt 2016 workshop lindsayagbt 2016 workshop lindsay
agbt 2016 workshop lindsay
Genome Reference Consortium1.2K views
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
Genome Reference Consortium1.5K views
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
Genome Reference Consortium1K views
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
Genome Reference Consortium1.6K views
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
Genome Reference Consortium738 views
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
Genome Reference Consortium688 views
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
Genome Reference Consortium2.3K views
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
Genome Reference Consortium292 views
Schneider_AGBT2014Schneider_AGBT2014
Schneider_AGBT2014
vaschn9.6K views
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
Genome Reference Consortium533 views
Variant Calling IIVariant Calling II
Variant Calling II
Genome Reference Consortium1.9K views

Similar to GRCWorkshop_geval_1KG_slides(20)

Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
Genome Reference Consortium2.4K views
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
Din Apellidos4.2K views
2016 bioinformatics i_databases_wim_vancriekinge2016 bioinformatics i_databases_wim_vancriekinge
2016 bioinformatics i_databases_wim_vancriekinge
Prof. Wim Van Criekinge1.8K views
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ian Foster523 views
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
Enis Afgan3.2K views
Bioinformatics t2-databases wim-vancriekinge_v2013Bioinformatics t2-databases wim-vancriekinge_v2013
Bioinformatics t2-databases wim-vancriekinge_v2013
Prof. Wim Van Criekinge2.9K views
2015 bioinformatics databases_wim_vancriekinge2015 bioinformatics databases_wim_vancriekinge
2015 bioinformatics databases_wim_vancriekinge
Prof. Wim Van Criekinge2.8K views
March 2013 Bioinformatics Working GroupMarch 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working Group
GenomeInABottle1.5K views
V4 Sequencing Reagent ExperienceV4 Sequencing Reagent Experience
V4 Sequencing Reagent Experience
Brian Krueger2.4K views

More from Genome Reference Consortium(16)

Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
Genome Reference Consortium2.1K views
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
Genome Reference Consortium1.4K views
Mane v2 finalMane v2 final
Mane v2 final
Genome Reference Consortium607 views
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
Genome Reference Consortium334 views
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
Genome Reference Consortium436 views
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
Genome Reference Consortium331 views
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
Genome Reference Consortium312 views
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
Genome Reference Consortium354 views
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
Genome Reference Consortium1.5K views
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
Genome Reference Consortium490 views
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
Genome Reference Consortium569 views
Genome in a BottleGenome in a Bottle
Genome in a Bottle
Genome Reference Consortium1.4K views

GRCWorkshop_geval_1KG_slides

  • 1. gVAL Browser and the GRC TrackHub Laura Clarke http://geval.sanger.ac.uk/
  • 2. • gVAL Browser • GRC TrackHub • 1000 Genomes and GRCh38
  • 3. gEVAL Browser • Evaluation of Genome Assemblies • Based on Ensembl Framework • Run by Sanger Institute • Navigation of current assemblies • View different annotation • Punchlists of Assembly Issues http://geval.sanger.ac.uk/
  • 4. gEVAL Browser - Annotation • Intra species alignments • Optical Map data • Clone End mapping • Black Tags/Clone sequence anomalies • Self Comparisons • CCDS Alignments • RefSeq Alignments • Repeat Annotation – Centromeric repeats – Telomeric repeats – 1000 Genomes identified Mobile Element Insertions http://geval.sanger.ac.uk/
  • 5. Human Build Info • Primary Reference – GRCh38 • Current Build – Human 20140826 • Older Versions – GRCh37.p13 – NCBI36 • Other Assemblies – CHM1_1.1 – NA12878 – Huref – YH2.0 http://geval.sanger.ac.uk/
  • 8. GRCh37 vs HuRef GRCh38 HuRef GRCh37 http://geval.sanger.ac.uk/
  • 9. GRCh38 HuRef GRCh37 GRCh38 fix. http://geval.sanger.ac.uk/
  • 10. HG-1312 (add wgs to PATH)
  • 11. Add fix to GRCh38 http://geval.sanger.ac.uk/
  • 12. Clone End Library Mappings 19 Human Clone Libraries • Reveals state of assembly • Orientation • Mis-assemblies • Incorrect location • Source of new sequence Mapped 1 time Mapped multiple times Wrong direction (<<, <>, >>) Wrong distance from partner Spanning partner in the vicinity http://geval.sanger.ac.uk/
  • 13. Clone End Library Mappings
  • 14. Clone End Library Mappings http://geval.sanger.ac.uk/
  • 15. Using Clone End Library Mappings before after http://geval.sanger.ac.uk/
  • 16. Optical Maps • ..ordered restriction maps from single stained molecules of DNA. • D.Schwartz (UW-Madison) http://geval.sanger.ac.uk/
  • 17. TrackHub http://ngs.sanger.ac.uk/production/grit/track_hub/hub.tx t Display GRC issues in Ensembl and UCSC • Genome issues under review by the GRC • Genomic regions defined by the GRC • Alignments between the primary assembly and alternate loci or patches • Clone sequence anomalies • Human regions with clones from the CHORI-17 library (CHM1tert)
  • 18. Adding a TrackHub to Ensembl http://ngs.sanger.ac.uk/production/grit/track_hub/hub.txt
  • 19. Adding a TrackHub to Ensembl
  • 21. Adding a TrackHub to UCSC
  • 22. Adding a TrackHub to UCSC
  • 25. The 1000 Genomes Project • IGSR established to maintain 1000 Genomes Data • Phase 3 release is out on GRCh37 • Reads will be remapping in 1Q 2015 • dbSNP has GRCh38 mapping for most sites now • Plan to recall/map in 2Q 2015
  • 26. Acknowledgments GRIT (Team135) • Kate Auger • Joanna Collins • Guy Griffith • Glenn Harden • Paul Heath • Britt Kilian • Kerstin Howe • Sarah Pelan • Glen Threadgold • James Torrance • Jo Wood Alumni • Kim Brugger • Mario Caccamo • Ian Sealy • Tina Eyre • Ed Zuiderwijk • James Smith • Paul Bevan • Simon Brent • Harpreet Riat • David Harper • David Schwartz (UW Madison) • Steve Goldstein (UWMadison) • Evan Eichler (Uwashington) • Jeff Kid (Uwashington) http://geval.sanger.ac.uk/

Editor's Notes

  1. Top BOX is GRCh38 Middle BOX is HuREF Human Assembly Bottom BOX is GRCh37p13 In GRCh37p13 window there is a gap between the components AC144441/AC147553 and AC232988. Aligning to Huref Assembly, shows potential ability to size of the gap based on sequence (even though there appears to be a small gap). This can allow user to define specific strategy to close this gap (ie small gap -> PCR, or insert wgs sequence or choose fosmid vs BAC clone or other) The gap is roughly 10kb. RESOLUTION: If you look at the ticket HG1312 (https://ncbijira.ncbi.nlm.nih.gov/browse/HG-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel): Milinn sees this as well, and this region is actually important b/c of a gene HIPK2 (GeneID:28996). Milinn edits the TPF to include the HuREF components initially: SEE NEXT SLIDE
  2. Revisiting the first slide, you see the scaffold placed in the gapped region in GRCh38. The scaffold is itself 60kb as seen in the clone track (yellow box: scaf00473_reg01_ctg01), but contributes the 9948 bp. The gap is now closed, and the associated gene that was split due to the gap ( HIPK2 (GeneID:28996 ) is now fixed.
  3. As seen above the Ticket entries shows the TPF with the WGS added. However later on, Deanna Church finds an unplaced scaffold in the RP11 WGS assembly and Milinn proceeds to accession and use that to fit into the next release ( see next slide ).
  4. As Predicted from the first slide, the gap was roughly 10kb!
  5. Clone end placements reveal sequence that can be placed in the gap region. Assembly reveals newly sequenced clone in path.
  6. The clone component AL596089 contains a deletion and is highlighted by the 3 cell line optical map analysis (right). This would not have been captured because the clone overlaps do not extend far enough to show this. An issue that is tagged and reported in GRC ticket: HG-1482.