SlideShare a Scribd company logo
Identifying structural variation, component issues and other sequence
artifacts by integrating long range genome maps in a web-based genome
browser
William Chow, and Kerstin Howe
Wellcome Trust Sanger Institute, Cambridge, UK.
Applications
Identifying and Capturing Variation
The	
  Ashkenazim	
  and	
  CHS	
  trio	
  maps	
  were	
  
generated	
  using	
  Bionano	
  Haplotype	
  Aware	
  
so;ware	
  (unpublished).	
  	
  The	
  poten@al	
  
inheritance	
  paBern	
  of	
  the	
  child	
  can	
  be	
  deduced	
  
from	
  the	
  maps	
  of	
  the	
  parents.	
  
Primary	
  Assembly	
  	
  
(NC_00018)	
  
Alternate	
  Locus	
  
Representa@on	
  
(NT_187618)	
  
There	
  are	
  two	
  nickase	
  
labeled	
  groups	
  seen.	
  	
  	
  	
  
#	
  blocks	
   Total	
  size	
  	
  
Orange	
  
blocks	
  
2	
   ~32kb	
  
Red	
  
blocks	
  
4	
   ~38kb	
  
	
  
Allele	
  1	
  with	
  Primary	
  Assembly	
  (NC_00018)	
  
	
  
	
  
	
  
	
  
Allele	
  2	
  with	
  Alternate	
  Assembly	
  (NT_187618)	
  
	
  
Human	
  (GRCh38)	
  –	
  Chr	
  18:	
  43,724,697-43,768,880 	
  
CAST/EiJ	
  
BALB/cJ	
  
GRCm38	
  
PWK/PhJ	
  
PWK/PhJ	
  
Mouse	
  (PWK/PhJ)	
  –	
  Chr	
  8:	
  106,309,962-­‐106,509,961	
  
Assembly Evaluation
There are two observations from this region of PWK/PhJ (above):
A.  there is genome map discordance. (Mapà~10kb. BspQIà~60kb)
B.  all the transcript mappings are green, but pmfbp1 looks like there is a very long suspicious intron caused by
middle component ScRybd3_121_120.
Comparative alignments between PWK/PhJ and CAST/EiJ, BALB/cJ, GRCm38 (right):
C.  ScRybd3_121_120 has no alignments to the other mouse assemblies, providing evidence that perhaps this
component is creating an expansion of the region.
C
CB
A
Compara@ve	
  Assembly	
  View:	
  	
  
Mouse	
  (PWK/PhJ)	
  against	
  3	
  other	
  Mouse	
  Assemblies	
  	
  
The gEVAL Browser
gEVAL is a modern, scrollable and dynamic genome browser, allowing the user to view pre-calculated analyses or attach data as tracks specifically tailored for
assembly evaluation. It also includes comparative analyses of different assembly builds for each species as well as automated lists created to facilitate
identification of and navigation to issues or regions of interest.
Public	
  
Repository	
  
DENOVO	
  ASSEMBLY	
  
CONSENSUS	
  MAPS	
  
Example	
  of	
  data	
  used	
  
Clone	
  Library	
  Ends	
  
Transcripts/cDNAs	
  
Assembly	
  Self	
  Comparisons	
  
GRC	
  Issues	
  tracker	
  
Markers	
  
Pacbio	
  reads	
  
Align	
  with	
  
RefAligner	
  
Align	
  
Datasets	
  
Long	
  range	
  genome	
  maps	
  either	
  generated	
  by	
  the	
  Irys	
  Instrument	
  or	
  from	
  public	
  
sources	
  (A)	
  and	
  genomic	
  datasets	
  (B)	
  are	
  aligned/mapped	
  using	
  the	
  appropriate	
  
tools	
  against	
  the	
  assembly	
  and	
  loaded	
  into	
  to	
  the	
  browser	
  for	
  visualiza@on	
  (C).	
  
A
B
C
1	
   Genome	
  Map	
  Con@g	
  and	
  
BspQI	
  Insilico	
  Digest	
  track	
  
Discordance	
  of	
  map	
  size	
  between	
  nickase/labels	
  and	
  
digest	
  coloured	
  in	
  red.	
  
2	
   Transcript(s)	
  track	
   Complete	
  mapping	
  in	
  green,	
  incomplete	
  in	
  orange.	
  
3	
   Clone	
  end(s)	
  track	
   Concordant	
  paired	
  end	
  mappings	
  in	
  green,	
  insert	
  size/
orienta@on	
  issues	
  will	
  be	
  colored	
  orange/red.	
  
1
2
3
Mouse	
  Genome	
  Project	
  Strain-­‐specific	
  Genome	
  Maps	
  
129S1/SvImJ	
   A/J	
   AKR/J	
   BALB/cJ	
   C3H/HeJ	
   C57BL/6NJ	
  
CAST/EiJ	
   CBA/J	
   DBA/2J	
   FVB/NJ	
   LP/J	
   NOD/ShiLtJ	
  
NZO/HiLtJ	
   PWK/PhJ	
   SPRET/EiJ	
   WSB/EiJ	
  
	
  
•  Maps	
  Generated	
  by	
  the	
  Sanger	
  Ins[tute.	
  
Mice Images courtesy of JAX creative division, The Jackson Laboratory	
  
Genome	
  Reference	
  Consor[um	
  Species	
  Genome	
  Maps	
  
	
  
	
  
	
  
	
  
	
  
	
  
Human	
  
Ashkenazim	
  Trio	
  †	
  
NA24149	
  (father)	
  
NA24143	
  (mother)	
  
NA24385	
  (son)	
  
Southern	
  Han	
  Chinese	
  (CHS)	
  
Trio	
  ¤	
  
HG00514	
  (daughter)	
  
HG00512	
  (father)	
  
HG00513	
  (mother)	
  
Yoruba	
  (YRI)	
  Trio	
  ¤	
  
NA19240	
  (daughter)	
  
NA19239	
  (father)	
  
NA19238	
  (mother)	
  
Yan	
  Huang	
  (YH)	
  §	
  
PRJNA42199	
  
	
  
† Zook,	
   J.,	
   et	
   al.	
   Extensive	
   sequencing	
   of	
   seven	
   human	
   genomes	
   to	
   characterize	
   benchmark	
  
reference	
  materials.	
  BioRxiv	
  (2015)	
  
‡ Mak	
   AC	
   et	
   al.	
   Genome-­‐Wide	
   Structural	
   Varia@on	
   Detec@on	
   by	
   Genome	
   Mapping	
   on	
  
Nanochannel	
  Arrays	
  Gene@cs	
  (2015)	
  
§ Cao,	
  H.,	
  et	
  al.	
  Rapid	
  Detec@on	
  of	
  Structural	
  Varia@on	
  in	
  a	
  Human	
  Genome	
  using	
  Nanochannel-­‐
based	
  Genome	
  Mapping	
  Technology.	
  Giga	
  Science	
  (2014);	
  3(December	
  2014):	
  34	
  
¤	
   Human	
   Genome	
   Structural	
   Varia@on	
   Consor@um	
   (HGSV)	
   |1000	
   Genomes.	
   Currently	
   Under	
  
Publica[on	
  Embargo.	
  	
  
^	
  Courtesy	
  of	
  T.Graves	
  (MGI),	
  E.Lam	
  (Bionano	
  Genomics).	
  (2014)	
  
	
  
Central	
  Europe	
  Hapmap	
  (CEPH)	
  
Trio	
  ‡	
  
NA12878	
  (daughter)	
  
NA12891	
  (father)	
  
NA12892	
  (mother)	
  	
  
Puerto	
  Rican	
  Trio	
  ¤	
  
HG00733	
  (daughter)	
  
HG00731	
  (father)	
  
HG00732	
  (mother)	
  
Han	
  Chinese	
  Trio	
  †	
  
NA24631	
  (son	
  only)	
  
Haploid	
  Hyda[dform	
  mole	
  
(CHM1)	
  ^	
  
PRJNA176729	
  
	
  
Zebrafish	
  
Sanger	
  AB	
  Tübingen	
  (SAT)	
  
Generated	
  by	
  the	
  Sanger	
  Ins@tute	
  
A	
  Trackhub	
  is	
  available:h,p://bit.ly/25b7Tqg	
  
Genome Maps Available in gEVAL
To aid in this, we have incorporated long range
single molecule, genome mapping technology
datasets from both in-house (Sanger Institute)
and public repositories (Bionano Genomics,
Genome in a Bottle). Along with the wide range
of data already aligned to each genome, this
long range data can help identify structural
variation and confirm assembly irregularities
such as insertions, deletions and mis-
assemblies whilst providing suitable information
to resolve them.
	
  
In the image on the left, the genome maps (Ashkenazi, CHS
and CEPH Trio, Han Chinese son, YH and CHM1) are aligned
to GRCh38. The maps indicate two distinct patterns created
by the nickase labels providing evidence of an alternate locus
capturing ~6kb unique sequence compared to the primary
reference assembly.
A.  Within some family trio, the maps were haplotype-aware
assembled creating two map contig per individual, this
can be used to illicit inheritance patterns in family.
B.  When looking at all maps, the two patterns illustrate the
variation between two group, one consisting of
2 blocks (~11.8kb + 21kb = 32kb) and the other,
4 blocks (~9.8kb + 7kb + 3kb + 18kb = 38kb).
Note the former agrees with the BspQI digest of the
primary assembly (11.8kb +21kb).
C.  When looking at the BspQI digest track in the primary
assembly (NC_00018) and the assembly representing the
alternate locus (NT_187618) versus the maps, the
concordance can be seen.	
  
	
  
gEVAL	
  -­‐	
  A	
  web	
  based	
  browser	
  for	
  evalua[ng	
  genome	
  assemblies	
  
Chow	
  W,	
  Brugger	
  K,	
  Caccamo	
  M,	
  Sealy	
  I,	
  Torrance	
  J,	
  Howe	
  K	
  
Bioinforma@cs	
  2016	
  Apr	
  7.	
  pii:btw159:	
  PMID:	
  27153597	
  	
  
	
  
http://geval.sanger.ac.uk

More Related Content

Similar to Genome Informatics 2016 poster

Gene mapping
Gene  mappingGene  mapping
Gene mappingrashzz
 
Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Robin Gutell
 
chromosomal abnormalities by Iqra malik
chromosomal abnormalities by Iqra malik chromosomal abnormalities by Iqra malik
chromosomal abnormalities by Iqra malik hafizaiqramalik
 
Genome Exploration in A-T G-C space (mk1)
Genome Exploration in A-T G-C space (mk1)Genome Exploration in A-T G-C space (mk1)
Genome Exploration in A-T G-C space (mk1)Jonathan Blakes
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13Jonathan Eisen
 
genemappingppt-170209023430.pptx
genemappingppt-170209023430.pptxgenemappingppt-170209023430.pptx
genemappingppt-170209023430.pptxHINDUJA20
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Databricks
 
Kulakova sbb2014
Kulakova sbb2014Kulakova sbb2014
Kulakova sbb2014Ek_Kul
 
Comparitive genome mapping and model systems
Comparitive genome mapping and model systemsComparitive genome mapping and model systems
Comparitive genome mapping and model systemsHimanshi Chauhan
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsFrancis Rowland
 
Karen miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detectionKaren miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detectionGenomeInABottle
 
AlgoAlignementGenomicSequences.ppt
AlgoAlignementGenomicSequences.pptAlgoAlignementGenomicSequences.ppt
AlgoAlignementGenomicSequences.pptSkanderBena
 
Map Based Cloning.pptx
Map Based Cloning.pptxMap Based Cloning.pptx
Map Based Cloning.pptxAnkit136730
 
wheat genome project.pptx
wheat genome project.pptxwheat genome project.pptx
wheat genome project.pptxBhagya246626
 

Similar to Genome Informatics 2016 poster (20)

paper
paperpaper
paper
 
Gene mapping
Gene  mappingGene  mapping
Gene mapping
 
Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082Gutell 076.curr.genetics.2001.40.0082
Gutell 076.curr.genetics.2001.40.0082
 
Genetic mapping
Genetic mappingGenetic mapping
Genetic mapping
 
chromosomal abnormalities by Iqra malik
chromosomal abnormalities by Iqra malik chromosomal abnormalities by Iqra malik
chromosomal abnormalities by Iqra malik
 
Gene Mapping Methods:Linkage Maps & Mapping with Molecular Markers
Gene  Mapping  Methods:Linkage Maps & Mapping with Molecular MarkersGene  Mapping  Methods:Linkage Maps & Mapping with Molecular Markers
Gene Mapping Methods:Linkage Maps & Mapping with Molecular Markers
 
Genome Exploration in A-T G-C space (mk1)
Genome Exploration in A-T G-C space (mk1)Genome Exploration in A-T G-C space (mk1)
Genome Exploration in A-T G-C space (mk1)
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13
 
genemappingppt-170209023430.pptx
genemappingppt-170209023430.pptxgenemappingppt-170209023430.pptx
genemappingppt-170209023430.pptx
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
 
Gene mapping ppt
Gene mapping pptGene mapping ppt
Gene mapping ppt
 
Kulakova sbb2014
Kulakova sbb2014Kulakova sbb2014
Kulakova sbb2014
 
Comparitive genome mapping and model systems
Comparitive genome mapping and model systemsComparitive genome mapping and model systems
Comparitive genome mapping and model systems
 
Complementing Computation with Visualization in Genomics
Complementing Computation with Visualization in GenomicsComplementing Computation with Visualization in Genomics
Complementing Computation with Visualization in Genomics
 
Karen miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detectionKaren miga centromere sequence characterization and variant detection
Karen miga centromere sequence characterization and variant detection
 
AlgoAlignementGenomicSequences.ppt
AlgoAlignementGenomicSequences.pptAlgoAlignementGenomicSequences.ppt
AlgoAlignementGenomicSequences.ppt
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Gene mapping ppt
Gene mapping pptGene mapping ppt
Gene mapping ppt
 
Map Based Cloning.pptx
Map Based Cloning.pptxMap Based Cloning.pptx
Map Based Cloning.pptx
 
wheat genome project.pptx
wheat genome project.pptxwheat genome project.pptx
wheat genome project.pptx
 

Recently uploaded

Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Sérgio Sacani
 
Mitosis...............................pptx
Mitosis...............................pptxMitosis...............................pptx
Mitosis...............................pptxCherry
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
 
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of CarbohydratesGBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of CarbohydratesAreesha Ahmad
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategyMansiBishnoi1
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxGOWTHAMIM22
 
GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)Areesha Ahmad
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)Areesha Ahmad
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxCherry
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalMd Hasan Tareq
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCherry
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
Microbial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptxMicrobial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptxCherry
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptsreddyrahul
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...Sérgio Sacani
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 

Recently uploaded (20)

Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Mitosis...............................pptx
Mitosis...............................pptxMitosis...............................pptx
Mitosis...............................pptx
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of CarbohydratesGBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategy
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab  1 (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab 1 (Microbiology Lab Safety Procedures)
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptx
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
Microbial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptxMicrobial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptx
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 

Genome Informatics 2016 poster

  • 1. Identifying structural variation, component issues and other sequence artifacts by integrating long range genome maps in a web-based genome browser William Chow, and Kerstin Howe Wellcome Trust Sanger Institute, Cambridge, UK. Applications Identifying and Capturing Variation The  Ashkenazim  and  CHS  trio  maps  were   generated  using  Bionano  Haplotype  Aware   so;ware  (unpublished).    The  poten@al   inheritance  paBern  of  the  child  can  be  deduced   from  the  maps  of  the  parents.   Primary  Assembly     (NC_00018)   Alternate  Locus   Representa@on   (NT_187618)   There  are  two  nickase   labeled  groups  seen.         #  blocks   Total  size     Orange   blocks   2   ~32kb   Red   blocks   4   ~38kb     Allele  1  with  Primary  Assembly  (NC_00018)           Allele  2  with  Alternate  Assembly  (NT_187618)     Human  (GRCh38)  –  Chr  18:  43,724,697-43,768,880   CAST/EiJ   BALB/cJ   GRCm38   PWK/PhJ   PWK/PhJ   Mouse  (PWK/PhJ)  –  Chr  8:  106,309,962-­‐106,509,961   Assembly Evaluation There are two observations from this region of PWK/PhJ (above): A.  there is genome map discordance. (Mapà~10kb. BspQIà~60kb) B.  all the transcript mappings are green, but pmfbp1 looks like there is a very long suspicious intron caused by middle component ScRybd3_121_120. Comparative alignments between PWK/PhJ and CAST/EiJ, BALB/cJ, GRCm38 (right): C.  ScRybd3_121_120 has no alignments to the other mouse assemblies, providing evidence that perhaps this component is creating an expansion of the region. C CB A Compara@ve  Assembly  View:     Mouse  (PWK/PhJ)  against  3  other  Mouse  Assemblies     The gEVAL Browser gEVAL is a modern, scrollable and dynamic genome browser, allowing the user to view pre-calculated analyses or attach data as tracks specifically tailored for assembly evaluation. It also includes comparative analyses of different assembly builds for each species as well as automated lists created to facilitate identification of and navigation to issues or regions of interest. Public   Repository   DENOVO  ASSEMBLY   CONSENSUS  MAPS   Example  of  data  used   Clone  Library  Ends   Transcripts/cDNAs   Assembly  Self  Comparisons   GRC  Issues  tracker   Markers   Pacbio  reads   Align  with   RefAligner   Align   Datasets   Long  range  genome  maps  either  generated  by  the  Irys  Instrument  or  from  public   sources  (A)  and  genomic  datasets  (B)  are  aligned/mapped  using  the  appropriate   tools  against  the  assembly  and  loaded  into  to  the  browser  for  visualiza@on  (C).   A B C 1   Genome  Map  Con@g  and   BspQI  Insilico  Digest  track   Discordance  of  map  size  between  nickase/labels  and   digest  coloured  in  red.   2   Transcript(s)  track   Complete  mapping  in  green,  incomplete  in  orange.   3   Clone  end(s)  track   Concordant  paired  end  mappings  in  green,  insert  size/ orienta@on  issues  will  be  colored  orange/red.   1 2 3 Mouse  Genome  Project  Strain-­‐specific  Genome  Maps   129S1/SvImJ   A/J   AKR/J   BALB/cJ   C3H/HeJ   C57BL/6NJ   CAST/EiJ   CBA/J   DBA/2J   FVB/NJ   LP/J   NOD/ShiLtJ   NZO/HiLtJ   PWK/PhJ   SPRET/EiJ   WSB/EiJ     •  Maps  Generated  by  the  Sanger  Ins[tute.   Mice Images courtesy of JAX creative division, The Jackson Laboratory   Genome  Reference  Consor[um  Species  Genome  Maps               Human   Ashkenazim  Trio  †   NA24149  (father)   NA24143  (mother)   NA24385  (son)   Southern  Han  Chinese  (CHS)   Trio  ¤   HG00514  (daughter)   HG00512  (father)   HG00513  (mother)   Yoruba  (YRI)  Trio  ¤   NA19240  (daughter)   NA19239  (father)   NA19238  (mother)   Yan  Huang  (YH)  §   PRJNA42199     † Zook,   J.,   et   al.   Extensive   sequencing   of   seven   human   genomes   to   characterize   benchmark   reference  materials.  BioRxiv  (2015)   ‡ Mak   AC   et   al.   Genome-­‐Wide   Structural   Varia@on   Detec@on   by   Genome   Mapping   on   Nanochannel  Arrays  Gene@cs  (2015)   § Cao,  H.,  et  al.  Rapid  Detec@on  of  Structural  Varia@on  in  a  Human  Genome  using  Nanochannel-­‐ based  Genome  Mapping  Technology.  Giga  Science  (2014);  3(December  2014):  34   ¤   Human   Genome   Structural   Varia@on   Consor@um   (HGSV)   |1000   Genomes.   Currently   Under   Publica[on  Embargo.     ^  Courtesy  of  T.Graves  (MGI),  E.Lam  (Bionano  Genomics).  (2014)     Central  Europe  Hapmap  (CEPH)   Trio  ‡   NA12878  (daughter)   NA12891  (father)   NA12892  (mother)     Puerto  Rican  Trio  ¤   HG00733  (daughter)   HG00731  (father)   HG00732  (mother)   Han  Chinese  Trio  †   NA24631  (son  only)   Haploid  Hyda[dform  mole   (CHM1)  ^   PRJNA176729     Zebrafish   Sanger  AB  Tübingen  (SAT)   Generated  by  the  Sanger  Ins@tute   A  Trackhub  is  available:h,p://bit.ly/25b7Tqg   Genome Maps Available in gEVAL To aid in this, we have incorporated long range single molecule, genome mapping technology datasets from both in-house (Sanger Institute) and public repositories (Bionano Genomics, Genome in a Bottle). Along with the wide range of data already aligned to each genome, this long range data can help identify structural variation and confirm assembly irregularities such as insertions, deletions and mis- assemblies whilst providing suitable information to resolve them.   In the image on the left, the genome maps (Ashkenazi, CHS and CEPH Trio, Han Chinese son, YH and CHM1) are aligned to GRCh38. The maps indicate two distinct patterns created by the nickase labels providing evidence of an alternate locus capturing ~6kb unique sequence compared to the primary reference assembly. A.  Within some family trio, the maps were haplotype-aware assembled creating two map contig per individual, this can be used to illicit inheritance patterns in family. B.  When looking at all maps, the two patterns illustrate the variation between two group, one consisting of 2 blocks (~11.8kb + 21kb = 32kb) and the other, 4 blocks (~9.8kb + 7kb + 3kb + 18kb = 38kb). Note the former agrees with the BspQI digest of the primary assembly (11.8kb +21kb). C.  When looking at the BspQI digest track in the primary assembly (NC_00018) and the assembly representing the alternate locus (NT_187618) versus the maps, the concordance can be seen.     gEVAL  -­‐  A  web  based  browser  for  evalua[ng  genome  assemblies   Chow  W,  Brugger  K,  Caccamo  M,  Sealy  I,  Torrance  J,  Howe  K   Bioinforma@cs  2016  Apr  7.  pii:btw159:  PMID:  27153597       http://geval.sanger.ac.uk