Transitioning to gr_ch38

Genome Reference Consortium
Genome Reference ConsortiumGenome Reference Consortium
© 2014 Personalis, Inc. All rights reserved.
Pioneering Genome-Guided Medicine
Deanna M. Church
Senior Directory of Genomics and Content
Transitioning to GRCh38
Personalis, Inc.2
Who we are
Inherited
Disease
Diagnostics
Cancer
Services
ACE Platform
Research
Services
Personalis, Inc.3
Reference assembly influence
Gene1 Gene2
Gene1
Sample
Ref
Assembly
Personalis, Inc.4
Excitement about GRCh38
GGAACGCAG
GGAACACAG
DPYD
R->C
Alt loci
Model Centromere Sequences
Miga et al., 2014
Personalis, Inc.5
CCL3: region: GRCh37
NC_000017.10 (chr17): 34,442,621-35,005,379
Personalis, Inc.6
CCL5-TBC1D3 region: GRCh38
NC_000017.11 (chr17): 36,032,574-36,269,924
NT_187661.1
100 Kb deletion on chromosome
Steinberg et al., 2014 http://dx.doi.org/10.1101/006841
7
Alternate Loci and Genes
3.6 Mb of novel sequence
153 genes not on primary assembly
Unique sequence in alternate loci
Total: 3.6 Mb; 153 genes only on alts
Personalis, Inc.8
Alt Loci and Genes
25% Medically Interpretable Genes (MIG)
Primary Assembly
Alt Locus
6.4%
6.2%0.18%
Personalis, Inc.9
Alt Loci and Genes
NT_167246.2: MHC alternate locus
No SNP annotationSparse SNP
annotation
Personalis, Inc.10
Analysis challenges
Primary Assembly
Paralogous duplication
Allelic duplication
Alt Locus
MapQ
https://github.com/GenomeRef/SoftwareDevTracking
Personalis, Inc.11
Analysis challenges: variant representation
Primary Assembly
Alt Locus
G>C
1/1 Only valid if homozygous for Alt
1/. Correct if heterozygous for Alt
Personalis, Inc.12
Waiting for graph representations?
Credit: UC Santa Cruz Genomics Institute
Personalis, Inc.13
Analysis challenges
chr19 vs 19
GenBank: CM00681.2
RefSeq: NC_000019.10
Personalis, Inc.14
Analysis challenges
chr19_KI270938v1_alt
CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1
GenBank: KI270886.1
RefSeq: NT_187640.1
Personalis, Inc.15
Analysis challenges MICB
Reporting formats (GFF, VCF, etc) don’t
manage multiple locations easily
Personalis, Inc.16
NW_003871068.1
NC_000006.12 BestRefSeq gene 31494881 31511124 . + . ID=gene13336;Name=MICB;Dbxref=GeneID:4277
NT_167244.2 BestRefSeq gene 2827449 2843674 . + . ID=gene42005;Name=MICB;Dbxref=GeneID:4277
NT_113891.3 BestRefSeq gene 2972222 2988464 . + . ID=gene43669;Name=MICB;Dbxref=GeneID:4277
NT_167245.2 BestRefSeq gene 2742492 2758910 . + . ID=gene44377;Name=MICB;Dbxref=GeneID:4277
NT_167246.2 BestRefSeq gene 2810648 2816200 . + . ID=gene44827;Name=MICB;Dbxref=GeneID:4277
NT_167247.2 BestRefSeq gene 2836836 2853071 . + . ID=gene45127;Name=MICB;Dbxref=GeneID:4277
ID=gene13336;Name=MICB;Dbxref=GeneID:4277
ID=gene42005;Name=MICB;Dbxref=GeneID:4277
ID=gene43669;Name=MICB;Dbxref=GeneID:4277
ID=gene44377;Name=MICB;Dbxref=GeneID:4277
ID=gene44827;Name=MICB;Dbxref=GeneID:4277
ID=gene45127;Name=MICB;Dbxref=GeneID:4277
Personalis, Inc.17
Analysis challenges
• Need aligners that can distinguish allelic and
paralogous duplication
• Need variant callers/modules than can correctly
assign genotypes in complex regions
• Need to extend file formats to accommodate new
assembly model
1 of 17

Recommended

Church dm grc_workshop by
Church dm grc_workshopChurch dm grc_workshop
Church dm grc_workshopGenome Reference Consortium
656 views31 slides
The key considerations of crispr genome editing by
The key considerations of crispr genome editingThe key considerations of crispr genome editing
The key considerations of crispr genome editingChris Thorne
2.1K views29 slides
Ashg sedlazeck grc_share by
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_shareGenome Reference Consortium
331 views28 slides
171017 giab for giab grc workshop by
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshopGenome Reference Consortium
312 views27 slides
GPCRs_HouseLA by
GPCRs_HouseLAGPCRs_HouseLA
GPCRs_HouseLALindsay House
411 views20 slides
Genome Editing Comes of Age by
Genome Editing Comes of AgeGenome Editing Comes of Age
Genome Editing Comes of AgeCandy Smellie
3.7K views45 slides

More Related Content

What's hot

Rewriting the Genome Using CRISPR and Synthetic Biology by
Rewriting the Genome Using CRISPR and Synthetic Biology Rewriting the Genome Using CRISPR and Synthetic Biology
Rewriting the Genome Using CRISPR and Synthetic Biology Integrated DNA Technologies
6.6K views31 slides
Mane v2 final by
Mane v2 finalMane v2 final
Mane v2 finalGenome Reference Consortium
607 views17 slides
Schneider grc workshop_final by
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_finalGenome Reference Consortium
1K views23 slides
An Introduction to Crispr Genome Editing by
An Introduction to Crispr Genome EditingAn Introduction to Crispr Genome Editing
An Introduction to Crispr Genome EditingChris Thorne
14K views23 slides
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MA by
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MACRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MA
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MADiane McKenna
3.4K views11 slides
2nd CRISPR Congress Boston, 23-25 February 2016 by
2nd CRISPR Congress Boston, 23-25 February 2016 2nd CRISPR Congress Boston, 23-25 February 2016
2nd CRISPR Congress Boston, 23-25 February 2016 Diane McKenna
2.3K views13 slides

What's hot(20)

An Introduction to Crispr Genome Editing by Chris Thorne
An Introduction to Crispr Genome EditingAn Introduction to Crispr Genome Editing
An Introduction to Crispr Genome Editing
Chris Thorne14K views
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MA by Diane McKenna
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MACRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MA
CRISPR Gene Editing Congress, 25-27 February 2015 in Boston, MA
Diane McKenna3.4K views
2nd CRISPR Congress Boston, 23-25 February 2016 by Diane McKenna
2nd CRISPR Congress Boston, 23-25 February 2016 2nd CRISPR Congress Boston, 23-25 February 2016
2nd CRISPR Congress Boston, 23-25 February 2016
Diane McKenna2.3K views
Translating Genomes | Personalizing Medicine by Candy Smellie
Translating Genomes | Personalizing MedicineTranslating Genomes | Personalizing Medicine
Translating Genomes | Personalizing Medicine
Candy Smellie1.1K views
GENASSIST™ CRISPR & rAAV Genome Editing Tools by Candy Smellie
GENASSIST™ CRISPR & rAAV Genome Editing ToolsGENASSIST™ CRISPR & rAAV Genome Editing Tools
GENASSIST™ CRISPR & rAAV Genome Editing Tools
Candy Smellie2K views
Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ... by Candy Smellie
Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ...Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ...
Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ...
Candy Smellie3.2K views
Aug2013 tumor normal whole genome sequencing by GenomeInABottle
Aug2013 tumor normal whole genome sequencingAug2013 tumor normal whole genome sequencing
Aug2013 tumor normal whole genome sequencing
GenomeInABottle1.1K views
Literature mining and large-scale data integration by Lars Juhl Jensen
Literature mining and large-scale data integrationLiterature mining and large-scale data integration
Literature mining and large-scale data integration
Lars Juhl Jensen316 views
Lack of association between CD45 C77G polymorphism and multiple sclerosis in ... by ijtsrd
Lack of association between CD45 C77G polymorphism and multiple sclerosis in ...Lack of association between CD45 C77G polymorphism and multiple sclerosis in ...
Lack of association between CD45 C77G polymorphism and multiple sclerosis in ...
ijtsrd24 views
Genome editing comes of age by Jan Hryca
Genome editing comes of ageGenome editing comes of age
Genome editing comes of age
Jan Hryca743 views

Similar to Transitioning to gr_ch38

CDAC 2018 Boeva analysis chromatin by
CDAC 2018 Boeva analysis chromatinCDAC 2018 Boeva analysis chromatin
CDAC 2018 Boeva analysis chromatinMarco Antoniotti
218 views74 slides
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t... by
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...QIAGEN
3.3K views37 slides
Guide Picker Poster V3 by
Guide Picker Poster V3Guide Picker Poster V3
Guide Picker Poster V3Soren Hough
67 views1 slide
Church SFAF2014 keynote by
Church SFAF2014 keynoteChurch SFAF2014 keynote
Church SFAF2014 keynoteDeanna Church
1.2K views65 slides
Biomed central by
Biomed centralBiomed central
Biomed centralGovernment Medical College
75 views16 slides
Visual Exploration of Clinical and Genomic Data for Patient Stratification by
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationNils Gehlenborg
1.2K views131 slides

Similar to Transitioning to gr_ch38(20)

Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t... by QIAGEN
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
QIAGEN3.3K views
Guide Picker Poster V3 by Soren Hough
Guide Picker Poster V3Guide Picker Poster V3
Guide Picker Poster V3
Soren Hough67 views
Church SFAF2014 keynote by Deanna Church
Church SFAF2014 keynoteChurch SFAF2014 keynote
Church SFAF2014 keynote
Deanna Church1.2K views
Visual Exploration of Clinical and Genomic Data for Patient Stratification by Nils Gehlenborg
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Nils Gehlenborg1.2K views
A practical guide to using The Cancer Imaging Archive for QIN Challenges and ... by CancerImagingInforma
A practical guide to using The Cancer Imaging Archive for QIN Challenges and ...A practical guide to using The Cancer Imaging Archive for QIN Challenges and ...
A practical guide to using The Cancer Imaging Archive for QIN Challenges and ...
CRISPR Screening: the What, Why and How by HorizonDiscovery
CRISPR Screening: the What, Why and HowCRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and How
HorizonDiscovery1.7K views
Zinc supplementation may reduce the risk of hepatocellular carcinoma using bi... by caijjournal
Zinc supplementation may reduce the risk of hepatocellular carcinoma using bi...Zinc supplementation may reduce the risk of hepatocellular carcinoma using bi...
Zinc supplementation may reduce the risk of hepatocellular carcinoma using bi...
caijjournal130 views
https://www.slideshare.net/eshaasini/research-hotspots-and-frontiers-of-robot... by eshaasini
https://www.slideshare.net/eshaasini/research-hotspots-and-frontiers-of-robot...https://www.slideshare.net/eshaasini/research-hotspots-and-frontiers-of-robot...
https://www.slideshare.net/eshaasini/research-hotspots-and-frontiers-of-robot...
eshaasini3 views
The evolution of ctDNA as a Predictive Biomarker of Response in Metastatic Ca... by eshaasini
The evolution of ctDNA as a Predictive Biomarker of Response in Metastatic Ca...The evolution of ctDNA as a Predictive Biomarker of Response in Metastatic Ca...
The evolution of ctDNA as a Predictive Biomarker of Response in Metastatic Ca...
eshaasini3 views
Genomics & Epigenomics by gumccomm
Genomics & EpigenomicsGenomics & Epigenomics
Genomics & Epigenomics
gumccomm851 views
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic... by Elia Brodsky
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky669 views
Detecting clinically actionable somatic structural aberrations from targeted ... by Ronak Shah
Detecting clinically actionable somatic structural aberrations from targeted ...Detecting clinically actionable somatic structural aberrations from targeted ...
Detecting clinically actionable somatic structural aberrations from targeted ...
Ronak Shah562 views
High-resolution melt analysis for semen discrimination by Joana Antunes, PhD
High-resolution melt analysis for semen discriminationHigh-resolution melt analysis for semen discrimination
High-resolution melt analysis for semen discrimination
Joana Antunes, PhD912 views
Applications of Next generation sequencing in Drug Discovery by vjain38
Applications of Next generation sequencing in Drug DiscoveryApplications of Next generation sequencing in Drug Discovery
Applications of Next generation sequencing in Drug Discovery
vjain383 views
Clasificación de riesgo en renal metastásico by Mauricio Lema
Clasificación de riesgo en renal metastásicoClasificación de riesgo en renal metastásico
Clasificación de riesgo en renal metastásico
Mauricio Lema945 views

More from Genome Reference Consortium

Previewing GRCm39: Assembly Updates from the GRC by
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCGenome Reference Consortium
7.5K views18 slides
What's new and what's next for the human reference assembly? by
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?Genome Reference Consortium
2.3K views19 slides
Telomere-to-telomere assembly of a complete human chromosomes by
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesGenome Reference Consortium
1.9K views48 slides
Genome variation graphs with the vg toolkit by
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome Reference Consortium
2.1K views17 slides
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project by
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectGenome Reference Consortium
1.4K views21 slides
Why graph genome storage and updating wakes me up at 4 am by
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amGenome Reference Consortium
688 views5 slides

More from Genome Reference Consortium(20)

Recently uploaded

CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx by
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptxCMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptxJubinNath2
10 views12 slides
Explore new Frontiers in Medicine with AI.pdf by
Explore new Frontiers in Medicine with AI.pdfExplore new Frontiers in Medicine with AI.pdf
Explore new Frontiers in Medicine with AI.pdfAnne Marie
16 views31 slides
Impact of ICF on collaboration and communication by
Impact of ICF on collaboration and communicationImpact of ICF on collaboration and communication
Impact of ICF on collaboration and communicationOlaf Kraus de Camargo
21 views19 slides
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha... by
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...PVI, PeerView Institute for Medical Education
13 views44 slides
Correct handling of laboratory Rats ppt.pptx by
Correct handling of laboratory Rats ppt.pptxCorrect handling of laboratory Rats ppt.pptx
Correct handling of laboratory Rats ppt.pptxTusharChaudhary99
16 views12 slides
Biomedicine & Pharmacotherapy by
Biomedicine & PharmacotherapyBiomedicine & Pharmacotherapy
Biomedicine & PharmacotherapyTrustlife
207 views12 slides

Recently uploaded(20)

CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx by JubinNath2
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptxCMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx
JubinNath210 views
Explore new Frontiers in Medicine with AI.pdf by Anne Marie
Explore new Frontiers in Medicine with AI.pdfExplore new Frontiers in Medicine with AI.pdf
Explore new Frontiers in Medicine with AI.pdf
Anne Marie16 views
Correct handling of laboratory Rats ppt.pptx by TusharChaudhary99
Correct handling of laboratory Rats ppt.pptxCorrect handling of laboratory Rats ppt.pptx
Correct handling of laboratory Rats ppt.pptx
Biomedicine & Pharmacotherapy by Trustlife
Biomedicine & PharmacotherapyBiomedicine & Pharmacotherapy
Biomedicine & Pharmacotherapy
Trustlife207 views
Fetal and Neonatal Circulation - MBBS, Gandhi medical College Hyderabad by Swetha rani Savala
Fetal and Neonatal Circulation - MBBS, Gandhi medical College Hyderabad Fetal and Neonatal Circulation - MBBS, Gandhi medical College Hyderabad
Fetal and Neonatal Circulation - MBBS, Gandhi medical College Hyderabad
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends by muskansbl01
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness TrendsTop Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends
muskansbl0155 views
Examining Pleural Fluid.pptx by Fareeha Riaz
Examining Pleural Fluid.pptxExamining Pleural Fluid.pptx
Examining Pleural Fluid.pptx
Fareeha Riaz 21 views
communication and nurse patient relationship by Tamanya Samui.pdf by TamanyaSamui1
communication and nurse patient relationship by Tamanya Samui.pdfcommunication and nurse patient relationship by Tamanya Samui.pdf
communication and nurse patient relationship by Tamanya Samui.pdf
TamanyaSamui131 views
Buccoadhesive drug delivery System.pptx by ABG
Buccoadhesive drug delivery System.pptxBuccoadhesive drug delivery System.pptx
Buccoadhesive drug delivery System.pptx
ABG162 views

Transitioning to gr_ch38

Editor's Notes

  1. Missing and misassembled sequence in the reference assembly can have dire consequences to genome interpretation. In this example, Gene2 is missing from the reference, but present in the sample we are analyzying. Regardless of whether gene2 is missing because of an assembly error, or because it is polymorphic in the population the outcome can be the same. In the best case scenario, reads from gene2 don’t align to the reference and we just can’t analyze gene2. However, if gene2 is related to gene1, we can get off targets alignments that can confound analysis of Gene1 as well, either leading to under calling in the region, or possibly leading to inappropriately calling paralogous sequence variants as allelic sequence variants. If we take sequences we know to be missing in GRCh37, simulate reads and then align these reads to GRCh37, we see that 75% of these find an off target alignment, regardless of the alignment method used. This is why Heng Li created decoy sequences for the 1000 genomes project- in an effort to reduce off-target alignments. However, we still lack the ability to analyze gene2 in this scenario. This underscores the importance of representing all common human sequences in the reference assembly.
  2. Mutations in DPYD result in dihydropyrimidine dehydrogenase deficiency, an error in pyrimidine metabolism associated with thymine-uraciluria and an increased risk of toxicity in cancer patients receiving 5-flourouracil. Replace this with protein coding info and stats? And Valerie’s poster
  3. The CCL3 region on chromosome 17 allows us to explore two major updates seen in GRCh38, and hopefully will underscore the importance of representing missing paralogs in the reference. This region is known to be copy number variant, with individuals having 0-4 copies of a 90Kb repeat unit. In GRCh37, the region was assembled from several sources that contain different structural variants. This led to the creation of a false gap, and a genomic representation that does not likely exist in anyone on the planet. Being able to correctly represent the genomic architecture of this regions is important as there is some, albeit conflicting evidence, of the correlation of the number of copies of CCL3L1 with HIV infection and progression to AIDs.
  4. To better represent this region, the GRC made a new clone tiling path in this region from a single haplotype resource derived from a hyaditiform mole. An additional allele, representing a 100 Kb insertion was also generated and placed in the assembly as an alternate locus. The reference assembly now has two correct representations of this region – though we may need more.
  5. For this reason, many people have just ignored these sequences, but doing so in GRCh38 means losing 3.6 Mb of sequence unique to the alternate loci- sequence containing 153 genes. This graph shows the distribution of the amount of unique sequence per alternate locus- so while it is clear they do not all contribute equal amounts of novel sequence, in aggregate the amount is significant. The GRC recently held a workshop to encourage development of new tools that can handle the full assembly, and Heng Li has already distributed a version of BWA-MEM that is alt-locus aware, we need to do considerable testing and additional development to make sure we are using these sequences correctly. We also need to assess the ramifications of this new structure on other parts of the tool chain.