Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2014 Personalis, Inc. All rights reserved.
Pioneering Genome-Guided Medicine
Deanna M. Church
Senior Directory of Genomi...
Personalis, Inc.2
Who we are
Inherited
Disease
Diagnostics
Cancer
Services
ACE Platform
Research
Services
Personalis, Inc.3
Reference assembly influence
Gene1 Gene2
Gene1
Sample
Ref
Assembly
Personalis, Inc.4
Excitement about GRCh38
GGAACGCAG
GGAACACAG
DPYD
R->C
Alt loci
Model Centromere Sequences
Miga et al., 2...
Personalis, Inc.5
CCL3: region: GRCh37
NC_000017.10 (chr17): 34,442,621-35,005,379
Personalis, Inc.6
CCL5-TBC1D3 region: GRCh38
NC_000017.11 (chr17): 36,032,574-36,269,924
NT_187661.1
100 Kb deletion on ch...
7
Alternate Loci and Genes
3.6 Mb of novel sequence
153 genes not on primary assembly
Unique sequence in alternate loci
To...
Personalis, Inc.8
Alt Loci and Genes
25% Medically Interpretable Genes (MIG)
Primary Assembly
Alt Locus
6.4%
6.2%0.18%
Personalis, Inc.9
Alt Loci and Genes
NT_167246.2: MHC alternate locus
No SNP annotationSparse SNP
annotation
Personalis, Inc.10
Analysis challenges
Primary Assembly
Paralogous duplication
Allelic duplication
Alt Locus
MapQ
https://...
Personalis, Inc.11
Analysis challenges: variant representation
Primary Assembly
Alt Locus
G>C
1/1 Only valid if homozygous...
Personalis, Inc.12
Waiting for graph representations?
Credit: UC Santa Cruz Genomics Institute
Personalis, Inc.13
Analysis challenges
chr19 vs 19
GenBank: CM00681.2
RefSeq: NC_000019.10
Personalis, Inc.14
Analysis challenges
chr19_KI270938v1_alt
CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1
GenBank: KI270886.1
RefSeq:...
Personalis, Inc.15
Analysis challenges MICB
Reporting formats (GFF, VCF, etc) don’t
manage multiple locations easily
Personalis, Inc.16
NW_003871068.1
NC_000006.12 BestRefSeq gene 31494881 31511124 . + . ID=gene13336;Name=MICB;Dbxref=GeneI...
Personalis, Inc.17
Analysis challenges
• Need aligners that can distinguish allelic and
paralogous duplication
• Need vari...
Upcoming SlideShare
Loading in …5
×

Transitioning to gr_ch38

1,641 views

Published on

GRC Workshop: Advancing the Human Reference Assembly
Speaker: Deanna Church

Published in: Health & Medicine
  • Be the first to comment

Transitioning to gr_ch38

  1. 1. © 2014 Personalis, Inc. All rights reserved. Pioneering Genome-Guided Medicine Deanna M. Church Senior Directory of Genomics and Content Transitioning to GRCh38
  2. 2. Personalis, Inc.2 Who we are Inherited Disease Diagnostics Cancer Services ACE Platform Research Services
  3. 3. Personalis, Inc.3 Reference assembly influence Gene1 Gene2 Gene1 Sample Ref Assembly
  4. 4. Personalis, Inc.4 Excitement about GRCh38 GGAACGCAG GGAACACAG DPYD R->C Alt loci Model Centromere Sequences Miga et al., 2014
  5. 5. Personalis, Inc.5 CCL3: region: GRCh37 NC_000017.10 (chr17): 34,442,621-35,005,379
  6. 6. Personalis, Inc.6 CCL5-TBC1D3 region: GRCh38 NC_000017.11 (chr17): 36,032,574-36,269,924 NT_187661.1 100 Kb deletion on chromosome Steinberg et al., 2014 http://dx.doi.org/10.1101/006841
  7. 7. 7 Alternate Loci and Genes 3.6 Mb of novel sequence 153 genes not on primary assembly Unique sequence in alternate loci Total: 3.6 Mb; 153 genes only on alts
  8. 8. Personalis, Inc.8 Alt Loci and Genes 25% Medically Interpretable Genes (MIG) Primary Assembly Alt Locus 6.4% 6.2%0.18%
  9. 9. Personalis, Inc.9 Alt Loci and Genes NT_167246.2: MHC alternate locus No SNP annotationSparse SNP annotation
  10. 10. Personalis, Inc.10 Analysis challenges Primary Assembly Paralogous duplication Allelic duplication Alt Locus MapQ https://github.com/GenomeRef/SoftwareDevTracking
  11. 11. Personalis, Inc.11 Analysis challenges: variant representation Primary Assembly Alt Locus G>C 1/1 Only valid if homozygous for Alt 1/. Correct if heterozygous for Alt
  12. 12. Personalis, Inc.12 Waiting for graph representations? Credit: UC Santa Cruz Genomics Institute
  13. 13. Personalis, Inc.13 Analysis challenges chr19 vs 19 GenBank: CM00681.2 RefSeq: NC_000019.10
  14. 14. Personalis, Inc.14 Analysis challenges chr19_KI270938v1_alt CHR_HSCHR19KIR_G248_BA2_HAP_CTG3_1 GenBank: KI270886.1 RefSeq: NT_187640.1
  15. 15. Personalis, Inc.15 Analysis challenges MICB Reporting formats (GFF, VCF, etc) don’t manage multiple locations easily
  16. 16. Personalis, Inc.16 NW_003871068.1 NC_000006.12 BestRefSeq gene 31494881 31511124 . + . ID=gene13336;Name=MICB;Dbxref=GeneID:4277 NT_167244.2 BestRefSeq gene 2827449 2843674 . + . ID=gene42005;Name=MICB;Dbxref=GeneID:4277 NT_113891.3 BestRefSeq gene 2972222 2988464 . + . ID=gene43669;Name=MICB;Dbxref=GeneID:4277 NT_167245.2 BestRefSeq gene 2742492 2758910 . + . ID=gene44377;Name=MICB;Dbxref=GeneID:4277 NT_167246.2 BestRefSeq gene 2810648 2816200 . + . ID=gene44827;Name=MICB;Dbxref=GeneID:4277 NT_167247.2 BestRefSeq gene 2836836 2853071 . + . ID=gene45127;Name=MICB;Dbxref=GeneID:4277 ID=gene13336;Name=MICB;Dbxref=GeneID:4277 ID=gene42005;Name=MICB;Dbxref=GeneID:4277 ID=gene43669;Name=MICB;Dbxref=GeneID:4277 ID=gene44377;Name=MICB;Dbxref=GeneID:4277 ID=gene44827;Name=MICB;Dbxref=GeneID:4277 ID=gene45127;Name=MICB;Dbxref=GeneID:4277
  17. 17. Personalis, Inc.17 Analysis challenges • Need aligners that can distinguish allelic and paralogous duplication • Need variant callers/modules than can correctly assign genotypes in complex regions • Need to extend file formats to accommodate new assembly model

×