GeT-RM Project and Browser
Deanna M. Church
@deannachurch GIAB 2013
ProjectTeam
Lisa Kalman, CDC
Birgit Funke, Harvard Partners
Madhuri Hegde, Emory
Guidance and Direction Implementation
Chen Chao, NCBI
Douglas Slotta, NCBI
Jonathon Trow, NCBI
Peter Meric, NCBI
Victor Ananiev, NCBI
Daniel Frishberg, NCBI
Chunlei Liu, NCBI
Maryam Halavi, NCBI
Wendy Rubinstein, NCBI
Deanna Church, NCBI
Submitting Labs
ARUP Laboratories
Baylor College of Medicine Medical Genetics Laboratory
Broad Institute of MIT and Harvard
Emory Genetics Laboratory
GeneDx
Genomics and Pathology Services at Washington University in St. Louis
Harvard School of Public Health
Illumina
Laboratory for Molecular Medicine
National Institute of Standards and Technology
University of California, San Francisco Department of Laboratory Medicine
University of Chicago
http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/details/
http://www.ncbi.nlm.nih.gov/variation/tools/get-rm
Calls
Tests
cSRA
Concordant
Discordant
NA
Target audience: Clinical testing labs
Submissions from: Clinical and Research labs
Twelve submitting labs to date
Twelve custom scripts to regularize data
Defined formats here:
http://www.ncbi.nlm.nih.gov/projects/variation/get-rm
Platforms
0
5
10
15
20
25
30
HiSeq 2000 HiSeq 2500 MiSeq Ion Torrent Sanger 454
NA12878Tests by Platform
Lab ProvidedValidation
Variants validated in this sample using another platform
Variants validated in another sample using another platform
Variants seen in other samples from submitting lab using this platform
Variants seen in public data set
Variants that are novel
Variants that were not assessed
0
50
100
150
200
250
0 10 50 100 500 1000 5000
Number
of
Variant
Read Count Bins
Suppor ng Read Counts
Based on May 2013 Data release
Based on May 2013 Data release
http://www.ncbi.nlm.nih.gov/variation/tools/get-rm
Gene level concordance
Σ (max(xi)/Σ T)
i = genotype call
X = count per call for each variant
T = total genotype calls per variant
Sums are taken over all variants in
a gene.
Tested regions taken into account
Phasing ignored
Looking forward
Analysis
Web tools
Genotype support analysis based on alignments
Development of consensus genotype set
Investigation of discordant regions
Comparison to paralogous sequence variant (PSV) sites
Comparison to GRCh38
Calculation of FP and FN rates
Link to browser for review
Improved gene navigation
Addition of PSV data tracks

Aug2013 GeT-RM project and genome browser

  • 1.
    GeT-RM Project andBrowser Deanna M. Church @deannachurch GIAB 2013
  • 2.
    ProjectTeam Lisa Kalman, CDC BirgitFunke, Harvard Partners Madhuri Hegde, Emory Guidance and Direction Implementation Chen Chao, NCBI Douglas Slotta, NCBI Jonathon Trow, NCBI Peter Meric, NCBI Victor Ananiev, NCBI Daniel Frishberg, NCBI Chunlei Liu, NCBI Maryam Halavi, NCBI Wendy Rubinstein, NCBI Deanna Church, NCBI
  • 3.
    Submitting Labs ARUP Laboratories BaylorCollege of Medicine Medical Genetics Laboratory Broad Institute of MIT and Harvard Emory Genetics Laboratory GeneDx Genomics and Pathology Services at Washington University in St. Louis Harvard School of Public Health Illumina Laboratory for Molecular Medicine National Institute of Standards and Technology University of California, San Francisco Department of Laboratory Medicine University of Chicago http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/details/
  • 4.
  • 5.
    Twelve submitting labsto date Twelve custom scripts to regularize data Defined formats here: http://www.ncbi.nlm.nih.gov/projects/variation/get-rm
  • 6.
    Platforms 0 5 10 15 20 25 30 HiSeq 2000 HiSeq2500 MiSeq Ion Torrent Sanger 454 NA12878Tests by Platform
  • 7.
    Lab ProvidedValidation Variants validatedin this sample using another platform Variants validated in another sample using another platform Variants seen in other samples from submitting lab using this platform Variants seen in public data set Variants that are novel Variants that were not assessed
  • 8.
    0 50 100 150 200 250 0 10 50100 500 1000 5000 Number of Variant Read Count Bins Suppor ng Read Counts Based on May 2013 Data release
  • 9.
    Based on May2013 Data release
  • 10.
  • 12.
    Gene level concordance Σ(max(xi)/Σ T) i = genotype call X = count per call for each variant T = total genotype calls per variant Sums are taken over all variants in a gene. Tested regions taken into account Phasing ignored
  • 13.
    Looking forward Analysis Web tools Genotypesupport analysis based on alignments Development of consensus genotype set Investigation of discordant regions Comparison to paralogous sequence variant (PSV) sites Comparison to GRCh38 Calculation of FP and FN rates Link to browser for review Improved gene navigation Addition of PSV data tracks