• Like
Church gia13
Upcoming SlideShare
Loading in...5

Church gia13

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • The reference is not just the is the chromosome sequences of the primary assembly unit, but also includes the alternate loci and patches, which are used to provide additional sequence representations at selected genomic regions. The GRC has been releasing patches to the human assembly on a quarterly cycle, and we’re now at GRCh37.p12. There are two varieties of patches:FIX patches correct existing assembly problems: chromosome will update, patches integrated in GRCh38NOVEL patches add new sequence representations: will become alternate lociThis ideogram shows the current distribution of patches and alternate loci, and you can see that many regions have changed since GRCh37. Note that approximately 3% of the current public human assembly GRCh37 is associated with a region that is represented by a patch or alternate locus.


  • 1. Converting from Analog to DigitalIntegrating the historical archive of human variation in an NGS worldDeanna M. ChurchStaff Scientist, NCBI@deannachurch Genome Informatics Alliance 2013
  • 2. AcknowledgementsGeT-RMLisa Kalman (CDC)Birgit Funke (Harvard)Mahduri Hegde (Emory)Maryam HalaviChao ChenJon TrowDouglas SlottaPeter MericDaniel FrishbergVictor AnanievClinVarAlex AstashynShanmuga ChitipirallaDouglas HoffmanWonhee JangBrandi KattmanMelissa LandrumJennifer LeeAdriana MalheiroWendy RubinsteinGeorge RileyAmanjeev SethiRicardo VillamarinISCAChrista Lese Martin (Geisinger)Erin Riggs (Geisinger)Jose MenaMike FeoloTim HefferonJohn GarnerJohn LopezGRCValerie Schneider (NCBI)The Genome Institute at Washington UniversityThe Wellcome Trust Sanger InstituteThe European Bioinformatics Institute
  • 3. VariationPhenotypes
  • 4. Phenotypes
  • 5. Variant Call (dbVarsubmission)Array data filesClinical LabsQC AnalysisCurationData regularizationdbGaPControlled AccessWeb accessFTP AccessAssemblyRemappingdbVarISCAUCSCDGVDGVaNCBIApproved UsersBioProject IDClinVardbGaP projects needa sponsoring NIHinstitute to run theDAC (NICHD)
  • 6. ASDAtrial Septum Defect Autism Spectrum Disorder??No HPO1,814HPO6,770Riggs et al, 2012~2 HPO terms/case(max of 16)The Human Phenotype Ontology
  • 7. http://www.ncbi.nlm.nih.gov/medgen
  • 8. Variation
  • 9. sequences alignments genotype likelihoods individual variants1101001,00010,000100,000size(gigabytes)component1092 genomes (low coverage + exome)38.2M SNPs3.9M Short Indels and14K DeletionsFASTQBAMVCFVCFFASTQBAMVCFVCFSteve Sherry, NCBI
  • 10. http://www.bioplanet.com/gcat
  • 11. http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes
  • 12. http://genomereference.orgGRCh37
  • 13. Dennis et al., 20121q32 1q21 1p211p21 patch alignment to chromosome 1
  • 14. Hydin: chr16 (16q22.2)Hydin2: chr1 (1q21.1)Missing in NCBI35 Unlocalized in NCBI36/GRCh37 Finished in GRCh38Alignment to Hydin2 Genomic, 300 Kb, 99.4% IDAlignment to Hydin1 CHM1_1.0, >99.9% IDAlignment to Hydin2 Genomic, 300 Kb, 99.4% IDAlignment to Hydin1 CHM1_1.0, >99.9% IDDoggett et al., 2006
  • 15. Kidd et al, 2007APOBEC clusterPart of chr22 assemblyAlternate locus for chr22White: InsertionBlack: Deletion
  • 16. http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes
  • 17. Human Resolved for GRCh38http://genomereference.org
  • 18. GRCh38 is coming(September, 2013)
  • 19. http://www.ncbi.nlm.nih.gov/variation/tools/get-rmCallsTestscSRAConcordantDiscordantNATarget audience: Clinical testing labsSubmissions from: Clinical and Research labs
  • 20. Reporting Standards: Not standardTwelve submitting labs to dateTwelve custom scripts to regularize dataDespite defined formats here:http://www.ncbi.nlm.nih.gov/projects/variation/get-rmWhat are the issues?
  • 21. Reporting Standards: Not standardWhat are the issues?Better Example: QUAL**Required sixth column in VCF file10.01-18357.112.6-21.20-21.220-3070Allele string34.79-44624.03None20-46006
  • 22. c.1956+15C>CTReporting Standards: Not standardWhat are the issues?Lab reporting a single nucleotide change (C->T) het change as:c.1956+15C>T[=]HGVS standards says this should be reported as:Lab reporting a single nucleotide change (A->G) hom change as:c.670+9A>GHGVS standards says this should be reported as:c.[670+9A>G];[670+9A>G]
  • 23. Defining a reference sequence: Data validationNM_007171.3:c.942T>CReported as:Base in transcript is a ‘C’ not a ‘T’
  • 24. http://www.ncbi.nlm.nih.gov/clinvar
  • 25. Standardize data: what is the variation?607008.0001985A>G985A>G (K304E)A985GACADM, LYS304GLUK304EK304E (985 A->G)K304E (K329E)K304E onlyK329EK329E(985A>G)LYS304GLUMutation c.985A>G (p.K304E)c.985A>Gc.985A>G (p.K304E)c.985A>G (p.Lys304Gluincludes: K304E (985A>G)p.K304Ep.Lys329Glupreviously known as p.Lys329GluAnalysis of ACADM 985A>G mutationNC_000001.10:g.76226846A>GNG_007045.1:g.41804A>GNM_000016.4:c.985A>GNP_000007.1:p.Lys329Glurs77931234
  • 26. Miki et al, 1994