ClinVar: A Central Repository for Clinically Relevant Variants - Melissa J Landrum


Published on

Thousands of new variants are being identified thanks to advances in sequencing technologies. However, much of the data are stored in separate and sometimes private databases and so may be difficult to use to evaluate the clinical significance of variants, especially rare variants. To improve access to this type of data, ClinVar maintains a freely available, public archive of human variation and its relationship to disease. The data can be used interactively on the web; a monthly full release in XML format and weekly summary files of genes and variants are also available for incorporation into analysis pipelines. Submissions include variants identified by direct testing in clinical or research labs, as well as reviewed variant-phenotype relationships from expert groups, such as InSiGHT and CFTR2, and professional societies, such as ACMG. In addition to the variant and phenotype, individual submissions may also provide a clinical assertion and evidence for that interpretation. The data model is flexible for many data elements, such that a variant may be defined by sequence or cytogenetic nomenclature; the phenotype may be a diagnostic term or features of a disease; and evidence for the interpretation may be structured as counts or provided as free text. For submitters who maintain their own website for variants, such as LSDBs, ClinVar links to the submitter’s site for each submitted variant, allowing users who start at ClinVar an awareness of the LSDB’s curated variants and access to more information on the variant that may be available at the LSDB. Each individual submission is accessioned and versioned, in the format SCV000000000.1, to allow the submitter to update their record as the interpretation of the variant is re-evaluated over time. ClinVar uses standard terminologies, such as those for variant nomenclature, phenotypes, and pathogenicity, to avoid data ambiguity and to promote comparison of information from multiple sources. ClinVar also adds related variant data, such as allele frequencies and HGVS expressions mapped across molecule types. While ClinVar staff members provide some curation of variants and phenotypes represented in ClinVar, clinical significance values are provided by submitters. As part of the submission process, ClinVar provides feedback to submitters. This feedback includes invalid HGVS expressions and submissions that conflict in clinical significance with an existing record for the same variant and phenotype which may warrant further curation. Submissions for the same variant-phenotype pair from different submitters are aggregated into a record that is accessioned and versioned in the format RCV000000000.1. Aggregation allows ClinVar to indicate when multiple submitters agree or conflict in the clinical interpretation of the variant, which can help clinical labs and curation groups to identify high-confidence interpretations as well as those that should be prioritized for curation efforts.

Published in: Science
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Describe standardized names as HGVS in several coordinate systems
  • ClinVar: A Central Repository for Clinically Relevant Variants - Melissa J Landrum

    1. 1. ClinVar: A Central Repository for Interpretations of Clinically Relevant Variants Melissa Landrum HVP 2014 May 21, 2014
    2. 2. ClinVar
    3. 3. ClinVar stats
    4. 4. Variation Phenotype Interpretation Evidence ClinVar integrates four domains of information dbSNP dbVar Gene MedGen (HPO, OMIM) PubMedACMG Sequence Ontology GTR
    5. 5. ClinVar – Standardized data 607008.0001 985A>G 985A>G (K304E) 985A>G (K329E) A985G ACADM, LYS304GLU K304E K304E (985 A->G) K304E (K329E) K304E only K329E K329E(985A>G) LYS304GLU Mutation c.985A>G (p.K304E) c.985A>G c.985A>G (p.K304E) c.985A>G (p.Lys304Glu c985A>G includes: K304E (985A>G) p.K304E p.Lys329Glu previously known as p.Lys329Glu Analysis of ACADM 985A>G mutation NC_000001.10:g.76226846A>G NG_007045.1:g.41804A>G NM_000016.4:c.985A>G ACADM:c.985A>G NP_000007.1:p.Lys329Glu
    6. 6. ClinVar aggregates by variant and phenotype Variant Phenotype Submitter SCV – submitted ClinVar record FBN1:c.4786C>T Marfan syndrome Lab A SCV000000010 FBN1:c.4786C>T Marfan syndrome Lab B SCV000000020 Variant Phenotype FBN1:c.4786C>T Marfan syndrome RCV000000050 RCV – reference ClinVar record
    7. 7. Allele summary • Gene • Variant type • Genomic location • HGVS expressions* • Molecular consequence* • Links* • Frequency* Phenotype summary • Names • Links* • Age of onset * • Prevalence * Interpretation • Significance • Review status * • Accession.version * * May be provided by NCBI ClinVar web display
    8. 8. ClinVar web display
    9. 9. ClinVar web display
    10. 10. classified by single submitter classified by multiple submitters conflicting data from submitters reviewed by expert panel reviewed by professional society ClinVar Review Status Expert panels – both medical and research experts with published criteria and process for evaluating variant pathogenicity • CFTR2, InSiGHT Professional society – groups that provide practice guidelines • American College of Medical Genetics (ACMG)
    11. 11. ClinVar aggregates by variant Variant Phenotype Submitter PTPN11:c.205G>C Noonan syndrome Lab A SCV000000010 PTPN11:c.205G>C Noonan syndrome Lab B SCV000000020 Variant Phenotype PTPN11:c.205G>C Noonan syndrome RCV000000050 PTPN11:c.205G>C Rasopathy RCV000000050 PTPN11:c.205G>CVariant PTPN11:c.205G>C Rasopathy Lab C SCV000000030
    12. 12. ClinVar – new web display
    13. 13. Accessing ClinVar data • Interactively on the web, updated weekly • Monthly full releases – Comprehensive XML extraction – VCF files – Tab-delimited summary files for genes, variants • E-utilities as web service or via command line • Annotation on graphic sequence displays • Variation Viewer • Variation Reporter
    14. 14. Submitting data to ClinVar • Minimal or data-rich submissions are accepted • Multiple submission formats – Excel spreadsheet templates – tsv, csv files – XML • Online documentation And contact us with questions -
    15. 15. Acknowledgements ClinVar/GTR/RefSeqGene /Gene/MedGen staff dbSNP/dbVar/dbGaP Alex Astashyn Chao Chen Shanmuga Chitipiralla Baoshan Gu Douglas Hoffman Wonhee Jang Brandi Kattman Ken Katz Jennifer Lee Donna Maglott Adriana Malheiro Michael Ovetsky George Riley Wendy Rubinstein Amanjeev Sethi Ray Tully Ricardo Villamarin Michael Feolo John Garner Tim Hefferon Brad Holmes John Lopez Rama Maiti Jose Mena Lon Phan David Shao Ming Ward All of NCBI Jim Ostell Steve Sherry