Before assessing the clinical significance of a somatic mutation, one must determine if the mutation is likely to be a driver mutation (i.e. a mutation that provides a selective growth advantage, thereby promoting cancer development). To aid clinicians in this process, VSClinical provides an oncogenicity scoring system, which uses a variety of metrics to classify a given somatic mutation into one of the following categories: oncogenic, likely oncogenic, benign, likely benign, or uncertain significance. This scoring system is heavily inspired by the ACMG Guidelines for the interpretation of germline mutations but has several important differences to make it more applicable in the context of somatic variant interpretation.
Our oncogenicity scoring system relies on an additive point system in which points are assigned to a given variant based on several criteria. Many of the criteria are shared by the ACMG Guidelines for germline variant interpretation, such as population frequency information, variant effect on protein function, and nearby pathogenic variants in catalogs such as ClinVar. However, other criteria are specific to the world of somatic variant interpretation. These include the variant’s presence in somatic catalogs such as COSMIC, the effect of other known oncogenic variants in the same gene, and the variant’s presence in known cancer hotspots or active binding sites. These criteria are combined by summing over the scores for all applicable scoring criteria. Scores exceeding 3 indicate an oncogenic or likely oncogenic classification, while scores falling below -3 indicate a benign or likely benign classification.
In this webcast, we discuss how each of these scoring criteria are combined to obtain an oncogenicity classification. This includes a discussion of the considerations taken into account during the development of this scoring system and a detailed analysis of several example mutations to illustrate the system in practice.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Evaluating Oncogenicity in VSClinical
1. Evaluating Oncogenicity in VSClinical
Nathan Fortier, Ph.D., Director of Research
20 Most Promising Biotech
Technology Providers
Top 10 Analytics
Solution Providers
3. Evaluating Oncogenicity in VSClinical
Nathan Fortier, Ph.D., Director of Research
20 Most Promising Biotech
Technology Providers
Top 10 Analytics
Solution Providers
4. NIH Grant Funding Acknowledgments
• Research reported in this publication was supported by the National Institute Of
General Medical Sciences of the National Institutes of Health under:
• Award Number R43GM128485-01
• Award Number R43GM128485-02
• Award Number 2R44 GM125432-01
• Award Number 2R44 GM125432-02
• Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily
represent the official views of the National Institutes of Health.
5. Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNVAnalysis
Pipeline: Run Workflows
Variant Warehouse
CentralizedAnnotations
Hosted Reports
Sharing and Integration
CNVAnalysis
GWAS |Genomic Prediction
Large-NPopulation Studies
RNA-Seq
Large-NCNV-Analysis
Who Are We?
Golden Helix is a global bioinformatics company
founded in 1998
8. SIMPLE, SUBSCRIPTION-
BASED BUSINESSMODEL
o Yearlyfee
o Unlimitedtraining&support
SOFTWARE ISVETTED
o 20,000+ usersat 400+ organizations
o Quality&feedback
DEEPLY ENGRAINED IN
SCIENTIFIC COMMUNITY
o Give backto thecommunity
o Contributecontentandsupport
INNOVATIVESOFTWARE SOLUTIONS
o Cited in1,000s ofpublications
When you choose Golden Helix,
you receive more than just the software
9. PDF
REPORT
WORD
REPORT
EXCEL
TABLE
B A M
Calling of CNVs
V C F
Annotating, filtering &
prioritizing of clinically
relevant SNPs and CNVs
‐ Clinical interpretation of SNPs &
CNVs
‐ ACMG & AMP guidelines assessing
germline and somatic variations
‐ Clinical reporting
10. VSClinical - AMP Guidelines: Analyzing Biomarkers
Haroche J. et al. Dramatic efficacy of vemurafenib in both
multisystemic and refractory Erdheim-Chester disease and
Langerhans cell histiocytosis harboring the BRAF V600E mutation.
Blood 2013 121
• Biomarker Definition
- Biological states with indications for
treatments, prognostic, or diagnostic
outcomes
- Presence or absence of proteins, antigens,
and specific genomic attributes of the
tumor
• Common Cancer Biomarkers
- HER2+: High levels of HER2 receptor
protein
- MSI-H: Microsatellite instability-high
- BRAFV600E: Activating mutation V600E
- ERBB2Amp: Amplification of ERBB2
- BCR-ABL1: Activation of ABL1 with BCR
fusion
- TP53WT: No significant alterations of
critical TSG
11. VSClinical – AMP and ACMG Guidelines: One Suite
• Increased lab throughput
• Consistent results
• Shorten learning curve
• Staying abreast of new
developments
Germlin
e
Somatic
13. Oncogenicity Scoring
Applies To Criteria -5B -3B -2B -1B +1O +2O +3O
All
Population Frequency -5 -3 -1
Homozygous in Controls -2 -1
In Somatic Catalogs +1 +2 +3
Relevant Variant Assessments -1 +2 +3
Null
Damaging LoF +1 +2
LoF are Oncogenic Mutations in Gene +1
Missense
Nearby Pathogenic Missense Variants +2
In-Frame not in Repeat Region +1
Somatic Hotspot & Active Binding Sites +1 +2
Non-Null Computational Evidence -1 +1
All Splice Site Prediction +1 +2
Non-Coding Silent, Intronic, UTR, Intergenic Variants w/ No Splice Effect -3
14. Germline Population Frequency
• The maximum sub-population frequency is used.
• We use gnomAD and 1000 Genomes (choosing the maximum
frequency between both catalogs)
• Our thresholds are equivalent to those used in the ACMG
Guideline automation for BA1/BS1 but there is no PM2 (+2)
for being novel (not in germline catalogs)
• Recessive genes allow for higher frequency (two-hit)
Possible Scores:
Recessive Dominant
-5B 1.00% 0.50%
-3B 0.15% 0.05%
-1B 16 individuals (all) 16 individuals (all)
-5 -1-3
15. Present in Controls
• Controls include 1000 genomes and gnomAD “Controls”
subset.
• Score counts of being homozygous in recessive gene
• Score counts of being heterozygous / hemizygous in a
dominant / x-linked gene respectively
Possible Scores:
Number of Individuals
-2B Multiple individual
-1B Exactly one individual
-2 -1
16. In Somatic Catalogs
• Will look at COSMIC, ICGC and MSK-Impact
• Total sample count (tumor type agnostic)
• Thresholds chosen to match power law of mutation
occurrence in somatic catalogs
• +2D/+3D only apply if variant < 16 AC in germline catalogs
Possible Scores: +3+2+1
# Samples (At Least) Variants in COSMIC
+1D 1 3,296,000 (100%)
+2D 5 43,000 (1.4%)
+3D 35 1,000 (0.03%)
17. Relevant Variant Assessments
Possible Scores:
Classified variants
- Internal Knowledge-Base of
classified variants
- ClinVar 1+ star Likely Pathogenic /
Pathogenic
- CIViC 1+ star variants
- Other “Consortium” sources
Score
- +3 if Pathogenic Same Change
- +2 if Pathogenic Missense Same
Codon
- -1 if Benign Scored
+3+2-1
18. Variant Type Specific Criteria
Groups of Variant Types:
• Null variant: frameshift, stop gain, start loss
• Previously classified mutation?
• Does mutation result in null / truncated gene product?
• Are Null variants shown to be drivers in cancer for this gene?
• Missense variants: amino acid substations and length
polymorphisms
• Previously classified amino acid (same codon)?
• In local region of previously classified variants?
• In active binding site or mutation hot-spot?
• In-silico evidence: functional prediction and splicing?
• Non-coding variants: silent mutations, intronic, utr
• Predicted to disrupt canonical splice site?
Sequencing Ontology on Current Transcript (Selectable)
19. Damaging LoF
The p.K1358Dfs variant occurs in the last
exon of MSH6. There are no other pathogenic
loss of function variants downstream of the
variant p.K1358Dfs.
Possible Scores: +2+1
Truncating / Null Variant Evidence:
+1 Relative position in protein coding sequence
- Not within 50bp of penultimate exon
- Not on last exon
+1 Previously classified variant downstream
- Any LoF variant downstream of this variant’s position
- Sources of previously classified variants:
- Internal KnowledgeBase of classified / interpreted
variant
- ClinVar 1+ star Likely Pathogenic / Pathogenic
- CIViC variants with certain evidence threshold / star-
rating
- Other “consortium” sources
20. LoF are Oncogenic Mutations in Gene
Possible Scores: +1
Affinity with
Gene:
Classified variants
- 1 or more LoF
Pathogenic / Likely
Pathogenic
Proportion of COSMIC
mutations:
- 5% of variants are LoF
LoF CIViC Evidence
- Statement about null
variants in CIViC
- 1 Star+ rating
21. Nearby Pathogenic Missense Variants
Possible Scores: +2
Using Previous Classified:
There are no benign missense
variations within three amino acids
of the variant
There are at least two pathogenic
missense variants within six amino
acids of the variant
The number of pathogenic missense
variants within six amino acids
exceeds the number of benign
missense variants
22. In-Frame Not in Repeat Region
For In-Frame Insertions / Deletions:
• +1 If the inserted sequence is not repeated two or more
times
• Considering a version of “Nearby Pathogenic Inframe
Variants” for another +1 to boost variants in inframe indel
hotspots (i.e. EGFR exon 19)
The p.A3571_V3572del variant is a in-frame
deletion of an amino acid sequence that is
repeated 2 times in the surrounding region.
Possible Scores: +1
23. Somatic Hotspot & Active Binding Site
Exon 15 of BRAF shows regions designated as somatic
missense mutation hotpsots as well as key activating sites
and binding site annotations
Possible Scores: +2+1
Region Tracks:
+1 Cancer hotspots
- Single residue and in-frame indel
mutation hotspots identified in 24,592
tumor samples by the algorithm
described in [Chang et al. 2017] and
[Chang et al. 2016]
+1 binding sites / activating / active
sites
- Curated through InterPro
- Residue annotations from CDD
- More specific than large domain
annotations
24. Computational Evidence
In-Silico Evidence (for Non-LoF
Variants)
• +2: 3 or 4 out of 4 splice site predictions of damaging
• +1: In-silico predictions in agreement variant is damaging &
conserved
• -1: If variant amino acid present in mammalian species
• -1: In-silico predictions in agreement that variant is tolerated
& not conserved
Synonymous / UTR / Intronic Variants
• -3: Not predicted to disrupt a canonical splice site and no
Pathogenic clinical assessment
Possible Scores: +3-1
25. Example: BRAF V600E
General Scoring
• +0: novel in gnomAD
• +3: Somatic catalog of 28,263 samples in COSMIC
• +3: In ClinVar as Pathogenic, in CIViC 1+ star
Missense/Computational Evidence
• +2: Nearby pathogenic variants
• +2: In Cancer Hotspot and Active Binding Site
• +1: Functional & Conservation all agree
Final Score: +11
26. Example: SLX4 A1461Pfs*2
General Scoring
• +0: 0.0009% (1 of 109874 European) in gnomAD
• +1: Somatic catalog of 1 sample in COSMIC
• +0: Not in ClinVar or CIViC
Loss of Function
• +2: Not at end of gene, downstream pathogenic LoF
• There are 2 downstream pathogenic loss of function variants,
with the furthest variant being 283 residues downstream of the
variant p.A1461Pfs*2.
• +1: LoF are Driver Mutation in Gene
• The p.A1461Pfs*2 variant is a loss of function variant in the gene
SLX4, which is intolerant of Loss of Function variants, as indicated
by the presence of existing pathogenic loss of function variant
NP_115820.2:p.Leu20Argfs*24 and 5 others
Final Score: +4 (Likely Oncogenic)
27. Example: PTCH1 C454Y
General Scoring
• +0: novel in gnomAD
• +0 : Not in Cosmic
• +0: Not in ClinVar or CIViC
Missense / Computational Evidence
• +0 : Nearby pathogenic variants
• There are no classified pathogenic variants within 6 amino acid
positions of the variant p.C454Y, providing no evidence of being in
a mutation hot spot.
• +0 : In Cancer Hotspot and Active Binding Site
• +1: Functional & Conservation all agree
Final Score: +1 (VUS)
29. NIH Grant Funding Acknowledgments
• Research reported in this publication was supported by the National Institute Of
General Medical Sciences of the National Institutes of Health under:
• Award Number R43GM128485-01
• Award Number R43GM128485-02
• Award Number 2R44 GM125432-01
• Award Number 2R44 GM125432-02
• Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily
represent the official views of the National Institutes of Health.
33. COVID-19 Resources
• Bundle discounts will be ending on June 15th
• SVS Imputation Module w/CADD & OMIM
• VSClinical, AMP, CNV, Sentieon Tier 1
• Small Warehouse License: VS-CNV, VSClinical+ AMP, Sentieon Tier 1,
VSReports, VSPipeline
• If you are interested in reserving one of these bundles, you can
mention this in the Questions pane now.
36. Thank you for attending!
Pleaseletus know ifyou have any further questions by emailing
info@goldenhelix.com.
Welookforward to seeingyou onthe nextwebcast.
Editor's Notes
Delaina’s intro – click when Q&A is mentioned
Delaina highlights Q&A, click when she passes it over to me
First foremost, we recently received grant funding from NIH which we are incredibly grateful for. The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards. Additionally we are also grateful for receiving local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH. Again, we are thankful of grants such as this which provides huge momentum in developing the quality software we provide. Now let’s learn more Golden Helix as a company.
GoldenHelix is a global bioinformatics software and analytics company that enables research and clinical practices to analyze large genomic datasets. We were originally founded in 1998 based of pharmacogenomics work performed at GalxoSmithKline who was and still is a primary investor in our company.
We currently have two flagship products Varseq and SNP and Variation Suite (SVS) for short.
SVS is our research application platform that enables researchers to perform complex analysis and visualizations on genomic and phenotypic data.
SVS has a broad range of tools to easily perform GWAS, Genomic Prediction, Differential expression analysis on RNA-Seq Data and has the ability to process CNV analysis, which we will demonstrate today.
VarSeq, on the other hand, is our clinical application platform that is used for filtering and annotating variants of interest.
We can also evaluate variants according to the ACMG guidelines with VSCLincal and have the option to automatically create clinical reports from the results of various workflows.
Using the same software, we can also perform CNV analysis on targeted gene panels, whole exome, and whole genome sequencing data, and
We also have an add-on feature called VSPipeline – which takes workflows created in VarSeq and automates the process of variant annotation and filtering.
Now all of the information produced from VarSeq can be stored in our Warehouse solution, which is designed to be installed on a server location and serve as a repository for your variants evaluations, annotations, and hosted reports. Lastly, VSWarehouse can also be used for sharing and integration between license holders.
Our software has been very well received by the industry. We have been cited in thousands of peer-reviewed publications and that’s a testament to our customer base.
We work with over 400 organizations all over the globe.
pharmaceutical companies, Bayer and Lilly
top-tier institutions, Stanford and yale
government organizations, NCI
clinics, Sick kids
genetic testing labs, prevention genetics and lineage
With now well over 20,000 installs of our products and with 1,000’s of unique users.
So why is this relevant to you?
This means that over the course of 20 years our products have received a lot user feedback, which we are always very receptive to when developing and releasing newer versions of our products. This user feedback allows our software to stay relevant and well vetted in it’s capabilities and qualities which builds our products reputation, trust, and client experience.
We also stay on the forefront of the needs of the industry and community by regularly attending conferences and providing useful product information via eBooks, tutorials, and blog posts. Your access to the software is a simple subscription based model where we don’t charge per sample nor per version. You also maintain full access to our support and training staff to get you up to speed quickly with your analysis
The Golden Helix clinical stack supports the entire workflow for NGS genetic testing of cancer. (review each step in summary)
We utilize AMP guidelines not only create a full understanding of a biomarker’s impact, but also to investigate and report on a variety of biomarker types. This could include single nucleotide variants, insertions or deletions, copy number variants, gene fusions, and considerations for wild type genes. The fundamental goal here is to leverage known clinical interpretations, drug sensitivity and resistance, as well as prognostic and diagnostic information from cancer databases which will inform treatment- an example can be seen in the image on the right, demonstrating the efficacy of vemurafenib in a patient harboring the BRAF V600E mutation. VSClinical’s framework follows the AMP Guideline suggested tier system to analyze available clinical evidence and clinical significance of any given variant.
The major hurdle VSClinical overcomes is the inherent limitation with manually accounting for all evolving knowledge for any given variant. This is especially true regarding cancer databases which evolve and grow exponentially, and it is unrealistic for a single person to promptly keep track of all the information. The need for automation is critical and VSClinical is the solution. Not only is consistency upheld through handling all available evidence across multiple databases, but it also removes any issue with underlying workflow fatigue. Moreover, the improved efficiency provided by VSClinical allows less experienced users to process more data more quickly while still maintaining interpretation consistency. A more subtle value but just as critical is the educational benefit VSClinical provides in familiarizing new users with the AMP and ACMG guidelines. Lastly, users benefit from the fact that we integrate advances in these guidelines directly into the software so users spend more time performing variant analysis and less time having to modify or update their bioinformatic pipeline.
Because VSClinical serves as the ACMG and AMP guideline interpretation hub it is important to discuss the cancer-based annotations that much of the interpretation is based on.
Inframe handled much like missense, but don’t get nearby pathogenic variants, do look at hotspots, active binding sites, exact vs same-aa relevant assessments and a extra +2 if not in repeat region (or -1 if they are).
The +1 / -1 relevant variant assessment actually relates to all
Concern: missense should get a -1 if in a benign variant / benign aa
Color code our criteria here
Color code our criteria here
571,000 (19%) for 2 or more
~1,400 1+ civic variants, all evidence level’s add something worth investigating in the interpretation
Pathogenic / Likely and Benign / Likely
alanine valine repeated twice more
alanine valine repeated twice more
alanine valine repeated twice more
Pathogenic / Likely Pathogenic
alanine valine repeated twice more
alanine valine repeated twice more
alanine valine repeated twice more
alanine valine repeated twice more
alanine valine repeated twice more
PTCH1 IS A Tumor Supressor Gene
ACMG gets to Likely Pathogenic because PM2 (novel), PP2 (missense in gene) PP3 (in-silico)
First foremost, we recently received grant funding from NIH which we are incredibly grateful for. The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards. Additionally we are also grateful for receiving local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH. Again, we are thankful of grants such as this which provides huge momentum in developing the quality software we provide. Now let’s learn more Golden Helix as a company.