With our recent launch of VarSeq 2.5.0, our ability to expedite somatic analysis for NGS labs is more accessible than ever before. Our recent webcasts have shown our range of updates, including our new oncogenicity classifier and carrier status workflows:
Identifying Oncogenic Variants in VarSeq
VarSeq 2.5.0: Empowering Family Planning through Carrier Screening Analysis
In this user perspective webcast, we will highlight how the combination of our new oncogenicity classifier and the updates to our CancerKB database streamline the interpretation of oncogenic variants. In addition, as NGS labs progress from gene panels to WES analysis for ideal genomic signature generation, we will demonstrate how a VarSeq somatic workflow can scale with these increased scopes of data analysis with ease.
Our user perspective webcast will cover:
Application of virtual panels to WES tumor/normal workflows.
Use of the oncogenicity classifier to streamline filter chains.
Updates to our CancerKB database to include the CancerKB gene track.
Including parallel germline secondary findings for the whole NGS workflow.
3. VarSeq 2.5.0: VSClinical AMP Workflow
from the User Perspective
November 29, 2023
Presented by: Jennifer Dankoff, PhD
4. NIH Grant Funding Acknowledgments
4
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Award Number 1R43HG013456-01
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the National
Institutes of Health.
5. Who Are We?
5
Golden Helix is a global bioinformatics company founded in 1998
Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNV Analysis
CNV Analysis
GWAS | Genomic Prediction
Large-N Population Studies
RNA-Seq
Large-N CNV-Analysis
Variant Warehouse
Centralized Annotations
Hosted Reports
Sharing and Integration
Pipeline: Run Workflows
8. The Golden Helix Difference
8
FLEXIBLE DEPLOYMENT
On premise or in a private
cloud
BUSINESS MODEL
Annual fee for software,
training and support
CLIENT CENTRIC
Unlimited support from the
very beginning
SINGLE SOLUTION
Comprehensive cancer and
germline diagnostics
SCALABILITY
Gene panels to whole
exomes or genomes
THROUGHPUT
Automated pipeline
capabilities
QUALITY
Clinical reports correct the
first time
11. 11
• Lower sequencing costs result in more affordable whole exome samples.
• Movement towards whole exome sequencing.
• Increase in diagnostic yield per sequenced sample.
• Tumor/Normal projects can be used for ‘normal subtraction.’
• Application of Virtual Panels for flexible analysis.
• According to research by Parikh et al., tumor only workflows can overestimate
the TMB score, adversely affecting patient treatment outcomes.
• Tumor/ Normal projects result in the most accurate TMB calculations.
• While genomic signatures have to be externally calculated, VarSeq supports
both these WES and Tumor / Normal somatic workflows.
Insight on Somatic workflows:
Moving from Gene Panel to WES
13. Secondary Findings
13
Adding incidental germline findings to any somatic workflow
Single sample
somatic
analysis
Tumor / normal
subtraction
Oncogenic
Workflow
Secondary
Findings
1. Oncogenicity Scoring
2. AMP Tier Level Prioritization
3. Somatic Gene Panel Application
1. ACMG
Classification
2. ClinVar Reviews
3. Secondary
Findings Gene Panel
• Secondary Findings workflows can be applied in
parallel to any somatic workflow.
• Single sample gene panels, WES, or tumor /
normal workflows.
• Secondary Findings ACMG Classifications and
ACMG Secondary Finding Gene Panels can be
supplemented to any filter chain.
• All applicable variants can be brought into the
VSClinical AMP module as ‘germline suspected’
for final reporting.
14. More than one type of genetic mutation can drive oncogenesis
• Mutations that activate oncogenes:
o Missense
o In-frame insertions/deletions
o Fusions
o Copy number amplifications
• Functions that inhibit tumor suppressor genes:
o Gene deletions
o Loss of function nonsense, frameshift indels
o Disabling fusions, structural variants
o Genomic Signatures that describe overall state of
somatic genome
Comprehensive Genomic Profiling Tests
14
The next step for precision medicine
Copy Number
Rearrangements
Base Substitutions
Deletions
Insertions
Genomic Signatures
15. VSClinical AMP Toolbox
SNVs/Indels, CNVs & Fusion
Analysis, Genomic Signatures
Import standard spectrum of variant
types from modern callers
Golden Helix CancerKB
Expert-curated knowledgebase, ultimate
somatic variant interpretation engine
Combination Biomarkers
More complex interpretations
for combinations, negative findings
Global Clinical Trials
Worldwide trials with advanced
search & filter
15
Evaluation Scripts
Import directly from TSO-500, Archer, Ion
Torrent & others to support clinical analysis
Oncogenicity Classifier
Apply filter to identify driver mutations in
panels, exomes and genomes
16. The Updated CancerKB
16
Brining cancer hallmarks to your reporting pipeline.
• Emerging cancer genes are now being captured
with CancerKB.
• Functional roles in cancer are defined with the
CancerKB Gene Track.
• Annotate with the CancerKB gene track to define
cancer hallmarks.
• Cancer roles are leveraged in the shipped
comprehensive cancer report template.
• CancerKB is now more comprehensive than ever.
• 800+ fully curated genes
• 17,000+ treatments
• Almost 60,000 global trials, and more!
17. 17
VSClinical: Comprehensive Clinical Reporting
• VSClinical conducts the clinical variant analysis
based on the AMP guidelines
- Automated population of the clinical report-based
workflow outcome
- Standardizing of variant level interpretation based on
customizable assessment catalogs
- For all somatic variant types (SNVs, CNVs, fusions),
GHI provides predefined clinical assessments via our
CancerKB catalog
- Include oncogenicity scores, biomarker interpretations,
drugs, trails, and more!
• Rendering of clinical reports within seconds
• Supported output formats
- Word
- PDF
- JSON
18. Examples for Product Demonstration
18
• Workflow One- Tumor/Normal analysis
• Demonstrate reduction in variant number due to normal ‘subtraction.’
• CancerKB Gene Track is used to isolate a cancer hallmark [TENT5C; Resisting cell
death]
• Workflow Two- Singleton with Parallel Filter Chains
• Variants in IDH1 and EIF2AK4
• Using the Cancer Classifier for known and novel variants.
• Leveraging parallel filter chains to capture VUS and Secondary Findings
• VSClinical:
• Tier Level Drug matches
• CancerKB interpretation
• Final Reporting
21. NIH Grant Funding Acknowledgments
21
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Award Number 1R43HG013456-01
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.
23. Innovation Awards
23
The competition will run from
Dec. 1st, 2023 - Feb. 29th, 2024
So if you answer YES to one or more of the questions below, or have
great examples of your workflows, then the 2024 Golden Helix
Innovation Awards are for you!
• Do you use Golden Helix software?
• Do you use NGS analysis to treat patients?
• Are you studying a particular disease category, or are you zeroing in
on a specific population?
• Have you incorporated the ACMG or AMP guidelines into your clinical
workflow?
• Do you leverage our research platform for plants, animals, or
humans?
• Do you work with CNVs?
Thanks Casey! We can’t wait to dive into this subject
Thanks you Casey! I am excited to talk about the new 2.5.0 features!
Before we start diving into the subject, I wanted mention our appreciation for our grant funding from NIH.
The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards.
We are also grateful to have received local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH.
So with that covered, lets take just a few minutes to talk a little bit about our company Golden Helix.
Golden Helix is a global bioinformatics software and analytics company that enables research and clinical practices to analyze large genomic datasets. We were originally founded in 1998 based off pharmacogenomics work performed at GlaxoSmithKline, who is still a primary investor in our company.
VarSeq, our flagship product, serves as a clinical tertiary analysis tool. At its core, it serves as a variant annotation and filtration engine. Additionally, users can have access to automated AMP or ACMG variant guidelines. VarSeq also have the capability to detect copy number variations scaling from single exome to large aneuploidy events. Lastly, the finalization of variant interpretation and classification is further optimized with the VarSeq clinical reporting capability. Users can integrate all of these features into a standardized workflow.
Paired with VarSeq are VSWarehouse and VSPipeline. VSWarehouse serves as a repository for the large amount of useful genomic data wrangled by our customers. Warehouse not only solves the issue of data storage for ever-increasing genomic content, but also is fully query-able, auditable, and allows for the definability of user access for project managers or collaborators. In tandem with this, VSPipeline allows for the automated execution of routine workflows, further optimizing users' abilities to handle large amounts of data and throughput.
Lastly, our research platform, SVS, enables researchers to perform complex analysis and visualizations on genomic and phenotypic data.
Our software has been very well received by the NGS industry. We have been cited in thousands of peer-reviewed publications, and that’s a testament to our customer base.
We work with over 400 organizations all over the globe. This includes top-tier institutions, like Stanford and yale, government organizations like the NCI and NIH, clinics such as Sick Kids, and many other genetic testing labs. We now have well over 20,000 installs of our products and with 1,000’s of unique users.
So how is this relevant to you?
At Golden Helix, we focus on the seven pillars of customer success. Golden Helix offers a single software solution that encompasses germline, somatic, and CNV analysis. Our software is also highly scalable, supporting gene panel to whole genome sequencing workflows. With our complete automation capabilities, we now offer a FASTQ or VCF to report pipeline. Our software can be locally deployed, or installed in cloud, and our business model of annual subscription per user means you are able to increase your workload without increasing analysis fees. And it goes without saying, that our FAS team is here to support you on your analysis journey.
Today I will be taking you through a workflow perspective of some of our updates in VarSeq 2.5.0. We will be taking a look at our project scalability, ability to integrate parallel workflows in a single project, our new cancer classifier, and some updates to our CancerKB database.
Now before we get started, I wanted to give you a high level overview of the VarSeq suite workflow. VarSeq facilitates handling of all your variant types for both somatic and germline analysis. The utility of the software can be broken into stages.
The first stage being the import of your SNVs/indels, CNVs and fusions, then passed through a user defined variant filter coupled with many annotations and algorithms to isolate the clinically relevant variants. These filters and project structure are saved as templates to facilitate automation with our VSPipeline command line tool. Once the clinically relevant variants are isolated, they are then imported into VSClinical which serves as the interpretation hub to collect all relevant evidence for germline or somatic variants via the ACMG and AMP guidelines. Once the variants have been evaluated, it is saved locally in a user database and VSClinical is used to generate a final clinical report. So now that you have a high-level understanding of the tool’s purpose, lets move into discussing today’s topics.
The reality of the NGS space is, both genome and exome sequencing are becoming very cost effective, perhaps a couple hundred dollars for sequencing an exome. With this increase in affordability, we are naturally seeing a trend toward whole exome sequencing, away from small gene panels. Now, another advantage over cost when moving to whole exome sequencing is the increase in diagnostic yield. More diagnostic information can be extracted from whole exome workflows, and with the application of virtual panels, the scope of analysis can be changed without having to re-sequence a sample.
The decrease in sample sequencing cost has also resulted in a surge in tumor normal workflows. These workflows are highly effective at screening out germline variants in what clinicians affectionately call ‘normal subtraction.’ There are other benefits to using a tumor normal set up. A study by Parikh and colleagues in 2020 demonstrated that tumor only workflows can overestimate TMB scores, which can adversely affect patient outcomes. They saw that tumor normal analysis was able to provide the most accurate TMB calculations.
Now while these comprehensive genomic signatures, like tumor mutation burden, have to be externally calculated, these scores can be imported into VSClincal to supplement a somatic analysis. Needless to say, VarSeq does not care about the source of the VCFs, the goal is to accurately leverage the highest level of detection for the tertiary analysis. That means scaling from gene panel all the way to whole genome, running a tumor only analysis or a tumor normal project, and bringing in copy number variants, structural variants, and other comprehensive genomic signatures for a complete analysis.
Here I wanted to highlight one of the tools that help expedite variant filtration and prioritization, whether you are working with a tumor only workflow, or a tumor and normal pair. The biggest gain in our variant filtering is our new oncogenicity classifier which was added in VarSeq version 2.5.0. Similar to how our ACMG classifier works, the cancer classifier will leverage various data sources, from population catalogs, to in-silico functional prediction tools to help you easily identify driver mutations and filter out variants that are likely benign. This classifier is based on an additive scoring system in which a variant is categorized into one of four classification based on a set of thresholds. Specifically, scores exceeding a threshold of 3 are classified as likely oncogenic or oncogenic, while scores falling below a threshold of -3 are classified as benign or likely benign. If you would like to learn more about our cancer classifier, please see our September 2023 webcast which does an excellent job breaking down these scoring metrics.
With our different tools, including the new cancer classifier, we are able to develop any workflow we need with VarSeq.
From the workflow perspective, once again, we can construct any combinations of filters for any workflow. Our various tools, such as the ACMG auto-classifier, the cancer classifier, virtual gene panels, phenotypic prioritization, and more, can be used to meet these goals. Any of these combinations of filter chains can be saved as a project template, and applied to the next round of samples.
For example, with my figure on the right, I have either a hypothetical tumor only or tumor and normal workflow. As a workflow option, I could have an oncogenic workflow that prioritizes tier I and tier II oncogenic variants. I could also have a bin specifically for capturing VUS variants that I may want to collect for reanalysis down the line. With this workflow I could apply the new cancer classifier, leverage AMP tier level prioritization, or build in virtual gene panels.
Then on the right, I have a parallel filter chain for specifically isolating secondary findings. For this workflow I may leverage the ACMG classification, ClinVar, or the Secondary Findings gene panels.
Now all of these variant types can be brought into VSClinical for analysis then final reporting.
Moving into the variant analysis portion of our workflow, inside of VSClinical I may want to leverage externally called comprehensive genomic signatures, which can influence which drug recommendations are made to patients.
Traditionally genetic tests in cancer have focused on small gene panels that include just the most common 50 or so genes, where the test is only looking at small gene mutations such as BRAF V600E. However, there are many other classes of variants that we can’t capture with these small gene panels. This shortcoming is being addressed by a new generation of genomic tests in cancer called Comprehensive Genomic Profiling Tests. These tests are looking at more than one type of mutation that can drive oncogenesis. In some cases we are looking for specific mutations that activate oncogenes. These can be missense mutations and in-frame insertions or deletions but can also include gene fusions and copy number amplifications. These tests also look for mutations that inhibit tumor suppressor genes, such as full gene deletions, nonsense or frameshift mutations, and disabling fusions. Finally, there is the question of what is going on at the genome scale for this tumor. This information is quantified using metrics called genomic signatures, which provide useful supporting information about how the cancer can be treated. This includes metrics like tumor mutational burden, which was mentioned earlier, and microsatellite instability. All of these can play into the final drug recommendations to patients and be rendered into the final clinical report.
Let’s take a look at the other levels of diagnostic information we are going to apply inside of VSClinical. We are going to want to leverage all of the high-quality biomarker data at our disposal. This means brining in the variants that have made it through our filter chains, importing comprehensive genomic signatures, and applying our proprietary CancerKB databases for expert level interpretations.
As you can imagine, capturing all of the relevant evidence to determine the tier level of a given biomarker is a large undertaking, which is why having a tool like VSClinical is vital to performing these evaluations in a streamlined way. Our developers always endeavor to stay ahead of the curve with updates in the cancer genomics field, so users don’t have to worry about building bioinformatic pipelines to keep up with the latest tests and technology. As mentioned, VSClinical AMP now supports the range of comprehensive genomic profiling and we provide custom evaluation scripts built in with the software to bring in the output from your TSO-500 kits and others such as Ion Torrent, Archer and more.
VSClinical presents the relevant clinical evidence in a streamlined user-friendly interface and our cancer classifier guides the user through an assessment of each variant’s oncogenicity through a series of easy-to-understand questions which will be automatically answered whenever possible.
CancerKB is our variant interpretation database containing high-quality expert-curated gene, and biomarker interpretations in various human cancers along with clinical information such as drug sensitivity, resistance, prognostic and diagnostic implications for those biomarkers. Coupled with our CancerKB driven biomarker analysis, VSClinical also incorporates automated matching for global clinical trials.
Let’s take a moment to review some updates to our proprietary CancerKB database. With the upgrades to VarSeq 2.5.0 we are now able to use the CancerKB gene track. This gene track can provide information on functional roles and cancer hallmarks. These cancer roles are then leveraged in the shipped cancer report template. Moreover, CancerKB is more comprehensive than ever before. With 800+ fully curated genes, more than 17 thousand drug treatments, almost 60 thousand global clinical trials, CancerKB is your clinical partner in creating a complete clinical report for your patients.
Speaking of clinical reports- Following variant interpretation and classification the user can easily and quickly create standardized clinical reports. All the relevant annotation and cataloged evidence for a variant is automatically populated into the report which is critical for eliminating copy and paste errors, and the reports are entirely customizable. As mentioned in the previous slide, the time to reporting is greatly reduced even more when leveraging the CancerKB catalog predefined clinical assessments.
For our demonstration today, we are going to start off by looking at a basic tumor-normal workflow. With this workflow, we can demonstrate the dramatic reduction in variant numbers due to normal ‘subtraction,’ isolating the variants unique to the tumor. With this project we can also take a look at how the new CancerKB Gene Track can be used to investigate tumor hallmarks.
For our next project, we are going to take a look at maximizing a single workflow by having several workflows set up in parallel. One of these will isolate oncogenic variants we want to bring into a clinical report, one workflow will isolate variants of uncertain significance, where they can be stored for later interpretation, and one workflow will isolate a germline secondary pathogenic finding.
Last, we will move into VSClinical to show how genomic signatures can contribute to drug matches, how CancerKB interpretations are automatically rendered for use, and creating that final clinical report.
In summary, today we touched on the highlights of what’s new in VS 2.4.0 and gave you a glimpse of these updates from the user perspective. Overall we wanted to assure our users that as germline structural variant analysis and long read tech are becoming more mainstream, we’ve made sure that VarSeq can handle these data types, so we look forward to working with you and helping you analyze your data! So, thank you for tuning in, and now I will hand it back over to Casey to wrap up.
Question – Do you have secondary support for structural variant calling with short read sequencing? Solomon will answer this one (pair star fusion with Sentieon)
Question – Can you import phenotype data in the form of phenopackets from PacBIoHas GHI seen many labs utilizing long read data? Yea, we have seen customers using long read at a lot higher frequency than expected both for germline and somatic.
Does your business model compensate for rerun of samples when setting up validation of workflows You can run as many samples as you need to to validate your pipeline
Before wrapping up, we'd like to again state our appreciation for the grants included here. And with that, I'll hand things back to Casey to talk about some exciting marketing updates and take us through a Q&A session.
Again, I want to mention how grateful we are we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Secondary findings not validated for general population screening (very little overlap ~ 4 genes) (DMD for cardiomyopathy, primary ovarian failure in FMR1)
What if carrier frequency not established: don’t report reproductive risk
VUS variants? If one partner has a Pathogenic, may consider a weak VUS in the same gene in the other with more scruitiny
Genes with multiple disease associations? – report them and describe, also possible some variants result in autosomal dominant disease
Does not replace newborn screening – does not include de novo
Higher risk in similar or shared genetic lineage in reproductive partners