Golden Helix is a single testing paradigm that allows users to start with next-generation sequencing data and finish with a clinical report. Our solutions are comprehensive as they are all performed in one software suite, which can save time and money as they prevent the need to outsource to different companies. Furthermore, our software is fully transparent in that you have full control over the steps performed in your analysis. Golden Helix is also on the forefront in the clinical workspace as we have implemented the ACMG and AMP guidelines to evaluate single nucleotide variants, insertions and deletions, as well as structural variants.
Beyond these functionalities, Golden Helix provides the ability to perform family-based analysis. Our ACMG and Exome trio templates give users a starting point to understand the different inheritance models ranging from transmitted to de novo variants, but we also have features that can provide additional evidence for both traditional and nontraditional family-based workflows. Specifically, we have algorithms that can be implemented to look at extended pedigree information and sample relatedness as well as options for examining whether a given variant such as a CNV segregates among similarly affected family members. In this webcast, we would like to demonstrate these features and show some solutions for the analysis of different family structures.
As an overview, this webcast will cover:
- Implementing filter logics for different family structures
- Algorithms that can be used for establishing clinical significance in family-based workflows
- Visualization capabilities for further understanding of inheritance models
- Confirming and evaluating transmitted CNVs
3. Family-Based Workflows in VarSeq and VSClinical
3
February 10th, 2021
Presented by Eli Sward, PhD: Field Application Scientist Manager
4. NIH Grant Funding Acknowledgments
4
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the National
Institutes of Health.
5. Who Are We?
5
Golden Helix is a global bioinformatics company founded in 1998
Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNVAnalysis
Pipeline: Run Workflows
CNVAnalysis
GWAS |Genomic Prediction
Large-NPopulation Studies
RNA-Seq
Large-NCNV-Analysis
Variant Warehouse
CentralizedAnnotations
Hosted Reports
Sharing and Integration
8. When you choose Golden Helix, you receive
more than just the software
8
Software isVetted
• 20,000+ users at 400+ organizations
• Quality & feedback
Simple, Subscription-Based
Business Model
• Yearly fee
• Unlimited training & support
Deeply Engrained in Scientific
Community
• Give back to the community
• Contribute content and support
Innovative Software Solutions
• Cited in 1,000s of publications
• Recipient of numerous NIH grant and other funding
bodies
10. 10
Quick Poll
What technique do you currently use to call copy number
variants ?
1. Chromosomal microarray
2. Multiplex ligation-dependent probe amplification (MLPA)
3. VS-CNV
4. Internal pipeline
5. Other
13. • Application
o 100s of users around the world
• Validation
o 15+ publications
o 100% concordance with MLPA (Iacocca et al. 2017)
• Value
o Improves resources
o Cost
o Analysis time
VS-CNV Implementation
13
14. • Based on existing BAM coverage data
o Prevents the need to outsource
• Uses data normalization
o Accounts for systematic biases
o Reference set comparison
• Requirements
o ≥ 30 ref samples
o Same library preparation method
o Gene panel & exome ~100X coverage
VS-CNV Detection
14
15. • Metrics
o Ratio, Z-score, VAF (Supporting)
o Documentation in manual
• Quality flags
o CNV event flags
o Sample flags
• Confidence
o P-value
• Clinical Interpretation
o Annotations & algorithms
VS-CNV CNV Calling
15
16. • Provides framework for evaluating CNVs
• Incorporates decision trees for CNV scoring system
• Scoring criteria composed into 5 sections
• Comprehensive workflow:
o Score
o Classify
o Interpret
o Report
Support ACMG &
ClinGen Guidelines
16
for the interpretation of CNVs
17. • Updated regularly
• Notifications for track updates
• Public annotations
• Frequency tracks
o 1KG Phase 3 /ExAC
o DGV / gnomAD
• Clinical submission databases
o ClinVar
• Dosage sensitivity
o ClinGen
VS-CNV Annotations
17
18. • Displays expected copy number
• Computes probability CNV is present in Mother or
Father
• Provides confidence that the estimated copy number
is accurate
• Identifies other samples sharing event
18
Algorithms
Copy Number Probability and Segregation
19. • Ranks CNVs and genes based on relevance to user-
specified phenotypes
• Provides names, scores, and paths for highest
ranking genes
• Modeled on Phevor algorithm
19
Algorithms
PhoRank for CNVs
20. • Trio with extended pedigree
• Proband (male) is associated with
intellectual disability, seizures,
obesity
• Identify SNVs and CNVs
associated with disorder
• Generate clinical report
20
Project Workflow
22. 1. Enables consistency
2. Offers simplicity
3. Highly educational
4. Provides scalability
22
VSClinical
Automating best practice guidelines
23. NIH Grant Funding Acknowledgments
23
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.
25. eBook Update: Precision Medicine
25
Want a free copy? Request one in the questions or chat panel and
our team will follow up with you!
26. End of Year Bundles
26
o 1: SVS Imputation Module w/CADD & OMIM (2-users) – $7,995
o 2: VSClinical, CNV, Sentieon Tier 1 (2-users) - $19,995
o 0: VSClinical, AMP, CNV, Sentieon Tier 1 (2-users) - $29,995
o 1: Small Warehouse License: VS-CNV, VSClinical+ AMP, Sentieon Tier 1, VSReports, VSPipeline - (2-users) - $48K
o 1: Large Warehouse License: VS-CNV, VSClinical + AMP, Sentieon Tier 1, VSReports, VSPipeline (<10 users) - $120K
Also offering temporary remote licenses for Golden Helix customers who are unable to access their machines. Please contact our team to learn more.
Extended through February 15th
27. Abstract Competition
27
• Official or unofficial abstracts are accepted (you do not need to be published; we are just asking to share a story)
• Winners will receive:
• 1st place:
• One-year Single-Named User (SNU) license* of either SNP & Variation Suite (SVS) or VarSeq
• Dell Latitude 5000 series laptop
• Opportunity to present research to the Golden Helix community in the form of a webcast and blog post
• 2nd and 3rd place:
• One-year Single-Named User (SNU) license* of either SNP & Variation Suite (SVS) or VarSeq
• Opportunity to present research to the Golden Helix community in the form of a webcast and blog post
• Competition will end on March 9th
• Submit your abstract to marketing@goldenhelix.com
Show us how you are, or plan, to use Golden Helix software in your clinical or research work!
28. COVID-19 Publications & Articles
28
Investigating the Global Spread of SARS-CoV-2 Leveraging
Next-Gen Sequencing and Principal Component Analysis
European Journal of Clinical and Biomedical Sciences
Christiane Scherer, James Grover, Darby Kammeraad, Gabe Rudy, Andreas Scherer
Diagnosing and Tracking COVID-19 Infections Leveraging
Next-Gen Sequencing
The Journal of Precision Medicine Feature | Andreas Scherer, Christiane Scherer
Golden Helix: Enabling Precision Medicine with Cutting-Edge
NGS Technologies
Clinical OMICs Feature
Leveraging Next-Generation Sequencing Technology in the
Fight Against COVID-19
Clinical Lab Manager Feature | Andreas Scherer
SARS-CoV-2 Global Spreading Investigation using Principal
Component Analysis of Sequence Variants
Journal of Genetics and Genome Research
Christiane Scherer, James Grover, Darby Kammeraad, Gabe Rudy, Andreas Scherer
Analysis of 46,046 SARS-CoV-2 whole-genomes
leveraging principal component analysis (PCA)
Pre-Release | Submitted for Publication
Christiane Scherer, James Grover, Darby Kammeraad, Gabe Rudy, Andreas Scherer
Before we start diving into the subject, I wanted mention our appreciation for our grant funding from NIH.
The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards.
Additionally we are also grateful for receiving local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH.
Again, we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Golden Helix was founded back in 1998, and we are one of the few bioinformatic companies that can say we have 20 years of experience building NGS solutions in the research and clinical space.
These solutions have a very broad range of capabilities, and we won’t be covering all of these today, but they do cover a lot of use cases in both the clinical and research realms.
What we will be focused on today is our VarSeq product suite, which focuses primarily on the clinical application of the software. VarSeq supports clinincal analysis workflows in which you can filter and annotate variants and then evaluate those variants according the ACMG and AMP guidelines.
However, If you are interested in any of these other capabilities please visit our website. We have recorded numerous webinars and other material for you to browse at your own pace.
We have been cited in over thousands of peer reviewed publications, and we are always happy to celebrate the success of our customers in performing their research with our tools. And actually, if you check out our blog, you will often see us highlighting those publications!
We serve over 400 customers globally, and these customers span many different industries, from academic institutions, to research hospitals, to commercial testing labs and government institutions.
That being said, as a company, our software is being vetted by our large and established customer base.
We have had many years to perfect the development process, and are always incorporating feedback from our customers and their diverse set of needs and use cases.
We are extremely integrated with the scientific community, providing content in the form of ebooks and webinars like this one.
As a business, we want to be aligned with the success of our customers. One way we do that is that we provide our software on a simple per-use subscription model that comes with unlimited training and support. We are invested in supporting your research or setting up your test so that you can run as many samples through our software as possible.
Since we are an established and trusted business in this space, you might already know someone who uses Golden Helix, but if not, we are happy to find you a reference in your area.
So before I jump into today's topic, I want to provide a conceptual overview of how VarSeq and more specifically VarSeq clinical fits into your NGS analysis workflows.
Gabe asks. Lots of reasons for reports..
Our software can detect CNV events using the existing coverage data stored in your BAM files and To best explain the value of VS CNV detection, we can compare against the traditional best methods: MLPA and chromosomal microarrays.
MLPA is ideally tailored for detecting smaller events for a single or maybe few genes. In addition to being expensive, around $80 per gene, there is also an inability to detect larger events, which chromosomal microarrays can handle. However, the large aneuploidy level of CMA event detection is typically from 10 kbp or larger and cannot accurately detect smaller events. Thus, representing a conundrum.
That is where VSCNV comes in. CNV detection with your NGS data in VarSeq accurately detects not only the 10kbp and larger events, but can detect events down to a single gene and even exon. VarSeq breaks down the restrictions of CMA and MLPA methods and gives the user full scale capability to process everything from small gene panels up to whole genome datasets in one software suite.
This can potentially save you a fortune on assays and time as CNV detection is performed by you
Our CNV software has been adopted by hundreds of users and cited in over 15 publications including well regarded journals focusing on a variety of different topics.
Most importantly, a paper by Lacocca et al from the Robarts research institute compared our software to the traditional best methods, MLPA and CMA, and found 100% concordance in CNV event detection, thus highlighting that our software is an accurate and cost effective tool in the CNV workspace.
Now that we have discussed the CNV caller, lets briefly talk about the VSClinical AMP feature.
NGS based detection of CNVs starts with the coverage data in the BAM file.
This is not a straightforward task and our development team has worked extremely hard to develop an algorithm that addresses the associated challenges with coverage data.
These challenges are that coverage can vary between samples where not all samples are created equal. Another is that coverage can fluctuate between targets and the vast majority of these fluctuations are not caused by CNV events but rather systematic biases in the data. To account for this bias and variability, we normalize the coverage data with mean values to simplify presentation of the coverage.
Another issue is that looking at a single samples coverage data alone is not enough to detect the calls. To detect a CNV from change in coverage over any region, we need to compare the loss or increase in coverage to a normal diploid region. To create the true sense what is a normal region and account for the systematic bias across coverage regions, we use reference samples to calculate a normalized averaged coverage value that represents a normal diploid region. It is also important to keep in mind that the reference set does not need to be solely control samples with no CNV events. The benefit of having multiple reference samples and averaging the normalized coverage is to prevent any event in a single reference sample from skewing the reference based normal region overall. For this approach to work effectively, there are some requirements which include having an adequate number of references, no less than 30, and making sure they come from the same platform and prep methods though not necessarily the same run and also having adequate coverage ideally 100x or greater.
This image is a great example of the need for reference samples. When looking at the 3 samples on the right and each of their coverage data across BRCA2, we may guess that sample 11 possibly has a heterozygous deletion since the coverage is nearly half as much as samples 12 and 13. Unfortunately it isn’t this simple and detecting any CNV is essentially impossible to tell from the naked eye since a single samples coverage doesn't provide enough information alone to detect these events. For this example, there is not a deletion event in sample 11 but we do find a duplication in sample 13.
Also, we seek to prevent black box approaches to CNV output, so users can easily find answers to the question “how does the coverage for any region in my sample compare to the mean coverage values of my reference set?” As output in VarSeqs coverage region table, the user can view both the normalized coverage value for the sample and for the reference set. The detection of a CNV event is based on the comparison of these two normalized values and the computation results in additional metrics the user can assess when interpreting the call. Let explore these CNV metrics in the next slide.
Here is a snapshot of the detected duplication in sample 13 spanning 4 exons in BRCA2. The comparison between sample 13 and the reference set normalized coverage is the basis for the CNV detection and will be represented by key metrics that support the event.
These key metrics are:
Ratio: which is simply the samples normalized coverage divided by the reference set normalized coverage. A ratio ~1 means the region can be interpreted as having equal coverage for your sample and reference set, whereas a ratio of 0.5 is indicative of heterozygous deletion and a ratio of 1.5 being a possible duplication. This you can see in the top of the image where the ratios for each region in this CNV near 1.5 for this duplication
The other critical metric is Z-score or standard deviation of the samples normalized coverage relative to the controls. The next plot on the image shows our standard deviations ranging from over 6 to 3.5, strong evidence for the call.
And lastly the Variant Allele Frequency, and while this is a secondary metric, it does allow us to reduce our rate of false positives and exclude problematic regions from the algorithm.
In addition to exploring these metrics, the CNV caller also provides the user with an introspective capability of adding certainty to the detected event.
The new ACMG CNV guidelines are the first to provide a robust set of rules for the interpretation of small intragenic deletions and duplications but there are also additional considerations that have been elaborated on by the ClinGen working group, which provides a deep dive into each section of the new guidelines as well as several examples and cases studies, one of which will be discussed today.
The guidelines can be broken down into 80 distinct criteria, which we have simplified into five sections. Once these sections have been evaluated, a point system is created which can determine based on a threshold, whether the CNV event is pathogenic or benign.
If your team has been troubled by the complexity of these guidelines, then Golden Helix is the solution for you. Our team has spent long hours reading the guidelines, watching the webinars, and reading all available supplementary material to provide you with an intuitive interface to classify CNV events according to the ACMG CNV guidelines.
NGS based detection of CNVs starts with the coverage data in the BAM file.
This is not a straightforward task and our development team has worked extremely hard to develop an algorithm that addresses the associated challenges with coverage data.
These challenges are that coverage can vary between samples where not all samples are created equal. Another is that coverage can fluctuate between targets and the vast majority of these fluctuations are not caused by CNV events but rather systematic biases in the data. To account for this bias and variability, we normalize the coverage data with mean values to simplify presentation of the coverage.
Another issue is that looking at a single samples coverage data alone is not enough to detect the calls. To detect a CNV from change in coverage over any region, we need to compare the loss or increase in coverage to a normal diploid region. To create the true sense what is a normal region and account for the systematic bias across coverage regions, we use reference samples to calculate a normalized averaged coverage value that represents a normal diploid region. It is also important to keep in mind that the reference set does not need to be solely control samples with no CNV events. The benefit of having multiple reference samples and averaging the normalized coverage is to prevent any event in a single reference sample from skewing the reference based normal region overall. For this approach to work effectively, there are some requirements which include having an adequate number of references, no less than 30, and making sure they come from the same platform and prep methods though not necessarily the same run and also having adequate coverage ideally 100x or greater.
This image is a great example of the need for reference samples. When looking at the 3 samples on the right and each of their coverage data across BRCA2, we may guess that sample 11 possibly has a heterozygous deletion since the coverage is nearly half as much as samples 12 and 13. Unfortunately it isn’t this simple and detecting any CNV is essentially impossible to tell from the naked eye since a single samples coverage doesn't provide enough information alone to detect these events. For this example, there is not a deletion event in sample 11 but we do find a duplication in sample 13.
Also, we seek to prevent black box approaches to CNV output, so users can easily find answers to the question “how does the coverage for any region in my sample compare to the mean coverage values of my reference set?” As output in VarSeqs coverage region table, the user can view both the normalized coverage value for the sample and for the reference set. The detection of a CNV event is based on the comparison of these two normalized values and the computation results in additional metrics the user can assess when interpreting the call. Let explore these CNV metrics in the next slide.
Darby
New PSV1 scoring
More modifiers for scoring criteria
Option to include custom frequency
Catalogs
Application of batch script with breakdown of parameters
Darby
New PSV1 scoring
More modifiers for scoring criteria
Option to include custom frequency
Catalogs
Application of batch script with breakdown of parameters
Eli
Clinical reports are the final product of laboratory testing and often are integrated into a patient’s electronic health record. Therefore, effective reports are concise, yet easy to understand. Reports should be written in clear language that avoids medical genetics jargon or defines such terms when used. The report should contain all of the essential elements of the test performed, including structured results, an interpretation, references, methodology, and appropriate disclaimers
The results section should list variants using HGVS nomenclature (see Nomenclature). Given the increasing number of variants found in genetic tests, presenting the variants in tabular form with essential components may best convey the information. These components include nomenclature at both the nucleotide (genomic and complementary DNA) and protein level, gene name, disease, inheritance, exon, zygosity, and variant classification.
The interpretation should contain the evidence supporting the variant classification, including its predicted effect on the resultant protein and whether any variants identified are likely to fully or partially explain the patient’s indication for testing. The interpretation section should address all variants described in the results section but may contain additional information. It should be noted whether the variant has been reported previously in the literature or in disease or control databases. The references, if any, that contributed to the classification should be cited where discussed and listed at the end of the report.
The additional information described in the interpretation section may include a summarized conclusion of the results of in silico analyses and evolutionary conservation analyses.
he methods and types of variants detected by the assay and those refractory to detection should be provided in the report
A large number of patient advocacy groups and clinical trials are now available for support and treatment of many diseases. Laboratories may choose to add this information to the body of the report or attach the information so it is sent to the health-care provider along with the report.
Enables Consistency: Directly follows ACMG, AMP, and now ACGS guidelines
Offers Simplicity: Provides auto recommendations for guidelines using intuitive GUI
Highly Educational: Easy to learn and understand guidelines and rule logics for classification
Provides Scalability: Allows labs to expedite variant analysis and reporting
Again, we want to mention how grateful we are for grants such as these which provides huge momentum in developing our software.
At this point Ill turn things back over to Delaina and she will talk about some Golden Helix updates and then we will go into the Q and Answer period.
Again, I want to mention how grateful we are we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Again, I want to mention how grateful we are we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Again, I want to mention how grateful we are we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Over the course of the 2020 period, we worked with a clinical lab in Germany to do the analysis of 46k samples to break down population structure for the genomic variability among those samples. Excited for this upcoming publication coming up for everyone to read.