Golden Helix introduced new capabilities in VarSeq 2.5.0 to support carrier screening analysis through NGS data. Key features include the ability to import partnered samples, detect shared carrier variants between samples at the gene level using a couples carrier screening workflow, and generate clinical reports that calculate reproductive risks and include variant interpretations. The software is designed to scale from gene panels to whole exomes/genomes. A demo showed how these new features streamline the carrier screening analysis process from data import and filtering to clinical reporting.
3. VarSeq 2.5.0: Empowering Family Planning
Through Carrier Screening Analysis
October 25, 2023
Presented by: Gabe Rudy, VP of Product & Engineering
4. NIH Grant Funding Acknowledgments
4
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the National
Institutes of Health.
5. Who Are We?
5
Golden Helix is a global bioinformatics company founded in 1998
Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNV Analysis
CNV Analysis
GWAS | Genomic Prediction
Large-N Population Studies
RNA-Seq
Large-N CNV-Analysis
Variant Warehouse
Centralized Annotations
Hosted Reports
Sharing and Integration
Pipeline: Run Workflows
8. The Golden Helix Difference
8
FLEXIBLE DEPLOYMENT
On premise or in a private
cloud
BUSINESS MODEL
Annual fee for software,
training and support
CLIENT CENTRIC
Unlimited support from the
very beginning
SINGLE SOLUTION
Comprehensive cancer and
germline diagnostics
SCALABILITY
Gene panels to whole
exomes or genomes
THROUGHPUT
Automated pipeline
capabilities
QUALITY
Clinical reports correct the
first time
11. Introduction to Carrier Screening
11
• Carrier Screening started 50 years ago for screening conditions prevalent in
defined racial/ethnic groups.
• Tay-Sachs Disease, Sickle Cell Disease
• Introduction of pan-ethnic carrier screening
• Cystic fibrosis and spinal muscular atrophy
• Purpose: Carrier screening helps individuals and reproductive partners
understand their reproductive risk of passing along autosomal recessive and X-
linked conditions to their children and allows parents to make more informed
reproductive decisions.
12. Objectives of Carrier Screening
12
IDENTIFY CARRIERS
Individuals carrying one
mutated copy of a gene
associated with a disorder
and are asymptomatic
PUBLIC HEALTH
Identifying populations
wherein certain disorders
may be more prevalent
GENETIC COUNSELING
Targeted healthcare
interventions and family
planning
RISK ASSESSMENT
Calculations showing the
likelihood that a child will be
affected with a disorder
PREVENTION
Options for managing
genetic disorders
13. Calculating Reproductive Risk
13
• Reproductive Risk is the likelihood that a couple may have a child with
a genetic disorder or condition.
• Carrier Frequency: The proportion of individuals in a specific
population who carry a single copy of a mutated or altered gene
responsible for that disorder without showing any symptoms of the
disease
• Detection Rate: calculated per disorder based on literature
publications or lab-generated data. It represents how well pathogenic
mutations that are disease-causing can be detected for a given
disorder
• Residual Risk: Non-zero risk associated with a negative test
• Unknown Risk: Risk associated with an untested partner
Shuji Ogino, Robert B. Wilson: Bayesian Analysis and Risk Assessment in Genetic Counseling and Testing, The Journal of Molecular Diagnostics, Vol 6, No. 1, February 2004,
1. Calculate Residual Risk per individual
2. Multiply the residual risks for each
partner together
3. Multiple by 1/4, which is the
probability that two carriers would both
pass a disease allele
14. Cystic Fibrosis Example
Detection Rate (DR): 99%
White/Caucasian Carrier Frequency (CF): 1/25
Southeast Asian CF: 1/40
Calculate Residual Risk for Negative Test Results
Aa Aa
Aa Aa
AA aa
1/25 1/40
1/25 x 1/40 x ¼
= 1/4,000
1/2401 x 1/3900 x ¼
= 1/37,455,600
Before Testing
1/2401 1/3900
After Testing Negative
15. α-Thalassemia
Detection Rate (DR): 75%
Calculate Residual Risk for Female
Aa
AA
AA Aa
AA Aa
1/10,000 1/20
1/10,000 x 1/20 x ¼
= 1/800,000
1/40,000 x 1/2 x ¼
= 1/320,000
Before Testing
1/40,000 1/2
After Testing
White/Caucasian Carrier Frequency (CF): 1/10,000
Southeast Asian CF: 1/20
16. Hydrocephalus due to congenital stenosis of aqueduct
of Sylvius
XY
Xx
Xx XX
XY
xY XY
1/30,000
1/30,000 x ½ =1/60,000 ½ x ½ = ¼
Before Testing
½
After Testing
Detection Rate (DR): 95%
White/Caucasian Carrier Frequency (CF): 1/30,000
17. 17
NGS-based Carrier Screening Tests and Panels
•ACMG Carrier Screening panel
•Custom Commercial Panels or Whole
Exome Sequencing
Gregg, A.R., Aarabi, M., Klugman, S. et al. Screening for autosomal recessive and X-linked conditions during
pregnancy and preconception: a practice resource of the American College of Medical Genetics and Genomics
(ACMG). Genet Med 23, 1793–1806 (2021).
18. 18
Confidential |
NGS Clinical Workflow
Golden Helix provides comprehensive data analytics software that scales across gene panels, whole exomes, and whole genomes
DNA Extraction in Wet
Lab and Sequence
Generation
Interpretation and
Result Reporting
Primary
Read Processing and
Quality Filtering
Alignment and Variant
Calling
Secondary
*Golden Helix provides
Secondary Analysis through
a reseller agreement
Tertiary
Golden Helix’s software and
primary focus
Comprehensive
secondary and tertiary
analysis solutions for
primary data
aggregated by all
commercially available
sequencers
Type Size
Gene Panel Small (100MB)
Whole Exome Medium (1GB)
Whole Genome Large (100GB)
Cancer use case
Hereditary use case
Process Analysis
… and scales across multiple
data set sizes for cancer and
hereditary use cases
Filtering and Annotation
Data Warehousing
Workflow Automation
Golden Helix works with all major
sequencers…
Topic of
Validation
20. Carrier Screening Analysis with VarSeq
20
• New VarSeq capabilities supporting carrier screening workflow!
• Partner import: Match samples as “Primary” and “Partner” at the
time of import
• Couples Carrier Screening Project Template: Designed to
filter to pathogenic variants wherein each sample has a variant in
genes associated with recessive disorders
• Shared Carrier Gene Detection Algorithm: Detects variants
shared between samples at a gene level.
• ACMG Carrier Screening Panel: Based on genes
recommended by the ACMG Professional Practice and
Guidelines Committee for patient carrier screening.
Import Partnered
Samples
Evaluate with
VSClinical
Set Project Workflow
Detect Carrier Variants
21. Carrier Screening Analysis with VSClinical
21
• New VSClinical capabilities supporting partnered analysis
• Multi-sample Evaluations: Create evaluations
and add variants for partners, trios, or extended
families
• Gene-Level View: See all variants added from
all samples in the evaluation at a gene level
• Phenotypes: Add family history, phenotypes
and disorders for multiple samples individually
• Cataloging variants: Save a variant
interpretation across multiple samples into
assessment catalogs
22. Carrier Screening Reporting with VSClinical
22
• Carrier Screening Report Template
• Multi-sample Reporting: Per-patient and sample info, per-
patient results section with variant details
• Gene-Disease Reproductive Risk Calculations: The
summary table shows the detected variants from each sample
per gene-disease and a risk calculation based on those
findings. The table also shows confirmed negative findings.
• Descriptions of variants detected: Variant interpretation,
which can be saved and re-used in assessment catalogs
• Disease Descriptions from OMIM: Built-in look-up to OMIM
for disease descriptions for gene-disease associations in the
defined panel.
24. NIH Grant Funding Acknowledgments
24
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of the
National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.
26. Conferences
26
ASHG 2023, Booth #506
• November 2 –4, 2023
• Washington, D.C.
• Friday, November 3, 12:20 – CoLab Session (CoLab
Theater 2) From Panels to Whole Genomes with VarSeq: The
Complete tertiary platform for short and long-read NGS data
AMP 2023, Booth #1500
• November 16 – 18, 2023
• Salt Lake City, Utah
• Friday November 17, 12:40 –Innovations Spotlight (Stage
2) TSO500 in VarSeq
Thanks Casey! We can’t wait to dive in to this subject
Thanks Casey! We can’t wait to dive in to this subject
Before we start diving into the subject, I wanted mention our appreciation for our grant funding from NIH.
The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards.
We are also grateful to have received local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH.
So with that covered, lets take just a few minutes to talk a little bit about our company Golden Helix.
Golden Helix is a global bioinformatics software and analytics company that enables research and clinical practices to analyze large genomic datasets. We were originally founded in 1998 based off pharmacogenomics work performed at GlaxoSmithKline, who is still a primary investor in our company.
VarSeq, our flagship product, serves as a clinical tertiary analysis tool. At its core, it serves as a variant annotation and filtration engine. Additionally, however, users have access to automated AMP or ACMG variant guidelines. VarSeq also have the capability to detect copy number variations scaling from single exome to large aneuploidy events. Lastly, the finalization of variant interpretation and classification is further optimized with the VarSeq clinical reporting capability. Users can integrate all of these features into a standardized workflow.
Paired with VarSeq are VSWarehouse and VSPipeline. VSWarehouse serves as a repository for the large amount of useful genomic data wrangled by our customers. Warehouse not only solves the issue of data storage for ever-increasing genomic content, but also is fully queryable and auditable and allows for the definability of user access for project managers or collaborators. In tandem with this, VSPipeline, which will be a large part of today's discussion, allows for the automated execution of routine workflows, further optimizing users' abilities to handle large amounts of data and throughput.
Lastly, our research platform, SVS, enables researchers to perform complex analysis and visualizations on genomic and phenotypic data. SVS has a range of tools to perform GWAW, genomic prediction, and RNA-Seq analysis, among other common research applications.
Our software has been very well received by the industry. We have been cited in thousands of peer-reviewed publications, and that’s a testament to our customer base.
We work with over 400 organizations all over the globe. This includes top-tier institutions, like Stanford and yale, government organizations like the NCI and NIH, clinics such as Sick Kids, and many other genetic testing labs. We now have well over 20,000 installs of our products and with 1,000’s of unique users.
So how is this relevant to you?
At Golden Helix, we focus on the seven pillars of customer success. Golden Helix offers a single software solution that encompasses germline, somatic, and CNV analysis. Our software is also highly scalable, supporting gene panel to whole genome sequencing workflows. With our complete automation capabilities, we now offer a FASTQ or VCF to report pipeline. Our software can be locally deployed, or installed in cloud, and our business model of annual subscription per user means you are able to increase your workload without increasing analysis fees. And it goes without saying, that our FAS team is here to support you on your analysis journey.
Today, Dr. Rana Smalling, a member of our Field Application Science team, and myself, Solomon Reinman, our technical field application scientist, have the pleasure of presenting. Not only are we delighted to be presenting the user perspective in VarSeq 2.4.0, but we look forward to showing off these capabilities to our current and future customers.
Carrier testing, is the most common genetic test performed in health care today. Clinical utility in providing supporting reproductive decision making
Tay-Sachs disease in the Ashkenazi Jewish population and Sickle Cell disease in African Americans.
NGS offers an affordable, high-throughput solution, carrier screening has become a common practice in healthcare systems. Generally
With the affordabitliy of carrier testing, all populations benefit from testing
The clinical utility of carrier screening is to assist in reproductive decision making
Carriers are Asymptomatic – heterozygous for a autosomal recessive gene or x-linked gene in a female
Prevention – taking pro-active actions to reduce or remove the chance of passing on your carrier status gene
If both individuals are carriers of pathogenic mutations in a disease-assocated gene, then the ris of a having an affected child is 1 in 4.
But there are other risks that are worth calculating and reporting in some cases: Unknown risk, residual risk. These require knowing the carrier frequency and detection rate.
The gene detection rate can vary depending on several factors:
The type of disorder: Unlike cystic fibrosis (monogenic disorder), Some clinical disorders can be caused by several genes. Others have variability in expression or partial penetrance
Population-specific considerations: Can be lower in specific populations, or have variable expression in different populations
Genetic complexity: Some genetic conditions may be caused by multiple genes or involve complex genetic interactions.
Technological sensitivity: Hard to sequence genes, known pathogenic variants in intronic regions, repeat expansions, CNVs or fusions
[Discussion] Typically labs develop their own metrics for CF and DR and these numbers often differ if different ethnicities or populations are considered.
Carrier screening results are probabilistic, and the actual outcome can vary.
X-linked disorders tend to also have more complexity- e.g. Fragile X premutation or full mutation. (55-200 CGG repeats)
Typical testing facilities have ethnicity specific genes and disorders (i.e. for Ashkenazi Jewish or African American ancestry).
It's important to note that while a high gene detection rate is desirable, no genetic test results in zero residual risk. That is why it is important to try and report a residual risk, if possible, or at least emphasize that a negative test does not meant there is zero risk
There may be limitations in detecting rare or novel mutations, and false negatives or false positives can occur. CFTR max out at 99%
Talk through three examples, we will see these in the demo
Unknown risk: 1 / 4,000
Ben-Shachar, R. et al. A data-driven evaluation of the size and content of expanded carrier screening panels. Genet Med 21, 1931–1939 (2019). https://doi.org/10.1038/s4143-019-0466-5
In demo, we will have a example pathogenic carrier alelles in these individuals, resulting in a ¼ chance that the offspring would have a compound het state and disorder
Lower detection rate associated with alpha-thalassemia, according to ACMG carrier screening paper, alpha thalassemia is more prevalent in SEA populations- high carrier frequency.
Parent is a carrier, 50% chance that a child will be carrier.
HBA1
50% chance daughter will be a carrier
50% chance son will be affected
With each pregnancy, 25% a child will be affected
Male is not in this calculation, because we are starting the assumption that the parents are unaffected, so the male should be negative by definition (can’t be a carrier for a x-linked disorder as a male)
113 in the Tier 1 – Tier 3
Expanded panels exist to cover the long tail of recessive disorder that are maybe not population neutral
Let's start with a bird's-eye view of an NGS clinical workflow, and explore how VarSeq fits in. When validating a workflow, it is important to plan with the beginning and end in mind, starting from sample collection and primary analysis to get your samples sequenced then run through the secondary stage handling alignment and variant calling then lasttly through the tertiary stage paired with data Warehousing. VarSeq mainly encompasses the tertiary analysis steps of filtering, annotation, interpretation and result reporting. However, its modular and flexible design makes it compatible with a variety of inputs coming from many secondary pipelines. Golden Helix software functions with all major sequencers, and our partnership with Sentieon allows users to establish industry-leading secondary analysis. Moreover, VarSeq tackles the issue of scalability quite well, allowing users to automate workflows for increasing sizes of datasets from small gene panels to the increasingly affordable genome. For this webcast, we will be focusing on key points of validating the tertiary analysis stage in VarSeq.
VarSeq facilitates handling of all your variant types for both somatic and germline analysis. The utility of the software can be broken into stages. The first being the import of your SNVs/indels, CNVs and fusions or breakends, then passed through a user defined variant filter coupled with many annotations and algorithms to isolate the clinically relevant variant. These filters and project structure are saved as templates to facilitate automation with our VSPipeline command line tool. Once the clinically relevant variant is isolated, it is then moved into stage 2 or VSClinical which serves as the interpretation hub to collect all relevant evidence for germline or somatic variants via the ACMG and AMP guidelines. Once the variant is evaluated, it is saved locally in a user database and carried into the final report stage. You’ll learn today that the reporting feature comes with quite expansive options for the user to customize, but overall, think of VarSeq as the one software suit solution to handle full import of all variants to isolating the reported findings of clinically relevant variants. So now that you have a high level understanding of the tools purpose, lets move into discussing today’s topic.
VSClinical’s oncogenicity scoring system was developed in consultation with the GA4GH variant interpretation in Cancer Consortium (VICC) as a criteria-based system that parallels the ACMG guidelines to rank variants according to their pathogenicity in the context of cancers and is useful for determining whether a variant is likely to be a cancer driver mutation. The oncogenicity scoring algorithm is also now available in VarSeq to use as a filter for prioritizing the most impactful cancer mutations in any somatic workflow. Our oncogenicity score is additive such that scores exceeding ‘3’ indicating an ‘oncogenic’ or ‘likely oncogenic’ effect.
0. Define and import reproductive partners – can have many partners in a project
filter logic for each sample first filter to look at Conflicting VUS, LP and P variants of good quality, are heterozygous in either sample (optionally can include homozygous too- penetrance consideration etc)
Shared Carrier gene detection alg- Consanguinity- Alg setting to include/exclude same variant detected in each sample. See all variants between samples at a gene level--- transfers to new VSClinical view in gene tab! Explain workflow as to avoid having to flip between samples “All carrier variants” works into existing filter logic so runs on filtered set of variants not all variants within project.
Can define a set of genes/disorders that always get reported whether variants are detected or not- the ability to capture the “tested negative” use case.
Can customize and expand the detection rate and carrier frequency as defined by your lab and target population (add per-ethnicity risks) in a java script file
In summary, today we touched on the highlights of what’s new in VS 2.4.0 and gave you a glimpse of these updates from the user perspective. Overall we wanted to assure our users that as germline structural variant analysis and long read tech are becoming more mainstream, we’ve made sure that VarSeq can handle these data types, so we look forward to working with you and helping you analyze your data! So, thank you for tuning in, and now I will hand it back over to Casey to wrap up.
Question – Do you have secondary support for structural variant calling with short read sequencing? Solomon will answer this one (pair star fusion with Sentieon)
Question – Can you import phenotype data in the form of phenopackets from PacBIoHas GHI seen many labs utilizing long read data? Yea, we have seen customers using long read at a lot higher frequency than expected both for germline and somatic.
Does your business model compensate for rerun of samples when setting up validation of workflows You can run as many samples as you need to to validate your pipeline
Before wrapping up, we'd like to again state our appreciation for the grants included here. And with that, I'll hand things back to Casey to talk about some exciting marketing updates and take us through a Q&A session.
Again, I want to mention how grateful we are we are thankful of grants such as this which support the advancement and development of our software to create the high quality software you'll see today.
So with that covered, lets take a few minutes to talk a little bit about our company Golden Helix.
Secondary findings not validated for general population screening (very little overlap ~ 4 genes) (DMD for cardiomyopathy, primary ovarian failure in FMR1)
What if carrier frequency not established: don’t report reproductive risk
VUS variants? If one partner has a Pathogenic, may consider a weak VUS in the same gene in the other with more scruitiny
Genes with multiple disease associations? – report them and describe, also possible some variants result in autosomal dominant disease
Does not replace newborn screening – does not include de novo
Higher risk in similar or shared genetic lineage in reproductive partners