Best Practices for Validating a Next-Gen Sequencing Workflow

Best Practices for Validating a Next-Gen
Sequencing Workflow
August 16, 2023
Presented by Darby Kammeraad, Director of Field Application Services and
Rana Smalling, PhD, Field Application Scientist

NIH Grant Funding Acknowledgments
4
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number 2R44 GM125432-01
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.

Who Are We?
5
Golden Helix is a global bioinformatics company founded in 1998
Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNV Analysis
CNV Analysis
GWAS | Genomic Prediction
Large-N Population Studies
RNA-Seq
Large-N CNV-Analysis
Variant Warehouse
Centralized Annotations
Hosted Reports
Sharing and Integration
Pipeline: Run Workflows

Cited in 1,000s of Peer-Reviewed Publications
6

The Golden Helix Difference
8
FLEXIBLE DEPLOYMENT
On premise or in a private
cloud
BUSINESS MODEL
Annual fee for software,
training and support
CLIENT CENTRIC
Unlimited support from the
very beginning
SINGLE SOLUTION
Comprehensive cancer and
germline diagnostics
SCALABILITY
Gene panels to whole
exomes or genomes
THROUGHPUT
Automated pipeline
capabilities
QUALITY
Clinical reports correct the
first time

Today’s Presenters
9
Rana Smalling, PhD
Field Application Scientist
Darby Kammeraad
Director of Field Application
Services
Best Practices for Validating a Next-Gen Sequencing Workflow

10
Confidential |
NGS Clinical Workflow
Golden Helix provides comprehensive data analytics software that scales across gene panels, whole exomes, and whole genomes
DNA Extraction in Wet
Lab and Sequence
Generation
Interpretation and
Result Reporting
Primary
Read Processing and
Quality Filtering
Alignment and Variant
Calling
Secondary
*Golden Helix provides
Secondary Analysis through
a reseller agreement
Tertiary
Golden Helix’s software and
primary focus
Comprehensive
secondary and tertiary
analysis solutions for
primary data
aggregated by all
commercially available
sequencers
Type Size
Gene Panel Small (100MB)
Whole Exome Medium (1GB)
Whole Genome Large (100GB)
Cancer use case
Hereditary use case
Process Analysis
… and scales across multiple
data set sizes for cancer and
hereditary use cases
Filtering and Annotation
Data Warehousing
Workflow Automation
Golden Helix works with all major
sequencers…
Topic of
Validation

Content Overview
12
• Preparation for NGS workflow validation
 Adequate controls
 Defining expected outcomes
• Design of the NGS workflow in VarSeq
 Types of workflows needed
 Sample related search terms
 Automation
• Expert tips
 Unique methods in VarSeq to expedite
validation
• Use cases
1) Somatic workflow
2) Germline workflow scenario
3) CNV validation example

Validation begins with well characterized sample
controls
13
Collection of case/control data
o Insightful: Kit with generic controls or catalog (sample or database file with numerous pathogenic variants)
 Pros: Useful when testing accuracy of classifier or benchmarking algorithms
 Cons: Do not suitably test efficacy of overall assay/filter for real world application
o Practical: Designed controls or real-world data with established results -> more suitable for workflow design/validation)
o Determine the number of samples needed to establish statistical robustness
 Example for GHI CNV caller
• minimum 30 controls, read-depth 100X (panels and exomes), consistent library prep method.
 Potentially >100s of samples with repeat runs for robustness
 Handle spectrum of variant types (SNVs, Indels, CNVs, Fusions)
 Handle workflow design/template (TN, T-only, Single germline, Trio/Duo)
 Sample collection (blood, saliva, solid tumor, FFPE)

Example control sources
14
Horizon Molecular reference standards:
https://horizondiscovery.com/en/reference-standards
o Mimic patient material from sample prep to downstream
analysis
 Platform agnostic
 Oncology focused with >370 clinically-relevant
variants
 SNVs/Indels/CNVs/Fusions
 Various DNA source types

Example validation process
15
• Phase 1: Software installation and verification of user access
• Phase 2: Definition of all deliverables: clinical reports, exported
data... (outputs)
• Phase 3: Initial workflow design tested with controls (inputs)
• Phase 4: Peer-review and verification of workflow design
• Phase 5: Analytical verification (expected outcomes)
• Phase 6: Finalization of all SOPs
• Phase 7: Training of key employees
• Phase 8: Pipeline approval and go-live

Optimal NGS Workflow
16
Workflow design – simplifying the workflow upfront streamlines automation later
o Variant filtering : In order to finalize the filter chain, develop a clear understanding of
the applicable cut-offs that are being modeled within the workflow.
 Variant quality (unique to each bioinformatic pipeline but VarSeq is agnostic
and can handle any VCF)
 Alt allele frequency in population (typically 1-5% or less but easily adjusted
for disorders more prevalent in population
 Ontology (Missense, LOF effect, or predicted to impact canonical or novel
splice site)
 Sample specific information (phenotypes/panels or tumor type)
 Classifier (default cutoffs adjustable to accommodate founder populations as
example)
o Implementation tip: Testing filter accuracy with flags
 Use variant flag sets to test efficacy of filtering strategies (Where does my
known pathogenic variant get lost? Adjusting the filter cutoffs/thresholds)

Crucial to define scope of reportable findings as it creates
novel workflow designs
o Establish with the clinical stakeholders what is scope of
genomic data to report on?
 For example, should report include
incidental/secondary findings?
• In somatic test, report germline findings such
hereditary risk later in life or related to
relatedness
• Perhaps an opt-in and opt-out policy
o Format choices: Exportable .json or simple visual with pdf
or word document
Reporting: Desired Output
17

Leveraging sample relationships
Types of potential germline workflows
o Carrier risk analysis
 Pro: Interesting findings can facilitate early
genomic investigation for future prenatal
situation
 Con: List of “risky alleles” require careful
reporting language
o Trios/Duos/Extended pedigrees
 Pro: Highly efficient filtering strategy
 Con: Require sequencing data from other family
members, may not always be available and add
cost
Inheritance Models: family-based analysis
18

o Leveraging sample phenotypes
Single sample: Phenotypic based search or use of panel when disorders are consistent in lab
• Pro: Phenotypes search expands beyond limit of panel. Not missing
potentially interesting pathogenic variants
• Con: User may need to research novel gene that falls outside current
version of panel
 Approach supported with VarSeq PhoRank algorithm
• Can be setup to be deployed alongside panel to reinforce variant search
• Can be automated with VSPipeline on per sample basis if each case
disorder is unique
Leveraging sample specific search terms: PhoRank
19

Somatic
Workflow
Strategy
• Priority Lists
• Project design
Workflow
Template
design
• Parallel filters by priority
• Reporting vs. tracking decision
tree
Workflow
automation
•VSPipeline
deployment strategy
•LIMS/API integration
Lab scenarios: Somatic
20
o Somatic Workflow Strategy for TSO500 or other panels
 Priority 1 list: known Tier I oncogenic variants with treatments
 Priority 2 list: user explores Tier II variants to collect available
knowledge
 Priority 3 list: VUS variants for future review
o Workflow Template design insights
 Parallel filters for
• Report: Rapid discovery and report of Priority 1 variants
• Report: User investigation of Priority 2 variants
• No Report on VUS: tracking of VUS for future
reclassification via VSWarehouse
i. Bulk upload of entire variant cohorts
ii. Easily track changing classifications
iii. Track and filter out artifacts
o Workflow automation with VSPipeline
 Streamline deployment of validated template
 Integration of VarSeq with existing LIMS and external genomic
software via APIs
Somatic workflow
Tier1
Tier2
Tier3

Lab scenarios: Germline
21
Germline workflow
Routine workflow filters applied to almost all sample scenarios
o Following scenarios are currently deployed across our global customer base
 Scenario 1: University lab running Genomes with designated panels
 Scenario 2: Commercial lab running Genomes for any unique disorder
(case by case basis) and report must include any interesting incidental
findings for risk alleles.
o Workflow design insights
 Parallel filters for
• Focused search for variants under “Standard Diagnostic” filter
• Parallel filter to capture scope of reportable “Incidental Findings”
o Second project: CNV calling with VarSeq
 Reviewing the quality of CNV reference set
 Comparison of findings against truth set
Standard
Diagnostic
Incidental
Findings
Scenario 1 Scenario 2

NIH Grant Funding Acknowledgments
23
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.

25 Licenses for 25 Months
25
Celebrating 25 Years in Business
• Limited quantity
• Licenses are 25-month license periods
• Available to new customers only
• Orders must be received by Sept 15, 2023
• Visit goldenhelix.com/forms/25-for-25 or
scan the QR code below

Conferences
26
European Human Genetics Conference, Booth #566
• June 10 – 13, 2023
• Glasgow, UK
• Monday, June 12, 12:00 - Corporate Satellite Talk (ALSH 1,
Level 0) Achieving Economic Success as an NGS Lab:
Strategy and Implementation
AMP Europe, Milan, Italy, Booth #14
• June 18 – 20, 2023
• Milan, Italy
• Monday, June 19, 1:00 – Industry Symposium Achieving
Economic Success as an NGS Lab: Strategy and
Implementation

Best Practices for Validating a Next-Gen Sequencing Workflow

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Best Practices for Validating a Next-Gen Sequencing Workflow

Similar to Best Practices for Validating a Next-Gen Sequencing Workflow (20)

More from Golden Helix

More from Golden Helix (20)

Recently uploaded

Recently uploaded (20)

Best Practices for Validating a Next-Gen Sequencing Workflow

Editor's Notes