Evaluating Cloud vs On-Premises
for NGS Clinical Workflows
NGS lab infrastructure choices to to meet requirements for cybersecurity,
patient data privacy, and scalable unit economics
Presented by Gabe Rudy, VP of Product & Engineering
NIH Grant Funding Acknowledgments
2
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.
Who Are We?
3
Golden Helix is a global bioinformatics company founded in 1998
Filtering and Annotation
ACMG & AMP Guidelines
Clinical Reports
CNV Analysis
Pipeline: Run Workflows
CNV Analysis
GWAS | Genomic Prediction
Large-N Population Studies
RNA-Seq
Large-N CNV-Analysis
Variant Warehouse
Centralized Annotations
Hosted Reports
Sharing and Integration
Cited in 1,000s of Peer-Reviewed Publications
4
Over 400 Customers Globally
5
When you choose Golden Helix, you receive
more than just the software
6
Software is Vetted
• 20,000+ users at 400+ organizations
• Quality & feedback
Simple, Subscription-
Based Business Model
• Yearly fee
• Unlimited training & support
Deeply Engrained in Scientific
Community
• Give back to the community
• Contribute content and support
Innovative Software Solutions
• Cited in 1,000s of publications
• Recipient of numerous NIH grant and other
funding bodies
Genetic Testing Process
   
Sample Prep Sequencing Align & Call Annotate
& Filter
Variant
Interpretation
Report
Sentieon
&VS-CNV
V
arSeq VSReports
VSClinical
Golden Helix Clinical Suite
VSWarehouse
Aggregate Variants, Reports, Knowledgebase
8
9
• Decision support for 33 criteria to classify a variant
from Benign to Pathogenic for Mendelian disorders
• Recommendation engine auto-scores 18 criteria
covering population catalogs, functional
predictions, and clinical annotations
• Interpretations saved into assessment catalogs and
re-used next time a variant is seen
• Guided workflows for interpretation of somatic
variants in cancer
• Includes coverage QC, somatic variant scoring,
biomarker evaluation of AMP evidence tier levels,
drug and trial evaluation, and clinical report
building
• Golden Helix CancerKB included with pre-written
report ready evaluations of most common
biomarkers in most common cancers
VSClinical (ACMG) VSClinical (AMP)
10
• Widely adopted over the last five years
serving clinical labs in calling CNVs from
NGS data
• Validated as a replacement for MLPA as
more cost-effective, with no reduced
sensitivity, and with wider coverage of
evaluated genes
• Calls include metrics for QC, augmented by
annotation library that can be used for
further filtering and prioritization
Building an NGS Analysis Workflow
11
• Laboratory Developed Tests must adapt and
customize workflows to fit the test uniquely
• Each part of the workflow requires tuning against
validation data:
o Levels of detection thresholds at each stage
o Filters that reduce hands-on interpretation but retain
sensitivity
o End-to-end validation of samples of positive and
negative controls
• A validated pipeline is ready for scaling to high sample
volumes
Seq
CNVs SVs
Variant
Annotate
Reads
Alignment
Annotate
Annotate
Filter Filter Filter
Interpret
Warehouse Report
Cohort
Analysis
Re-Analysis
Scaling an NGS Analysis Workflow
12
• Automated workflow:
o Align & Call
o Annotate & Filter
• Interactive workflow:
o QC samples and variants
o Classify and interpret variants
o Review and report
• Constraints
o Compute resources
o Personnel
o Turn-around time
Seq
CNVs SVs
Variant
Annotate
Reads
Alignment
Annotate
Annotate
Filter Filter Filter
Interpret
Warehouse Report
Cohort
Analysis
Re-Analysis
Scaling Batch Processing
13
• Parallelize by sample
• Strategies:
o Many machines running single process
o Many processes on large machines
o The strategy should match the storage and infrastructure
of the organization
o Existing infrastructure and policy for data security
o Cloud infrastructure requires security engineering
o On-premises Linux clusters if available are a simpler
compute model to orchestrate.
Scaling Scenarios
14
• Region Testing Lab Scaling Exomes
• New contracts increase volume to hundreds of
exomes a week
• Batch work must complete over the weekend
• Windows-centric IT infrastructure
• Pharmaceutical Services Co Scaling Genomes
• High volume, batch-oriented WGS processing
• Fully automated locked-down services
deliverables
• Docker-centric pipeline construction and
automation
15
• Prefers fewer large servers
• Target time limit of 24 hours to process 300 exomes
(buffer time for re-run if failure)
• Windows server with Powershell scripts
• Analysis workflow includes:
• Coverage stats and QC
• VS-CNV calling
• Annotation and filtering
• Gene panel and phenotype automation
Region Testing Lab Scaling Exomes
Pharma Services Scaling Genomes
16
• Deliverables are locked down through versioned
pipelines
• All steps inside read-only Docker containers
• VSPipeline + Annotations + Project Templates
• Local Linux cluster and cloud deployment scenarios
• Secure Cloud deployments in China with no
outbound internet access
• Batch node job:
• Download docker containers
• [Cloud only] Download BAM, VCF from storage
• Docker run pipeline
• [Cloud only] Upload results to storage
Windows/Linux Server Running Desktop App
Network Attached Server (5+TB)
VSWarehouse Server (Linux)
Automated Workflow Servers
• Sequencer Output (FASTQ)
• Secondary Analysis Data (VCFs/BAMs)
• VarSeq Projects
• Annotation Data
• User Preferences
VSPipeline
• VS-CNV
• BAM/VCF => VarSeq Projects
• Aggregate Variant Projects
• Interpretation Catalogs
• Monthly ClinVar Changes
• Sample Reports
Users Interact through Remote Desktop
Golden Helix Deployment Diagram
18
NIH Grant Funding Acknowledgments
19
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health.
ESHG 2022
20
Develop repeatable cancer and germline interpretation workflows
that scale from panels to whole exomes and genomes
Maximizing Profitability in your NGS Testing Lab
Monday, June 13, 12:00-13:00 PM | Room number – 2.32 & – 2.33, level -2
Seating is limited, light refreshments will be offered
Visit our booth #X5-476
• Discussions with Golden Helix team about your lab’s specific needs
• See our product in action during our in-booth demos
• Don’t leave ESHG without one of our famous t-shirts
Attend our Corporate Satellite!
21

Evaluating Cloud vs On-Premises for NGS Clinical Workflows

  • 1.
    Evaluating Cloud vsOn-Premises for NGS Clinical Workflows NGS lab infrastructure choices to to meet requirements for cybersecurity, patient data privacy, and scalable unit economics Presented by Gabe Rudy, VP of Product & Engineering
  • 2.
    NIH Grant FundingAcknowledgments 2 • Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under: o Award Number R43GM128485-01 o Award Number R43GM128485-02 o Award Number 2R44 GM125432-01 o Award Number 2R44 GM125432-02 o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005 • PI is Dr. Andreas Scherer, CEO of Golden Helix. • The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
  • 3.
    Who Are We? 3 GoldenHelix is a global bioinformatics company founded in 1998 Filtering and Annotation ACMG & AMP Guidelines Clinical Reports CNV Analysis Pipeline: Run Workflows CNV Analysis GWAS | Genomic Prediction Large-N Population Studies RNA-Seq Large-N CNV-Analysis Variant Warehouse Centralized Annotations Hosted Reports Sharing and Integration
  • 4.
    Cited in 1,000sof Peer-Reviewed Publications 4
  • 5.
  • 6.
    When you chooseGolden Helix, you receive more than just the software 6 Software is Vetted • 20,000+ users at 400+ organizations • Quality & feedback Simple, Subscription- Based Business Model • Yearly fee • Unlimited training & support Deeply Engrained in Scientific Community • Give back to the community • Contribute content and support Innovative Software Solutions • Cited in 1,000s of publications • Recipient of numerous NIH grant and other funding bodies
  • 7.
    Genetic Testing Process    Sample Prep Sequencing Align & Call Annotate & Filter Variant Interpretation Report Sentieon &VS-CNV V arSeq VSReports VSClinical Golden Helix Clinical Suite VSWarehouse Aggregate Variants, Reports, Knowledgebase
  • 8.
  • 9.
    9 • Decision supportfor 33 criteria to classify a variant from Benign to Pathogenic for Mendelian disorders • Recommendation engine auto-scores 18 criteria covering population catalogs, functional predictions, and clinical annotations • Interpretations saved into assessment catalogs and re-used next time a variant is seen • Guided workflows for interpretation of somatic variants in cancer • Includes coverage QC, somatic variant scoring, biomarker evaluation of AMP evidence tier levels, drug and trial evaluation, and clinical report building • Golden Helix CancerKB included with pre-written report ready evaluations of most common biomarkers in most common cancers VSClinical (ACMG) VSClinical (AMP)
  • 10.
    10 • Widely adoptedover the last five years serving clinical labs in calling CNVs from NGS data • Validated as a replacement for MLPA as more cost-effective, with no reduced sensitivity, and with wider coverage of evaluated genes • Calls include metrics for QC, augmented by annotation library that can be used for further filtering and prioritization
  • 11.
    Building an NGSAnalysis Workflow 11 • Laboratory Developed Tests must adapt and customize workflows to fit the test uniquely • Each part of the workflow requires tuning against validation data: o Levels of detection thresholds at each stage o Filters that reduce hands-on interpretation but retain sensitivity o End-to-end validation of samples of positive and negative controls • A validated pipeline is ready for scaling to high sample volumes Seq CNVs SVs Variant Annotate Reads Alignment Annotate Annotate Filter Filter Filter Interpret Warehouse Report Cohort Analysis Re-Analysis
  • 12.
    Scaling an NGSAnalysis Workflow 12 • Automated workflow: o Align & Call o Annotate & Filter • Interactive workflow: o QC samples and variants o Classify and interpret variants o Review and report • Constraints o Compute resources o Personnel o Turn-around time Seq CNVs SVs Variant Annotate Reads Alignment Annotate Annotate Filter Filter Filter Interpret Warehouse Report Cohort Analysis Re-Analysis
  • 13.
    Scaling Batch Processing 13 •Parallelize by sample • Strategies: o Many machines running single process o Many processes on large machines o The strategy should match the storage and infrastructure of the organization o Existing infrastructure and policy for data security o Cloud infrastructure requires security engineering o On-premises Linux clusters if available are a simpler compute model to orchestrate.
  • 14.
    Scaling Scenarios 14 • RegionTesting Lab Scaling Exomes • New contracts increase volume to hundreds of exomes a week • Batch work must complete over the weekend • Windows-centric IT infrastructure • Pharmaceutical Services Co Scaling Genomes • High volume, batch-oriented WGS processing • Fully automated locked-down services deliverables • Docker-centric pipeline construction and automation
  • 15.
    15 • Prefers fewerlarge servers • Target time limit of 24 hours to process 300 exomes (buffer time for re-run if failure) • Windows server with Powershell scripts • Analysis workflow includes: • Coverage stats and QC • VS-CNV calling • Annotation and filtering • Gene panel and phenotype automation Region Testing Lab Scaling Exomes
  • 16.
    Pharma Services ScalingGenomes 16 • Deliverables are locked down through versioned pipelines • All steps inside read-only Docker containers • VSPipeline + Annotations + Project Templates • Local Linux cluster and cloud deployment scenarios • Secure Cloud deployments in China with no outbound internet access • Batch node job: • Download docker containers • [Cloud only] Download BAM, VCF from storage • Docker run pipeline • [Cloud only] Upload results to storage
  • 17.
    Windows/Linux Server RunningDesktop App Network Attached Server (5+TB) VSWarehouse Server (Linux) Automated Workflow Servers • Sequencer Output (FASTQ) • Secondary Analysis Data (VCFs/BAMs) • VarSeq Projects • Annotation Data • User Preferences VSPipeline • VS-CNV • BAM/VCF => VarSeq Projects • Aggregate Variant Projects • Interpretation Catalogs • Monthly ClinVar Changes • Sample Reports Users Interact through Remote Desktop Golden Helix Deployment Diagram
  • 18.
  • 19.
    NIH Grant FundingAcknowledgments 19 • Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under: o Award Number R43GM128485-01 o Award Number R43GM128485-02 o Award Number 2R44 GM125432-01 o Award Number 2R44 GM125432-02 o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005 • PI is Dr. Andreas Scherer, CEO of Golden Helix. • The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
  • 20.
    ESHG 2022 20 Develop repeatablecancer and germline interpretation workflows that scale from panels to whole exomes and genomes Maximizing Profitability in your NGS Testing Lab Monday, June 13, 12:00-13:00 PM | Room number – 2.32 & – 2.33, level -2 Seating is limited, light refreshments will be offered Visit our booth #X5-476 • Discussions with Golden Helix team about your lab’s specific needs • See our product in action during our in-booth demos • Don’t leave ESHG without one of our famous t-shirts Attend our Corporate Satellite!
  • 21.