Automation in the Cloud with
VSWarehouse 3.0: A User’s Perspective
January 29, 2024
Presented by:
Director of Customer Success - Darby Kammeraad &
Technical FAS - Solomon Reinman
2
Automation in the Cloud with
VSWarehouse 3.0: A User’s Perspective
January 29, 2024
Presented by:
Director of Customer Success - Darby Kammeraad &
Technical FAS - Solomon Reinman
NIH Grant Funding Acknowledgments
4
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
o NIH SBIR Grant 1R43HG013456-01
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of
the National Institutes of Health.
Golden Helix at-a-Glace
5
Company Snapshot: Leading SaaS provider of tertiary genomic analysis solutions for NGS labs
Golden Helix is a SaaS bioinformatics solution provider specializing in next-gen sequencing
(“NGS”) data analysis

The Company’s software enables automated workflows and variant analysis for gene panels,
exomes, and whole genomes

Key Clinical Applications
Prenatal
testing
Hereditary disease
testing
Reproductive
testing
Oncology
Marquee Global Clients
Golden Helix’s solutions allow clients to increase throughput, ensure consistent quality,
maximize revenue, and save time

1998
Company Founded
Bozeman, Montana
Headquarters
Recognitions
Government Research
Pharmaceuticals
Agrigenomics
Testing Labs
Translational Labs
Human Genetics Research
Hospitals
Academia
Publications
Content & Resources
Pharmacogenetics
testing
6
Confidential |
NGS Clinical Workflow
Golden Helix provides comprehensive data analytics software that scales across gene panels, whole exomes, and whole genomes
DNA Extraction in Wet
Lab and Sequence
Generation
Interpretation and
Result Reporting
Primary
Read Processing and
Quality Filtering
Alignment and Variant
Calling
Secondary
*Golden Helix provides
Secondary Analysis through
a reseller agreement
Tertiary
Golden Helix’s software and
primary focus
Comprehensive
secondary and tertiary
analysis solutions for
primary data
aggregated by all
commercially available
sequencers
Type Size
Gene Panel Small (100MB)
Whole Exome Medium (1GB)
Whole Genome Large (100GB)
Cancer use case
Hereditary use case
Process Analysis
… and scales across multiple
data set sizes for cancer and
hereditary use cases
Filtering and Annotation
Data Warehousing
Workflow Automation
Golden Helix works with all major
sequencers…
Medical Device
Certification
Secured CE Mark for EU
7
• VarSeq Dx
• VarSeq Dx is designed with compliance and reliability for your
clinical analysis.
• VarSeq Dx is our flagship software, VarSeq, that is CE marked
to meet the European In Vitro Diagnostic Regulation (IVDR
2017/746) requirements. VarSeq Dx satisfies the IVDR
requirements within the European Economic Area (EEA).
• Verification
• CE MARK
• ISO Certification
• Our customers will work with our Field Application Scientist to
verify the installation and ensure proper usage of the
software. This can be used for ISO QMS software validation
documentation.
8
9
Topics for Presentation
• Deployment: Local versus cloud
• Define your constraint. Cloud needed to:
• Accommodate increasing sample throughput
• Broadening test and necessary resource (ex. panel -
> genome)
• Components of an NGS pipeline
• Many tools to construct the secondary and tertiary
pipeline
• Need for a solution to simplify setup and deployment
• Example workflow with VSWarehouse 3.0
• Recent Webcast: Bring your own Cloud: Clinical Testing at
Scale with VSWarehouse 3
• Todays presentation provides a user perspective
demonstrating the simplicity of new cloud platform
10
Growth and Success of NGS
Testing in Clinical Genomics
• NGS testing becoming go-to solution for many applications:
o Rare disease diagnosis
o Hereditary cancers and inherited disorders prognosis
o Prenatal Screening
o Carrier screening
o Cancer: Dx, Prognostics and Therapeutics
o Pharmacogenomics
• Going from single-gene to panels and now exomes/genomes
• The cost of instruments and cost-per-Gb of sequence data is
going down
• But the scale and complexity of the data analysis is going up
Data Center Infrastructure
11
Strengths
Weaknesses
Strengths
Weaknesses
• Unbounded storage: provisioned on demand
• Dynamic compute resources
• Efficient cloud-to-cloud transfer of data
• Faster onboarding and setup with less
configuration
• Data hosted and controlled by a third-party organization
• The vendor often controls the upgrade cycle in multi-tenant hosting
• Downloading required for complex (non-supported) workflows
• Less flexible, customizable, may not have integration options
• Up-front capital expenses
• Generally over-provisioning resources
• IT support is needed for maintaining solution
• Security requirements may be high bar to meet
ON-PREMISE
CLOUD
• Sovereignty of data: retained with institutional
• Control of software and hardware upgrade cycle
• Locality of data allows for easy manipulation
• May be supported by existing IT and institution
12
Poll: What do you most encounter as a local constraint?
• Limitations on local bioinformatic or IT support
• Slow network
• Limitations on computational resources (RAM, CPU)
• Increasing sample throughput
• Increasing data scale (Panels -> Genomes)
Bring Your Own Cloud
13
• Can build on the strength of both strategies
• Run Golden Helix’s existing solutions
• You control the upgrade and maintenance cycle
• Deployed on “Your Cloud”
o Direct relationship with the cloud vendor
o Direct ownership and control of the cloud-hosted data
o Match configuration to the needs of your institution
o Choose your cloud vendor and deploy to the closest data
center to your institution
o Automated deployment with no cloud expertise required
VSWarehouse 3 Supports Bring Your Own Cloud
Flexible Deployment
Deployable as Bring Your Own
Cloud: Amazon, Azure or On-
Premises
Cloud Application Streaming
Stream VarSeq, VSClinical,
GenomeBrowse, and other custom
applications
Workflow Automation
Run VSPipeline, custom workflows,
and other bioinformatics tools.
Integrations with other cloud
vendors and institutional stores
File Management
Built in file management system to
easily upload, download, and
preview files and directory
management
14
VSWarehouse 3 our complete server platform for genomic analysis
(
(
(
(
Unifying Hybrid Bioinformatics
15
• Complexity of building bioinformatic pipelines manually
o Requires expertise to integrate diverse tools and workflows
• Diverse testing creates added complexity
o Example scenario: Lab runs both Illumina DRAGEN WES and
Archer VariantPlex somatic tests
• Scaling magnifies infrastructure constraints
o Increased test volume demands robust computational
resources
• Golden Helix VSW3.0 Solution
o Preloaded pipeline designs for seamless integration
o Cloud-based option to overcome local resource limitations
o Supports hybrid workflow with accuracy and compliance
DOI:10.1038/s41598-020-77218-4
17
FASTQ
VCF
BAM
+
VarSeq variant
analysis and
reporting
Task 1. Push FASTQ data to
Archer for alignment/calling
Task 2. Pull VCF/BAM data
back to server for analysis
Task 3. Automated variant
filtering and annotation with
VSPipeline
VSWarehouse 3 Components: Archer Pipeline example
Entire workflow done with
one click
18
Product Demo
NIH Grant Funding Acknowledgments
19
• Research reported in this publication was supported by the National Institute Of General Medical Sciences of
the National Institutes of Health under:
o Award Number R43GM128485-01
o Award Number R43GM128485-02
o Award Number 2R44 GM125432-01
o Award Number 2R44 GM125432-02
o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005
o NIH SBIR Grant 1R43HG013456-01
• PI is Dr. Andreas Scherer, CEO of Golden Helix.
• The content is solely the responsibility of the authors and does not necessarily represent the official views of
the National Institutes of Health.
20
Golden Helix Free eBooks
21
Explore Our Comprehensive
Genomic eBook Library
Discover a wide range of eBooks covering the latest in genomic
analysis, from Pharmacogenetics to Cancer Variant Analysis.
These resources offer valuable insights into cutting-edge
techniques, best practices, and innovations in clinical genomics.
Whether you're focused on precision medicine, NGS-based testing,
or data warehousing, our expertly curated eBooks provide the
knowledge to support your research and clinical workflows.
Browse through our collection to stay ahead in the rapidly evolving
field of genomic medicine.
22

Automation in the Cloud With VSWarehouse 3.0: A User's Perspective

  • 1.
    Automation in theCloud with VSWarehouse 3.0: A User’s Perspective January 29, 2024 Presented by: Director of Customer Success - Darby Kammeraad & Technical FAS - Solomon Reinman
  • 2.
  • 3.
    Automation in theCloud with VSWarehouse 3.0: A User’s Perspective January 29, 2024 Presented by: Director of Customer Success - Darby Kammeraad & Technical FAS - Solomon Reinman
  • 4.
    NIH Grant FundingAcknowledgments 4 • Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under: o Award Number R43GM128485-01 o Award Number R43GM128485-02 o Award Number 2R44 GM125432-01 o Award Number 2R44 GM125432-02 o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005 o NIH SBIR Grant 1R43HG013456-01 • PI is Dr. Andreas Scherer, CEO of Golden Helix. • The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
  • 5.
    Golden Helix at-a-Glace 5 CompanySnapshot: Leading SaaS provider of tertiary genomic analysis solutions for NGS labs Golden Helix is a SaaS bioinformatics solution provider specializing in next-gen sequencing (“NGS”) data analysis  The Company’s software enables automated workflows and variant analysis for gene panels, exomes, and whole genomes  Key Clinical Applications Prenatal testing Hereditary disease testing Reproductive testing Oncology Marquee Global Clients Golden Helix’s solutions allow clients to increase throughput, ensure consistent quality, maximize revenue, and save time  1998 Company Founded Bozeman, Montana Headquarters Recognitions Government Research Pharmaceuticals Agrigenomics Testing Labs Translational Labs Human Genetics Research Hospitals Academia Publications Content & Resources Pharmacogenetics testing
  • 6.
    6 Confidential | NGS ClinicalWorkflow Golden Helix provides comprehensive data analytics software that scales across gene panels, whole exomes, and whole genomes DNA Extraction in Wet Lab and Sequence Generation Interpretation and Result Reporting Primary Read Processing and Quality Filtering Alignment and Variant Calling Secondary *Golden Helix provides Secondary Analysis through a reseller agreement Tertiary Golden Helix’s software and primary focus Comprehensive secondary and tertiary analysis solutions for primary data aggregated by all commercially available sequencers Type Size Gene Panel Small (100MB) Whole Exome Medium (1GB) Whole Genome Large (100GB) Cancer use case Hereditary use case Process Analysis … and scales across multiple data set sizes for cancer and hereditary use cases Filtering and Annotation Data Warehousing Workflow Automation Golden Helix works with all major sequencers… Medical Device Certification
  • 7.
    Secured CE Markfor EU 7 • VarSeq Dx • VarSeq Dx is designed with compliance and reliability for your clinical analysis. • VarSeq Dx is our flagship software, VarSeq, that is CE marked to meet the European In Vitro Diagnostic Regulation (IVDR 2017/746) requirements. VarSeq Dx satisfies the IVDR requirements within the European Economic Area (EEA). • Verification • CE MARK • ISO Certification • Our customers will work with our Field Application Scientist to verify the installation and ensure proper usage of the software. This can be used for ISO QMS software validation documentation.
  • 8.
  • 9.
    9 Topics for Presentation •Deployment: Local versus cloud • Define your constraint. Cloud needed to: • Accommodate increasing sample throughput • Broadening test and necessary resource (ex. panel - > genome) • Components of an NGS pipeline • Many tools to construct the secondary and tertiary pipeline • Need for a solution to simplify setup and deployment • Example workflow with VSWarehouse 3.0 • Recent Webcast: Bring your own Cloud: Clinical Testing at Scale with VSWarehouse 3 • Todays presentation provides a user perspective demonstrating the simplicity of new cloud platform
  • 10.
    10 Growth and Successof NGS Testing in Clinical Genomics • NGS testing becoming go-to solution for many applications: o Rare disease diagnosis o Hereditary cancers and inherited disorders prognosis o Prenatal Screening o Carrier screening o Cancer: Dx, Prognostics and Therapeutics o Pharmacogenomics • Going from single-gene to panels and now exomes/genomes • The cost of instruments and cost-per-Gb of sequence data is going down • But the scale and complexity of the data analysis is going up
  • 11.
    Data Center Infrastructure 11 Strengths Weaknesses Strengths Weaknesses •Unbounded storage: provisioned on demand • Dynamic compute resources • Efficient cloud-to-cloud transfer of data • Faster onboarding and setup with less configuration • Data hosted and controlled by a third-party organization • The vendor often controls the upgrade cycle in multi-tenant hosting • Downloading required for complex (non-supported) workflows • Less flexible, customizable, may not have integration options • Up-front capital expenses • Generally over-provisioning resources • IT support is needed for maintaining solution • Security requirements may be high bar to meet ON-PREMISE CLOUD • Sovereignty of data: retained with institutional • Control of software and hardware upgrade cycle • Locality of data allows for easy manipulation • May be supported by existing IT and institution
  • 12.
    12 Poll: What doyou most encounter as a local constraint? • Limitations on local bioinformatic or IT support • Slow network • Limitations on computational resources (RAM, CPU) • Increasing sample throughput • Increasing data scale (Panels -> Genomes)
  • 13.
    Bring Your OwnCloud 13 • Can build on the strength of both strategies • Run Golden Helix’s existing solutions • You control the upgrade and maintenance cycle • Deployed on “Your Cloud” o Direct relationship with the cloud vendor o Direct ownership and control of the cloud-hosted data o Match configuration to the needs of your institution o Choose your cloud vendor and deploy to the closest data center to your institution o Automated deployment with no cloud expertise required
  • 14.
    VSWarehouse 3 SupportsBring Your Own Cloud Flexible Deployment Deployable as Bring Your Own Cloud: Amazon, Azure or On- Premises Cloud Application Streaming Stream VarSeq, VSClinical, GenomeBrowse, and other custom applications Workflow Automation Run VSPipeline, custom workflows, and other bioinformatics tools. Integrations with other cloud vendors and institutional stores File Management Built in file management system to easily upload, download, and preview files and directory management 14 VSWarehouse 3 our complete server platform for genomic analysis ( ( ( (
  • 15.
    Unifying Hybrid Bioinformatics 15 •Complexity of building bioinformatic pipelines manually o Requires expertise to integrate diverse tools and workflows • Diverse testing creates added complexity o Example scenario: Lab runs both Illumina DRAGEN WES and Archer VariantPlex somatic tests • Scaling magnifies infrastructure constraints o Increased test volume demands robust computational resources • Golden Helix VSW3.0 Solution o Preloaded pipeline designs for seamless integration o Cloud-based option to overcome local resource limitations o Supports hybrid workflow with accuracy and compliance DOI:10.1038/s41598-020-77218-4
  • 16.
    17 FASTQ VCF BAM + VarSeq variant analysis and reporting Task1. Push FASTQ data to Archer for alignment/calling Task 2. Pull VCF/BAM data back to server for analysis Task 3. Automated variant filtering and annotation with VSPipeline VSWarehouse 3 Components: Archer Pipeline example Entire workflow done with one click
  • 17.
  • 18.
    NIH Grant FundingAcknowledgments 19 • Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under: o Award Number R43GM128485-01 o Award Number R43GM128485-02 o Award Number 2R44 GM125432-01 o Award Number 2R44 GM125432-02 o Montana SMIR/STTR Matching Funds Program Grant Agreement Number 19-51-RCSBIR-005 o NIH SBIR Grant 1R43HG013456-01 • PI is Dr. Andreas Scherer, CEO of Golden Helix. • The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
  • 19.
  • 20.
    Golden Helix FreeeBooks 21 Explore Our Comprehensive Genomic eBook Library Discover a wide range of eBooks covering the latest in genomic analysis, from Pharmacogenetics to Cancer Variant Analysis. These resources offer valuable insights into cutting-edge techniques, best practices, and innovations in clinical genomics. Whether you're focused on precision medicine, NGS-based testing, or data warehousing, our expertly curated eBooks provide the knowledge to support your research and clinical workflows. Browse through our collection to stay ahead in the rapidly evolving field of genomic medicine.
  • 21.

Editor's Notes

  • #1 Casey’s intro
  • #3 Thank you for the introduction Casey and were glad everyone could attend. The focus of our presentation is to follow up with some of our latest feature developments with our VSWarehouse platform but consider deployment options in the cloud vs on prem.
  • #4 Before we get started, I want to first express our appreciation for grant funding from the NIH. The research and development efforts for a number of our software capabilities have been supported by the National institute of general medical sciences of the national institutes of health under the listed awards, as well as local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix. I’ts worth mentioning that the content described today is the responsibility of us the authors and does not officially represent the views of the NIH.
  • #5 Let’s take a moment to also provide some background one who we are as a company. Golden Helix has been developing bioinformatics software for over 25 years and is based out of Bozeman Montana. We have been serving a global customer base over that time and development began by providing research focused software for array-based analysis. However, there was an eventual shift to focus on next generation sequencing applications which has resulted in GoldenHelix now positioned as a market leaders in NGS clinical and research applications. Our tertiary software solutions are scalable for routine work from gene panels up to whole genomes and are automated to facilitate high throughput operations where large numbers of samples are being processed. This combined with our subscription-based business model, allows users to freely process an unlimited number of samples as needed, without the concerns about scaling costs that would be experienced with most per sample applications on the market. The tests designed in our software are flexible and user defined, spanning a wide spectrum of applications, including somatic workflows for oncology-based analyses, germline workflows for hereditary cancer, inherited and rare diseases, prenatal testing, carrier screening, family based analysis, and pharmacogenomics. Taking advantage of these capabilities is our wide-spread global customer base – our users span government and testing labs, hospitals, universities, and many research and pharmaceutical labs. Our communication with our customers informs our software development as we aim to stay abreast of the most important features to develop, edge cases and different data types that we can support, and through this partnership our software has been regularly cited in reputable scientific journals, which is a testament to the work of our customer base.
  • #6 Now that we have discussed some uses cases for our software, let’s review where our tools fit into the bigger picture of an NGS Workflow. Generally speaking, the NGS workflow is divided into 3 stages where primary analysis encompasses everything from sample collection to sequencing, secondary analysis describes the processes for read alignment and variant calling, and the tertiary stage is where variant evaluation and reporting take place. VarSeq is a tertiary analysis tool that is designed to be agnostic to upstream sequencing platforms and secondary analysis pipelines, which means we accept NGS variant calling and alignment files from the various platforms and pipelines that are commonly used, provided that these adhere to standard VCF and alignment file formats. These upstream pipelines include Illumina and ThermoFisher, some of the emerging sources like MGI and we accommodate PacBio and Oxford Nanopore long read technologies. We also have a long-standing partnership with Sentieon to provide labs with a secondary analysis solution as needed. VarSeq is one of the few platforms that can handle the range of variant calling outputs, tackling both short and long read data, and scaling from small targeted panels up to the complete whole genomes which results in an increase in computational and storage demands which is relevant to today’s topic. The graphical user interface of VarSeq serves as the front end for variant annotation and filtering, as well as clinical interpretation and reporting for small variants, CNVs and fusions. However, we couple this GUI with our command-line-interface workflow automation tool - VSPipeline - for higher throughput processing for each component of tertiary analysis. Lastly, we provide robust data warehousing solutions via VSWarehouse which serves as a repository for aggregating and storing variant frequency data from your own cohorts over time. Warehouse facilitates efficient data management and enables easy retrieval of variant assessments or interpretations that can be applied to a growing cohort of samples and historically has been deployed locally in your environment.
  • #7 Each of the applications we just discussed has been diligently developed by our team here at Golden Helix, and we adhere to a highly structured and thoroughly documented manufacturing process. As a result of this quality commitment Golden Helix is now an ISO 13485 certified medical device manufacturer as of January 2024, and is a CE marked medical device under IVDR as of April 2024 . This certification holds significant value for laboratories seeking their own ISO certification and IVDR compliance, especially those within the EU or those processing European samples, as the software can more easily be incorporated into a lab’s quality management system. Our certification and continued adherence to a rubost QMS assures reproducibility of our quality specifications and manuals thus simplifying the validation process for any lab using current and future versions of our software. It is important to note that VarSeq is not CE marked for users by default – if a user desires to use VarSeq as CE marked medical device we have developed VarSeq Dx Mode which is available in VarSeq 2.6.1 and all future versions. When implementing this feature, we have a certification process which users must complete, and our support staff is ready to guide you accordingly through our user onboarding, installation and verification and proficiency certification processes.
  • #8 Today we’re going to demonstrate a couple layers of software interface, one being a higher level view on NGS workflow deployments which Solomon will take you through, and the other is our variant analysis UI in VarSeq. The analytical process in VarSeq can be separated into three steps. Step 1 is to import the full list of variants, SNPs and indels, CNVs and other Structural Variants from both long and short read pipelines. This step will actually automated and prepared for me in advance as I’ll play the role of clinical user focused on variant reporting. Step 2 is where I’ll more or less begin analysis and serves as the variant evaluation process following the integrated ACMG and AMP guidelines for germline and somatic workflows Step 3 is then the streamlined process of creating the clinical report of findings all with the click of a single button. So now lets direct our focus more specifically to today’s topic.
  • #9 The purpose of today’s webcast is to demonstrate some latest features of the GoldenHelix software stack. But at the same time we encourage our audience to consider their current deployment architecture. One important aspect of our latest features is defining the potential need to move from a local architecture to the cloud. There may be more than one reason that justifies your need for cloud which we will cover. At the same time, there is a need to breakdown what components makeup an NGS pipeline so to demonstrate the value of VSWarehouse3.0 at simplifying the deployment of commonly used pipelines and ultimately streamlining the user experience for both bioinformatics and variant analysis. Our VP of product development Gabe Rudy introduced us to these new capabilities on a recent webcast. You can access his webcast and many others from our Golden Helix site, so I encourage you to explore all the available content. Today, we will reiterate on some of the talking points from Gabe presentation but orient the audience to a hypothetical usecase from the users perspective.
  • #10 Lets kick things off by address some high level consideration for upgrading the underlying resources necessary for any NGS pipeline. First off, the spectrum of reportable tests is rapidly expanding. Historically, reportable outcomes may only have fallen under the scope of a targeted hereditary cancer panels as just an example. These panels are still commonly utilized but many of our users are expanding to whole exome and genomes tests that broaden the diagnostic yield for various tests such as carrier screening, family based analysis, and even pharmacogenomics. So not only do you need a tool like VarSeq that gives you the freedom to build these unique workflows, but deploy the tools in an environment that allows you to run them in an consistent and timely manner. At the same time, the cost of the genome is regularly decreasing, only increasing its potential as the standard input for many of these listed tests for an ever increasing number of samples. Overall, both the scale and analytical complexity is increasing which requires us to make strategic decisions early on how to handle the associated computation burden. I’d like to hand things over to our Technical Field Application Scientist Solomon Reinman to take us through the Pros and Cons of deployment strategies and what our latest developments aim to simplify for our users.
  • #11 Thanks for the overview, Darby. We’re excited about the growth we’re seeing in the industry and among our clients. As we’ve seen demand for higher throughput continue to rise, we’ve striven to create flexible, scalable solutions with the same framework that our users have come to cherish. A fundamental shift that we’ve sought to accommodate is seamless integration with the cloud, while maintaining the building blocks that make the Golden Helix software suite unique in the market. Let’s briefly touch on the strengths and weaknesses of cloud versus on-premise deployment, bearing in mind that while we’ll be showing off some of our cloud-focused capabilities today, we still absolutely support and prioritize being just as competitive on-premise. So why bring cloud into the picture at all? One of the key components is growth. Cloud deployments provide unbounded, on-demand compute and storage resources, allowing institutions to ramp up production as demand increases without changing their infrastructure. Cloud also generally allows for a more rapid, flexible deployment and requires less configuration by IT and bioinformatics teams. Of course, there is a trade-off. While there are options for ultra high-security cloud deployments, the cloud paradigm still requires users to allow third-party organizations to host data. Furthermore, costs can be less predictable than with an on-premise deployment and may not support the same level of customizability in terms of workflow development and infrastructure. Conversely, on-premise deployments offer a concrete upfront cost, complete data sovereignty and control over the software and hardware upgrade cycle, and, for new customers, can sometimes be supported by and integrate directly with existing on-premise infrastructure. This comes at the cost of less flexible resource allocation and generally higher demands on IT. Choosing between the cloud and on-premise deployment is a big decision, and we’re proud to support both options, or even a mix of the two.
  • #12 Having discussed the pros and cons of cloud deployments, we’re eager to hear from you. We’ve got a poll that we’d love for you to give some thought: if you are currently running a local deployment for your NGS workflow, what are some constraints that are affecting your ability to meet demand in an efficient manner? Feel free to chime in to our poll and provide some feedback.
  • #13 Customize resources and lifecycle Currently (AWS, Azure) With some of the disadvantages of on-premise solutions fresh in our minds, let’s touch on how we can make the most of cloud deployments. Our solution to enabling users to flexibly and efficiently deploy the Golden Helix software suite is what we call the “Bring Your Own Cloud” strategy. A cornerstone of our development and customer service philosophy has always been enabling our users to meld our software to fit their needs. The Bring Your Own Cloud model exemplifies this by pairing our highly customizable software with the freedom for users to define their own cloud environment. In this model, users have a direct relationship with the cloud vendor of their choice, maintaining direct ownership of cloud-hosted data and having full jurisdiction over resource utilization and upgrade and maintenance cycles. Meanwhile, we’ve developed VSWarehouse 3 to be deployable without any cloud expertise.
  • #14 So what is VSWarehouse 3, and how does it support the Bring Your Own Cloud model? Simply put, VSWarehouse 3 is a hub that handles workflow automation, file management, cloud application streaming, and workflow automation within the framework of a flexible deployment. The intricacies of file management, whether that be locally on premise, coordinated across multiple servers, hosted on various cloud storage platforms, or a combination of all of the above, are abstracted from the user for a seamless file management experience. Similarly, workflow automation is managed behind the scenes with an intuitive infrastructure handling file management, API calls, and locally- and cloud-run pipeline components. Lastly, open-ended application streaming allows for hosting VarSeq for end-users to analyze clinical results alongside VSCode to get under the hood and manage workflow and file configurations, all in the same hosted environment. What can we accomplish with all the tools in the VSWarehouse 3 toolkit?
  • #15 Building and maintaining bioinformatic pipelines is a complex task. Users need to understand both the intricacies of a given workflow as well as the computational resources and infrastructure needed to run it efficiently and reliably. Furthermore, many labs utilize disparate pipelines for various tests and may need to nimbly offer new tests as demand flows in different directions. For instance, consider a lab that runs an Illumina pipeline using BaseSpace and DRAGEN for whole-exome data alongside Archer’s VariantPlex platform for somatic tests. While the result of both pipelines can be a VarSeq project for tertiary analysis, they require separate secondary analysis and data transfer steps dealing with distinct APIs and sample management. In VSWarehouse 3, not only can users host both pipelines under the same infrastructure, but they can dynamically allocate resources to both pipelines in tandem and respond to varying throughput. Nor are these examples exhaustive. While we host some standard common workflows that users can run right off the bat, virtually any bioinformatic pipeline can be brought under the umbrella of VSWarehouse 3. This entails not only running them seamlessly in the VSWarehouse 3 environment but integrating them directly with our built-in workflow management and app streaming.
  • #16 Building and maintaining a bioinformatics pipeline is a highly complex task, requiring a deep understanding of various tools, workflows, and infrastructure needs. Labs face significant challenges when managing multiple unique pipelines, especially as testing volumes increase and workflows diversify. Take, for example, a laboratory that utilizes Illumina's DRAGEN secondary analysis for whole-exome sequencing tests while simultaneously running Archer's VariantPlex platform for somatic workflows. Each of these workflows demands its own specialized pipeline, which adds layers of complexity in integration, optimization, and infrastructure management. This complexity is further amplified by the need to scale, as labs must accommodate increasing test volumes without overburdening local computational resources. For many labs, the limitations of on-premises infrastructure create bottlenecks, restricting growth and efficiency. Golden Helix VSWarehouse 3.0 addresses these challenges by providing preloaded designs for diverse bioinformatic pipelines, offering seamless integration of hybrid workflows. By computing these pipelines in the cloud, VSWarehouse 3.0 eliminates local infrastructure constraints, ensuring labs can scale efficiently while maintaining accuracy and regulatory compliance. This solution empowers labs to focus on delivering clinical insights rather than being bogged down by technical and logistical hurdles.
  • #17 Part of the seamless environment is breaking workflows down into tasks, which can be visualized and run independently. The user interface for running these tasks is abstracted to the point where bioinformaticians can take a step back and let end users intuitively schedule, run, and view the results of hosted pipelines. As an example, the Archer VariantPlex pipeline can be broken down into three tasks: pushing FASTQ data to Archer’s secondary analysis platform, pulling the generated VCF and BAM data back to the VSWarehouse 3 environment, and parsing this data out into a VarSeq project for the end user to perform tertiary analysis. Once the pipeline is set up with the correct credentials and user-defined inputs, a run can be kicked off at the click of a button or scheduled to run automatically, all within a simple but powerful user interface.
  • #18 Without further ado, let’s jump into the VSWarehouse 3 interface and walk through some of the capabilities we’ve been discussing.
  • #19 Before we start diving into the subject, I wanted mention our appreciation for our grant funding from NIH. The research reported in this publication was supported by the National institute of general medical sciences of the national institutes of health under the listed awards. We are also grateful to have received local grant funding from the state of Montana. Our PI is Dr. Andreas Scherer who is also the CEO at Golden Helix and the content described today is the responsibility of the authors and does not officially represent the views of the NIH. So with that covered, Before diving into today's topic, I'd like to offer some background and context on what Golden Helix brings to the table as a company