In this webcast, we provide an overview of our complete end-to-end clinical stack. Initially, we walk through our powerful secondary analysis pipeline which allows you to call SNVs and CNVs. We then demonstrate how various types of CNVs are called and discuss metrics that express the confidence associated with each call.
From there, we show you our powerful tertiary analysis capabilities for gene panels, exome, and whole genome data. We show how our users can move seamlessly from the variant interpretation stage to a clinical report. Lastly, we demonstrate how our genetic data warehouse, VSWarehouse, can be used in the clinic. We also demonstrate various use cases and show how a comprehensive assessment catalog can be utilized to ensure consistent analysis across multiple labs.
We hope you enjoy our first presentation on Golden Helix's entire end-to-end solution for clinical labs!
1. Golden Helix’s End-to-End Solution for Clinical Labs
Steven Hystad - Field Application Scientist
Nathan Fortier – Senior Software Engineer
20 most promising
Biotech Technology
Providers
Top 10 Analytics
Solution Providers
Hype Cycle for
Life sciences
2. Use the Questions pane in your
GoToWebinar window
Questions during
the presentation
3. Golden Helix
Golden Helix is a global bioinformatics
company founded in 1998.
GWAS
Genomic Prediction
Large-N-Population Studies
RNA-Seq
Large-N CNV-Analysis
Variant Warehouse
Centralized Annotations
Hosted Reports
Sharing and Integration
Variant Calling
Filtering and Annotation
Clinical Reports
CNV Analysis
Pipeline: Run Workflows
6. Golden Helix – Who We Are
When you choose a Golden Helix solution, you get more than just software
REPUTATION
TRUST
EXPERIENCE
INDUSTRY FOCUS
THOUGHT
LEADERSHIP
COMMUNITY
TRAINING
SUPPORT
RESPONSIVENESS
INNOVATION and
SPEED
CUSTOMIZATIONS
9. Why Sentieon? – Pipeline Improvements
Why Improve Broad’s GATK Haplotype Caller?
GATK Haplotype Caller is the most accurate DNA analysis tool
Problem: GATK Haplotype Caller is too Slow
Solution
Consistent: No down-sampling or thread dependency = no run-to-run differences = More accurate
No Javascript: low level C
More Robust: Can handle joint genotyping on over 100K samples simultaneously without needing a gVCF
Software-only Solution: Easily deployable, easily scalable, and easily upgraded.
Two Product Lines
DNAseq – BWA-GATK Haplotype Caller
Tnseq – MuTect and MuTec2
11. Proven Accuracy – Precision FDA
Consistency Challenge
- Processed single sample consistency
- Single biological sample x two library preps
- Top overall performance
- Highest reproducibility
Truth Challenge
- Determine accuracy in Truth Set
- Highest indel precision
- Highest SNP recall
14. CNV and LoH Detection
VarSeq Calls CNVs on
- Gene Panel Data
- Exome Data
- Whole Genome Data
Faster than CMA and MLPA
Utilizes existing coverage data
Calls CNVs of every size
- Small Single Exon Events
- Large Cytogenetic Events
15. CNV Detection via NGS
CNVs are called from coverage
data
Challenges
- Coverage varies between samples
- Coverage fluctuates between targets
- Systematic biases impact coverage
Solutions
- Data Normalization
- Reference Sample Comparison
16. CNV calling in VarSeq
Reference samples used for normalization
Metrics
- Z-score: number of standard deviations from reference
sample mean
- Ratio: sample coverage divided by reference sample
mean
- VAF: Variant Allele Frequency
For Gene Panels and Exomes
- Probabilistic model used to call CNVs
- Segmentation identifies large cytogenetic events
For Whole Genome Data
- Targets segmented using Z-scores
- Events called based on Z-score and Ratio thresholds
18. VarSeq Suite
Variant annotation, filtering,
and interpretation
Rich visualizations with
GenomeBrowse built-in
Powerful GUI and
command-line interfaces
Repeatable workflows
VarSeq
Simple
Flexible
Scalable
19. Example Workflows
Illumina TruSight Cancer Sequencing Panel
- Oncogenes & Tumor Suppressor Genes
- Drug Targeting Information & Ongoing Clinical Trials
Exome Trio Workflow
- Casual Variants Associated with Phenotype
- Incidental Findings
Hereditary Risk Sequencing Panel
- Comprehensive coverage of 175 genes with known associations to inherited cardiac
conditions, cancers, and other inherited diseases.
20. Sample Data for Cancer Gene Panels
Illumina TruSight Cancer Sequencing Panel
- Comprehensive coverage of 154 genes designed to target exons of key tumor
suppressor genes and frequently cited oncogenes.
- BAM and VCF files for each replicate are available
VSReports
24. VSWarehouse – Variant Warehouse Server
A place to archive full VCFs of every
sequenced sample
Centralized genomic data hosting,
integration with other systems
Ask the Variant Warehouse:
- Have I ever seen this variant in my
previous test samples?
- At what frequency? (counts as well)
- Does this gene contain other rare variants
in my cohort?
- Did I provide a pathogenicity assessment
for this variant? Has that changed?
- Has ClinVar changed since that
assessment was initially made?
- Have I put this variant into a clinical report
for any previous samples?
25. Integration with VarSeq
VarSeq can “Export to Warehouse”
- Create a new Project
- Current project template is used
- Add Variants to Existing (data only)
Annotate with Warehouse Counts
- Allele / Genotype counts computed on
warehouse variants (or other sources in
template).
Remote Assessment Catalogs
- Customize databases of variant classifications
or internal flags like false-positives.
Remote Reports
- Central versioning of report
- Queryable database of all saved reports, with
all data in form ready for EMR/LIMS
integration
26. VSWarehouse – Centralized Collaboration
Project A
Project B
VSWarehouse Server
Group A Group B Collaborators