Bioinformatics
Emphasis on “Production”
Chris Dwan, SVP Production Bioinformatics
Bio-IT World, 2022
Chris Dwan
2000 - 2004: Jumped to
Genomics, started frantically
learning biology, built my first
HPC system
Including petabyte scale
storage at NASA in 2007
2004 - 2011: Built and led
BioTeam’s consulting team
2011 - 2014: Built IT organization
and infrastructure for NYGC
2014 - 2017: Led research
computing and IT through a
data / cloud transformation.
2017 - 2020: Independent Consulting
Late 90’s: Built AI
systems for Military R&D Built a bunch of
systems for science
(mostly genomics)
iNquiry
2020: Built and led production
bioinformatics at Sema4
Computer Science
& Engineering
Domain Expertise Matters
“Bioinformatics is full of pitfalls for those who look for patterns or
make predictions without a thorough understanding of where biological
data come from and what they mean”
Nevin Young – Distinguished McKnight Professor at UMN
“People who are more than casually interested in computers should
have at least some idea of what the underlying hardware is like.
Otherwise the programs they write will be pretty weird.”
Donald Knuth – Professor Emeritus at Stanford
Sema4
4
A patient-centered health intelligence company dedicated to
advancing healthcare through data-driven insights.
Sema4 is transforming healthcare by applying artificial intelligence
(AI) and machine learning to multidimensional, longitudinal
clinical and genomic data to build dynamic models of human
health and defining optimal, individualized health trajectories.
Centrellis®, our innovative health intelligence platform, is enabling
us to generate a more complete understanding of disease and
wellness and to provide science-driven solutions to the most
pressing medical needs.
We believe that patients should be treated as partners, and that
data should be shared for the benefit of all
~50PB data
Genomic Testing at Sema4
5
Comprehensive molecular profiling
insights to help providers identify
therapies and clinical trials for their
patients today, and take advantage of
the therapies and trials of tomorrow.
One of the most comprehensive
carrier screens available, providing
accurate, actionable insights into
carrier status for a broad range of
hereditary conditions to help
patients make more informed
pregnancy planning choices.
Comprehensively screens for
common chromosome aneuploidies,
sex chromosome aneuploidies, and
microdeletion syndromes, with the
flexibility to order select
components of the test.
Detects 193 childhood conditions, many
of which can’t be detected by carrier
screening, standard prenatal tests, or
state newborn screening alone. Also
includes a pharmacogenomic (PGx)
analysis of a child’s response to more
than 40 medications that may be
prescribed during childhood, including
common antibiotics
Offers a menu of panels to help you
and your patients understand their
individual risk for developing cancer
or to inform treatment decisions.
These tests are enabled through a
set of digital tools and services to
support easy identification,
ordering, resulting, counseling, and
access to testing.
We are hiring: https://sema4.com/careers
Production Bioinformatics (aka “Data Ops”)
Chief Information Officer
Production Bioinformatics
Chief Data Officer
Chief Technology Officer Chief Health Informatics Officer
Production Data:
• LIMS / Lab Informatics
• Off instrument, to cloud, through analysis pipelines
• Delivered to downstream systems
• FAIR (Findable, Accessible, Interoperable, and Reusable) for R&D
Production Informatics:
• Support for lab validations, verifications, process equivalencies
• Sample tracking / status
How we work:
• 24/7 front-line operations team
• Teams for data engineering, lab informatics,
analysis & methods, and portfolio / quality.
Adjacencies:
• Clinical
• Product
• R&D
• Lab Operations
Chief Information
Security Officer
Genomic Solutions
The support model changes as a process matures
8
Research and Development
R&D Research Development
Embed software quality analysts
and software testers
R&D Research creates
prototype capabilities
Development creates re-
usable tools and processes.
Production delivers
correct / complete data
Research
Academic Style Research
produces knowledge,
insights, and manuscripts.
Offer both bioinformatics and
computational biology support
Embed experienced
computational biologists
Production
Less structure, more
experience / seniority
More structure, easier on-
ramp for early career folks
Software and Data
From Sample to Report (Simplest Version)
Receive Sample(s) Lab Process Data Process Clinical Review
Variant Annotation Deliver Report
Confirmation and QC
From Sample to Report (moderately complex version)
11
Sequencer
Flowcell
Samples Extracted DNA Capture
Molecular
Barcodes Pooling / Multiplex
Run Folder (BCL)
~1TB per S4 flowcell
Reads (FASTQ)
Many files per sample
~1TB per S4 flowcell
Aligned Reads (BAM)
Few files per sample
~1TB per S4 flowcell
QC: Sample Level
Alignment
Variants (gVCF)
Few files per sample
Significantly smaller files (~2% of BAM)
QC: Batch
Variant Calling
Continued
Below …
Continued
From Above
…
Biorepository Ancillary and
Confirmatory Assays
Lab Ops
Data Ops
QC: Run Level
On-capture
variants
Continued
From Above
…
Continued
Below …
Masking
In-test variants
(only variants that
were ordered by a
physician)
Annotation
Annotated
Variants
Post
Analytical
Confirmed
Variants
Run Folder (BCL)
~1TB per S4 flowcell
QC: Lane level
Reporting
ROI QC &
Confirmation
Clinical
Review
Reportable
Variants
Demultiplex
Samples are received at the lab as
blood, saliva, biopsy or other tissue.
DNA is extracted from the sample.
Undesired parts of the genome are washed
away, leaving us with targeted regions.
The DNA fragments are tagged
with a molecular “bar-code”
Unused sample is
stored for 60 days
Extracted DNA is
sent for biobanking.
Pools are loaded onto
Illumina flowcells.
Flowcells are run on
Illumina instruments.
Runs (BCL)
Data are uploaded to
the Amazon cloud.
The run folder is demultiplexed into individual
reads
FASTQ
Reads are mapped or aligned to a
reference genome.
BAM
VCF
The BAM is analyzed to call genetic variants.
These are stored in the variant call file (VCF).
The VCF is masked so that the curators only
see genes that were ordered by a physician
Masked VCF
Identity
Annotations
Curators assign text narratives to variants,
describing their likely clinical impact.
Ancillary
QC: Laboratory Directors review data, order
confirmatory assays, and export reportable variants.
Quality Control
Confirmatory
Ancillary Assays
detect variants not
accessible to NGS.
The tagged DNA is then mixed
(“pooled” or “multiplexed”)
Reporting
Reporting and Sign Out: Laboratory
Directors create and finalize the report
Confirmatory Assays support
(or disprove) variants with
marginal evidence
Identity Assays
Detect sample swaps
From Sample to Report (most complex version)
Clinical Systems for Business, Lab, and Data (high level)
13
LIMS
Individual
Samples
Batches of
Samples
LIS
Portal Orders
Client
Onboarding
Status
updates
Billing
Clinical
Reports
Lab Data Lake
Data Delivery
(external)
3rd Party Data
Data Delivery
(internal)
Accessioning
Digital
Accessioning
EMR Orders
Paper
Orders
Manifests
The Lab Information System (LIS) is
the system of record for business logic
like test ordering and report delivery
The Lab Information
Management System (LIMS)
drives laboratory workflows
and is the system of record for
samples and assays.
The Lab Data Lake is the
indexed repository of all
instrument and derived data
“Message Bus”
Samples
Orders
Colocation
Lab to Cloud and Primary Analysis
Branford Lab
Stamford Lab
Cheetara Snarf
Panthro
Drax
Data streams from lab to
colocation site over a high-
performance private network.
Electro Iris
Titan Ultron Watson
Zeus
Tygra Lion-O
Kit
Kat
AWS Cloud
AWS Direct Connect
Lambda functions trigger demultiplexing,
alignment, and variant calling.
Metadata for all lab and analytical results
are indexed in a MongoDB data lake
Instrument
Buckets (BCL)
Clinical & R&D
Buckets (FASTQ)
Data Lake
Data is cached at the colocation site
and streamed to the cloud concurrent
with the instrument runs
High performance
private network
Near future: DRAGEN accelerator boards from Illumina to
perform accelerated demultiplexing, alignment, and variant
calling at the colocation site
Reduces network burden and cloud spend
Enables rapid QC (~1 hour post run completion)
Commodifies instrument adjacent work
This will reduce me from 99.9% cloud to a mere 99% cloud.
Test
Nonclinical
Lab Local
Storage
Sequencer
Bucket
Cloud Upload: Custom
software copies files from
lab local storage to
Amazon’s cloud.
Clinical
Demultiplexing: Custom
software demultiplexes and
moves moves the FASTQ and
other sample-level files into
appropriate locations
Bioinformatics Sample
Processing & Management Job
dispatch (“Gondor”) and sample
tracking database (“Zion”)
triggers analysis tasks
Modern / Cost-effective: “Valinor”
Workflows in WDL, custom code in docker, executed by Cromwell.
Advantage: Separate workflow logic from underlying code.
Highly available / durable: “Nexus”
Commercial platform for executing WDL workflows
Advantage: Platform & Engineering Support, profiling & optimization
Legacy: “Mordor”
Monolithic docker image, triggered by AWS Lambda.
Advantage: Got us to the cloud
Gondor: Bioinformatics Pipeline Management
Gondor: Routes data and tracks /
triggers workflows based on open
requests and available data.
LIS
LIMS
Lab Data Lake
Bioinformatics Infrastructure (99% cloud)
16
Legacy Pipelines
NGS
Demux
Upload
Legacy pipelines are
monolithic docker containers
that run on raw AWS.
Modern Pipelines
Masking Service
The Masking Service reduces
VCFs down to those that were
ordered for a particular report
GermLine Dashboard
NIPT Dashboard
Oncology Suite
Variant Curation&
Annotation
Confirmatory
Assays & Re-work
Certain legacy
processes still rely
on file drops.
Gondor tracks orders
& data to trigger (and
re-trigger) pipelines
Modern pipelines are
coded in WDL and run using
Cromwell or DNA Nexus
The Lab Data Lake is a
large collection of objects
stored in S3 and indexed in
MongoDB
Reporting
Dashboards allow the clinical
team to track results perform
QC and confirmation, and
order re-work as necessary.
The data platform
The work of building out an integrated data
platform is grinding and detail-oriented
• It starts with executive sponsorship and
resourcing
• It insists on clear, shared ontologies and data
models
• It is built around committees of stakeholders
• It rests on governance.
Good Data Starts With Strong Definitions
18
Definitions must be detailed
Time zones and start points for days and
weeks are essential
Queries must be under version control
Limit redundant views of the “same” data
Multiple sources they must agree
Data hygiene is a practice, not a project
The data committee meets twice a week
Track KPIs and metrics station by station
Good metrics apply at many levels
The overall turnaround time should be
the sum of the component turnaround
times.
Post Analytical Dashboards
20
User interfaces need to change at scale …
21
People
22
23
Production Bioinformatics
Amateurs practice until they can play
it right
Professionals practice until they
cannot play it wrong.
How We Work: Transparency and Accountability
Production meetings twice a week
• Leads and managers required
• Everybody at Sema4 welcome
• Minutes on department wiki
Priorities:
• Current clinical samples (patients waiting)
• Commercial / research samples (client waiting)
• TAT / COGS improvements
• R&D obligations
The “must inform” principle
• Broccoli in the teeth? Anything askew?
• Boss seems factually incorrect?
• Something not working / going to break?
How do you make it hard to play wrong? (DevOps / CI version)
• All infrastructure is defined by version-controlled code
• All infrastructure is deployed entirely from code without manual intervention
• If the environment wanders, our first reaction is to redeploy.
• Dev/test/prod must be identical. No manual changes whatsoever.
• All software is containerized. The container is the delivery is the artifact.
• Containers are built once, then promoted through environments - not rebuilt
• Commits that fail unit tests are rejected, commits that fail integration tests are
rolled back.
• Never change prod manually. Not even once. Not even if it’s on fire.
• Monitoring is ubiquitous. All systems have observability hooks.
• Retro everything. Root cause everything. Write everything down.
Don’t set traps for the next shift!
It timed out:
• Rather than increasing the timeout
• Figure out why it is slowing down.
It needed a reboot:
• Rather than waiting for that to happen again
• Schedule automatic reboots or periodic
maintenance.
It ran out of memory
• Okay fine, increase RAM on the VM but also
• Fix the memory leak
The configs were out of sync:
• Sure, get them in sync, but also
• Pick one as the source of truth and stop using the
other
It’s just a little change:
• This is a validated environment
• Also, Code Freeze Friday is a thing
No alerts or warnings!
• It’s quiet. Too quiet
• Build better, more sensitive alerts
“Perception of the risk associated with an activity often
decreases over a period of time when no losses occur even
though the real risk has not changed at all. This misperception
leads to reducing the very factors that are preventing accidents.”
Nancy Levinson - “Technical and Managerial Factors in the
NASA Challenger and Columbia Losses”
Closing Thoughts
• It’s an amazing time to be in this field.
• The next few years are going to be transformative.
• We will forge practical, data-driven connections
between clinical care and genomic / biomedical
research – leading to improved health outcomes and
lower costs.
• Let’s build it together: https://sema4.com/careers
• I’m interested in your thoughts:
chris.dwan@sema4.com

Production Bioinformatics, emphasis on Production

  • 1.
    Bioinformatics Emphasis on “Production” ChrisDwan, SVP Production Bioinformatics Bio-IT World, 2022
  • 2.
    Chris Dwan 2000 -2004: Jumped to Genomics, started frantically learning biology, built my first HPC system Including petabyte scale storage at NASA in 2007 2004 - 2011: Built and led BioTeam’s consulting team 2011 - 2014: Built IT organization and infrastructure for NYGC 2014 - 2017: Led research computing and IT through a data / cloud transformation. 2017 - 2020: Independent Consulting Late 90’s: Built AI systems for Military R&D Built a bunch of systems for science (mostly genomics) iNquiry 2020: Built and led production bioinformatics at Sema4 Computer Science & Engineering
  • 3.
    Domain Expertise Matters “Bioinformaticsis full of pitfalls for those who look for patterns or make predictions without a thorough understanding of where biological data come from and what they mean” Nevin Young – Distinguished McKnight Professor at UMN “People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird.” Donald Knuth – Professor Emeritus at Stanford
  • 4.
    Sema4 4 A patient-centered healthintelligence company dedicated to advancing healthcare through data-driven insights. Sema4 is transforming healthcare by applying artificial intelligence (AI) and machine learning to multidimensional, longitudinal clinical and genomic data to build dynamic models of human health and defining optimal, individualized health trajectories. Centrellis®, our innovative health intelligence platform, is enabling us to generate a more complete understanding of disease and wellness and to provide science-driven solutions to the most pressing medical needs. We believe that patients should be treated as partners, and that data should be shared for the benefit of all ~50PB data
  • 5.
    Genomic Testing atSema4 5 Comprehensive molecular profiling insights to help providers identify therapies and clinical trials for their patients today, and take advantage of the therapies and trials of tomorrow. One of the most comprehensive carrier screens available, providing accurate, actionable insights into carrier status for a broad range of hereditary conditions to help patients make more informed pregnancy planning choices. Comprehensively screens for common chromosome aneuploidies, sex chromosome aneuploidies, and microdeletion syndromes, with the flexibility to order select components of the test. Detects 193 childhood conditions, many of which can’t be detected by carrier screening, standard prenatal tests, or state newborn screening alone. Also includes a pharmacogenomic (PGx) analysis of a child’s response to more than 40 medications that may be prescribed during childhood, including common antibiotics Offers a menu of panels to help you and your patients understand their individual risk for developing cancer or to inform treatment decisions. These tests are enabled through a set of digital tools and services to support easy identification, ordering, resulting, counseling, and access to testing.
  • 6.
    We are hiring:https://sema4.com/careers
  • 7.
    Production Bioinformatics (aka“Data Ops”) Chief Information Officer Production Bioinformatics Chief Data Officer Chief Technology Officer Chief Health Informatics Officer Production Data: • LIMS / Lab Informatics • Off instrument, to cloud, through analysis pipelines • Delivered to downstream systems • FAIR (Findable, Accessible, Interoperable, and Reusable) for R&D Production Informatics: • Support for lab validations, verifications, process equivalencies • Sample tracking / status How we work: • 24/7 front-line operations team • Teams for data engineering, lab informatics, analysis & methods, and portfolio / quality. Adjacencies: • Clinical • Product • R&D • Lab Operations Chief Information Security Officer Genomic Solutions
  • 8.
    The support modelchanges as a process matures 8 Research and Development R&D Research Development Embed software quality analysts and software testers R&D Research creates prototype capabilities Development creates re- usable tools and processes. Production delivers correct / complete data Research Academic Style Research produces knowledge, insights, and manuscripts. Offer both bioinformatics and computational biology support Embed experienced computational biologists Production Less structure, more experience / seniority More structure, easier on- ramp for early career folks
  • 9.
  • 10.
    From Sample toReport (Simplest Version) Receive Sample(s) Lab Process Data Process Clinical Review Variant Annotation Deliver Report Confirmation and QC
  • 11.
    From Sample toReport (moderately complex version) 11 Sequencer Flowcell Samples Extracted DNA Capture Molecular Barcodes Pooling / Multiplex Run Folder (BCL) ~1TB per S4 flowcell Reads (FASTQ) Many files per sample ~1TB per S4 flowcell Aligned Reads (BAM) Few files per sample ~1TB per S4 flowcell QC: Sample Level Alignment Variants (gVCF) Few files per sample Significantly smaller files (~2% of BAM) QC: Batch Variant Calling Continued Below … Continued From Above … Biorepository Ancillary and Confirmatory Assays Lab Ops Data Ops QC: Run Level On-capture variants Continued From Above … Continued Below … Masking In-test variants (only variants that were ordered by a physician) Annotation Annotated Variants Post Analytical Confirmed Variants Run Folder (BCL) ~1TB per S4 flowcell QC: Lane level Reporting ROI QC & Confirmation Clinical Review Reportable Variants Demultiplex
  • 12.
    Samples are receivedat the lab as blood, saliva, biopsy or other tissue. DNA is extracted from the sample. Undesired parts of the genome are washed away, leaving us with targeted regions. The DNA fragments are tagged with a molecular “bar-code” Unused sample is stored for 60 days Extracted DNA is sent for biobanking. Pools are loaded onto Illumina flowcells. Flowcells are run on Illumina instruments. Runs (BCL) Data are uploaded to the Amazon cloud. The run folder is demultiplexed into individual reads FASTQ Reads are mapped or aligned to a reference genome. BAM VCF The BAM is analyzed to call genetic variants. These are stored in the variant call file (VCF). The VCF is masked so that the curators only see genes that were ordered by a physician Masked VCF Identity Annotations Curators assign text narratives to variants, describing their likely clinical impact. Ancillary QC: Laboratory Directors review data, order confirmatory assays, and export reportable variants. Quality Control Confirmatory Ancillary Assays detect variants not accessible to NGS. The tagged DNA is then mixed (“pooled” or “multiplexed”) Reporting Reporting and Sign Out: Laboratory Directors create and finalize the report Confirmatory Assays support (or disprove) variants with marginal evidence Identity Assays Detect sample swaps From Sample to Report (most complex version)
  • 13.
    Clinical Systems forBusiness, Lab, and Data (high level) 13 LIMS Individual Samples Batches of Samples LIS Portal Orders Client Onboarding Status updates Billing Clinical Reports Lab Data Lake Data Delivery (external) 3rd Party Data Data Delivery (internal) Accessioning Digital Accessioning EMR Orders Paper Orders Manifests The Lab Information System (LIS) is the system of record for business logic like test ordering and report delivery The Lab Information Management System (LIMS) drives laboratory workflows and is the system of record for samples and assays. The Lab Data Lake is the indexed repository of all instrument and derived data “Message Bus” Samples Orders
  • 14.
    Colocation Lab to Cloudand Primary Analysis Branford Lab Stamford Lab Cheetara Snarf Panthro Drax Data streams from lab to colocation site over a high- performance private network. Electro Iris Titan Ultron Watson Zeus Tygra Lion-O Kit Kat AWS Cloud AWS Direct Connect Lambda functions trigger demultiplexing, alignment, and variant calling. Metadata for all lab and analytical results are indexed in a MongoDB data lake Instrument Buckets (BCL) Clinical & R&D Buckets (FASTQ) Data Lake Data is cached at the colocation site and streamed to the cloud concurrent with the instrument runs High performance private network Near future: DRAGEN accelerator boards from Illumina to perform accelerated demultiplexing, alignment, and variant calling at the colocation site Reduces network burden and cloud spend Enables rapid QC (~1 hour post run completion) Commodifies instrument adjacent work This will reduce me from 99.9% cloud to a mere 99% cloud.
  • 15.
    Test Nonclinical Lab Local Storage Sequencer Bucket Cloud Upload:Custom software copies files from lab local storage to Amazon’s cloud. Clinical Demultiplexing: Custom software demultiplexes and moves moves the FASTQ and other sample-level files into appropriate locations Bioinformatics Sample Processing & Management Job dispatch (“Gondor”) and sample tracking database (“Zion”) triggers analysis tasks Modern / Cost-effective: “Valinor” Workflows in WDL, custom code in docker, executed by Cromwell. Advantage: Separate workflow logic from underlying code. Highly available / durable: “Nexus” Commercial platform for executing WDL workflows Advantage: Platform & Engineering Support, profiling & optimization Legacy: “Mordor” Monolithic docker image, triggered by AWS Lambda. Advantage: Got us to the cloud Gondor: Bioinformatics Pipeline Management Gondor: Routes data and tracks / triggers workflows based on open requests and available data. LIS LIMS
  • 16.
    Lab Data Lake BioinformaticsInfrastructure (99% cloud) 16 Legacy Pipelines NGS Demux Upload Legacy pipelines are monolithic docker containers that run on raw AWS. Modern Pipelines Masking Service The Masking Service reduces VCFs down to those that were ordered for a particular report GermLine Dashboard NIPT Dashboard Oncology Suite Variant Curation& Annotation Confirmatory Assays & Re-work Certain legacy processes still rely on file drops. Gondor tracks orders & data to trigger (and re-trigger) pipelines Modern pipelines are coded in WDL and run using Cromwell or DNA Nexus The Lab Data Lake is a large collection of objects stored in S3 and indexed in MongoDB Reporting Dashboards allow the clinical team to track results perform QC and confirmation, and order re-work as necessary.
  • 17.
    The data platform Thework of building out an integrated data platform is grinding and detail-oriented • It starts with executive sponsorship and resourcing • It insists on clear, shared ontologies and data models • It is built around committees of stakeholders • It rests on governance.
  • 18.
    Good Data StartsWith Strong Definitions 18 Definitions must be detailed Time zones and start points for days and weeks are essential Queries must be under version control Limit redundant views of the “same” data Multiple sources they must agree Data hygiene is a practice, not a project The data committee meets twice a week
  • 19.
    Track KPIs andmetrics station by station Good metrics apply at many levels The overall turnaround time should be the sum of the component turnaround times.
  • 20.
  • 21.
    User interfaces needto change at scale … 21
  • 22.
  • 23.
    23 Production Bioinformatics Amateurs practiceuntil they can play it right Professionals practice until they cannot play it wrong.
  • 24.
    How We Work:Transparency and Accountability Production meetings twice a week • Leads and managers required • Everybody at Sema4 welcome • Minutes on department wiki Priorities: • Current clinical samples (patients waiting) • Commercial / research samples (client waiting) • TAT / COGS improvements • R&D obligations The “must inform” principle • Broccoli in the teeth? Anything askew? • Boss seems factually incorrect? • Something not working / going to break?
  • 25.
    How do youmake it hard to play wrong? (DevOps / CI version) • All infrastructure is defined by version-controlled code • All infrastructure is deployed entirely from code without manual intervention • If the environment wanders, our first reaction is to redeploy. • Dev/test/prod must be identical. No manual changes whatsoever. • All software is containerized. The container is the delivery is the artifact. • Containers are built once, then promoted through environments - not rebuilt • Commits that fail unit tests are rejected, commits that fail integration tests are rolled back. • Never change prod manually. Not even once. Not even if it’s on fire. • Monitoring is ubiquitous. All systems have observability hooks. • Retro everything. Root cause everything. Write everything down.
  • 26.
    Don’t set trapsfor the next shift! It timed out: • Rather than increasing the timeout • Figure out why it is slowing down. It needed a reboot: • Rather than waiting for that to happen again • Schedule automatic reboots or periodic maintenance. It ran out of memory • Okay fine, increase RAM on the VM but also • Fix the memory leak The configs were out of sync: • Sure, get them in sync, but also • Pick one as the source of truth and stop using the other It’s just a little change: • This is a validated environment • Also, Code Freeze Friday is a thing No alerts or warnings! • It’s quiet. Too quiet • Build better, more sensitive alerts “Perception of the risk associated with an activity often decreases over a period of time when no losses occur even though the real risk has not changed at all. This misperception leads to reducing the very factors that are preventing accidents.” Nancy Levinson - “Technical and Managerial Factors in the NASA Challenger and Columbia Losses”
  • 27.
    Closing Thoughts • It’san amazing time to be in this field. • The next few years are going to be transformative. • We will forge practical, data-driven connections between clinical care and genomic / biomedical research – leading to improved health outcomes and lower costs. • Let’s build it together: https://sema4.com/careers • I’m interested in your thoughts: chris.dwan@sema4.com