Genome in a Bottle Consortium
August 2014
NIST, Gaithersburg, MD
Reference Materials for Clinical Applications of Human Genome
Sequencing
Marc Salit, Ph.D. and Justin Zook, Ph.D
National Institute of Standards and Technology
Advances in Biological/Medical Measurement Science
(ABMS @ Stanford)
Genome in a Bottle
Consortium Development
• NIST met with sequencing technology
developers to assess standards needs
– Stanford, June 2011
• Open, exploratory workshop
– ASHG, Montreal, Canada
– October 2011
• Small, invitational workshop at NIST
to develop consortium for human
genome reference materials
– FDA, NCBI, NHGRI, NCI, CDC, Wash U,
Broad, technology developers, clinical
labs, CAP, PGP, Partners, ABRF, others
– developed draft work plan
– April 2012
• Open, public meeting at NIST to
formally establish consortium,
present draft work plan
– formed working groups
– identified candidate genomes
– established principles of:
• reference material selection
• characterization
• informatics
• performance metrics
– August 2012
• Open, public workshop at XGen
Congress
– March 2013
• Biannual workshops
– August 2013 at NIST
– January 2014 at Stanford
– August 2014 at NIST
– January 29-30 2015 at Stanford
• Website
– www.genomeinabottle.org
Well-characterized, stable RMs
• Obtain metrics for validation,
QC, QA, PT
• Determine sources and types
of bias/error
• Learn to resolve difficult
structural variants
• Improve reference genome
assembly
• Optimization
– integration of data from
multiple platforms
– sequencing and analysis
• Enable regulated applications Comparison of SNP Calls for
NA12878 on 2 platforms, 3
analysis methods
Measurement Process
Sample
gDNA isolation
Library Prep
Sequencing
Alignment/Mapping
Variant Calling
Confidence Estimates
Downstream Analysis
• gDNA reference
materials will be
developed to
characterize
performance of a part
of process
– materials will be
certified for their
variants against a
reference sequence,
with confidence
estimates
genericmeasurementprocess
• NIST working with GiaB
to select genomes
• Current plan
– NA12878 HapMap
sample as Pilot sample
• part of 17-member
pedigree
– trios from PGP as more
complete set
• 2 trios, focus on children
• varying biogeographic
ancestry
12889 12890 12891 12892
12877 12878
12879 12880 12881 12882 12883 12884 12885 1288712886 12888 12893
CEPH Utah Pedigree 1463
Putting “Genomes” in Bottles
11 children, Birth Order Redacted
Genome in a Bottle Working Groups
Reference Material
Selection
& Design
Andrew Grupe,
Celera
•Develop prioritized list
of whole human
genomes for Reference
Materials
•Identify candidate
approaches and
materials for artificial
RMs
•Develop prioritized
list
Meaurements for
Reference Material
Characterization
Mike Eberle, Illumina
•Develop consensus
plan for experimental
characterization of
Reference Materials
Bioninformatics,
Data Integration,
and Data
Representation
Steve Sherry, NCBI
•Develop plan for
integrating
experimental data and
forming consensus
variant calls and
confidence estimates
•Develop consensus
plan for data
representation
Performance Metrics
& Figures of Merit
Deanna Church,
Personalis
•User interface to the
Genome-in-a-Bottle
Reference Material
•“Dashboard”
•what an end user will
see and report to
understand and
describe the
performance of their
experiment
•variant call accuracy
•process performance
measures to enable
optimization
Update
Zook et al., Nature Biotechnology, 2014.
• methods to develop
SNP/indel call set
described in manuscript
• broad and quick
adoption of call set for
benchmarking
– struck nerve
• use scenarios a
highlight of this
workshop
Preliminary uses of high-confidence
NIST-GIAB genotypes for NA12878
• NIST have released
several versions of high-
confidence genotypes
for its pilot RM
• These data are
presently being used for
benchmarking
– prior to release of RMs
– SNPs & indels
• ~77% of the genome
Highlights
This workshop
• release plans for pilot RM
• characterization plans for
HG-002 – HG-005
• prioritization of next
genomes
• collaborations
– ABRF
– Global Alliance
Upcoming
• materials and methods to
support somatic variant
calling
• integrating structural
variants into the GIAB call
sets
• crowdsourcing analysis
• data analysis/code
jamboree
Agenda
Thursday
• Welcome and Status Update
• Charge to Working Groups
• Break
• Working Group Breakout
Discussions
• Lunch (in NIST cafeteria)
• FDA Perspective on Future
Needs
• ABRF interlaboratory NGS
study
• Break
• Working Group Breakout
Discussions continued
Friday
• Working groups meet if
needed
• Working Group leaders
present plans and discussion
• Break
• Discussion of NIST Reference
Material Development plans
• Discussion of Steering
Committee agenda items
• Everyone adjourn except
steering committee
• Steering committee meet over
lunch
Agenda
Monday
• Breakfast and registration
• Welcome and Context Setting
• NIST RM Update and Status Report
• Charge to Working Groups
• Coffee Break
• Working Group Breakout Discussions
• Lunch (provided)
• Informal Working Group Reports
• Coffee Break
• Breakout Topical Discussions
– Topic #1: Moving beyond the 'easy'
variants and regions of the genome
– Topic #2: Selecting future genomes for
Reference Materials
Tuesday
• Breakfast and registration
• Use cases: Experiences using the pilot
Reference Material
• Discussion of plans to release pilot
Reference Material
• Coffee Break
• Working Group Breakout discussions
• Lunch (provided)
• Working Group leaders present plans
and discussion
• Steering committee Overview
• First meeting of the Steering
Committee (others adjourn)
Please Note
Slides will be made available on SlideShare after
the workshop (see genomeinabottle.org).
Tweets are welcome unless the speaker requests
otherwise. Please use #giab as the hashtag.
Plenary sessions are being broadcast as a
webinar. Questions from webinar attendees can
be submitted via chat
NIST Reference Materials
Pilot RM - NA12878
• 8300 10ug vials of NA12878
gDNA @ NIST 4/2013
– Available for sequencing by
GIAB participants
– target for release as NIST RM
12/2014
• SNPs, small indels
• Sequenced at 6 labs
– 4 technologies, multiple modes
• Received “Human Subjects
Approval” for release of
NA12878 as NIST RM
Personal Genome Project
• Ashkenazim trio DNA
received Feb 2014
• Asian son DNA received Feb
2014
– Parents’ cell lines at Coriell
• “Human subjects review”
approval for release of PGP
genomes as NIST RMs
• What future RMs are
needed?
Consenting Genomes for use as
Reference Materials
• Risk of re-identification
– this is a real risk
– privacy
– implications for family members
• Meaning of possibility of withdrawal
• Commercial application
– indirect, research
– direct, derived products
• PGP project currently state-of-art
– broad and direct
– test to demonstrate understanding
• “Wild West”
• Coriell MTA for PGP genomes now
explicitly permits commercial
redistribution/modification/…
AMP Survey Using Pilot RM
AMP Members International
AMP Survey Future RM needs
AMP Members International

Aug2014 giab intro slides

  • 1.
    Genome in aBottle Consortium August 2014 NIST, Gaithersburg, MD Reference Materials for Clinical Applications of Human Genome Sequencing Marc Salit, Ph.D. and Justin Zook, Ph.D National Institute of Standards and Technology Advances in Biological/Medical Measurement Science (ABMS @ Stanford)
  • 2.
    Genome in aBottle Consortium Development • NIST met with sequencing technology developers to assess standards needs – Stanford, June 2011 • Open, exploratory workshop – ASHG, Montreal, Canada – October 2011 • Small, invitational workshop at NIST to develop consortium for human genome reference materials – FDA, NCBI, NHGRI, NCI, CDC, Wash U, Broad, technology developers, clinical labs, CAP, PGP, Partners, ABRF, others – developed draft work plan – April 2012 • Open, public meeting at NIST to formally establish consortium, present draft work plan – formed working groups – identified candidate genomes – established principles of: • reference material selection • characterization • informatics • performance metrics – August 2012 • Open, public workshop at XGen Congress – March 2013 • Biannual workshops – August 2013 at NIST – January 2014 at Stanford – August 2014 at NIST – January 29-30 2015 at Stanford • Website – www.genomeinabottle.org
  • 3.
    Well-characterized, stable RMs •Obtain metrics for validation, QC, QA, PT • Determine sources and types of bias/error • Learn to resolve difficult structural variants • Improve reference genome assembly • Optimization – integration of data from multiple platforms – sequencing and analysis • Enable regulated applications Comparison of SNP Calls for NA12878 on 2 platforms, 3 analysis methods
  • 4.
    Measurement Process Sample gDNA isolation LibraryPrep Sequencing Alignment/Mapping Variant Calling Confidence Estimates Downstream Analysis • gDNA reference materials will be developed to characterize performance of a part of process – materials will be certified for their variants against a reference sequence, with confidence estimates genericmeasurementprocess
  • 5.
    • NIST workingwith GiaB to select genomes • Current plan – NA12878 HapMap sample as Pilot sample • part of 17-member pedigree – trios from PGP as more complete set • 2 trios, focus on children • varying biogeographic ancestry 12889 12890 12891 12892 12877 12878 12879 12880 12881 12882 12883 12884 12885 1288712886 12888 12893 CEPH Utah Pedigree 1463 Putting “Genomes” in Bottles 11 children, Birth Order Redacted
  • 6.
    Genome in aBottle Working Groups Reference Material Selection & Design Andrew Grupe, Celera •Develop prioritized list of whole human genomes for Reference Materials •Identify candidate approaches and materials for artificial RMs •Develop prioritized list Meaurements for Reference Material Characterization Mike Eberle, Illumina •Develop consensus plan for experimental characterization of Reference Materials Bioninformatics, Data Integration, and Data Representation Steve Sherry, NCBI •Develop plan for integrating experimental data and forming consensus variant calls and confidence estimates •Develop consensus plan for data representation Performance Metrics & Figures of Merit Deanna Church, Personalis •User interface to the Genome-in-a-Bottle Reference Material •“Dashboard” •what an end user will see and report to understand and describe the performance of their experiment •variant call accuracy •process performance measures to enable optimization
  • 7.
    Update Zook et al.,Nature Biotechnology, 2014. • methods to develop SNP/indel call set described in manuscript • broad and quick adoption of call set for benchmarking – struck nerve • use scenarios a highlight of this workshop
  • 8.
    Preliminary uses ofhigh-confidence NIST-GIAB genotypes for NA12878 • NIST have released several versions of high- confidence genotypes for its pilot RM • These data are presently being used for benchmarking – prior to release of RMs – SNPs & indels • ~77% of the genome
  • 9.
    Highlights This workshop • releaseplans for pilot RM • characterization plans for HG-002 – HG-005 • prioritization of next genomes • collaborations – ABRF – Global Alliance Upcoming • materials and methods to support somatic variant calling • integrating structural variants into the GIAB call sets • crowdsourcing analysis • data analysis/code jamboree
  • 10.
    Agenda Thursday • Welcome andStatus Update • Charge to Working Groups • Break • Working Group Breakout Discussions • Lunch (in NIST cafeteria) • FDA Perspective on Future Needs • ABRF interlaboratory NGS study • Break • Working Group Breakout Discussions continued Friday • Working groups meet if needed • Working Group leaders present plans and discussion • Break • Discussion of NIST Reference Material Development plans • Discussion of Steering Committee agenda items • Everyone adjourn except steering committee • Steering committee meet over lunch
  • 11.
    Agenda Monday • Breakfast andregistration • Welcome and Context Setting • NIST RM Update and Status Report • Charge to Working Groups • Coffee Break • Working Group Breakout Discussions • Lunch (provided) • Informal Working Group Reports • Coffee Break • Breakout Topical Discussions – Topic #1: Moving beyond the 'easy' variants and regions of the genome – Topic #2: Selecting future genomes for Reference Materials Tuesday • Breakfast and registration • Use cases: Experiences using the pilot Reference Material • Discussion of plans to release pilot Reference Material • Coffee Break • Working Group Breakout discussions • Lunch (provided) • Working Group leaders present plans and discussion • Steering committee Overview • First meeting of the Steering Committee (others adjourn) Please Note Slides will be made available on SlideShare after the workshop (see genomeinabottle.org). Tweets are welcome unless the speaker requests otherwise. Please use #giab as the hashtag. Plenary sessions are being broadcast as a webinar. Questions from webinar attendees can be submitted via chat
  • 12.
    NIST Reference Materials PilotRM - NA12878 • 8300 10ug vials of NA12878 gDNA @ NIST 4/2013 – Available for sequencing by GIAB participants – target for release as NIST RM 12/2014 • SNPs, small indels • Sequenced at 6 labs – 4 technologies, multiple modes • Received “Human Subjects Approval” for release of NA12878 as NIST RM Personal Genome Project • Ashkenazim trio DNA received Feb 2014 • Asian son DNA received Feb 2014 – Parents’ cell lines at Coriell • “Human subjects review” approval for release of PGP genomes as NIST RMs • What future RMs are needed?
  • 13.
    Consenting Genomes foruse as Reference Materials • Risk of re-identification – this is a real risk – privacy – implications for family members • Meaning of possibility of withdrawal • Commercial application – indirect, research – direct, derived products • PGP project currently state-of-art – broad and direct – test to demonstrate understanding • “Wild West” • Coriell MTA for PGP genomes now explicitly permits commercial redistribution/modification/…
  • 14.
    AMP Survey UsingPilot RM AMP Members International
  • 15.
    AMP Survey FutureRM needs AMP Members International