SlideShare a Scribd company logo
1 of 27
genomeinabottle.org
Genome in a Bottle Consortium
GIAB/GRC Pre-ASHG Workshop
October 5, 2015
Reference Materials for Clinical Applications of
Human Genome Sequencing
Justin Zook and Marc Salit
National Institute of Standards and Technology
genomeinabottle.org
Sequencing technologies and
bioinformatics pipelines disagree
O’Rawe et al. Genome Medicine 2013, 5:28
genomeinabottle.org
Sequencing technologies and
bioinformatics pipelines disagree
O’Rawe et al. Genome Medicine 2013, 5:28
Who is right?
Is anyone right?
genomeinabottle.org
GIAB Scope
• The Genome in a Bottle Consortium is
developing the reference materials, reference
methods, and reference data needed to
assess confidence in human genome variant
calls.
• A principal motivation for this consortium is to
enable performance assessment of
sequencing and science-based regulatory
oversight of clinical sequencing.
genomeinabottle.org
Well-characterized, stable RMs
• Obtain metrics for validation,
QC, QA, PT
• Determine sources and types
of bias/error
• Learn to resolve difficult
structural variants
• Improve reference genome
assembly
• Optimization
– integration of data from
multiple platforms
– sequencing and analysis
• Enable regulated applications Comparison of SNP Calls for
NA12878 on 2 platforms, 3
analysis methods
genomeinabottle.org
NGS Validation Process using
Genomes in Bottles
Sample
gDNA isolation
Library Prep
Sequencing
Alignment/Mapping
Variant Calling
Confidence
Estimates
Downstream
Analysis
Analytical Process
Genome in a Bottle Scope
Pre-Analytical Process
Clinical Interpretation
GIAB
Data
genomeinabottle.org
Genome in a Bottle Consortium (GIAB)
Hosted by US National Institute of Standards and Technology
Goal: Provide infrastructure to assess
confidence in human variant calls
• Appropriately consented widely
available DNA samples, distributed by
the Coriell Institute
– Also, QCed Reference Material (RM)
versions from controlled lots will be
available from NIST
– Also, PGP samples are commercially
available
• High-accuracy reference data for these
samples
• Tools to facilitate their use
– With the Global Alliance Data Working
Group Benchmarking Team
ga4gh.org
genomeinabottle.org
GIAB Selected Samples
CEPH/Utah Pedigree 1463
✔
NA1288
9
NA12879
NA12890
NA12880
NA12881
NA12882
NA12883
NA12884
NA12885
NA12886
NA12887
NA12888
NA12893
NA12877 NA12878
NA12891 NA12892
✔ ✔
NA24149 NA24143
NA24385
Ashkenazi Jewish Trio
✔
NA24694 NA24695
NA24631
Asian (Han Chinese) Trio
✔
Note: Illumina and RTG have used data from the pedigree
to improve variant calls in the specific GIAB samples.
New
New
Personal
Genome
Project
Available as
NIST RM8398
genomeinabottle.org
NIST Human Genome
Reference Materials (RMs)
• NIST RM 8398 is available!
– tinyurl.com/giabpilot
– DNA isolated from large
growth cell cultures
– Stable, homogeneous
– Best for regulated uses
– DNA from same cell line at
Coriell (NA12878)
• New AJ and Asian Samples
– Available from Coriell now
– NIST RM available in 2016
genomeinabottle.org
Integrated 14 datasets from 5 platforms
to establish Reference SNP/indel Calls for
NA12878
Zook et al., Nature Biotechnology, 2014.
genomeinabottle.org
Integration Methods to Establish
Reference Variant Calls for NA12878
Candidate Variants from Each Platform
Identify Concordant Variants
Identify Characteristics of Systematic Error
Arbitrate Using Evidence of Systematic Error
Exclude regions potentially biased for all short
reads (e.g., repeats, SVs)
Zook et al., Nature Biotechnology, 2014.
genomeinabottle.org
Assigning confidence to genomic
regions for NA12878
High-confidence (77%)
• Platforms agree or we
understand the systematic
biases causing
disagreement
• At least some methods have
no evidence of systematic
errors
• Mendelian inheritance
consistent
Lower confidence (23%)
• In a region known to be
difficult for current
technologies
– Segmental Dups
– Repeats, Low Complexity
– High/Low GC
– Etc.
• Evidence of systematic error
across many platforms
• Inconsistent inheritance
Zook et al., Nature Biotechnology, 2014.
genomeinabottle.org
Using high-confidence NIST-GIAB
genotypes for NA12878
• NIST have released
several versions of high-
confidence genotypes
for its pilot RM
• These data are
presently being used for
benchmarking
– prior to release of RMs
– SNPs & indels
• ~77% of the genome
•Data on FTP now well-organized
genomeinabottle.org
GeT-RM Browser from NCBI and CDC
• http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/
• Allows visualization of data underlying call each call
genomeinabottle.org
Uses of GIAB NA12878
Oncology – Molecular and Cellular Tumor Markers
“Next Generation” Sequencing (NGS) guidelines for
somatic genetic variant detection
www.bioplanet.com/gcat
genomeinabottle.org
Global Alliance for Genomics and Health
Benchmarking Task Team
• Formed June 2014 to develop
methods and tools for comparing
variant calls to a benchmark
• Developed standardized definitions
for performance metrics like TP, FP,
and FN.
• Initial focus on germline SNPs/indels
• Developing benchmarking tools
• Comparison engine
• Pluggable web interface with
modules for:
• Reporting/calculation of metrics
• Visualization/user interface
• Working with Genome in a Bottle
Consortium to host data and calls
from their well-characterized
genomes
www.bioplanet.com/gcat
Example User Interface
genomeinabottle.org
Global Alliance for Genomics and Health
Benchmarking Task Team
Credit: Rebecca Truty, Complete Genomics
How should we interpret this complex variant on chr21?
genomeinabottle.org
Global Alliance for Genomics and Health
Benchmarking Task Team
Credit: Rebecca Truty, Complete Genomics
Beyond simple T/F classification: Genotype errors
Trut
h
Callse
t
Description Proposed
Name(s)
CM#1 region
match
CM#2 allele match CM#3 genotype
match
0/1 1/1 zygosity/genotype
error
GE TP 1TP, 1GE FN
1/1 0/1
1/2 0/1
1/1
0/2
2/2
common allele, FN
allele
GE_FN TP 1TP, 1GE, 1FN FN
0/1 1/2 common allele, FP
allele
GE_FP TP 1TP, 1GE, 1FP FP, FN
1/1 1/2
1/2 1/3 common allele, FP
allele, FN allele
GE_FP_FN TP 1TP, 1GE, 1FP,
1FN
FP, FN
genomeinabottle.org
Global Alliance for Genomics and Health
Benchmarking Task Team
Credit: Rebecca Truty, Complete Genomics
Beyond simple T/F classification: no-calls and half-calls
Truth Callset Description Proposed
Name(s)
CM#1 region
match
CM#2 allele match CM#3
genotyp
e match
0/1 ./1 half-call, TP allele HC_TP NC, NCV,
TP
1NC, 1NCV, 1TP, 1GE TP
1/1 ./1 1NC, 1NCV, 1TP, 1GE FN
0/1
1/1
./0 half call, FN
allele(s)
HC_FN NC, NCV, TP 1NC, 1NCV, 1FN FN
1/2 ./0 1NC, 2NCV, 2FN FN
1/2 ./1
./2
half-call, TP allele,
FN allele
HC_TP_F
N
NC, NCV,
TP
1NC, 1NCV, 1TP, 1GE,
1FN
FN
genomeinabottle.org
Stratifying False PositivesGC Content
TR
Unit
<7
TR
Unit
>=7
TR
Unit
2TR
Unit
1
TR
Unit
3
TR
Unit
4
Credit:
Abby Beeler
Ellie Wood
GA4GH - Stratification
genomeinabottle.org
Public data from GIAB AJ PGP Trio
Long reads/”Linked” reads
• ~70/30/30x PacBio
– ~11kb N50
• BioNano
• 10X Genomics
• Moleculo
• Complete Genomics LFR
• Oxford Nanopore
Short reads
• 300x Illumina paired-end
• 15x Illumina 6kb mate-pair
• Complete Genomics
• SOLiD 5500W
• Ion Proton Exome
http://biorxiv.org/content/early/2015/09/15/026468
genomeinabottle.org
GIAB Analysis Group – New Data Sets
Leaders
• Francisco de la Vega
– Annai Systems
• Chris Mason
– Weil Cornell Medical Center
• Tina Graves
– Washington University
• Valerie Schneider
– NCBI
•and Justin and Marc
Status
• Analysis Group Responsibilities:
– https://docs.google.com/document/d/10e
A0DwB4iYTSFM_LPO9_2LyyN2xEqH49OXH
htNH1uzw/edit?usp=sharing
• Analysis Milestones:
– https://docs.google.com/spreadsheets/d/1Pj4nSz
H742g40wJz2fA6f8kFtZYAToZpSZYVPiC5st4/edit?u
sp=sharing
• Analysis Methods
– https://docs.google.com/spreadsheet
s/d/1Je2g85H7oK6kMXbBOoqQ1FM
NrvGnFuUJTJn7deyYiS8/edit?usp=sha
ring
• Analysis Plan:
– https://drive.google.com/file/d/0B7Ao1qq
JJDHQdnVEaVdqbWdEdkE/view?usp=shari
ng
• Collecting Data and analyses on GIAB
FTP Site
• Recruiting people to help with the
work.
Goal: Establish and distribute a set of authoritative benchmark variant calls of all
types and sizes, as well as homozygous reference regions, on GIAB PGP trios
genomeinabottle.org
Data Release Policy: Real-time,
Open, Public Release
Individual Datasets
• Uploaded to GIAB FTP site
as it is collected
• Includes raw reads, aligned
reads, and
variant/reference calls
Integrated High-confidence Calls
• First develop SNP, indel,
and homozygous reference
calls
• Then develop SV and non-
SV calls
• Released calls are versioned
• Preliminary callsets will be
made available to be
critiqued
genomeinabottle.org
Analysis Progress: AJ Trio
• SNPs/indels
– Several candidate callsets
– NIST working on integration
– Plan to use 10X/moleculo/PacBio for difficult-to-map regions
• Assembly
– 2 de novo assemblies of AJ trio (MHAP/PBcR and Falcon/Bionano)
– Will be used by at least 2 groups for SV calling
• Structural variants
– Candidate calls being generated by 15+ groups with >20 different
algorithms and 6 datasets
– 3 integration methods: Bina-MetaSV, DNAnexus/Baylor-
Parliament, NIST-svclassify
– Parliament: ~7k SVs with evidence in PacBio and Illumina
• Long-range Phasing
– 2 phased calls so far (CG LFR and 10X)
– Integration methods needed
genomeinabottle.org
Proposed approach to form high-
confidence SV (and non-SV) calls
Generate candidate calls from multiple
methods
Compare/evaluate calls using
Parliament/MetaSV/svclassify/others?;
manually inspect discordant calls
Integrate new and revised calls
Combine integrated calls (with heuristics
and/or machine learning) to generate high-
confidence calls
August 30, 2015
Nov 1, 2015
Jan 1, 2016
Jan 26, 2016
genomeinabottle.org
Acknowledgments
• FDA – Elizabeth
Mansfield, Computing
staff
• Many members of
Genome in a Bottle
– New members
welcome!
– Sign up on website for
email newsletters
Steering Committee
– Marc Salit
– Justin Zook
– David Mittelman
– Andrew Grupe
– Michael Eberle
– Steve Sherry
– Deanna Church
– Francisco De La Vega
– Christian Olsen
– Monica Basehore
– Lisa Kalman
– Christopher Mason
– Elizabeth Mansfield
– Liz Kerrigan
– Leming Shi
– Melvin Limson
– Alexander Wait Zaranek
– Nils Homer
– Fiona Hyland
– Steve Lincoln
– Don Baldwin
– Robyn Temple-Smolkin
– Chunlin Xiao
– Kara Norman
– Luke Hickey
genomeinabottle.org
For More Information
www.genomeinabottle.org - sign up for general GIAB and Analysis
Team google group emails
www.bioplanet.com/gcat - exome comparison tool
www.ncbi.nlm.nih.gov/variation/tools/get-rm/ - Get-RM Browser
Data: http://biorxiv.org/content/early/2015/09/15/026468
Global Alliance Benchmarking work group
– ga4gh.org/#/benchmarking-team
Twice yearly workshop
– Winter: January 28-29, 2016 at Stanford University, California, USA
– Summer at NIST, Maryland, USA
Public Meetings
Justin Zook: jzook@nist.gov
Marc Salit: salit@nist.gov

More Related Content

What's hot

What's hot (20)

ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Aug2015 Giab nist integration methods
Aug2015 Giab nist integration methodsAug2015 Giab nist integration methods
Aug2015 Giab nist integration methods
 
Sept2016 plenary nist_intro
Sept2016 plenary nist_introSept2016 plenary nist_intro
Sept2016 plenary nist_intro
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giab
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
170120 giab stanford genetics seminar
170120 giab stanford genetics seminar170120 giab stanford genetics seminar
170120 giab stanford genetics seminar
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slides
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seq
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
170326 giab abrf
170326 giab abrf170326 giab abrf
170326 giab abrf
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin
 
Jan2016 bina giab
Jan2016 bina giabJan2016 bina giab
Jan2016 bina giab
 
2017 agbt giab_poster
2017 agbt giab_poster2017 agbt giab_poster
2017 agbt giab_poster
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequins
 

Similar to GIAB-GRC workshop oct2015 giab introduction 151005

2014 agbt giab_progress update
2014 agbt giab_progress update2014 agbt giab_progress update
2014 agbt giab_progress update
GenomeInABottle
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
GenomeInABottle
 
Aug2013 reference material selection and design working group
Aug2013 reference material selection and design working groupAug2013 reference material selection and design working group
Aug2013 reference material selection and design working group
GenomeInABottle
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
Philip Cheung
 

Similar to GIAB-GRC workshop oct2015 giab introduction 151005 (20)

150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
2014 agbt giab_progress update
2014 agbt giab_progress update2014 agbt giab_progress update
2014 agbt giab_progress update
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
150224 giab 30 min generic slides
150224 giab 30 min generic slides150224 giab 30 min generic slides
150224 giab 30 min generic slides
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomics
 
Aug2013 reference material selection and design working group
Aug2013 reference material selection and design working groupAug2013 reference material selection and design working group
Aug2013 reference material selection and design working group
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 

More from GenomeInABottle

More from GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 

Recently uploaded

Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
chanderprakash5506
 
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
amritaverma53
 
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
Rashmi Entertainment
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
rajnisinghkjn
 

Recently uploaded (20)

💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
 
7 steps How to prevent Thalassemia : Dr Sharda Jain & Vandana Gupta
7 steps How to prevent Thalassemia : Dr Sharda Jain & Vandana Gupta7 steps How to prevent Thalassemia : Dr Sharda Jain & Vandana Gupta
7 steps How to prevent Thalassemia : Dr Sharda Jain & Vandana Gupta
 
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptxANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
 
Circulatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsCirculatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanisms
 
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
(RIYA)🎄Airhostess Call Girl Jaipur Call Now 8445551418 Premium Collection Of ...
 
Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
Russian Call Girls In Pune 👉 Just CALL ME: 9352988975 ✅❤️💯low cost unlimited ...
 
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
 
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
 
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
Cardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their RegulationCardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their Regulation
 
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
 
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
 
Call Girls Wayanad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Wayanad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Wayanad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Wayanad Just Call 8250077686 Top Class Call Girl Service Available
 

GIAB-GRC workshop oct2015 giab introduction 151005

  • 1. genomeinabottle.org Genome in a Bottle Consortium GIAB/GRC Pre-ASHG Workshop October 5, 2015 Reference Materials for Clinical Applications of Human Genome Sequencing Justin Zook and Marc Salit National Institute of Standards and Technology
  • 2. genomeinabottle.org Sequencing technologies and bioinformatics pipelines disagree O’Rawe et al. Genome Medicine 2013, 5:28
  • 3. genomeinabottle.org Sequencing technologies and bioinformatics pipelines disagree O’Rawe et al. Genome Medicine 2013, 5:28 Who is right? Is anyone right?
  • 4. genomeinabottle.org GIAB Scope • The Genome in a Bottle Consortium is developing the reference materials, reference methods, and reference data needed to assess confidence in human genome variant calls. • A principal motivation for this consortium is to enable performance assessment of sequencing and science-based regulatory oversight of clinical sequencing.
  • 5. genomeinabottle.org Well-characterized, stable RMs • Obtain metrics for validation, QC, QA, PT • Determine sources and types of bias/error • Learn to resolve difficult structural variants • Improve reference genome assembly • Optimization – integration of data from multiple platforms – sequencing and analysis • Enable regulated applications Comparison of SNP Calls for NA12878 on 2 platforms, 3 analysis methods
  • 6. genomeinabottle.org NGS Validation Process using Genomes in Bottles Sample gDNA isolation Library Prep Sequencing Alignment/Mapping Variant Calling Confidence Estimates Downstream Analysis Analytical Process Genome in a Bottle Scope Pre-Analytical Process Clinical Interpretation GIAB Data
  • 7. genomeinabottle.org Genome in a Bottle Consortium (GIAB) Hosted by US National Institute of Standards and Technology Goal: Provide infrastructure to assess confidence in human variant calls • Appropriately consented widely available DNA samples, distributed by the Coriell Institute – Also, QCed Reference Material (RM) versions from controlled lots will be available from NIST – Also, PGP samples are commercially available • High-accuracy reference data for these samples • Tools to facilitate their use – With the Global Alliance Data Working Group Benchmarking Team ga4gh.org
  • 8. genomeinabottle.org GIAB Selected Samples CEPH/Utah Pedigree 1463 ✔ NA1288 9 NA12879 NA12890 NA12880 NA12881 NA12882 NA12883 NA12884 NA12885 NA12886 NA12887 NA12888 NA12893 NA12877 NA12878 NA12891 NA12892 ✔ ✔ NA24149 NA24143 NA24385 Ashkenazi Jewish Trio ✔ NA24694 NA24695 NA24631 Asian (Han Chinese) Trio ✔ Note: Illumina and RTG have used data from the pedigree to improve variant calls in the specific GIAB samples. New New Personal Genome Project Available as NIST RM8398
  • 9. genomeinabottle.org NIST Human Genome Reference Materials (RMs) • NIST RM 8398 is available! – tinyurl.com/giabpilot – DNA isolated from large growth cell cultures – Stable, homogeneous – Best for regulated uses – DNA from same cell line at Coriell (NA12878) • New AJ and Asian Samples – Available from Coriell now – NIST RM available in 2016
  • 10. genomeinabottle.org Integrated 14 datasets from 5 platforms to establish Reference SNP/indel Calls for NA12878 Zook et al., Nature Biotechnology, 2014.
  • 11. genomeinabottle.org Integration Methods to Establish Reference Variant Calls for NA12878 Candidate Variants from Each Platform Identify Concordant Variants Identify Characteristics of Systematic Error Arbitrate Using Evidence of Systematic Error Exclude regions potentially biased for all short reads (e.g., repeats, SVs) Zook et al., Nature Biotechnology, 2014.
  • 12. genomeinabottle.org Assigning confidence to genomic regions for NA12878 High-confidence (77%) • Platforms agree or we understand the systematic biases causing disagreement • At least some methods have no evidence of systematic errors • Mendelian inheritance consistent Lower confidence (23%) • In a region known to be difficult for current technologies – Segmental Dups – Repeats, Low Complexity – High/Low GC – Etc. • Evidence of systematic error across many platforms • Inconsistent inheritance Zook et al., Nature Biotechnology, 2014.
  • 13. genomeinabottle.org Using high-confidence NIST-GIAB genotypes for NA12878 • NIST have released several versions of high- confidence genotypes for its pilot RM • These data are presently being used for benchmarking – prior to release of RMs – SNPs & indels • ~77% of the genome •Data on FTP now well-organized
  • 14. genomeinabottle.org GeT-RM Browser from NCBI and CDC • http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/ • Allows visualization of data underlying call each call
  • 15. genomeinabottle.org Uses of GIAB NA12878 Oncology – Molecular and Cellular Tumor Markers “Next Generation” Sequencing (NGS) guidelines for somatic genetic variant detection www.bioplanet.com/gcat
  • 16. genomeinabottle.org Global Alliance for Genomics and Health Benchmarking Task Team • Formed June 2014 to develop methods and tools for comparing variant calls to a benchmark • Developed standardized definitions for performance metrics like TP, FP, and FN. • Initial focus on germline SNPs/indels • Developing benchmarking tools • Comparison engine • Pluggable web interface with modules for: • Reporting/calculation of metrics • Visualization/user interface • Working with Genome in a Bottle Consortium to host data and calls from their well-characterized genomes www.bioplanet.com/gcat Example User Interface
  • 17. genomeinabottle.org Global Alliance for Genomics and Health Benchmarking Task Team Credit: Rebecca Truty, Complete Genomics How should we interpret this complex variant on chr21?
  • 18. genomeinabottle.org Global Alliance for Genomics and Health Benchmarking Task Team Credit: Rebecca Truty, Complete Genomics Beyond simple T/F classification: Genotype errors Trut h Callse t Description Proposed Name(s) CM#1 region match CM#2 allele match CM#3 genotype match 0/1 1/1 zygosity/genotype error GE TP 1TP, 1GE FN 1/1 0/1 1/2 0/1 1/1 0/2 2/2 common allele, FN allele GE_FN TP 1TP, 1GE, 1FN FN 0/1 1/2 common allele, FP allele GE_FP TP 1TP, 1GE, 1FP FP, FN 1/1 1/2 1/2 1/3 common allele, FP allele, FN allele GE_FP_FN TP 1TP, 1GE, 1FP, 1FN FP, FN
  • 19. genomeinabottle.org Global Alliance for Genomics and Health Benchmarking Task Team Credit: Rebecca Truty, Complete Genomics Beyond simple T/F classification: no-calls and half-calls Truth Callset Description Proposed Name(s) CM#1 region match CM#2 allele match CM#3 genotyp e match 0/1 ./1 half-call, TP allele HC_TP NC, NCV, TP 1NC, 1NCV, 1TP, 1GE TP 1/1 ./1 1NC, 1NCV, 1TP, 1GE FN 0/1 1/1 ./0 half call, FN allele(s) HC_FN NC, NCV, TP 1NC, 1NCV, 1FN FN 1/2 ./0 1NC, 2NCV, 2FN FN 1/2 ./1 ./2 half-call, TP allele, FN allele HC_TP_F N NC, NCV, TP 1NC, 1NCV, 1TP, 1GE, 1FN FN
  • 20. genomeinabottle.org Stratifying False PositivesGC Content TR Unit <7 TR Unit >=7 TR Unit 2TR Unit 1 TR Unit 3 TR Unit 4 Credit: Abby Beeler Ellie Wood GA4GH - Stratification
  • 21. genomeinabottle.org Public data from GIAB AJ PGP Trio Long reads/”Linked” reads • ~70/30/30x PacBio – ~11kb N50 • BioNano • 10X Genomics • Moleculo • Complete Genomics LFR • Oxford Nanopore Short reads • 300x Illumina paired-end • 15x Illumina 6kb mate-pair • Complete Genomics • SOLiD 5500W • Ion Proton Exome http://biorxiv.org/content/early/2015/09/15/026468
  • 22. genomeinabottle.org GIAB Analysis Group – New Data Sets Leaders • Francisco de la Vega – Annai Systems • Chris Mason – Weil Cornell Medical Center • Tina Graves – Washington University • Valerie Schneider – NCBI •and Justin and Marc Status • Analysis Group Responsibilities: – https://docs.google.com/document/d/10e A0DwB4iYTSFM_LPO9_2LyyN2xEqH49OXH htNH1uzw/edit?usp=sharing • Analysis Milestones: – https://docs.google.com/spreadsheets/d/1Pj4nSz H742g40wJz2fA6f8kFtZYAToZpSZYVPiC5st4/edit?u sp=sharing • Analysis Methods – https://docs.google.com/spreadsheet s/d/1Je2g85H7oK6kMXbBOoqQ1FM NrvGnFuUJTJn7deyYiS8/edit?usp=sha ring • Analysis Plan: – https://drive.google.com/file/d/0B7Ao1qq JJDHQdnVEaVdqbWdEdkE/view?usp=shari ng • Collecting Data and analyses on GIAB FTP Site • Recruiting people to help with the work. Goal: Establish and distribute a set of authoritative benchmark variant calls of all types and sizes, as well as homozygous reference regions, on GIAB PGP trios
  • 23. genomeinabottle.org Data Release Policy: Real-time, Open, Public Release Individual Datasets • Uploaded to GIAB FTP site as it is collected • Includes raw reads, aligned reads, and variant/reference calls Integrated High-confidence Calls • First develop SNP, indel, and homozygous reference calls • Then develop SV and non- SV calls • Released calls are versioned • Preliminary callsets will be made available to be critiqued
  • 24. genomeinabottle.org Analysis Progress: AJ Trio • SNPs/indels – Several candidate callsets – NIST working on integration – Plan to use 10X/moleculo/PacBio for difficult-to-map regions • Assembly – 2 de novo assemblies of AJ trio (MHAP/PBcR and Falcon/Bionano) – Will be used by at least 2 groups for SV calling • Structural variants – Candidate calls being generated by 15+ groups with >20 different algorithms and 6 datasets – 3 integration methods: Bina-MetaSV, DNAnexus/Baylor- Parliament, NIST-svclassify – Parliament: ~7k SVs with evidence in PacBio and Illumina • Long-range Phasing – 2 phased calls so far (CG LFR and 10X) – Integration methods needed
  • 25. genomeinabottle.org Proposed approach to form high- confidence SV (and non-SV) calls Generate candidate calls from multiple methods Compare/evaluate calls using Parliament/MetaSV/svclassify/others?; manually inspect discordant calls Integrate new and revised calls Combine integrated calls (with heuristics and/or machine learning) to generate high- confidence calls August 30, 2015 Nov 1, 2015 Jan 1, 2016 Jan 26, 2016
  • 26. genomeinabottle.org Acknowledgments • FDA – Elizabeth Mansfield, Computing staff • Many members of Genome in a Bottle – New members welcome! – Sign up on website for email newsletters Steering Committee – Marc Salit – Justin Zook – David Mittelman – Andrew Grupe – Michael Eberle – Steve Sherry – Deanna Church – Francisco De La Vega – Christian Olsen – Monica Basehore – Lisa Kalman – Christopher Mason – Elizabeth Mansfield – Liz Kerrigan – Leming Shi – Melvin Limson – Alexander Wait Zaranek – Nils Homer – Fiona Hyland – Steve Lincoln – Don Baldwin – Robyn Temple-Smolkin – Chunlin Xiao – Kara Norman – Luke Hickey
  • 27. genomeinabottle.org For More Information www.genomeinabottle.org - sign up for general GIAB and Analysis Team google group emails www.bioplanet.com/gcat - exome comparison tool www.ncbi.nlm.nih.gov/variation/tools/get-rm/ - Get-RM Browser Data: http://biorxiv.org/content/early/2015/09/15/026468 Global Alliance Benchmarking work group – ga4gh.org/#/benchmarking-team Twice yearly workshop – Winter: January 28-29, 2016 at Stanford University, California, USA – Summer at NIST, Maryland, USA Public Meetings Justin Zook: jzook@nist.gov Marc Salit: salit@nist.gov