SlideShare a Scribd company logo
Creating Reference-Grade
Human Genome Assemblies
Tina Graves Lindsay
Reference Genome Workshop at AGBT
Feb 13, 2017
The Human Reference is a Work in Progress!
• The current reference – GRCh38 - is not optimal for some
regions of the genome and/or some individuals/ancestries.
• GRCh38 is comprised of DNA from several individual humans.
• Allelic diversity and structural variation present major
challenges when assembling a representative diploid genome.
• New technologies, methods, and resources since 2003 have
allowed for substantial improvements in the reference genome.
• Additional high-quality reference sequences are needed to
represent the full range of genetic diversity in humans
AC074378.4
AC079749.5
AC134921.2
AC147055.2
AC140484.1
AC019173.4
AC093720.2
AC021146.7
NCBI36NC_000004.10 (chr4) Tiling Path
Xue Y et al, 2008
TMPRSS11E TMPRSS11E2
GRCh37NC_000004.11 (chr4) Tiling Path
AC074378.4
AC079749.5
AC134921.1
AC147055.2
AC093720.2
AC021146.7
TMPRSS11E
GRCh37: NT_167250.1 (UGT2B17 alternate locus)
AC074378.4
AC140484.1
AC019173.4
AC226496.2
AC021146.7
TMPRSS11E2
UGT2B17 – Conflicting Alleles
G
A
P
Samples to be Sequenced
Sequencing Plan
Definitions of Genome Level
• Platinum Genome
• Haploid genome source
• Contiguous, haplotype-resolved representation of entire genome
• BAC library available
• Gold Genome
• Diploid genome source
• Part of a trio
• Parents will be sequenced to help haplotype resolve some
regions
• BAC libraries available
• Targeted regions sequenced using these BAC libraries
• Will contain some haplotype resolved regions
CHM1: A Key Resource for Improving the Reference
• CHM1 cell line established from a haploid hydatidiform
mole (complete, paternal; 46XX) (U.Surti)
• CHORI-17 BAC library (P. deJong)
• CHORI-17 BAC end sequences (n=325,659)
• CHORI-17 multiple enzyme fingerprint map (1,560 fpc contigs)
• CHORI-17 BACs
• >750 have been sequenced
• 664 of them in Genbank as phase 3 sequence
• CHM1 WGS assembly
• Initial assembly produced from >100X coverage of Illumina data
• Initial PacBio assembly produced using ~54X of P5/C3 PacBio data
• Latest PacBio assembly produced using ~60X of P6/C4 PacBio data
Assembly Assessment Methods
• Assemblies run through NCBI QA pipeline
• Assessed for contiguity, annotation, and concordance with the
finished BACs
• Assembly Assembly alignments can be generated between each PB
assembly and GRCh38
• BioNano Genome Map
• SV calls generated from comparing the BioNano data to each of the
assemblies
• Hybrid scaffolding conflicts will also point out potential assembly
errors
• Alignment of the Illumina reads back to the each of the
assemblies
• Heterozygous calls are likely indicative of a collapse in the
assembly (for the haploid genomes)
Hybrid Scaffolds – PacBio and BioNano
Seq
Assem
Seq
Assem
Seq
Assem
BN
Hybrid
BN
Hybrid
BN
Hybrid
# of
Contigs
Contig
N50 (Mb)
Total
Size
(Gb)
# of
Scaffolds
Scaff N50
(Mb)
Total Size
(Gb)
CHM1 (P6)
GCA_001297185
MGI CHM1 map
(Jason’s version)
3641 26.9 2.99 161 47.6 2.84
CHM1 (P6)
GCA_001307025
MGI CHM1 Map
(Adam’s version)
4850 20.6 2.94 221 40.04 2.82
Hybrid Scaffold
Hybrid Scaffold
PacBio Contigs
BioNano Contigs
1q21 Region – GRCh38 vs GCA_001297185
1 Megabase
GRCh38
GCA_001297185
Seg Dup Track
1q21 Region - GRCh38 vs GCA_001297185
GRCh38
GCA_001297185
Seg Dup Track
99.9+% identity
99.1% identity
1 Megabase
CHM1 – Next Steps
• Currently running Pilon on GCA_001297185, for improved
base pair accuracy
• Based on alignment of BioNano data as well as
comparisons to GRCh38, we will make additional breaks
where needed
• Incorporate all finished BACs
• Final alignment to GRCh38 in order to produce
chromosome AGPs and submit
Samples to be Sequenced
Genome Status
Data
Source
Origin Level of
Coverage
Status
CHM1 NA Platinum Assembly Improvement
CHM13 NA Platinum In Assembly Queue
NA19240 Yoruban Gold Assembly Submission
HG00733 Puerto Rican Gold Assessing New Assembly
HG00514 Han Chinese Gold Assessing New Assembly**
NA12878 European Gold Assessing New Assembly
HG01352 Columbian Gold Assessing New Assembly
HG02818 Gambian Gold Assembly Underway
HG02059 Kinh-Vietnamese Gold In Assembly Queue
NA19434 Luhya Gold In Assembly Queue
HG04217 Telugu Gold Data Production Underway
**100x coverage was generated for the Han Chinese sample
Genome Total Size
(older version
Falcon)
# Contigs
(older version
Falcon)
Contig N50
(older version
Falcon)
Contig N50
(newer version
Falcon)
NA19240 2.75 Gb 3569 6.0 Mb 26.4 Gb
HG00733 2.84 Gb 3715 7.6 Mb 22-23 Mb
NA12878 2.80 Gb 4412 4.49 Mb 14-15 Mb
HG01352 2.85 Gb 4080 8.22 Mb 20-24 Mb
HG00514 2.85 Gb 2808 10.0 Mb 22-24 Mb
HG02818 2.82 Gb 3300 7.24 Mb Assembly
underway
Assembly Stats
First Gold Genome - NA19240
• NA19240 – Yoruban sample
• Generated >70X raw P6/C4 RSII PacBio data
Initial Assembly
Stats
Latest Assembly Stats
# Seq Contigs 3569 2889
Max Contig Length 20,393,869 bp 75,769,079 bp
Total Assembly
Size
2,745,634,789 bp 2,874,720,146 bp
N50 6,003,115 bp 26,385,265 bp
N90 848,151 bp 2,559,914 bp
N95 345,457 bp 710,070 bp
Assembly QC and Submission Steps
Multiple Falcon
Assemblies
Using stats and
alignment to
Bionano, pick the
best assembly
Quiver and Pilon
on best assembly
Use Bionano to
identify mis-
assemblies and
scaffold assembly
Submit scaffold-
level AGPs to
Genbank
Run through NCBI
assembly QA
pipeline
Evaluate and
curate output of
QA pipeline
Generate final
chromosome level
AGPs and Submit
Annotation of
chromosome level
assembly
Hybrid Stats
Seq Assem Seq Assem Seq Assem BN Hybrid BN Hybrid BN Hybrid
# of
Contigs
Contig N50
(Mb)
Total Size
(Gb)
# of
Scaffolds
Scaffold
N50 (Mb)
Total Size
(Gb)
NA19240 2889 26.3 2.87 218 39.9 2.82
NA12878 3551 15.1 2.86 270 28.7 2.83
HG00514 3190 24.2 2.88 208 37.0 2.83
NA19240 Assembly Assessment
Initial Calls Breaks made
Conflicts 51 35
Translocation SV 321 16
Complex 123 9
Nucmer
Alignments
9
69 Total
breaks made
Contig # Contig N50 Total Assembly
Size
Before Breaks 2889 26.4 Mb 2.87 Gb
After Breaks 2951 25.7 Mb 2.87 Gb
NA19240 contig break
Chimeric PacBio Contig
GRCh38 – Chr 1
GRCh38 – Chr 4
NA19240 Contig
NA19240 Contig
Segmental Duplications
Segmental Duplications
NA19240 Bionano Map Compared to GRCh38
SV Type Number of Calls
Insertion 1795
Deletion 756
End 71
Inversions 8
Complex 62
Translocations 6
NA19240 Inversion Compared to GRCh38
GRCh38
NA19240 Bionano Contigs
NA19240 MHC Region
GRCh38
Bionano Contigs
NA19240 MHC Region
NA19240
Reference
Alts
~65 kb insertion
Finished BACs Resolve This Region
GRCh38
PB Assembly
BAC Alignments
Seg Dup
Spanning Reference Gaps
• HG00514 80X assembly
• Initial assessment had 75 potential gap spanning contigs
• Closer look only 32 are real gap spanning contigs, that span 40
total gaps
True Gap Spanner
GRCh38
HG00514
Contig
False Gap Spanner
False
Alignment
Seg Dup
True
Alignment
7kb
3 kb
10 kb
Short Term Future Plans
• Lots of assemblies to analyze!
• Generate the latest Falcon assemblies for all samples
• Improve those assemblies
• Identifying misassemblies
• Making the breaks where needed
• Scaffolding the assemblies
• Incorporating BACs as they are finished
• Create Chromosomal AGPs
• Submit to Genbank
Longer Term Future Work
• Better Utilization of the Reference
• Mapping Strategies
• Graph based alignments
• Other alt-aware read mapping strategies
• Alternative reference data display challenges – When and how to
present data
• Alt alleles?
• Full reference sequences
• Haplo-resolved (10X)?
• Wet Lab Improvements
• Haplo-resolved strategies (10X)
• Clone-based work replacements? - Hyb 10X or Pac Bio?
• New long read technologies
• PacBio Sequel
• Oxford Nanopore
Acknowledgements
The McDonnell Genome Institute at
Washington University in St. Louis
Susan Dutcher
Bob Fulton
Wes Warren
Karyn Meltz Steinberg
Derek Albracht
Milinn Kremitzki
Susan Rock
Chad Tomlinson
Patrick Minx
Chris Markovic
Eddie Belter
Lee Trani
Sara Kohlberg
University of Washington
Evan Eichler
NCBI
Valerie Schneider
University of Pittsburgh
School of Medicine
(CHM1 and CHM13 cell line)
Urvashi Surti
BioNano Genomics
Alex Hastie
Pacific Biosciences
Jason Chin
Nick Sisneros
UCSF
Pui-Yan Kwok
Yvonne Lai
Chin Lin
Catherine Chu
NHGRI
Adam Phillippy
Sergey Koren
10X Genomics
Deanna Church
Nationwide Children’s Hospital
Richard Wilson
Vince Magrini
Sean McGrath
AGBT2017 Reference Workshop: Lindsay

More Related Content

What's hot

Generating haplotype phased reference genomes for the dikaryotic wheat strip...
Generating haplotype phased reference genomes  for the dikaryotic wheat strip...Generating haplotype phased reference genomes  for the dikaryotic wheat strip...
Generating haplotype phased reference genomes for the dikaryotic wheat strip...
Benjamin Schwessinger
 
ABGT 2016 Workshop Schneider
ABGT 2016 Workshop SchneiderABGT 2016 Workshop Schneider
ABGT 2016 Workshop Schneider
Genome Reference Consortium
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
Genome Reference Consortium
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
Genome Reference Consortium
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
Genome Reference Consortium
 
Grc workshop agbt2015_tg
Grc workshop agbt2015_tgGrc workshop agbt2015_tg
Grc workshop agbt2015_tg
Genome Reference Consortium
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
Genome Reference Consortium
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
Genome Reference Consortium
 
Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
Genome Reference Consortium
 
Schneider_AGBT2014
Schneider_AGBT2014Schneider_AGBT2014
Schneider_AGBT2014
vaschn
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
Genome Reference Consortium
 
Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
Genome Reference Consortium
 
Ashg2015 grc-pruitt
Ashg2015 grc-pruittAshg2015 grc-pruitt
Ashg2015 grc-pruitt
Genome Reference Consortium
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
Genome Reference Consortium
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
Genome Reference Consortium
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
Genome Reference Consortium
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
Genome Reference Consortium
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
GenomeInABottle
 
Agbt2015 workshop schneider
Agbt2015 workshop schneiderAgbt2015 workshop schneider
Agbt2015 workshop schneider
Genome Reference Consortium
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
Genome Reference Consortium
 

What's hot (20)

Generating haplotype phased reference genomes for the dikaryotic wheat strip...
Generating haplotype phased reference genomes  for the dikaryotic wheat strip...Generating haplotype phased reference genomes  for the dikaryotic wheat strip...
Generating haplotype phased reference genomes for the dikaryotic wheat strip...
 
ABGT 2016 Workshop Schneider
ABGT 2016 Workshop SchneiderABGT 2016 Workshop Schneider
ABGT 2016 Workshop Schneider
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
Grc workshop agbt2015_tg
Grc workshop agbt2015_tgGrc workshop agbt2015_tg
Grc workshop agbt2015_tg
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
 
Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
 
Schneider_AGBT2014
Schneider_AGBT2014Schneider_AGBT2014
Schneider_AGBT2014
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
 
Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
 
Ashg2015 grc-pruitt
Ashg2015 grc-pruittAshg2015 grc-pruitt
Ashg2015 grc-pruitt
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
Agbt2015 workshop schneider
Agbt2015 workshop schneiderAgbt2015 workshop schneider
Agbt2015 workshop schneider
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 

Viewers also liked

Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
Genome Reference Consortium
 
Variation reference graphs and the variation graph toolkit vg
Variation reference graphs and the variation graph toolkit vgVariation reference graphs and the variation graph toolkit vg
Variation reference graphs and the variation graph toolkit vg
Genome Reference Consortium
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)
Genome Reference Consortium
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
Genome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
Genome Reference Consortium
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
Genome Reference Consortium
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
Genome Reference Consortium
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
Genome Reference Consortium
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
Genome Reference Consortium
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
GenomeInABottle
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
GenomeInABottle
 
Aug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validationAug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validation
GenomeInABottle
 
My five minutes bell
My five minutes bellMy five minutes bell
My five minutes bell
JessicaSaga
 
Lata resume (1)
Lata resume (1)Lata resume (1)
Lata resume (1)
lata naik
 
PLI Spring 2017 Alumni Networking Events
PLI Spring 2017 Alumni Networking EventsPLI Spring 2017 Alumni Networking Events
PLI Spring 2017 Alumni Networking Events
Karin Seid
 
test slideshare
test slidesharetest slideshare
test slideshare
garimajain834
 
Font
FontFont
Font
kiuntoro
 

Viewers also liked (18)

Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Variation reference graphs and the variation graph toolkit vg
Variation reference graphs and the variation graph toolkit vgVariation reference graphs and the variation graph toolkit vg
Variation reference graphs and the variation graph toolkit vg
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
 
Aug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validationAug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validation
 
My five minutes bell
My five minutes bellMy five minutes bell
My five minutes bell
 
Homes Today 2 sm
Homes Today 2 smHomes Today 2 sm
Homes Today 2 sm
 
Lata resume (1)
Lata resume (1)Lata resume (1)
Lata resume (1)
 
PLI Spring 2017 Alumni Networking Events
PLI Spring 2017 Alumni Networking EventsPLI Spring 2017 Alumni Networking Events
PLI Spring 2017 Alumni Networking Events
 
test slideshare
test slidesharetest slideshare
test slideshare
 
Font
FontFont
Font
 

Similar to AGBT2017 Reference Workshop: Lindsay

Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
Miten Jain
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
GenomeInABottle
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
GenomeInABottle
 
Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...
Miten Jain
 
Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​
Jennifer Shelton
 
Jan2016 bio nano han cao
Jan2016 bio nano han caoJan2016 bio nano han cao
Jan2016 bio nano han cao
GenomeInABottle
 
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Fabio Caligaris
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Stuart MacGowan
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
sesejun
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
Din Apellidos
 
V4 Sequencing Reagent Experience
V4 Sequencing Reagent ExperienceV4 Sequencing Reagent Experience
V4 Sequencing Reagent Experience
Brian Krueger
 
Jan2015 giab bioinformatics summary
Jan2015 giab bioinformatics summaryJan2015 giab bioinformatics summary
Jan2015 giab bioinformatics summary
GenomeInABottle
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Integrated DNA Technologies
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
GenomeInABottle
 
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNABeyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
Integrated DNA Technologies
 
Bionano genome maps_feb2014
Bionano genome maps_feb2014Bionano genome maps_feb2014
Bionano genome maps_feb2014
Jennifer Shelton
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
Nikolay Vyahhi
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
Genome Reference Consortium
 
Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
Deanna Church
 
Scaling Genomic Analyses
Scaling Genomic AnalysesScaling Genomic Analyses
Scaling Genomic Analyses
fnothaft
 

Similar to AGBT2017 Reference Workshop: Lindsay (20)

Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...Generating high-quality reference human genomes using PromethION nanopore seq...
Generating high-quality reference human genomes using PromethION nanopore seq...
 
Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​Using BioNano Maps to Improve an Insect Genome Assembly​
Using BioNano Maps to Improve an Insect Genome Assembly​
 
Jan2016 bio nano han cao
Jan2016 bio nano han caoJan2016 bio nano han cao
Jan2016 bio nano han cao
 
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
Toward A Better Understanding Of Plant Genome Structure: Combining NGS, Optic...
 
Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)Review of Liao et al - A draft human pangenome reference - Nature (2023)
Review of Liao et al - A draft human pangenome reference - Nature (2023)
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
V4 Sequencing Reagent Experience
V4 Sequencing Reagent ExperienceV4 Sequencing Reagent Experience
V4 Sequencing Reagent Experience
 
Jan2015 giab bioinformatics summary
Jan2015 giab bioinformatics summaryJan2015 giab bioinformatics summary
Jan2015 giab bioinformatics summary
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNABeyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
Beyond Cloning: 101 Uses of Synthetic, High-Fidelity, Double-Stranded DNA
 
Bionano genome maps_feb2014
Bionano genome maps_feb2014Bionano genome maps_feb2014
Bionano genome maps_feb2014
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
 
Scaling Genomic Analyses
Scaling Genomic AnalysesScaling Genomic Analyses
Scaling Genomic Analyses
 

More from Genome Reference Consortium

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
Genome Reference Consortium
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
Genome Reference Consortium
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
Genome Reference Consortium
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
Genome Reference Consortium
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
Genome Reference Consortium
 
Mane v2 final
Mane v2 finalMane v2 final
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
Genome Reference Consortium
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
Genome Reference Consortium
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
Genome Reference Consortium
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
Genome Reference Consortium
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
Genome Reference Consortium
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
Genome Reference Consortium
 

More from Genome Reference Consortium (12)

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 

Recently uploaded

Nano-gold for Cancer Therapy chemistry investigatory project
Nano-gold for Cancer Therapy chemistry investigatory projectNano-gold for Cancer Therapy chemistry investigatory project
Nano-gold for Cancer Therapy chemistry investigatory project
SIVAVINAYAKPK
 
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdfCHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
rishi2789
 
CBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdfCBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdf
suvadeepdas911
 
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIESLOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
ShraddhaTamshettiwar
 
Pollen and Fungal allergy: aeroallergy.pdf
Pollen and Fungal allergy: aeroallergy.pdfPollen and Fungal allergy: aeroallergy.pdf
Pollen and Fungal allergy: aeroallergy.pdf
Chulalongkorn Allergy and Clinical Immunology Research Group
 
Ageing, the Elderly, Gerontology and Public Health
Ageing, the Elderly, Gerontology and Public HealthAgeing, the Elderly, Gerontology and Public Health
Ageing, the Elderly, Gerontology and Public Health
phuakl
 
vonoprazan A novel drug for GERD presentation
vonoprazan A novel drug for GERD presentationvonoprazan A novel drug for GERD presentation
vonoprazan A novel drug for GERD presentation
Dr.pavithra Anandan
 
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdf
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdfOphthalmic drugs latest. Xxxxxxzxxxxxx.pdf
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdf
MuhammadMuneer49
 
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
AyushGadhvi1
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
Jim Jacob Roy
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
Health Advances
 
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
Kosmoderma Academy Of Aesthetic Medicine
 
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
rishi2789
 
Cervical Disc Arthroplasty ORSI 2024.pptx
Cervical Disc Arthroplasty ORSI 2024.pptxCervical Disc Arthroplasty ORSI 2024.pptx
Cervical Disc Arthroplasty ORSI 2024.pptx
LEFLOT Jean-Louis
 
Acute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdfAcute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdf
Jim Jacob Roy
 
10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations   10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations
Traumasoft LLC
 
Recent advances on Cervical cancer .pptx
Recent advances on Cervical cancer .pptxRecent advances on Cervical cancer .pptx
Recent advances on Cervical cancer .pptx
DrGirishJHoogar
 
Top Travel Vaccinations in Manchester
Top Travel Vaccinations in ManchesterTop Travel Vaccinations in Manchester
Top Travel Vaccinations in Manchester
NX Healthcare
 
pharmacology for dummies free pdf download.pdf
pharmacology for dummies free pdf download.pdfpharmacology for dummies free pdf download.pdf
pharmacology for dummies free pdf download.pdf
KerlynIgnacio
 
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdfCHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
rishi2789
 

Recently uploaded (20)

Nano-gold for Cancer Therapy chemistry investigatory project
Nano-gold for Cancer Therapy chemistry investigatory projectNano-gold for Cancer Therapy chemistry investigatory project
Nano-gold for Cancer Therapy chemistry investigatory project
 
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdfCHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
CHEMOTHERAPY_RDP_CHAPTER 3_ANTIFUNGAL AGENT.pdf
 
CBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdfCBL Seminar 2024_Preliminary Program.pdf
CBL Seminar 2024_Preliminary Program.pdf
 
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIESLOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
LOW BIRTH WEIGHT. PRETERM BABIES OR SMALL FOR DATES BABIES
 
Pollen and Fungal allergy: aeroallergy.pdf
Pollen and Fungal allergy: aeroallergy.pdfPollen and Fungal allergy: aeroallergy.pdf
Pollen and Fungal allergy: aeroallergy.pdf
 
Ageing, the Elderly, Gerontology and Public Health
Ageing, the Elderly, Gerontology and Public HealthAgeing, the Elderly, Gerontology and Public Health
Ageing, the Elderly, Gerontology and Public Health
 
vonoprazan A novel drug for GERD presentation
vonoprazan A novel drug for GERD presentationvonoprazan A novel drug for GERD presentation
vonoprazan A novel drug for GERD presentation
 
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdf
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdfOphthalmic drugs latest. Xxxxxxzxxxxxx.pdf
Ophthalmic drugs latest. Xxxxxxzxxxxxx.pdf
 
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
Lecture 6 -- Memory 2015.pptlearning occurs when a stimulus (unconditioned st...
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
 
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
Cosmetology and Trichology Courses at Kosmoderma Academy PRP (Hair), DR Growt...
 
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
CHEMOTHERAPY_RDP_CHAPTER 2 _LEPROSY.pdf1
 
Cervical Disc Arthroplasty ORSI 2024.pptx
Cervical Disc Arthroplasty ORSI 2024.pptxCervical Disc Arthroplasty ORSI 2024.pptx
Cervical Disc Arthroplasty ORSI 2024.pptx
 
Acute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdfAcute Gout Care & Urate Lowering Therapy .pdf
Acute Gout Care & Urate Lowering Therapy .pdf
 
10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations   10 Benefits an EPCR Software should Bring to EMS Organizations
10 Benefits an EPCR Software should Bring to EMS Organizations
 
Recent advances on Cervical cancer .pptx
Recent advances on Cervical cancer .pptxRecent advances on Cervical cancer .pptx
Recent advances on Cervical cancer .pptx
 
Top Travel Vaccinations in Manchester
Top Travel Vaccinations in ManchesterTop Travel Vaccinations in Manchester
Top Travel Vaccinations in Manchester
 
pharmacology for dummies free pdf download.pdf
pharmacology for dummies free pdf download.pdfpharmacology for dummies free pdf download.pdf
pharmacology for dummies free pdf download.pdf
 
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdfCHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 4_ANTI VIRAL DRUGS.pdf
 

AGBT2017 Reference Workshop: Lindsay

  • 1. Creating Reference-Grade Human Genome Assemblies Tina Graves Lindsay Reference Genome Workshop at AGBT Feb 13, 2017
  • 2. The Human Reference is a Work in Progress! • The current reference – GRCh38 - is not optimal for some regions of the genome and/or some individuals/ancestries. • GRCh38 is comprised of DNA from several individual humans. • Allelic diversity and structural variation present major challenges when assembling a representative diploid genome. • New technologies, methods, and resources since 2003 have allowed for substantial improvements in the reference genome. • Additional high-quality reference sequences are needed to represent the full range of genetic diversity in humans
  • 3. AC074378.4 AC079749.5 AC134921.2 AC147055.2 AC140484.1 AC019173.4 AC093720.2 AC021146.7 NCBI36NC_000004.10 (chr4) Tiling Path Xue Y et al, 2008 TMPRSS11E TMPRSS11E2 GRCh37NC_000004.11 (chr4) Tiling Path AC074378.4 AC079749.5 AC134921.1 AC147055.2 AC093720.2 AC021146.7 TMPRSS11E GRCh37: NT_167250.1 (UGT2B17 alternate locus) AC074378.4 AC140484.1 AC019173.4 AC226496.2 AC021146.7 TMPRSS11E2 UGT2B17 – Conflicting Alleles G A P
  • 4. Samples to be Sequenced
  • 6. Definitions of Genome Level • Platinum Genome • Haploid genome source • Contiguous, haplotype-resolved representation of entire genome • BAC library available • Gold Genome • Diploid genome source • Part of a trio • Parents will be sequenced to help haplotype resolve some regions • BAC libraries available • Targeted regions sequenced using these BAC libraries • Will contain some haplotype resolved regions
  • 7. CHM1: A Key Resource for Improving the Reference • CHM1 cell line established from a haploid hydatidiform mole (complete, paternal; 46XX) (U.Surti) • CHORI-17 BAC library (P. deJong) • CHORI-17 BAC end sequences (n=325,659) • CHORI-17 multiple enzyme fingerprint map (1,560 fpc contigs) • CHORI-17 BACs • >750 have been sequenced • 664 of them in Genbank as phase 3 sequence • CHM1 WGS assembly • Initial assembly produced from >100X coverage of Illumina data • Initial PacBio assembly produced using ~54X of P5/C3 PacBio data • Latest PacBio assembly produced using ~60X of P6/C4 PacBio data
  • 8. Assembly Assessment Methods • Assemblies run through NCBI QA pipeline • Assessed for contiguity, annotation, and concordance with the finished BACs • Assembly Assembly alignments can be generated between each PB assembly and GRCh38 • BioNano Genome Map • SV calls generated from comparing the BioNano data to each of the assemblies • Hybrid scaffolding conflicts will also point out potential assembly errors • Alignment of the Illumina reads back to the each of the assemblies • Heterozygous calls are likely indicative of a collapse in the assembly (for the haploid genomes)
  • 9. Hybrid Scaffolds – PacBio and BioNano Seq Assem Seq Assem Seq Assem BN Hybrid BN Hybrid BN Hybrid # of Contigs Contig N50 (Mb) Total Size (Gb) # of Scaffolds Scaff N50 (Mb) Total Size (Gb) CHM1 (P6) GCA_001297185 MGI CHM1 map (Jason’s version) 3641 26.9 2.99 161 47.6 2.84 CHM1 (P6) GCA_001307025 MGI CHM1 Map (Adam’s version) 4850 20.6 2.94 221 40.04 2.82
  • 10. Hybrid Scaffold Hybrid Scaffold PacBio Contigs BioNano Contigs
  • 11. 1q21 Region – GRCh38 vs GCA_001297185 1 Megabase GRCh38 GCA_001297185 Seg Dup Track
  • 12. 1q21 Region - GRCh38 vs GCA_001297185 GRCh38 GCA_001297185 Seg Dup Track 99.9+% identity 99.1% identity 1 Megabase
  • 13. CHM1 – Next Steps • Currently running Pilon on GCA_001297185, for improved base pair accuracy • Based on alignment of BioNano data as well as comparisons to GRCh38, we will make additional breaks where needed • Incorporate all finished BACs • Final alignment to GRCh38 in order to produce chromosome AGPs and submit
  • 14. Samples to be Sequenced
  • 15. Genome Status Data Source Origin Level of Coverage Status CHM1 NA Platinum Assembly Improvement CHM13 NA Platinum In Assembly Queue NA19240 Yoruban Gold Assembly Submission HG00733 Puerto Rican Gold Assessing New Assembly HG00514 Han Chinese Gold Assessing New Assembly** NA12878 European Gold Assessing New Assembly HG01352 Columbian Gold Assessing New Assembly HG02818 Gambian Gold Assembly Underway HG02059 Kinh-Vietnamese Gold In Assembly Queue NA19434 Luhya Gold In Assembly Queue HG04217 Telugu Gold Data Production Underway **100x coverage was generated for the Han Chinese sample
  • 16. Genome Total Size (older version Falcon) # Contigs (older version Falcon) Contig N50 (older version Falcon) Contig N50 (newer version Falcon) NA19240 2.75 Gb 3569 6.0 Mb 26.4 Gb HG00733 2.84 Gb 3715 7.6 Mb 22-23 Mb NA12878 2.80 Gb 4412 4.49 Mb 14-15 Mb HG01352 2.85 Gb 4080 8.22 Mb 20-24 Mb HG00514 2.85 Gb 2808 10.0 Mb 22-24 Mb HG02818 2.82 Gb 3300 7.24 Mb Assembly underway Assembly Stats
  • 17. First Gold Genome - NA19240 • NA19240 – Yoruban sample • Generated >70X raw P6/C4 RSII PacBio data Initial Assembly Stats Latest Assembly Stats # Seq Contigs 3569 2889 Max Contig Length 20,393,869 bp 75,769,079 bp Total Assembly Size 2,745,634,789 bp 2,874,720,146 bp N50 6,003,115 bp 26,385,265 bp N90 848,151 bp 2,559,914 bp N95 345,457 bp 710,070 bp
  • 18. Assembly QC and Submission Steps Multiple Falcon Assemblies Using stats and alignment to Bionano, pick the best assembly Quiver and Pilon on best assembly Use Bionano to identify mis- assemblies and scaffold assembly Submit scaffold- level AGPs to Genbank Run through NCBI assembly QA pipeline Evaluate and curate output of QA pipeline Generate final chromosome level AGPs and Submit Annotation of chromosome level assembly
  • 19. Hybrid Stats Seq Assem Seq Assem Seq Assem BN Hybrid BN Hybrid BN Hybrid # of Contigs Contig N50 (Mb) Total Size (Gb) # of Scaffolds Scaffold N50 (Mb) Total Size (Gb) NA19240 2889 26.3 2.87 218 39.9 2.82 NA12878 3551 15.1 2.86 270 28.7 2.83 HG00514 3190 24.2 2.88 208 37.0 2.83
  • 20. NA19240 Assembly Assessment Initial Calls Breaks made Conflicts 51 35 Translocation SV 321 16 Complex 123 9 Nucmer Alignments 9 69 Total breaks made Contig # Contig N50 Total Assembly Size Before Breaks 2889 26.4 Mb 2.87 Gb After Breaks 2951 25.7 Mb 2.87 Gb
  • 22. Chimeric PacBio Contig GRCh38 – Chr 1 GRCh38 – Chr 4 NA19240 Contig NA19240 Contig Segmental Duplications Segmental Duplications
  • 23. NA19240 Bionano Map Compared to GRCh38 SV Type Number of Calls Insertion 1795 Deletion 756 End 71 Inversions 8 Complex 62 Translocations 6
  • 24. NA19240 Inversion Compared to GRCh38 GRCh38 NA19240 Bionano Contigs
  • 27. Finished BACs Resolve This Region GRCh38 PB Assembly BAC Alignments Seg Dup
  • 28. Spanning Reference Gaps • HG00514 80X assembly • Initial assessment had 75 potential gap spanning contigs • Closer look only 32 are real gap spanning contigs, that span 40 total gaps
  • 30. False Gap Spanner False Alignment Seg Dup True Alignment 7kb 3 kb 10 kb
  • 31. Short Term Future Plans • Lots of assemblies to analyze! • Generate the latest Falcon assemblies for all samples • Improve those assemblies • Identifying misassemblies • Making the breaks where needed • Scaffolding the assemblies • Incorporating BACs as they are finished • Create Chromosomal AGPs • Submit to Genbank
  • 32. Longer Term Future Work • Better Utilization of the Reference • Mapping Strategies • Graph based alignments • Other alt-aware read mapping strategies • Alternative reference data display challenges – When and how to present data • Alt alleles? • Full reference sequences • Haplo-resolved (10X)? • Wet Lab Improvements • Haplo-resolved strategies (10X) • Clone-based work replacements? - Hyb 10X or Pac Bio? • New long read technologies • PacBio Sequel • Oxford Nanopore
  • 33. Acknowledgements The McDonnell Genome Institute at Washington University in St. Louis Susan Dutcher Bob Fulton Wes Warren Karyn Meltz Steinberg Derek Albracht Milinn Kremitzki Susan Rock Chad Tomlinson Patrick Minx Chris Markovic Eddie Belter Lee Trani Sara Kohlberg University of Washington Evan Eichler NCBI Valerie Schneider University of Pittsburgh School of Medicine (CHM1 and CHM13 cell line) Urvashi Surti BioNano Genomics Alex Hastie Pacific Biosciences Jason Chin Nick Sisneros UCSF Pui-Yan Kwok Yvonne Lai Chin Lin Catherine Chu NHGRI Adam Phillippy Sergey Koren 10X Genomics Deanna Church Nationwide Children’s Hospital Richard Wilson Vince Magrini Sean McGrath