SlideShare a Scribd company logo
1 of 48
Download to read offline
Telomere-to-telomere assembly of a
complete human chromosomes
Karen Miga
UC Davis Genetics Seminar
Sept 30, 2019
@khmiga
New Era in Genetics and Genomics
We are finally reaching complete, high-quality
telomere-to-telomere chromosome assemblies
New Era in Genetics and Genomics
We are finally reaching complete, high-quality
telomere-to-telomere chromosome assemblies
Human reference genome is incomplete.
• 368 unresolved issues, 102 gaps
• Segmental duplications, gene families, satellite
arrays, centromeres, rDNAs
• Uncharacterized sequence variation in the human
population
New Era in Genetics and Genomics
We are finally reaching complete, high-quality
telomere-to-telomere chromosome assemblies
Human reference genome is incomplete.
• 368 unresolved issues, 102 gaps
• Segmental duplications, gene families, satellite
arrays, centromeres, rDNAs
• Uncharacterized sequence variation in the human
population
chr21
New Era in Genetics and Genomics
We are finally reaching complete, high-quality
telomere-to-telomere chromosome assemblies
Human reference genome is incomplete.
• 368 unresolved issues, 102 gaps
• Segmental duplications, gene families, satellite
arrays, centromeres, rDNAs
• Uncharacterized sequence variation in the human
population
Our current understanding of
genome biology and function30 Mb
chr21
New Era in Genetics and Genomics
We are finally reaching complete, high-quality
telomere-to-telomere chromosome assemblies
Human reference genome is incomplete.
• 368 unresolved issues, 102 gaps
• Segmental duplications, gene families, satellite
arrays, centromeres, rDNAs
• Uncharacterized sequence variation in the human
population
Our current understanding of
genome biology and function30 Mb
chr21
~20 Mb ?
Challenge:
Generating assemblies across repetitive regions that
span hundreds of kilobases.
Repeats (100 kb+)
Unique
variant
Unique
variant
Can high-coverage ultra-long sequencing resolve
complete assemblies of the human genome?
MinION
100kb+
It’s time to finish the human genome
The Telomere-to-Telomere (T2T) consortium is an
open, community-based effort to generate the
first complete assembly of a human genome.
Our target: CHM13hTERT
Cell line from Urvashi Surti, Pitt; SKY karyotype from Jennifer Gerton and Tamara Potapova, Stowers
N=46; XX
Our target: CHM13hTERT
Cell line from Urvashi Surti, Pitt; SKY karyotype from Jennifer Gerton and Tamara Potapova, Stowers
N=46; XX
Intramural Sequencing Center
CHM13 Sequencing
94 MinION/GridION flow cells
11.1M reads
155 Gb (1.6 Gb / flow cell) (50x)
99 Gb in reads >50 kb (32x)
78 Gb in reads >70 kb (25x)
Max mapped read length 1.04 Mb
From May 1/18 – Jan 8/19
Intramural Sequencing Center
CHM13 Sequencing
94 MinION/GridION flow cells
11.1M reads
155 Gb (1.6 Gb / flow cell) (50x)
99 Gb in reads >50 kb (32x)
78 Gb in reads >70 kb (25x)
Max mapped read length 1.04 Mb
From May 1/18 – Jan 8/19
50x Nanopore ultra-long
Contig building
60x PacBio
Polishing
50x 10x Genomics
Polishing
BioNano
Structural validation
• 2.94 Gbp assembly NG50: 75 Mbp
• Exceeds the continuity of the reference
genome GRCh38 (56 Mbp NG50
contig size).
• Subset of chromosome assemblies
break only at centromere.
Roadmap for completing the genome
Canu
Canu
Canu
Orthogonal Validation
Jo and Valerie
Telomere-to-telomere assembly of a complete human chromosomes
2.2 - 3.7 Mb
mean of 3010 kb (S.D. = 429; n = 49)
STRUCTURAL VARIANT
STRUCTURAL VARIANT
151516 15 3 8 2
8
4
Assemble contigs
Using overlapping
SV patterns
XqXp
Scaffold Assembly of XCEN
XqXp
Rel3 Assembly: ~3.1 Mb
The assembly is a hypothesis(!)
2107 294659
Beth SullivanJennifer Gerton
Edmund Howe
Rel3 Assembly: ~3.1 Mb
@NanoporeConf | #NanoporeConf
Marker-assisted mapping
Adam Phillippy Arang Rhie Sergey Koren
@NanoporeConf | #NanoporeConf
Create a scaffold of unique, or
single copy k-mers genome-wide
Marker-assisted mapping
Adam Phillippy Arang Rhie Sergey Koren
Marker-assisted mapping
@NanoporeConf | #NanoporeConf
Anchor high-confident
long-read alignments to
repeat assemblies
Marker-assisted mapping
Adam Phillippy Arang Rhie Sergey Koren
Marker-assisted mapping
28
Confident mapping of long reads
using a single-copy k-mer strategy
Identify and mark all sites of unique anchors across the chromosome
chrX
• 21-mers that appear ~c times in Illumina data
• Also found in PacBio/Nanopore reads
• Less frequent in the centromere, but still there
• (Validated with Duplex-Seq)
29
Confident mapping of long reads
using a single-copy k-mer strategy
Filter long read alignments: retaining those with unique k-mer anchoring
chrX
chrX
30
Spacing of single-copy k-mers can be irregular in
repeat-dense regions
chrX
chrX
X CENTROMERE ARRAY
CENTROMERE
CENX: 3.1 Mbps
Number of k-mers: 2,034
Spacing N50: 6,879
Longest distance
between k-mers
: 53,798 bp
31
10XG Polishing
Unique K-mer-based filtering: Nanopore Reads
longranger + freebayes (two rounds)
nanopolish (two rounds)
arrow (two rounds)
Unique K-mer-based filtering: PacBio (CLR) Reads
chrX
chrX
chrX
GAGE pre-polishing
ChrX GAGE array: 19 tandemly arrayed ~9.4 kb repeats
Coverage
250
200
150
100
50
0
Base position
Most frequent base
Second most frequent base (error)
19 tandemly arrayed ~9.4 kb repeats
GAGE with marker-assisted polishing
Most frequent base
Second most frequent base (error)
ChrX GAGE array: 19 tandemly arrayed ~9.4 kb repeats
Coverage
250
200
150
100
50
0
Base position
19 tandemly arrayed ~9.4 kb repeats
34
CSS/HiFi Evaluation
chrX
HiFi Alignments to Evaluate Polishing
CENTROMERE X:
BEFORE POLISHING
DXZ1: 3.1 Mb
35
CSS/HiFi Evaluation
chrX
HiFi Alignments to Evaluate Polishing
CENTROMERE X:
AFTER POLISHING
NOTE:
Underlying satellite array
structure remains the same.
DXZ1: 3.1 Mb
Opens the whole genome to analysis
Ariel Gershman
Winston Timp’s
Laboratory
Ariel Gershman
Winston Timp’s
Laboratory
Ariel Gershman
Winston Timp’s
Laboratory
Ariel Gershman
Winston Timp’s
Laboratory
1. Structurally validated assembly from telomere-to-telomere. Including
3.1 Mb tandem repeat at the X centromere and providing a complete
assessment across tandemly repeated gene families.
Finished T2T X Chromosome:
High Accuracy and High Continuity
1. Structurally validated assembly from telomere-to-telomere. Including
3.1 Mb tandem repeat at the X centromere and providing a complete
assessment across tandemly repeated gene families.
2. Novel polishing strategy capable of improving the quality of large repeat-
rich regions. Demonstrating dramatic improvements in quality over the
entirety of the X chromosome.
Finished T2T X Chromosome:
High Accuracy and High Continuity
1. Structurally validated assembly from telomere-to-telomere. Including
3.1 Mb tandem repeat at the X centromere and providing a complete
assessment across tandemly repeated gene families.
2. Novel polishing strategy capable of improving the quality of large repeat-
rich regions. Demonstrating dramatic improvements in quality over the
entirety of the X chromosome.
3. Statistics of CHM13 full length BAC alignments to polished assembly:
275/341 (81%) QV 37.4 QV 27.9
153/341 (45%) QV 37.7 QV 27.4
Vollger M, Logsdon, G et al. bioRxiv doi.org/10.1101/635037
MeanMedianBACs Aligned
HiFi
UL-asm
Finished T2T X Chromosome:
High Accuracy and High Continuity
@NanoporeConf | #NanoporeConf
It is time to finish the
human genome
• github.com/nanopore-wgs-consortium/chm13
• 120x Nanopore reads
• NHGRI, UW, Nottingham,
• UC Davis (PromethION, Megan Dennis)
• 50x 10x Genomics linked reads (NHGRI)
• 70x PacBio CLR reads (WashU)
• 24x PacBio HiFi reads (UW)
• 40x Hi-C (Arima Genomics)
• BioNano optical map (WashU)
• Unpolished Canu assemblies
NEW! Rel3 open data release
Additional ultra-long ONT data
from Glennis Logsdon (UW)
Read length Coverage Percent of data
>50 kbp 12X 86%
>100 kbp 9.1X 66%
>150 kbp 6.8X 49%
>200 kbp 4.9X 35%
>250 kbp 3.4X 24%
N50 = 147.1
N1 = 649.6
Max = 1538.3
0.1 1 10 100 1000 10,000
Read length (kbp)
20,000
17,500
15,000
12,500
10,000
7,500
5,000
2,500
0
Numberofreads
13.9X coverage
• github.com/nanopore-wgs-consortium/chm13
• Minimal change in continuity
• 79.5 Mbp (rel2) vs. 71.8 Mbp (rel3) NG50
• Don’t judge assemblies based on continuity
• Tricky regions are fixed
• GAGE and more SegDups automatically resolved
• Improved BAC validation
• 288 (rel2) vs. 310 (rel3) of 341 BACs resolved
• 1 chromosome down, 23 to go…
Triple the coverage, what changed?
Goal of a complete human genome in the next two
years.
Challenges in front of us:
• Acrocentric p-arms
• Large segmental duplications
• Classical Human satellites 2,3
Establishing new benchmarking standards (XChr)
Pioneering new pipelines: Polishing, repeat assembly, and array
structural validation.
Setting the bar higher for quality and completeness.
Telomere-to-telomere assembly of a complete human chromosomes

More Related Content

What's hot

RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Keith Bradnam
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysismikaelhuss
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotLi Shen
 
Assembly and gene_prediction
Assembly and gene_predictionAssembly and gene_prediction
Assembly and gene_predictionBas van Breukelen
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1QIAGEN
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Mrinal Vashisth
 
Analysis of ChIP-Seq Data
Analysis of ChIP-Seq DataAnalysis of ChIP-Seq Data
Analysis of ChIP-Seq DataPhil Ewels
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Genome Reference Consortium
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewSean Davis
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysistuxette
 

What's hot (20)

Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plot
 
Assembly and gene_prediction
Assembly and gene_predictionAssembly and gene_prediction
Assembly and gene_prediction
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1NGS in Clinical Research: Meet the NGS Experts Series Part 1
NGS in Clinical Research: Meet the NGS Experts Series Part 1
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)
 
Analysis of ChIP-Seq Data
Analysis of ChIP-Seq DataAnalysis of ChIP-Seq Data
Analysis of ChIP-Seq Data
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysis
 

Similar to Telomere-to-telomere assembly of a complete human chromosomes

London Calling 2019: Karen Miga
London Calling 2019: Karen MigaLondon Calling 2019: Karen Miga
London Calling 2019: Karen MigaKaren Hayden Miga
 
40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?Adam Phillippy
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortiumGenomeInABottle
 
Architecture and evolution of neochromosomes
Architecture and evolution of neochromosomesArchitecture and evolution of neochromosomes
Architecture and evolution of neochromosomesAnthony Papenfuss
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Miten Jain
 
Tetrahymena genome project update 2004 by Jonathan Eisen
Tetrahymena genome project update 2004 by Jonathan EisenTetrahymena genome project update 2004 by Jonathan Eisen
Tetrahymena genome project update 2004 by Jonathan EisenJonathan Eisen
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowBrian Krueger
 
Aug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsAug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsGenomeInABottle
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
CALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeqCALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeqAshley Yow
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for educationaryajayakottarathil
 
How we revealed genomes secrets?
How we revealed genomes secrets? How we revealed genomes secrets?
How we revealed genomes secrets? ehsan sepahi
 
High Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genomeHigh Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genomeBrian Krueger
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfNoraCRuizGuevara
 
Useful.ppt
Useful.pptUseful.ppt
Useful.pptaaaa bbb
 
DNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGSDNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGS4RTPCRAnand
 

Similar to Telomere-to-telomere assembly of a complete human chromosomes (20)

London Calling 2019: Karen Miga
London Calling 2019: Karen MigaLondon Calling 2019: Karen Miga
London Calling 2019: Karen Miga
 
40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?40 Years of Genome Assembly: Are We Done Yet?
40 Years of Genome Assembly: Are We Done Yet?
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortium
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
Architecture and evolution of neochromosomes
Architecture and evolution of neochromosomesArchitecture and evolution of neochromosomes
Architecture and evolution of neochromosomes
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
 
Sept2016 sv 10_x
Sept2016 sv 10_xSept2016 sv 10_x
Sept2016 sv 10_x
 
Tetrahymena genome project update 2004 by Jonathan Eisen
Tetrahymena genome project update 2004 by Jonathan EisenTetrahymena genome project update 2004 by Jonathan Eisen
Tetrahymena genome project update 2004 by Jonathan Eisen
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
Aug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomicsAug2015 analysis team 04 10x genomics
Aug2015 analysis team 04 10x genomics
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
CALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeqCALS_Stewards_of_Future_2015_Yow_IsoSeq
CALS_Stewards_of_Future_2015_Yow_IsoSeq
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education
 
2013 duke-talk
2013 duke-talk2013 duke-talk
2013 duke-talk
 
How we revealed genomes secrets?
How we revealed genomes secrets? How we revealed genomes secrets?
How we revealed genomes secrets?
 
High Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genomeHigh Throughput Sequencing Technologies: On the path to the $0* genome
High Throughput Sequencing Technologies: On the path to the $0* genome
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdf
 
Useful.ppt
Useful.pptUseful.ppt
Useful.ppt
 
DNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGSDNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGS
 

More from Genome Reference Consortium

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCGenome Reference Consortium
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectGenome Reference Consortium
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amGenome Reference Consortium
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyGenome Reference Consortium
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsGenome Reference Consortium
 

More from Genome Reference Consortium (20)

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 

Recently uploaded

Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPirithiRaju
 
Project report on Fasciola hepatica.docx
Project report on Fasciola hepatica.docxProject report on Fasciola hepatica.docx
Project report on Fasciola hepatica.docxpriyanshimanchanda4
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)GregBabinski
 
Production of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxProduction of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxAKSHAY MANDAL
 
Science9 Quarter 3:Latitude and altitude.pptx
Science9 Quarter 3:Latitude and altitude.pptxScience9 Quarter 3:Latitude and altitude.pptx
Science9 Quarter 3:Latitude and altitude.pptxteleganne21
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinSowmiya
 
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYRHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYDnyandaBopche
 
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...Thane Heins
 
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...ORAU
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET
 
Solid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxSolid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxkrishuchavda31032003
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...ORAU
 
The deconstructed Standard Model equation _ - symmetry magazine.pdf
The deconstructed Standard Model equation _ - symmetry magazine.pdfThe deconstructed Standard Model equation _ - symmetry magazine.pdf
The deconstructed Standard Model equation _ - symmetry magazine.pdfSOCIEDAD JULIO GARAVITO
 
Preparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingPreparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingbhanilsaa
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...klada0003
 
Theory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesTheory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesChimwemweGladysBanda
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPTvishalbhati28
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionWim van Es
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.kapgateprachi@gmail.com
 
Zoogeographical regions In the World.pptx
Zoogeographical regions In the World.pptxZoogeographical regions In the World.pptx
Zoogeographical regions In the World.pptx2019n04898
 

Recently uploaded (20)

Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest ManagementPests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
Pests of Maize_Dr.UPR_Identification, Binomics, Integrated Pest Management
 
Project report on Fasciola hepatica.docx
Project report on Fasciola hepatica.docxProject report on Fasciola hepatica.docx
Project report on Fasciola hepatica.docx
 
The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)The GIS Capability Maturity Model (2013)
The GIS Capability Maturity Model (2013)
 
Production of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptxProduction of super male Tilapia (Sex reversal techniques).pptx
Production of super male Tilapia (Sex reversal techniques).pptx
 
Science9 Quarter 3:Latitude and altitude.pptx
Science9 Quarter 3:Latitude and altitude.pptxScience9 Quarter 3:Latitude and altitude.pptx
Science9 Quarter 3:Latitude and altitude.pptx
 
Introduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of proteinIntroduction about protein and General method of analysis of protein
Introduction about protein and General method of analysis of protein
 
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITYRHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
RHEOLOGY MODIFIERS: ENHANCING PERFORMANCE AND FUNCTIONALITY
 
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
AI Published & MIT Validated Perpetual Motion Machine Breakthroughs (2 New EV...
 
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
Mining Data for Ore Natural Language Processing to Identify Lithium Minerals ...
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
 
Solid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptxSolid waste management_13_409_U1_2024.pptx
Solid waste management_13_409_U1_2024.pptx
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
The deconstructed Standard Model equation _ - symmetry magazine.pdf
The deconstructed Standard Model equation _ - symmetry magazine.pdfThe deconstructed Standard Model equation _ - symmetry magazine.pdf
The deconstructed Standard Model equation _ - symmetry magazine.pdf
 
Preparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farmingPreparation of enterprise budget for integrated fish farming
Preparation of enterprise budget for integrated fish farming
 
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
1David Andress - The Oxford Handbook of the French Revolution-Oxford Universi...
 
Theory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theoriesTheory of indicators: Ostwald's and Quinonoid theories
Theory of indicators: Ostwald's and Quinonoid theories
 
structure of proteins and its type I PPT
structure of proteins and its type I PPTstructure of proteins and its type I PPT
structure of proteins and its type I PPT
 
Geometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projectionGeometric New Earth, Solarsystem, projection
Geometric New Earth, Solarsystem, projection
 
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
SHAMPOO : OVERVIEW OF SHAMPOO AND IT'S TYPES.
 
Zoogeographical regions In the World.pptx
Zoogeographical regions In the World.pptxZoogeographical regions In the World.pptx
Zoogeographical regions In the World.pptx
 

Telomere-to-telomere assembly of a complete human chromosomes

  • 1. Telomere-to-telomere assembly of a complete human chromosomes Karen Miga UC Davis Genetics Seminar Sept 30, 2019 @khmiga
  • 2. New Era in Genetics and Genomics We are finally reaching complete, high-quality telomere-to-telomere chromosome assemblies
  • 3. New Era in Genetics and Genomics We are finally reaching complete, high-quality telomere-to-telomere chromosome assemblies Human reference genome is incomplete. • 368 unresolved issues, 102 gaps • Segmental duplications, gene families, satellite arrays, centromeres, rDNAs • Uncharacterized sequence variation in the human population
  • 4. New Era in Genetics and Genomics We are finally reaching complete, high-quality telomere-to-telomere chromosome assemblies Human reference genome is incomplete. • 368 unresolved issues, 102 gaps • Segmental duplications, gene families, satellite arrays, centromeres, rDNAs • Uncharacterized sequence variation in the human population chr21
  • 5. New Era in Genetics and Genomics We are finally reaching complete, high-quality telomere-to-telomere chromosome assemblies Human reference genome is incomplete. • 368 unresolved issues, 102 gaps • Segmental duplications, gene families, satellite arrays, centromeres, rDNAs • Uncharacterized sequence variation in the human population Our current understanding of genome biology and function30 Mb chr21
  • 6. New Era in Genetics and Genomics We are finally reaching complete, high-quality telomere-to-telomere chromosome assemblies Human reference genome is incomplete. • 368 unresolved issues, 102 gaps • Segmental duplications, gene families, satellite arrays, centromeres, rDNAs • Uncharacterized sequence variation in the human population Our current understanding of genome biology and function30 Mb chr21 ~20 Mb ?
  • 7. Challenge: Generating assemblies across repetitive regions that span hundreds of kilobases. Repeats (100 kb+) Unique variant Unique variant Can high-coverage ultra-long sequencing resolve complete assemblies of the human genome?
  • 9. It’s time to finish the human genome The Telomere-to-Telomere (T2T) consortium is an open, community-based effort to generate the first complete assembly of a human genome.
  • 10. Our target: CHM13hTERT Cell line from Urvashi Surti, Pitt; SKY karyotype from Jennifer Gerton and Tamara Potapova, Stowers N=46; XX
  • 11. Our target: CHM13hTERT Cell line from Urvashi Surti, Pitt; SKY karyotype from Jennifer Gerton and Tamara Potapova, Stowers N=46; XX
  • 12. Intramural Sequencing Center CHM13 Sequencing 94 MinION/GridION flow cells 11.1M reads 155 Gb (1.6 Gb / flow cell) (50x) 99 Gb in reads >50 kb (32x) 78 Gb in reads >70 kb (25x) Max mapped read length 1.04 Mb From May 1/18 – Jan 8/19
  • 13. Intramural Sequencing Center CHM13 Sequencing 94 MinION/GridION flow cells 11.1M reads 155 Gb (1.6 Gb / flow cell) (50x) 99 Gb in reads >50 kb (32x) 78 Gb in reads >70 kb (25x) Max mapped read length 1.04 Mb From May 1/18 – Jan 8/19 50x Nanopore ultra-long Contig building 60x PacBio Polishing 50x 10x Genomics Polishing BioNano Structural validation
  • 14. • 2.94 Gbp assembly NG50: 75 Mbp • Exceeds the continuity of the reference genome GRCh38 (56 Mbp NG50 contig size). • Subset of chromosome assemblies break only at centromere. Roadmap for completing the genome Canu
  • 15. Canu
  • 16. Canu
  • 19. 2.2 - 3.7 Mb mean of 3010 kb (S.D. = 429; n = 49)
  • 21. STRUCTURAL VARIANT 151516 15 3 8 2 8 4 Assemble contigs Using overlapping SV patterns
  • 23. XqXp Rel3 Assembly: ~3.1 Mb The assembly is a hypothesis(!)
  • 24. 2107 294659 Beth SullivanJennifer Gerton Edmund Howe Rel3 Assembly: ~3.1 Mb
  • 25. @NanoporeConf | #NanoporeConf Marker-assisted mapping Adam Phillippy Arang Rhie Sergey Koren
  • 26. @NanoporeConf | #NanoporeConf Create a scaffold of unique, or single copy k-mers genome-wide Marker-assisted mapping Adam Phillippy Arang Rhie Sergey Koren Marker-assisted mapping
  • 27. @NanoporeConf | #NanoporeConf Anchor high-confident long-read alignments to repeat assemblies Marker-assisted mapping Adam Phillippy Arang Rhie Sergey Koren Marker-assisted mapping
  • 28. 28 Confident mapping of long reads using a single-copy k-mer strategy Identify and mark all sites of unique anchors across the chromosome chrX • 21-mers that appear ~c times in Illumina data • Also found in PacBio/Nanopore reads • Less frequent in the centromere, but still there • (Validated with Duplex-Seq)
  • 29. 29 Confident mapping of long reads using a single-copy k-mer strategy Filter long read alignments: retaining those with unique k-mer anchoring chrX chrX
  • 30. 30 Spacing of single-copy k-mers can be irregular in repeat-dense regions chrX chrX X CENTROMERE ARRAY CENTROMERE CENX: 3.1 Mbps Number of k-mers: 2,034 Spacing N50: 6,879 Longest distance between k-mers : 53,798 bp
  • 31. 31 10XG Polishing Unique K-mer-based filtering: Nanopore Reads longranger + freebayes (two rounds) nanopolish (two rounds) arrow (two rounds) Unique K-mer-based filtering: PacBio (CLR) Reads chrX chrX chrX
  • 32. GAGE pre-polishing ChrX GAGE array: 19 tandemly arrayed ~9.4 kb repeats Coverage 250 200 150 100 50 0 Base position Most frequent base Second most frequent base (error) 19 tandemly arrayed ~9.4 kb repeats
  • 33. GAGE with marker-assisted polishing Most frequent base Second most frequent base (error) ChrX GAGE array: 19 tandemly arrayed ~9.4 kb repeats Coverage 250 200 150 100 50 0 Base position 19 tandemly arrayed ~9.4 kb repeats
  • 34. 34 CSS/HiFi Evaluation chrX HiFi Alignments to Evaluate Polishing CENTROMERE X: BEFORE POLISHING DXZ1: 3.1 Mb
  • 35. 35 CSS/HiFi Evaluation chrX HiFi Alignments to Evaluate Polishing CENTROMERE X: AFTER POLISHING NOTE: Underlying satellite array structure remains the same. DXZ1: 3.1 Mb
  • 36. Opens the whole genome to analysis Ariel Gershman Winston Timp’s Laboratory
  • 40. 1. Structurally validated assembly from telomere-to-telomere. Including 3.1 Mb tandem repeat at the X centromere and providing a complete assessment across tandemly repeated gene families. Finished T2T X Chromosome: High Accuracy and High Continuity
  • 41. 1. Structurally validated assembly from telomere-to-telomere. Including 3.1 Mb tandem repeat at the X centromere and providing a complete assessment across tandemly repeated gene families. 2. Novel polishing strategy capable of improving the quality of large repeat- rich regions. Demonstrating dramatic improvements in quality over the entirety of the X chromosome. Finished T2T X Chromosome: High Accuracy and High Continuity
  • 42. 1. Structurally validated assembly from telomere-to-telomere. Including 3.1 Mb tandem repeat at the X centromere and providing a complete assessment across tandemly repeated gene families. 2. Novel polishing strategy capable of improving the quality of large repeat- rich regions. Demonstrating dramatic improvements in quality over the entirety of the X chromosome. 3. Statistics of CHM13 full length BAC alignments to polished assembly: 275/341 (81%) QV 37.4 QV 27.9 153/341 (45%) QV 37.7 QV 27.4 Vollger M, Logsdon, G et al. bioRxiv doi.org/10.1101/635037 MeanMedianBACs Aligned HiFi UL-asm Finished T2T X Chromosome: High Accuracy and High Continuity
  • 43. @NanoporeConf | #NanoporeConf It is time to finish the human genome
  • 44. • github.com/nanopore-wgs-consortium/chm13 • 120x Nanopore reads • NHGRI, UW, Nottingham, • UC Davis (PromethION, Megan Dennis) • 50x 10x Genomics linked reads (NHGRI) • 70x PacBio CLR reads (WashU) • 24x PacBio HiFi reads (UW) • 40x Hi-C (Arima Genomics) • BioNano optical map (WashU) • Unpolished Canu assemblies NEW! Rel3 open data release
  • 45. Additional ultra-long ONT data from Glennis Logsdon (UW) Read length Coverage Percent of data >50 kbp 12X 86% >100 kbp 9.1X 66% >150 kbp 6.8X 49% >200 kbp 4.9X 35% >250 kbp 3.4X 24% N50 = 147.1 N1 = 649.6 Max = 1538.3 0.1 1 10 100 1000 10,000 Read length (kbp) 20,000 17,500 15,000 12,500 10,000 7,500 5,000 2,500 0 Numberofreads 13.9X coverage • github.com/nanopore-wgs-consortium/chm13
  • 46. • Minimal change in continuity • 79.5 Mbp (rel2) vs. 71.8 Mbp (rel3) NG50 • Don’t judge assemblies based on continuity • Tricky regions are fixed • GAGE and more SegDups automatically resolved • Improved BAC validation • 288 (rel2) vs. 310 (rel3) of 341 BACs resolved • 1 chromosome down, 23 to go… Triple the coverage, what changed?
  • 47. Goal of a complete human genome in the next two years. Challenges in front of us: • Acrocentric p-arms • Large segmental duplications • Classical Human satellites 2,3 Establishing new benchmarking standards (XChr) Pioneering new pipelines: Polishing, repeat assembly, and array structural validation. Setting the bar higher for quality and completeness.

Editor's Notes

  1. KEY POINT HERE: spacing of unique variants… Some regions are easier than others….
  2. Number of k-mers: 2,034 Spacing N50: 6,879 Longest distance: 53,798 bp
  3. Median BAC QV 37.4 (mean QV 28.0) vs median QV 37.6 (mean WV 27.4 ) for the best CHM13 HiFi asm. And resolve 85% of BACs at >99.8% idy v.s. 54% for prior PacBio asm. T otal BACs: 341 Compressed: 166 1 Median: 99.9895 QV: 39.78811 Mean: 99.8706 QV: 28.88052 Mitchell HiFi: 153 1 Median: 99.9827 QV: 37.61954 Mean: 99.81871 QV: 27.41627 UL + 10x: 275 1 Median: 99.982 QV: 37.44727 Mean: 99.84145 QV: 27.99832
  4. Median BAC QV 37.4 (mean QV 28.0) vs median QV 37.6 (mean WV 27.4 ) for the best CHM13 HiFi asm. And resolve 85% of BACs at >99.8% idy v.s. 54% for prior PacBio asm. T otal BACs: 341 Compressed: 166 1 Median: 99.9895 QV: 39.78811 Mean: 99.8706 QV: 28.88052 Mitchell HiFi: 153 1 Median: 99.9827 QV: 37.61954 Mean: 99.81871 QV: 27.41627 UL + 10x: 275 1 Median: 99.982 QV: 37.44727 Mean: 99.84145 QV: 27.99832
  5. Median BAC QV 37.4 (mean QV 28.0) vs median QV 37.6 (mean WV 27.4 ) for the best CHM13 HiFi asm. And resolve 85% of BACs at >99.8% idy v.s. 54% for prior PacBio asm. T otal BACs: 341 Compressed: 166 1 Median: 99.9895 QV: 39.78811 Mean: 99.8706 QV: 28.88052 Mitchell HiFi: 153 1 Median: 99.9827 QV: 37.61954 Mean: 99.81871 QV: 27.41627 UL + 10x: 275 1 Median: 99.982 QV: 37.44727 Mean: 99.84145 QV: 27.99832