Successfully reported this slideshow.
Your SlideShare is downloading. ×

AGBT2017 Reference Workshop: Fulton

Upcoming SlideShare
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
Loading in …3

Check these out next

1 of 35 Ad

More Related Content

Slideshows for you (20)

Similar to AGBT2017 Reference Workshop: Fulton (20)


More from Genome Reference Consortium (17)

Recently uploaded (20)


AGBT2017 Reference Workshop: Fulton

  1. 1. Laboratory Aspects of Generating High Quality Assemblies MGI Reference Genomes Workshop Bob Fulton February 13th 2017
  2. 2. Primary Objectives • Develop Tools and Techniques to Provide High Quality, Haplo-resolved Genome Assemblies Sampling and Capturing as Much Human Diversity as Possible
  3. 3. Sequencing Strategy for Reference Genomes • PacBio Large Insert Library Construction • Linked Reads with 10X Genomics • Validation Using BioNano Physical Map
  4. 4. PacBio
  5. 5. PacBio WGS Library Construction • High Molecular Weight Genomic DNA • DNA must be of sufficient quality to allow for 50 kb shearing to produce PacBio Continuous Long Reads (CLR) • Consistent Shearing 50 kb • Preferred method: Diagenode Megaruptor • Fragment size setting – 50kb • Working on 3 Methods for Library Construction • PacBio SMRTbell – Current Standard PacBio SMRTbell Template Prep Kit 1.0 and SMRTbell Damage Repair Kit • Hybrid Library– Swift Accel-NGS XL Library Prep Kit but exchanging PacBio Damage Repair Kit • Swift Library - Swift Accel-NGS XL Library Prep Kit Including Swift DNA Repair Enzymes • New Data Recently Available with New Repair Process
  6. 6. HG02818 Library Preparation and Sequencing • Three library reactions(15ug) each of HG02818 were processed using the PacBio SMRTbell, Hybrid, and Swift library preps. • Library recoveries leading into BluePippin size selection for the Hybrid and Swift methods were double the PacBio library prep. • All libraries were size selected on the BP at 20Kb-50Kb.. • The PacBio SMRTbell library generated over a Gb of data for the first two SMRT cells. Additional SMRT cells produced less data as the library appeared to degrade. Library Method Library Recovery Pre-BP ROI Read Length PacBio SMRTbell 35.8% (5.3ug) 12178 Hybrid 68.8% (10.3ug) 13511 Swift 70.9% (10.6ug) 10232
  7. 7. HG02818 Library Preparation and Sequencing 0 200 400 600 800 1000 1200 1400 1600 1800 11/6/16 11/11/16 11/16/16 11/21/16 11/26/16 12/1/16 12/6/16 PacBio SMRTbell Hybrid Swift Date of PacBio RSII Sequencing Run ReadofInsertMbasesperSMRTcell
  8. 8. Subread Length Comparisons - HG02818 SMRTbell Library • Mean Subread Length: 11,391 bp • N50 Subread Length: 17,007 bp Hybrid Libraries • Mean Subread Length: 13,406 bp • N50 Subread Length: 18,649 bp
  9. 9. Subread Length Comparisons - HG02818 Swift Library • Mean Subread Length: 10,163 bp • N50 Subread Length: 15,220 bp E. Coli New Swift Only Kit • Mean Subread Length: 16,387 bp • N50 Subread Length: 22,625 bp
  10. 10. Agilent Tape Station Assessment of Library Size PacBio SMRTbell No BluePippin Size Selection
  11. 11. Agilent Tape Station Assessment of Library Size PacBio SMRTbell 6Kb-50Kb BluePippin Size Selection
  12. 12. Agilent Tape Station Assessment of Library Size Hybrid Prep Pre-BluePippin Size Selection
  13. 13. Agilent Tape Station Assessment of Library Size PacBio SMRTbell 8Kb-50Kb BluePippin Size Selection
  14. 14. Agilent Tape Station Assessment of Library Size Hybrid Prep 18Kb-50Kb BluePippin Size Selection
  15. 15. 10X Genomics
  16. 16. 10X Genomics • Chromium Instrument • Long Range Linking Information on a Genome Wide Scale • Phasing Information Across a Genome • Enhanced Variant Calling and Structural Variation Detection • DeNovo Assembly of Diploid Genomes • Both WGS and Targeted Approaches
  17. 17. 10X Genomics Overview (Church 10X Genomics)
  18. 18. 10X Genomics Phasing – Important for Het vs. Repeat Copy Resolution (Church 10X Genomics)
  19. 19. (Church 10X Genomics)
  20. 20. BioNano
  21. 21. Bionano Stats from Human Cell Lines Genome Coverage Mol N50 (Kb) # of Map Contigs Contig N50 (Mb) Total Map Size (Gb) NA19240 96X 174.9 3148 1.26 2.85 NA19238 93X 216.9 2798 1.47 2.93 NA19239 118X 201 2565 1.68 2.96 HG00733 157X 202.9 2484 1.69 2.92 HG00514 161X 211.7 3025 1.35 2.83 NA12878 134X 202.7 2739 1.46 2.84 HG01352 117X 184.5 3666 1.01 2.80
  22. 22. Large Inversion in HG00514
  23. 23. Printrepeats showing ~25kb Inverted Repeat
  24. 24. Read Mapping of Short Reads A CG TG T Short Reads A A CC ? ? G G G G TTTT ??? ?
  25. 25. Short Read Assembly A CG TG T Short Reads A A CC ? ? G G G G TTTT ??? ? A C G T G T
  26. 26. Long (PacBio) Reads A CG TG T Long Reads A CG T T A GA G G G G T CT
  27. 27. 10X Linked Reads A CG TG T A C G G T A C G T T T T G T
  28. 28. 10X Linked Reads A CG TG T CT T A T T A G T G TX We only achieve ~.2X per Molecule X X
  29. 29. 10X Linked Reads – Resolving Alleles vs Repeats A CG T/GG T CT T A G CT T A G G G GX
  30. 30. BioNano Map A CG TG T Nick Sites
  31. 31. BioNano Map A CG TG T Nick Sites Indicates Flipped Loop of Inverted Repeat
  32. 32. Future Plans • Refine Existing Platforms • Longer Linking • Longer Sequences • Cost Reductions • Investigate New Platforms • PacBio Sequel • Oxford Nanopore • Investigate New Techniques • Hybridization of Long Linked Reads in Lieu of Large Insert Clones to Capture Allelic Diversity Across as Many Humans as Possible
  33. 33. Summary • Goal: Generate Robust Data Sets for Additional High- quality Reference Genome Enhancing the Full Range of Genetic Diversity in Humans • These Long Read (Long Range) Sequencing/Mapping Applications Provide Orthogonal Synergistic Data Sets to Help Accomplish Our Goal. • Each System Possesses Unique Challenges and Requires Optimization of Protocols and Running Conditions Specific to Our Needs. • Experience and Communication is Key. (Magrini)
  34. 34. Acknowledgements The McDonnell Genome Institute at Washington University in St. Louis Tina Graves Amy Ly Lisa Cook Catrina Fronick Karyn Meltz Steinberg Wes Warren Chad Tomlinson Eddie Belter Susan Dutcher 10X Genomics Deanna Church Michael Chase BioNano Genomics Alex Hastie Pacific Biosciences Nick Sisneros Laura Nolden Nationwide Children’s Hospital Rick Wilson Vince Magrini Sean McGrath NCBI Valerie Schneider