PREPARED BY
ARUNDHATI MEHTA
© Arundhati Mehta, 2017
DNA SEQUENCING
 DNA Sequencing is Figuring out the order of DNA
nucleotides, or bases (A T G C ), in a genome that
make up an organism’s DNA.
F. Sangar
Sangar
Sequencing
History of DNA sequencing
1953
Discovery of the structure of the
DNA double helix
1972
Development of Recombinant
DNA technology,.
1977
The first complete DNA genome
to be sequenced is that of
Bacteriophage φX174 &
Frederick Sanger publishes
"DNA sequencing with chain-
terminating inhibitors“
1984
Medical Research Council
scientists decipher the complete
DNA sequence of the Epstein-
Barr virus, 170 kb.
1987
Applied Biosystems markets first
automated sequencing machine,
the model ABI 370.
1990
The U.S. National Institutes of
Health (NIH) begins large-scale
sequencing trials on M.
capricolum, E. coli
Caenorhabditis elegans and S.
cerevisiae
1995
Craig Venter Hamilton Smith
and colleagues publish the 1st
complete genome of bacterium
H. influenzae (whole-genome
shotgun sequencing.)
1996
Pål Nyrén and his student
Mostafa Ronaghi at the Royal
Institute of Technology in
Stockholm publish their method
of Pyrosequencing
1998
Phil Green and Brent Ewing
of the University of
Washington publish "phred”
for sequencer data analysis.
2001
A draft sequence of the
human genome is published.
2004
454 Life Sciences markets a
parallelized version of
Pyrosequencing.
2006
Era of Next Generation
Sequencing- 454
Sequencing, Illumina etc.
ERA OF SEQUENCING
1st Generation sequencing
• Sequence many identical molecules
• Sequencing in large gels or capillary tubing limits
scale
Sangar Chain Termination
( 1977 )
Maxam- Gilbert Sequencing
(1977)
ABI PRISM 377
5
Intro to NGS, 11.30.2016
1st Generation Sequencing
• Sequence many identical
molecules
• Sequencing in large gels or
capillary tubing limits scale
2nd Generation Sequencing
• Sequence millions of clonally
amplified molecules per run
• Using a reversible, stepwise
sequencing chemistry
• Immobilized on a surface
ERA OF SEQUENCING
QIAGEN GeneReader
Life Technologies/Applied
Biosystems; SOLID 5500
Illumina MiSeq
Roche / 454
Pyrosequencer
NEXT GENERATION SEQUENCING
High throughput DNA Sequencing Technique.
Employs Micro and Nanotechnologies
Reduce sample size.
Low Reagent cost
Less Time
Massive Parallel Sequencing
Sequence thousands of sequences at once.
Produce enormous amount of data .
NGS WORKFLOW
Clonal Amplification by
Bridge PCR
Sequencing-by-ligation
(SOLiD Platform )
Clonal Amplification by
Emulsion PCR
Pyrosequencing
(454 Sequencing)
Sequencing-by-synthesis
( Solexa Technology)
Sample Extraction , DNA fragmentation and invitro adapter ligation
NGS WORKFLOW
1. Create DNA fragments
2. Add platform-specific adapter sequences to every fragment.
Adapter
ligation
point
Adapter
molecule
Adapter molecules : Bind library to a flowcell or bead; Add sequence primer
binding sites & Add barcodes for multiplexing.
Adapter molecule
bound to DNA .
Adapter Binding
Adapter
ligation
point
DNA
Cluster Amplification:
Bridge PCR
DNA fragments are flanked with adaptors (library)
A solid surface is coated with primers complementary
to the two adaptor sequences
Isothermal amplification, with one end of each “bridge”
tethered to the surface
Clusters of DNA molecules are generated on the chip.
Each cluster is originated from a single DNA fragment,
and is thus a clonal population.
Cluster Amplification :
Emulsion PCR
Fragments with adaptors (the library) are PCR amplified within a water drop in oil.
One PCR primer is attached to the surface of a bead.
DNA molecules are synthesized on the beads in the water droplet. Each bead bears
clonal DNA originated from a single DNA fragment
Beads (with attached DNA) are then deposited into the wells of sequencing chips –
one well, one bead
Sequencing & Imaging Technologies:
Chain Reversible Termination
Sequencing by Cyclic Reversible Termination (CRT): CRT uses
reversible terminators in a cyclic method that comprises nucleotide
incorporation, fluorescence imaging and cleavage. 100-150bp reads are
used.
Sequencing & Imaging Technologies:
Sequencing by Ligation
Sequencing by Ligation (SBL) uses the enzyme DNA ligase to identify the nucleotide present
at a given position in a DNA sequence.
Sequencing & Imaging Technologies:
Pyrosequencing
Pyrosequencing: non-electrophoretic, bioluminescence method that measures the
release of inorganic pyrophosphate by proportionally converting it into visible light using a
series of enzymatic reaction
Nucleotide incorporation generates light seen as a peak
in the Pyrogram trace
Sequencing & Imaging Technologies:
Single Molecule-Real Time Sequencing
Single Molecule- Real Time (SMRT) is a parallelized single molecule DNA
sequencing method. A single DNA polymerase enzyme is affixed at the bottom of a ZMW
with a single molecule of DNA as a template
DNA
Polymerase
ZMW
(Zero Mode
Waveguide DNA)
NGS Technologies Overview
NGS differs in template preparation, sequencing and imaging, and data analysis
Commercially available technologies:
Illumina/Solexa
Roche/454
Helicos BioSciences
Life/APG – SOLiD system
Pacific Biosciences
Ion Torrent technology
Solid-phase amplification can produce 100-200 million spatially separated clusters, providing free
ends to which a universal sequencing primer can be hybridized to initiate the NGS reaction
ILLUMINA/SOLEXA SEQUENCING
 Run time: 1–10 days
 Produces: 2–1000 Gb of sequence
 Read length: 2 x 50 bp – 2 x 250 bp
(paired-end)
 Cost: $0.05–$0.40/Mb
Bridge PCR Clustal Amplification
 Applications
 DNA sequencing
 Gene Regulation Analysis
 Sequencing-based Transcriptome Analysis
 SNPs and SVs discovery
 Cytogenetic Analysis
 ChIP-sequencing
 Small RNA discovery analysis
 A whole human genome sequence was
determined in 8 weeks to an average
depth of ~ 40X, discovering ~ 4 new
million SNPs and ~400000 SVs (with an
accuracy <1% for both over-calls and
under-calls)
Over 1800
publications.
ROCHE/454 SEQUENCING
 Sequence much longer reads by sequencing multiple reads at once by reading optical signals as
bases are added.
 The DNA or RNA is fragmented into shorter reads up to 1kb.
 Uses Emulsion PCR for Clustal Amplificication.
 PYROSEQUENCING as sequencing approach.
Nucleotide incorporation generates light seen as a peak in the
Pyrogram trace .
All of the sequence reads we get from 454 will be different
lengths, because different numbers of these bases will be
added with each cycle.
 Applications
 Whole genome sequencing
 Targeted resequencing
 Sequencing-based Transcriptome Analysis
 Metagenomics
Over 1300
publications...
ION TORRENT SEQUENCING
 Ion torrent and ion proton sequencing do not make use of optical signals. Instead, they
exploit the fact that addition of a dNTP to a DNA polymer releases an H+ ion.
Run time: 3 h; no termination or deprotection steps
Clustal Amplification- Emulsion PCR
Read length: 100–300 bp
Throughput determined by chip size : 10Mb – 5 Gb
Cost: $1–$20/Mb
The pH change, if any, is used to determine how many
bases (if any) were added with each cycle.
LIFE/APG/ABI- SOLiD SEQUENCING
 AB SOLIDTM 3 System generates over 20 gigabases & 400 M tags per run .
Library Preparation
Emulsion PCR/ Bead Enrichment
Bead deposition
Sequencing by Ligation
2. Chemical crosslinking to
an amino-coated glass
surface
SANGERS Vs. NGS
Features Sanger NGS
Sequencing
Samples
Clones, PCR DNA Libraries
Preparation Steps Few, Sequencing reactions clean
up
Many, Complex
procedures
Data Collection Samples in plates :
96, 384
Samples on slides
1-16+
Data 1 Read/ Sample Thousands & Millions of
Reads/ Samples.
Comparison Of NGS Platforms
ADVANTAGES OF NGS
Sanger Sequencing NGS Sequencing
No invivo cloning, Transformation,
Colony picking
High degree of Parallelism then
Capillary Sequencing
Low Reagent Cost
Reduced Sample Size
Less Time
APPLICATIONS OF NGS
 Mutation discovery
 Transcriptome Analysis – RNA-Seq
 Sequencing clinical isolates in strain-to-reference mechanisms.
 Enabling Metagenomics
 Defining DNA-Protein interactions – ChIP-Seq
 Discovering non-coding RNAs
 Molecular diagnostics for Oncology & Inherited Disease study.
 Gene Regulation Analysis
 Whole Genome Sequencing
 Exploring Chromatin Packaging
FUTURE APPLICATIONS OF NGS
SUMMARY
 Next Generation Sequencing has changed the way we
carry out molecular biology and genomic studies.
 It has allowed us to sequence and annotate genomes at
a faster rate.
 It has allowed us to study , variation, expression and
DNA binding at a genome – wide level.
REFERENCES
 Elaine R. Mardis (2008) the impact of next-generation sequencing
technology on genetics. Cell vol.24 No.3,133-14.
 Elaine R. Mardis (2009): Next-Generation Sequencing Methods. Annu.
Rev. Genomics hum genet. 9:387-402
 Jorge S Reis-Filho (2010): Next-Generation Sequencing, Breast Cancer
Research 2010, 11(Suppl 3)
 Some websites –
 https://www.ncbi.nlm.nih.gov/pubmed
Next generation sequencing

Next generation sequencing

  • 1.
    PREPARED BY ARUNDHATI MEHTA ©Arundhati Mehta, 2017
  • 2.
    DNA SEQUENCING  DNASequencing is Figuring out the order of DNA nucleotides, or bases (A T G C ), in a genome that make up an organism’s DNA. F. Sangar Sangar Sequencing
  • 3.
    History of DNAsequencing 1953 Discovery of the structure of the DNA double helix 1972 Development of Recombinant DNA technology,. 1977 The first complete DNA genome to be sequenced is that of Bacteriophage φX174 & Frederick Sanger publishes "DNA sequencing with chain- terminating inhibitors“ 1984 Medical Research Council scientists decipher the complete DNA sequence of the Epstein- Barr virus, 170 kb. 1987 Applied Biosystems markets first automated sequencing machine, the model ABI 370. 1990 The U.S. National Institutes of Health (NIH) begins large-scale sequencing trials on M. capricolum, E. coli Caenorhabditis elegans and S. cerevisiae 1995 Craig Venter Hamilton Smith and colleagues publish the 1st complete genome of bacterium H. influenzae (whole-genome shotgun sequencing.) 1996 Pål Nyrén and his student Mostafa Ronaghi at the Royal Institute of Technology in Stockholm publish their method of Pyrosequencing 1998 Phil Green and Brent Ewing of the University of Washington publish "phred” for sequencer data analysis. 2001 A draft sequence of the human genome is published. 2004 454 Life Sciences markets a parallelized version of Pyrosequencing. 2006 Era of Next Generation Sequencing- 454 Sequencing, Illumina etc.
  • 4.
    ERA OF SEQUENCING 1stGeneration sequencing • Sequence many identical molecules • Sequencing in large gels or capillary tubing limits scale Sangar Chain Termination ( 1977 ) Maxam- Gilbert Sequencing (1977) ABI PRISM 377
  • 5.
    5 Intro to NGS,11.30.2016 1st Generation Sequencing • Sequence many identical molecules • Sequencing in large gels or capillary tubing limits scale 2nd Generation Sequencing • Sequence millions of clonally amplified molecules per run • Using a reversible, stepwise sequencing chemistry • Immobilized on a surface ERA OF SEQUENCING QIAGEN GeneReader Life Technologies/Applied Biosystems; SOLID 5500 Illumina MiSeq Roche / 454 Pyrosequencer
  • 6.
    NEXT GENERATION SEQUENCING Highthroughput DNA Sequencing Technique. Employs Micro and Nanotechnologies Reduce sample size. Low Reagent cost Less Time Massive Parallel Sequencing Sequence thousands of sequences at once. Produce enormous amount of data .
  • 7.
    NGS WORKFLOW Clonal Amplificationby Bridge PCR Sequencing-by-ligation (SOLiD Platform ) Clonal Amplification by Emulsion PCR Pyrosequencing (454 Sequencing) Sequencing-by-synthesis ( Solexa Technology) Sample Extraction , DNA fragmentation and invitro adapter ligation
  • 8.
    NGS WORKFLOW 1. CreateDNA fragments 2. Add platform-specific adapter sequences to every fragment. Adapter ligation point Adapter molecule Adapter molecules : Bind library to a flowcell or bead; Add sequence primer binding sites & Add barcodes for multiplexing. Adapter molecule bound to DNA .
  • 9.
  • 10.
    Cluster Amplification: Bridge PCR DNAfragments are flanked with adaptors (library) A solid surface is coated with primers complementary to the two adaptor sequences Isothermal amplification, with one end of each “bridge” tethered to the surface Clusters of DNA molecules are generated on the chip. Each cluster is originated from a single DNA fragment, and is thus a clonal population.
  • 11.
    Cluster Amplification : EmulsionPCR Fragments with adaptors (the library) are PCR amplified within a water drop in oil. One PCR primer is attached to the surface of a bead. DNA molecules are synthesized on the beads in the water droplet. Each bead bears clonal DNA originated from a single DNA fragment Beads (with attached DNA) are then deposited into the wells of sequencing chips – one well, one bead
  • 12.
    Sequencing & ImagingTechnologies: Chain Reversible Termination Sequencing by Cyclic Reversible Termination (CRT): CRT uses reversible terminators in a cyclic method that comprises nucleotide incorporation, fluorescence imaging and cleavage. 100-150bp reads are used.
  • 13.
    Sequencing & ImagingTechnologies: Sequencing by Ligation Sequencing by Ligation (SBL) uses the enzyme DNA ligase to identify the nucleotide present at a given position in a DNA sequence.
  • 14.
    Sequencing & ImagingTechnologies: Pyrosequencing Pyrosequencing: non-electrophoretic, bioluminescence method that measures the release of inorganic pyrophosphate by proportionally converting it into visible light using a series of enzymatic reaction Nucleotide incorporation generates light seen as a peak in the Pyrogram trace
  • 15.
    Sequencing & ImagingTechnologies: Single Molecule-Real Time Sequencing Single Molecule- Real Time (SMRT) is a parallelized single molecule DNA sequencing method. A single DNA polymerase enzyme is affixed at the bottom of a ZMW with a single molecule of DNA as a template DNA Polymerase ZMW (Zero Mode Waveguide DNA)
  • 16.
    NGS Technologies Overview NGSdiffers in template preparation, sequencing and imaging, and data analysis Commercially available technologies: Illumina/Solexa Roche/454 Helicos BioSciences Life/APG – SOLiD system Pacific Biosciences Ion Torrent technology
  • 17.
    Solid-phase amplification canproduce 100-200 million spatially separated clusters, providing free ends to which a universal sequencing primer can be hybridized to initiate the NGS reaction ILLUMINA/SOLEXA SEQUENCING  Run time: 1–10 days  Produces: 2–1000 Gb of sequence  Read length: 2 x 50 bp – 2 x 250 bp (paired-end)  Cost: $0.05–$0.40/Mb Bridge PCR Clustal Amplification
  • 18.
     Applications  DNAsequencing  Gene Regulation Analysis  Sequencing-based Transcriptome Analysis  SNPs and SVs discovery  Cytogenetic Analysis  ChIP-sequencing  Small RNA discovery analysis  A whole human genome sequence was determined in 8 weeks to an average depth of ~ 40X, discovering ~ 4 new million SNPs and ~400000 SVs (with an accuracy <1% for both over-calls and under-calls) Over 1800 publications.
  • 19.
    ROCHE/454 SEQUENCING  Sequencemuch longer reads by sequencing multiple reads at once by reading optical signals as bases are added.  The DNA or RNA is fragmented into shorter reads up to 1kb.  Uses Emulsion PCR for Clustal Amplificication.  PYROSEQUENCING as sequencing approach.
  • 20.
    Nucleotide incorporation generateslight seen as a peak in the Pyrogram trace . All of the sequence reads we get from 454 will be different lengths, because different numbers of these bases will be added with each cycle.  Applications  Whole genome sequencing  Targeted resequencing  Sequencing-based Transcriptome Analysis  Metagenomics Over 1300 publications...
  • 21.
    ION TORRENT SEQUENCING Ion torrent and ion proton sequencing do not make use of optical signals. Instead, they exploit the fact that addition of a dNTP to a DNA polymer releases an H+ ion. Run time: 3 h; no termination or deprotection steps Clustal Amplification- Emulsion PCR Read length: 100–300 bp Throughput determined by chip size : 10Mb – 5 Gb Cost: $1–$20/Mb The pH change, if any, is used to determine how many bases (if any) were added with each cycle.
  • 22.
    LIFE/APG/ABI- SOLiD SEQUENCING AB SOLIDTM 3 System generates over 20 gigabases & 400 M tags per run . Library Preparation Emulsion PCR/ Bead Enrichment Bead deposition Sequencing by Ligation 2. Chemical crosslinking to an amino-coated glass surface
  • 23.
    SANGERS Vs. NGS FeaturesSanger NGS Sequencing Samples Clones, PCR DNA Libraries Preparation Steps Few, Sequencing reactions clean up Many, Complex procedures Data Collection Samples in plates : 96, 384 Samples on slides 1-16+ Data 1 Read/ Sample Thousands & Millions of Reads/ Samples.
  • 24.
  • 25.
    ADVANTAGES OF NGS SangerSequencing NGS Sequencing No invivo cloning, Transformation, Colony picking High degree of Parallelism then Capillary Sequencing Low Reagent Cost Reduced Sample Size Less Time
  • 26.
    APPLICATIONS OF NGS Mutation discovery  Transcriptome Analysis – RNA-Seq  Sequencing clinical isolates in strain-to-reference mechanisms.  Enabling Metagenomics  Defining DNA-Protein interactions – ChIP-Seq  Discovering non-coding RNAs  Molecular diagnostics for Oncology & Inherited Disease study.  Gene Regulation Analysis  Whole Genome Sequencing  Exploring Chromatin Packaging
  • 27.
  • 28.
    SUMMARY  Next GenerationSequencing has changed the way we carry out molecular biology and genomic studies.  It has allowed us to sequence and annotate genomes at a faster rate.  It has allowed us to study , variation, expression and DNA binding at a genome – wide level.
  • 29.
    REFERENCES  Elaine R.Mardis (2008) the impact of next-generation sequencing technology on genetics. Cell vol.24 No.3,133-14.  Elaine R. Mardis (2009): Next-Generation Sequencing Methods. Annu. Rev. Genomics hum genet. 9:387-402  Jorge S Reis-Filho (2010): Next-Generation Sequencing, Breast Cancer Research 2010, 11(Suppl 3)  Some websites –  https://www.ncbi.nlm.nih.gov/pubmed

Editor's Notes

  • #6 Sequencers on the picture: Ion Torrent PGM, also now part of Thermo Fisher Scientific