2. Sanger sequencing’ has been the only DNA sequencing method for 30
years but…
…hunger for even greater sequencing throughput and more economical
sequencing technology…
NGS has the ability to process millions of sequence reads in parallel
rather than 96 at a time (1/6 of the cost)
Objections:
fidelity, read length, infrastructure cost, handle large volum of data
Sanger vs NGS
3. NGS Sequence technologies
• Obsolete
• 454
• SOLiD
• Supported, not used much in genome assembly
• Ion Torrent (Ion PGM)
• Ion Proton
• Current workhorses
• Illumina
• Pacific biosciences
• Up and coming
• Oxford Nanopore
• 10x genomics - GemCode
7. 454 sequencing (Pyrosequencing)
454 Sequencing uses a large-scale parallel pyrosequencing system
capable of sequencing roughly 400-600 megabases of DNA per 10-hour
run
The system relies on fixing nebulized and adapter-ligated DNA
fragments to small DNA-capture beads in a water-in-oil emulsion. The
DNA fixed to these beads is then amplified by PCR. Each DNA-bound
bead is placed into a ~29 μm well on a PicoTiterPlate, a fiber optic chip.
A mix of enzymes such as DNA polymerase, ATP sulfurylase, and
luciferase are also packed into the well.
8. It differs from Sanger sequencing, in that it relies on the detection
of pyrophosphate release on nucleotide incorporation, rather than
chain termination with dideoxynucleotides
The desired DNA sequence is able to be determined by light
emitted upon incorporation of the next complementary nucleotide
by the fact that only one out of four of the possible A/T/C/G
nucleotides are added and available at a time so that only one
letter can be incorporated on the single stranded template
9. The single-strand DNA (ssDNA) template is
hybridized to a sequencing primer and
incubated with the enzymes DNA
polymerase, ATP sulfurylase, luciferase and
apyrase, and with the substrates adenosine
5´ phosphosulfate (APS) and luciferin.
1. The addition of one of the four
deoxynucleoside triphosphates (dNTPs)
(dATPαS, which is not a substrate for a
luciferase, is added instead of dATP to
avoid noise) initiates the second step. DNA
polymerase incorporates the correct,
complementary dNTPs onto the template.
This incorporation releases pyrophosphate
(PPi).
10. 2. ATP sulfurylase converts PPi to ATP in
the presence of adenosine 5´
phosphosulfate. This ATP acts as a
substrate for the luciferase-mediated
conversion of luciferin to oxyluciferin
that generates visible light in amounts
that are proportional to the amount of
ATP. The light produced in the
luciferase-catalyzed reaction is
detected by a camera and analyzed in a
pyrogram.
3. Unincorporated nucleotides and ATP
are degraded by the apyrase, and the
reaction can restart with another
nucleotide.
13. Library prep
Genomic DNA is fractionated into smaller fragments (300-800 base pairs)
and polished (made blunt at each end)
These adaptors provide priming sequences for both amplification and
sequencing of the sample-library fragments. One adaptor (Adaptor B)
contains a 5'-biotin tag for immobilization of the DNA library onto
streptavidin-coated beads
After nick repair, the non-biotinylated strand is released and used as a
single-stranded template DNA (sstDNA) library.
14. The sstDNA library is immobilized onto beads. The beads containing a
library fragment carry a single sstDNA molecule. The bead-bound library
is emulsified with the amplification reagents in a water-in-oil mixture
15. Sequencing
Single-stranded template DNA library beads are added to the
DNA Bead Incubation Mix (containing DNA polymerase) and are
layered with Enzyme Beads (containing sulfurylase and
luciferase) onto a PicoTiterPlate device.
The device is centrifuged to deposit the beads into the wells. The
layer of Enzyme Beads ensures that the DNA beads remain
positioned in the wells during the sequencing reaction. The
bead-deposition process is designed to maximize the number of
wells that contain a single amplified library bead.
16. The fluidics sub-system delivers sequencing reagents (containing
buffers and nucleotides) across the wells of the plate. The four
DNA nucleotides are added sequentially in a fixed order across
the PicoTiterPlate device during a sequencing run. During the
nucleotide flow, millions of copies of DNA bound to each of the
beads are sequenced in parallel.
When a nucleotide complementary to the template strand is
added into a well, the polymerase extends the existing DNA
strand by adding nucleotide(s). Addition of one (or more)
nucleotide(s) generates a light signal that is recorded by the CCD
camera in the instrument. This technique is based on sequencing-
by-synthesis and is called pyrosequencing
17. The signal strength is proportional to the number of nucleotides; for
example, homopolymer stretches, incorporated in a single nucleotide
flow generate a greater signal than single nucleotides.
18. Illumina sequencing
This sequencing method is based on reversible dye-terminators
that enable the identification of single bases as they are
introduced into DNA strands. It is often employed to sequence
difficult regions, such as homopolymers and repetitive
sequences. It can also be used for whole-genome and region
sequencing, transcriptome analysis, metagenomics, small RNA
discovery, methylation profiling, and genome-wide protein-
nucleic acid interaction analysis.
20. Tagmentation
The first step after DNA purification is tagmentation. Enzymes called
transposomes randomly cut the DNA into short segments (“tags”).
Adapters are added on either side of the cut points (ligation). Strands that
fail to have adapters ligated are washed away
21. Reduced Cycle Amplification
During this step, sequences for primer binding, indices, and terminal
sequences are added. Indices are usually six base pairs long and are used
during DNA sequence analysis to identify samples. Indices allow for up to
96 different samples to be run together
During analysis, the computer will group all reads with the same index
together
22. The terminal sequences are used for attaching the DNA strand to the
flow cell
This process takes place inside of an acrylamide-coated glass flow
cell. The flow cell has oligonucleotides (short nucleotide sequences)
coating the bottom of the cell, and they serve to hold the DNA
strands in place during sequencing. The oligos match the two kinds
of terminal sequences added to the DNA during reduced cycle
amplification. As the DNA enters the flow cell, one of the adapters
attaches to a complementary oligo.
23. Bridge amplification
Once attached, cluster generation can begin. The goal is to create
hundreds of identical strands of DNA. Some will be the forward strand;
the rest, the reverse. Clusters are generated through bridge amplification.
Polymerases move along a strand of DNA, creating its complementary
strand. The original strand is washed away, leaving only the reverse
strand. At the top of the reverse strand there is an adapter sequence. The
DNA strand bends and attaches to the oligo that is complementary to the
top adapter sequence. Polymerases attach to the reverse strand, and its
complementary strand (which is identical to the original) is made.
24. The now double stranded DNA is denatured so that each strand
can separately attach to an oligonucleotide sequence anchored
to the flow cell. One will be the reverse strand; the other, the
forward. This process is called bridge amplification, and it
happens for thousands of clusters all over the flow cell at once
25. Clonal Amplification
Polymerases will synthesize a new strand to create a double stranded
segment, and that will be denatured so that all of the DNA strands in
one area are from a single source (clonal amplification).
Clonal amplification is important for quality control purposes. If a strand
is found to have an odd sequence, then scientists can check the reverse
strand to make sure that it has the complement of the same oddity
The forward and reverse strands act as checks to guard against artifacts.
Because Illumina sequencing uses polymerases, base substitution errors
have been observed , especially at the 3’ end. Paired end reads
combined with cluster generation can confirm an error took place.
A minimum threshold of 97% similarity has been used in some labs’
analyses
26. Sequence by Synthesis
At the end of bridge amplification, all of the reverse strands are washed off
the flow cell, leaving only forward strands. Primers attach to the forward
strands and add fluorescently tagged nucleotides to the DNA strand. Only
one base is added per round. A reversible terminator is on every nucleotide
to prevent multiple additions in one round. Each of the four bases has a
unique emission, and after each round, the machine records which base
was added. This process is “sequence by synthesis.”
27. SOLiD sequencing
SOLiD (Sequencing by Oligonucleotide Ligation and Detection) is a
next-generation DNA sequencing technology developed by Life
Technologies and has been commercially available since 2006. This
next generation technology generates hundreds of millions to billions
of small sequence reads at one time.
This method should not be confused with "sequencing by synthesis,"
a principle used by Roche-454 pyrosequencing and Illumina
28. A library of DNA fragments is prepared from the sample to be
sequenced, and are used to prepare clonal bead populations. That is,
only one species of fragment will be present on the surface of each
magnetic bead. The fragments attached to the magnetic beads will
have a universal P1 adapter sequence attached so that the starting
sequence of every fragment is both known and identical. Emulsion PCR
takes place in microreactors containing all the necessary reagents for
PCR. The resulting PCR products attached to the beads are then
covalently bound to a glass slide.
29. Primers hybridize to the P1 adapter sequence within the library
template. A set of four fluorescently labelled di-base probes compete
for ligation to the sequencing primer. Specificity of the di-base probe is
achieved by interrogating every 1st and 2nd base in each ligation
reaction. Multiple cycles of ligation, detection and cleavage are
performed with the number of cycles determining the eventual read
length. Following a series of ligation cycles, the extension product is
removed and the template is reset with a primer complementary to
the n-1 position for a second round of ligation cycles.
Five rounds of primer reset are completed for each sequence tag.
Through the primer reset process, each base is interrogated in two
independent ligation reactions by two different primers.
30.
31. Ion torrent sequencing
A method of DNA sequencing based on the detection of hydrogen ions
that are released during the polymerization of DNA.
A microwell containing a template DNA strand to be sequenced is flooded
with a single species of deoxyribonucleotide triphosphate (dNTP). If the
introduced dNTP is complementary to the leading template nucleotide, it
is incorporated into the growing complementary strand. This causes the
release of a hydrogen ion that triggers an ion sensor, which indicates that
a reaction has occurred. If homopolymer repeats are present in the
template sequence, multiple dNTP molecules will be incorporated in a
single cycle. This leads to a corresponding number of released hydrogens
and a proportionally higher electronic signal.
32. SMRT sequencing (Pacific Biosciences)
Single molecule real time sequencing (SMRT) is a parallelized single molecule
DNA sequencing method. Single molecule real time sequencing utilizes a zero-
mode waveguide (ZMW). A single DNA polymerase enzyme is affixed at the
bottom of a ZMW with a single molecule of DNA as a template.
The ZMW is a structure that creates an illuminated observation volume that is
small enough to observe only a single nucleotide of DNA being incorporated by
DNA polymerase. Each of the four DNA bases is attached to one of four different
fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, the
fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW
where its fluorescence is no longer observable.
A detector detects the fluorescent signal of the nucleotide incorporation, and the
base call is made according to the corresponding fluorescence of the dye.
33. Nanopore sequencing
• It works by monitoring changes to an electrical current as nucleic acids
are passed through a protein nanopore. The resulting signal is decoded
to provide the specific DNA or RNA sequence
• Pros: Extremely long sequences, single molecule, portable
• Cons: Very high error rates (38%!)
36. Platform Chemistry Read Length Run Time Gb/Run Advantage Disadvantage
454 GS Junior
(Roche)
Pyro-
sequencing
500 8 hrs. 0.04 Long Read
Length
High error rate in
homopolymer
454 GS FLX+
(Roche)
Pyro-
sequencing
700 23 hrs. 0.7 Long Read
Length
High error rate in
homopolymer
HiSeq
(Illumina)
Reversible
Terminator
2*100 2 days
(rapid
mode)
120 (rapid
mode)
High-
throughput /
cost
Short reads Long run
time (normal mode)
SOLiD (Life) Ligation 85 8 days 150 Low Error Rate Short reads Long run
time
Ion Proton
(Life)
Proton
Detection
200 2 hrs. 100 Short Run times New*
PacBio RS Real-time
Sequencing
3000 (up to
15,000) 20 min 3
No PCR
Longest Read
Length
High Error Rate
Comparison of different sequencing platforms
38. Inherent Challenges
• Isolation of single cells
• Maintaining cells in vitro (cell viability)
• Mediating PCR bias
• Contamination-free environment
• High cost associated with high throughput volume
39. The Future
• More Reads
• Longer Reads
• Faster Sequencing
• Cheaper Sequencing
Editor's Notes
MinION is a portable device for molecular analyses that is driven by nanopore technology. It is adaptable for the analysis of DNA, RNA, proteins or small molecules with a straightforward workflow. The MinION product specification is available here.
Give a read length of 230-300 require sample of 10pg -1ug and total data generated is 7 TB
SOLiD has lowest error rate. Good for resequencing.
Illumina has the highest throughput / cost.
PacBio has the longest read length. Up to 15,000 bp. Applications in DeNovo sequencing and structural variation studies.
Ion Torrent from Life Technologies has two instruments. The Personal Genome Machine which has been out since 2010. It’s mostly only applicable for small genome sequencing like bacterial genomes. The Ion Proton, which promises much greater throughput and it has been recently made commercially available. There is however limited information on it’s actual performance.