DNA sequencing: what's driving their improvements
Upcoming SlideShare
Loading in...5
×
 

DNA sequencing: what's driving their improvements

on

  • 6,530 views

these slides show how the improvements in DNA sequencers are mostly from "reductions in scale." As with integrated circuits, reducing the size of features on DNA sequencers has enabled many orders of ...

these slides show how the improvements in DNA sequencers are mostly from "reductions in scale." As with integrated circuits, reducing the size of features on DNA sequencers has enabled many orders of magnitude improvements in them. Unlike integrated circuits, the improvements are also due to changes in technology. For example, changes from pyrosequencing to semiconductor and nanopore sequencing have also been needed to achieve the reductions in scale. Second, pyrosequencing also benefited from improvements in lasers and camera chips.

Statistics

Views

Total Views
6,530
Views on SlideShare
6,530
Embed Views
0

Actions

Likes
1
Downloads
22
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • What is genome –all of an organism’s hereditary information

DNA sequencing: what's driving their improvements DNA sequencing: what's driving their improvements Presentation Transcript

  • A/Prof Jeffrey Funk Division of Engineering and Technology Management National University of Singapore For information on other technologies, see http://www.slideshare.net/Funk98/presentations
  •  What are the important dimensions of performance for DNA sequencers and higher-level systems?  What are the rates of improvement?  What drives these rapid rates of improvement?  Will these improvements continue?  What kinds of new higher-level systems will likely emerge from the improvements in DNA sequencers?  What does this tell us about the future?
  • Session Technology 1 Objectives and overview of course 2 Two types of improvements: 1) Creating materials that better exploit physical phenomena; 2) Geometrical scaling 4 Semiconductors, ICs, electronic systems 5 MEMS and Bio-electronic ICs 6 Nanotechnology and DNA sequencing 7 Superconductivity and solar cells 8 Lighting and Displays 9 Human-computer interfaces (also roll-to roll printing) 10 Telecommunications and Internet 11 3D printing and energy storage This is Part of the Sixth Session of MT5009
  •  Creating materials (and their associated processes) that better exploit physical phenomenon  Geometrical scaling ◦ Increases in scale ◦ Reductions in scale  Some technologies directly experience improvements while others indirectly experience them through improvements in “components” A summary of these ideas can be found in 1) What Drives Exponential Improvements? California Management Review, Spring 2013 2) Technology Change and the Rise New Industries, Stanford University Press, 2013
  •  Creating materials (and their associated processes) that better exploit physical phenomenon ◦ Created materials that enable new techniques of DNA sequencing  Geometrical scaling ◦ Reductions in scale: smaller feature sizes for each technique (but many new techniques) ◦ Increases in scale: larger wash plates and production equipment  Some technologies directly experience improvements while others indirectly experience them through improvements in “components” ◦ Better lasers and sensors were important for some of the techniques (e.g., pyrosequencing and Single-molecule real-time sequencing)
  •  Identify the sequence and identity of 3 billion base pair nucleotides in DNA strand  Nucleotides encode the genetic instructions for organisms  Four types of nucleotides in a DNA strand ◦ Adenine ◦ Thymine ◦ Cytosine ◦ Guanine
  • http://www.genome.gov/sequencingcosts/
  • http://www.genome.gov/sequencingcosts/
  •  Read lengths  Accuracies  Speeds  Improvements in these variables also lead to reductions in cost of sequencing  Capability to analyze and use gathered data ◦ need better computers ◦ need more storage
  • Improvements in DNA sequencers Nature 2011, 470: 198-203, Elaine Mardis
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  New methods of sequencing ◦ Maxam-Gilbert Sequencing: relies on cleaving of nucleotides by chemical methods ◦ Chain Termination methods (sometimes called Sanger method): bases are illuminated with UV light, read with X-rays ◦ Dye-termination: reading sequences with fluorescent dyes where each nucleotide emits light in different wavelengths (this technology caused acceleration)  Improved lasers and cameras to read fluorescent dyes  More parallel processing  Smaller feature sizes, reductions in scale http://www.dnasequencing.org/history-of-d
  • Source: Nature Biotechnology 30(11), 1023-1026, November 2012 But many different approaches are being investigated
  •  This can be understood by reading highly cited papers such as ◦ “Genome sequencing in micro-fabricated high-density pico-liter reactors” (Margulies, 2005) and ◦ “Toward nano-scale genome sequencing” (Ryan et al, 2007).  Quote from Ryan et al: “The ability to construct nano-scale structures and perform measurements using novel nano- scale effects has provided new opportunities to identify nucleotides directly using physical, and not chemical, methods.”  In fact, just the titles of these papers are fairly suggestive. In all of these decreasing scale examples, totally new forms of equipment, processes and factories were required.
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  1) separate DNA into smaller strands  2) make copies of strands (i.e., amplification) with emulsion beads in plastic containers ◦ do this with small containers on a large wash plate so that many copies are made in parallel ◦ smaller containers and larger wash plates lead to more parallel and faster processing  3) identify DNA nucleotides utilizing lasers and cameras ◦ Nucleotides emit light in the presence of an enzyme, ADT (Adenosine Triphosphate) ◦ falling costs of lasers and cameras reduce costs  4) Analyze data with computers
  • One source: http://www.454.com/downloads/news-events/how-genome-sequencing-is-done
  •  Make copies to improve accuracy through redundancy  454 PicoTiterPlate from LifeSciences ◦ contains 1.6 million hexagonal wells ◦ each holds 75 pico-liters (10-12 liters, <100 micron diameter)  These wells can be made much smaller ◦ dimensions on integrated circuits (ICs) are on the order of 20 nano-meters ◦ Is it possible to reduce feature sizes by 1000 times or volumes by 109
  • Fluorescent Dyes, Lasers, and Cameras (Step 3) As bases move across wash plate during sequencing run, a nucleotide (molecules that make up DNA) generates light signal, which is recorded by camera Signal strength is proportional to number of nucleotide incorporated onto the DNA strands
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  • Eliminate amplification and wash steps with zero wave guides (Pacific BioSciences) http://www.youtube.com/watch?v=v8p4ph2MAvI from 1:50 to 3:50
  •  Uses Zero Mode Wave Guides  They are ◦ Very small: zepto-liters (10-21 liters, 50 nanometers in diameter) ◦ fabricated in a 100nm metal film on a silicon dioxide substrate ◦ enough room for 600,000 molecules of liquid water at room temperature ◦ How much smaller can they be made?
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  Uses semiconductor chips to sequence DNA by detecting PH differences between A, G, C, and T ◦ Thus, no lasers, cameras, or amplification are used  A microwell containing template DNA strand is filled with single species of deoxyribonucleotide triphosphate (dNTP) ◦ Beneath layer of microwells is ion sensitive layer, below which is ISFET ion sensor. ◦ All layers are contained in CMOS semiconductor chip ◦ If the introduced dNTP is complementary to leading template nucleotide, it is incorporated into growing strand ◦ This causes release of a hydrogen ion that triggers ISFET ion sensor, indicating a reaction has occurred http://www.nature.com/news/2010/101214/full/news.2010.674.html http://en.wikipedia.org/wiki/Ion_semiconductor_sequencing
  • Done in Massively Parallel For each well Matches cause ion to be released Multiple matches cause multiple ions to be released No matches no ions are released
  •  While first sequencers used older (i.e., large feature sizes) semiconductor technology, newer ones use smaller feature sizes and thus are faster than older ones  http://www.youtube.com/watch?v=JHzkYDyMzOg&feature=relmfu (2:30- 4:15)  For example, first sequencer (314) had 1.2 million wells while most recent one (Proton II) has 660 million wells ◦ How much smaller can these wells be made? ◦ Since 256GB memory chips (1 byte = 8 bits) exist, can ion torrent be able to provide 256 x 8 billion wells or about 2 trillion wells in next few years? ◦ After that improvements may slow as ion torrent's improvements depend on reductions in feature sizes of semiconductor technology http://www.nature.com/news/2010/101214/full/news.2010.674.html http://en.wikipedia.org/wiki/Ion_semiconductor_sequencing http://www.lifescientist.com.au/article/394936/feature_sequencing_3_0/?pp=2
  • Source: Ion Torrent Video
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  • Squeeze DNA through a nano-scopic pore (about 1.4 nm) in a semiconductor and read the distinctive change each letter in the sequence makes in the amount of current flowing through the pore NanoPores
  •  DNA moves through a nanopore at remarkably high velocities and thus only a small number of ions (as few as ~100) are available in the nanopore to correctly identify nucleotides ◦ so the small changes in the ionic current due to the presence of different nucleotides are overwhelmed by thermodynamic fluctuations  Challenge is to reduce the translocation velocity so that the ions can be correctly identified http://www.youtube.com/watch?v=wvclP3GySUY
  • http://www.nature.com/nnano/journal/v6/n10/fig_tab/nnano.2011.129_ F1.html nt=nucleotides Reductions in Translocation Velocity over Time
  •  2000 nanopore system (900 USD) that can read DNA at a rate of hundreds of kilobases per second  8000 nanopore system by next year (2013) that can read more than 1M bases per second  With about 3 billion bases per human genome and 20 sequencing machines, it takes about 15 minutes to sequence human genome http://www.nature.com/news/nanopore-genome-sequencer-makes-its-debut-1.10051
  •  The reduced velocities (and improved sensitivities) achieved by ◦ combination of site-specific mutagenesis and one of the following: the incorporation of DNA processing enzymes into the nanopore, chemical labeling of the nucleotides or the covalent attachment of an aminocyclodextrin adapter for α-haemolysin ◦ optimization of solution conditions (temperature, viscosity, pH), chemical functionalization, surface- charge engineering, varying the thickness and composition of the membranes, and the use of smaller diameter nanopores (thereby enhancing polymer–pore interactions) for solid sate http://www.nature.com/nnano/journal/v6/n10/fig_tab/nnano.2011.129_F1.html
  • Personal Sequencing, Garage Biology Sequencing can be done in your home, office, or in field Sequence your own DNA multiple times in your life Sequence the DNA from a bucket of ocean water, sewage, or handful of dirt Find proteins to manufacture other things Combined with 3D printers, PCs, and the Internet, there is no limit to what we can do as individuals $900 from Oxford Nanopore
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  Many believe this will be the bottleneck in genome sequencing  Partial solution: because there are redundancies in the data, better algorithms can speed up the sequencing Source: Nature 498 pp. 255-260, 13 June 2013
  • a) File sizes of the uncompressed, compressed with links and edits, and unique sequence data sets with default parameters. (b) Run times of BLAST, compressive BLAST and the coarse search step of compressive BLAST on the unique data ('coarse only'). Error bars, s.d. of five runs. Reported runtimes were on a set of 10,000 simulated queries. For queries that generate very few hits, the coarse search time provides a lower bound on search time. (c) Run times of BLAT, compressive BLAT and the coarse search step on the unique data ('coarse only') for 10,000
  •  For storage and processing  How to encourage sharing of data?  How to protect privacy?  Who will be the leading providers and users of these services?  How will this impact on the overall industry of health care? ◦ Might this globalize health care?
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  We can synthesize new forms of DNA  Make new drugs, crops, or materials  Test them  Then synthesize/design newer forms of DNA  Keep iterating and making better drugs, crops, and materials
  • The cost of synthesizing DNA is also drop http://singularityhub.com/2012/09/17/new-software-makes-synthesizing-dna-as-easy-as-
  • http://www.synthesis.cc/cgi-bin/mt/mt-search.cgi?blog_id=1&tag=Carlson%20Curves&limit=20
  • About 5 years behind sequencing
  •  Why do Costs Fall?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  •  Most drugs are naturally occurring substances  But improvements in our knowledge of humans and other organisms and reductions in cost of sequencing and synthesizing DNA increase possibility of synthesizing drugs ◦ Begins with DNA "target”: naturally existing cellular or molecular structure involved in pathology of interest ◦ A common target is proteins whose function has now become clear as a result of basic scientific research ◦ Sequence the protein’s DNA and then synthesize a drug that acts on this protein (also based on scientific research) Gary Pisano, Science Business: The promise, the reality, and the future of biotech, Chapters 2 and 3
  •  If we can reduce the cost of drug development, we can target smaller groups of people with drugs  How about synthesizing drugs for individuals?  How about understanding which diseases a human might be susceptible by sequencing their DNA?  Even if we cannot synthesize drugs for individuals, maybe we can better assign drugs to individuals by better understanding which humans are susceptible to known side effects ◦ Most drugs have side effects ◦ DNA can tell us who might be susceptible to the side effects Gary Pisano, Science Business: The promise, the reality, and the future of biotech, Chapters 2 and 3
  •  Gleevec treats myeloid leukemia ◦ Blocks activity of protein BCR-ABL; it comes from abnormal gene created by a merge of chromosomes 9 and 22  Crizitnonib teats lung cancer ◦ mutated version of gene called ALK, encodes protein that instructs lung cells to divide uncontrollably  Vemurafenib treats melanoma ◦ Attacks protein that is generated by mutated version of a gene called BRAF  Problems ◦ Many cancers driven by more than one mutation and genes involved in repair are often involved with mutations ◦ $100,000 for 4 doses of one drug  Nevertheless, DNA sequencing is helping scientists identify common genes for cancer Source: Getting Close and Personal, Economist, January 4, 2014
  •  Better sensors (cameras, infrared, fluorescence, lasers) and mechanical controls enable complete control and measurement over crop growth  DNA sequencing and DNA synthesizing enable characterization and replication of high performing crops  Other biological materials?  Cellulosic ethanol  Algae http://www.aber.ac.uk/en/media/departmental/ibers/facilities/phenomicscentre/BBC-FOCUS-NPPC-Feature.pdf
  • Source: https://www.soils.org/publications/cs/articles/46/2/528 Improvements in U.S. Corn Yields through New Seeds
  • Improvements in Yield for other Crops U.S. Department of Agriculture and Michael Bomford, Crop Yield Projections
  • According to Science Magazine, Scale-up will not enable economic feasibility
  •  It’s not just about ◦ making bio-fuels from the non-food part of the plant or ◦ scaling up the production in order to reduce cost  It’s also about Developing Better Organisms ◦ Better cellulose that produces more ethanol per weight, while still enabling the plant to produce lots of food ◦ Better algae that consumes more carbon dioxide and generates more energy per weight or area http://www.theguardian.com/science/2012/jan/14/synthetic-biology-spider-goat-genetics
  •  Spider silk is very strong  But difficult to harvest spider silk, partly because it is hard to raise spiders (they eat each other)  Scientists introduced the gene for spider silk into goats so spider silk would be produced in their milk  Now spider silk is produced in the goat’s milk and scientists are trying to improve the results  It is expected that many other natural substances can be manufactured in this way http://www.theguardian.com/science/2012/jan/14/synthetic-biology-spider-goat-genetics
  •  Enzymes, plastics, textiles, dyes  Many of these are now made from fossil fuels but were once made form natural substances  Can we return to biological feedstocks? ◦ Modify yeast so that sugar can be turned into useful compounds such as malaria drugs and biofuels ◦ Bring a switch from fossil fuels to biological feedstocks such as sugar, starch, and cellulose
  •  Registry of Standard Biological Parts ◦ More than 10,000 parts  Can build complex systems from these parts  Genetically Engineered Machine Competition ◦ Students compete to build complex systems ◦ One group built a biological light detector with a resolution of 100 million pixels per square inch  Will biological systems ever compete with electronic systems?
  •  DNA synthesizing equipment can be used to make (and replicate) DNA  One challenge is how to insert DNA into a cell, so that the cell can then replicate itself ◦ Each cell contains DNA needed for a specific organism ◦ Each cell may even contain the DNA for features that no longer exist and the features can be turned back on  First done by Craig Venter’s team in May 2010 ◦ His team synthesized an entire bacterial genome and “took over” a cell by inserting the DNA into the cell  Can this be done for more complex life forms? Source: Michio Kaku, Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100 (2011)
  • More Complex Organisms Require More Base Pairs and thus more years for their Synthesizing
  •  The cost of sequencing and synthesizing DNA continues to fall  A major reason for the cost reductions is the benefits from reductions in scale ◦ Similar to those in ICs, bio-electronic ICs, and MEMS ◦ A powerful way to reduce costs  Further reductions in scale and thus further cost reductions appear possible
  •  Low cost and small DNA sequencers and synthesizers will change drug discovery, health care, and science ◦ How will we do drug discovery in the future?  What kind of analyses can help us understand how these trends will change drug discovery and health care?  What kinds of opportunities will emerge for firms as vast amounts of data become available for analysis?
  •  Single cell genomics ◦ select the embryos created by IVF (in vitro fertilization) that have best chance of developing into a healthy baby  Metagenomic medicine ◦ Sequencing many different microbes en masse and then teasing out individual genomes to diagnose which ones are helping or harming human health Nature 494, 21 February 2013, pp. 290-291