DNA sequencing: rapid improvements and their implications

10,741 views

Published on

these slides analyze the rapid improvements in DNA sequencers and the implications for these rapid improvements for drug discovery, new crops, materials creation, and new bio-fuels. Many of the rapid improvements are from "reductions in scale." As with integrated circuits, reducing the size of features on DNA sequencers has enabled many orders of magnitude improvements in them. Unlike integrated circuits, the improvements are also due to changes in technology. For example, changes from pyrosequencing to semiconductor and nanopore sequencing have also been needed to achieve the reductions in scale. Second, pyrosequencing also benefited from improvements in lasers and camera chips.

Published in: Business
  • Be the first to comment

DNA sequencing: rapid improvements and their implications

  1. 1. A/Prof Jeffrey Funk Division of Engineering and Technology Management National University of Singapore For information on other technologies, see http://www.slideshare.net/Funk98/presentations
  2. 2.  Identify the sequence and identity of 3 billion base pair nucleotides in DNA strand  Nucleotides encode the genetic instructions for organisms  Four types of nucleotides in a DNA strand ◦ Adenine ◦ Thymine ◦ Cytosine ◦ Guanine
  3. 3.  Can the Falling Cost of sequencing and synthesizing DNA ◦ How can we use the data?  Enable us to reduce the cost and time of developing better ◦ Drugs? Is personalized medicine possible? (medicine has recognized about 6,000 diseases that can be traced to one or more genes) ◦ Crops? Can we feed the world? ◦ Bio-fuels? Can we reduce carbon emissions? ◦ Bio-Materials? Are better materials possible? ◦ Complex biological systems? Can we create computers from biological parts? http://www.economist.com/news/briefing/21661799-it-now-easy-edit-genomes-plants-animals-and-humans-age-red-pen
  4. 4. Session Technology 1 Objectives and overview of course 2 How/when do new technologies become economically feasible? 3 Two types of improvements: 1) Creating materials that better exploit physical phenomena; 2) Geometrical scaling 4 Semiconductors, ICs, electronic systems 5 Sensors, MEMS and the Internet of Things 6 Bio-electronics, Health Care, DNA Sequencers 7 Lighting, Lasers, and Displays 8 Roll-to Roll Printing, Human-Computer Interfaces 9 Information Technology and Land Transportation 10 Nano-technology and Superconductivity This is Sixth Session of MT5009
  5. 5.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  6. 6.  New methods of sequencing ◦ Maxam-Gilbert Sequencing: relies on cleaving of nucleotides by chemical methods ◦ Chain Termination methods (sometimes called Sanger method): bases are illuminated with UV light, read with X-rays ◦ Dye-termination: reading sequences with fluorescent dyes where each nucleotide emits light in different wavelengths  Improved lasers and cameras to read fluorescent dyes  More parallel processing  Smaller feature sizes, reductions in scale http://www.dnasequencing.org/history-of-d
  7. 7. Source: High Throughput Sequencing Technologies, Brian Krueger, http://www.slideshare.net/Kruegsybear/high-throughput-sequencing-technologies-on-the-path-to-the-0-genome Dye- Termination
  8. 8. Source: Nature Biotechnology 30(11), 1023-1026, November 2012 Many new approaches are being investigated
  9. 9.  This can be understood by reading highly cited papers such as ◦ “Genome sequencing in micro-fabricated high- density pico-liter reactors” (Margulies, 2005) and ◦ “Toward nano-scale genome sequencing” (Ryan et al, 2007)  Quote from Ryan et al: “The ability to construct nano- scale structures and perform measurements using novel nano-scale effects has provided new opportunities to identify nucleotides directly using physical, and not chemical, methods.”  In fact, just the titles of these papers are fairly suggestive. In all of these decreasing scale examples, totally new forms of equipment, processes and factories were required.
  10. 10.  Read lengths  Accuracies  Speeds  Improvements in these variables also lead to reductions in cost of sequencing  Capability to analyze and use gathered data ◦ need better computers ◦ need more storage
  11. 11. Improvements in Output per Instrument Run Nature 2011, 470: 198-203, Elaine Mardis
  12. 12.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  13. 13.  http://www.youtube.com/watch?v=6ldtdWjD wes&list=PL7DFC2F1730BD5F3A  http://www.youtube.com/watch?v=iz2xh8U9j Yc&index=3&list=PL7DFC2F1730BD5F3A
  14. 14.  1) separate DNA into smaller strands  2) make copies of strands (i.e., amplification) with emulsion beads in plastic containers (need redundancy for higher accuracy) ◦ done with small containers on large wash plate so that many copies are made in parallel ◦ smaller containers and larger wash plates lead to more parallel and faster processing  3) identify DNA nucleotides utilizing lasers and cameras ◦ Nucleotides (A, T, C G) emit light in presence of an enzyme, ADT (Adenosine Triphosphate) ◦ falling costs of lasers and cameras reduce costs  4) Analyze data with computers
  15. 15. One source: http://www.454.com/downloads/news-events/how-genome-sequencing-is-done
  16. 16.  Make copies to improve accuracy through redundancy  454 PicoTiterPlate from LifeSciences ◦ contains 1.6 million hexagonal wells ◦ each holds 75 pico-liters (10-12 liters, <100 micron diameter)  These wells can be made much smaller ◦ dimensions on integrated circuits (ICs) are on the order of 20 nano-meters ◦ Is it possible to reduce feature sizes by 1000 times or volumes by 109
  17. 17. Fluorescent Dyes, Lasers, and Cameras (Step 3) As bases move across wash plate during sequencing run, a nucleotide (molecules that make up DNA) generates light signal, which is recorded by camera Signal strength is proportional to number of nucleotide incorporated onto the DNA strands
  18. 18.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  19. 19. Eliminate amplification and wash steps with zero wave guides (Pacific BioSciences) http://www.youtube.com/watch?v=v8p4ph2MAvI from 1:50 to 3:50
  20. 20.  Uses Zero Mode Wave Guides  They are ◦ Very small container: zepto-liters (10-21 liters, 50 nanometers in diameter) ◦ fabricated in a 100nm metal film on a silicon dioxide substrate ◦ enough room for 600,000 molecules of liquid water at room temperature ◦ How much smaller can they be made?
  21. 21.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  22. 22.  Uses semiconductor chips to sequence DNA by detecting PH differences between A, G, C, and T ◦ Thus, no lasers, cameras, or amplification are used  A micro-well containing template DNA strand is filled with single species of deoxyribonucleotide triphosphate (dNTP) ◦ Beneath layer of micro-wells is ion sensitive layer, below which is ISFET ion sensor. ◦ All layers are contained in CMOS semiconductor chip ◦ If the introduced dNTP is complementary to leading template nucleotide, it is incorporated into growing strand ◦ This causes release of a hydrogen ion that triggers ISFET ion sensor, indicating a reaction has occurred http://www.nature.com/news/2010/101214/full/news.2010.674.html http://en.wikipedia.org/wiki/Ion_semiconductor_sequencing http://www.lifescientist.com.au/article/394936/feature_sequencing_3_0/?pp=2
  23. 23. Done in Massively Parallel For each well Matches cause ion to be released Multiple matches cause multiple ions to be released No matches no ions are released
  24. 24.  While first sequencers used older (i.e., large feature sizes) semiconductor technology, newer ones use smaller feature sizes and thus are faster than older ones  http://www.youtube.com/watch?v=JHzkYDyMzOg&feature=relmfu (2:30- 4:15)  For example, first sequencer (314) had 1.2 million wells while most recent one (Proton II) has 660 million wells ◦ How much smaller can these wells be made? ◦ Since 256GB memory chips (1 byte = 8 bits) exist, can ion torrent be able to provide 256 x 8 billion wells or about 2 trillion wells in next few years? ◦ After that improvements may slow as ion torrent's improvements depend on further reductions in feature sizes of semiconductor technology http://www.nature.com/news/2010/101214/full/news.2010.674.html http://en.wikipedia.org/wiki/Ion_semiconductor_sequencing http://www.lifescientist.com.au/article/394936/feature_sequencing_3_0/?pp=2
  25. 25. Source: Ion Torrent Video
  26. 26.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  27. 27. Squeeze DNA through a nanoscopic pore (about 1.4 nm) in a semiconductor and read the distinctive change each letter in the sequence makes in the amount of current flowing through the pore NanoPores
  28. 28.  DNA moves through a nano-pore at remarkably high velocities and thus only a small number of ions (as few as ~100) are available in the nano- pore to correctly identify nucleotides ◦ so the small changes in the ionic current due to the presence of different nucleotides are overwhelmed by thermodynamic fluctuations  Challenge is to reduce the translocation velocity so that the ions can be correctly identified http://www.youtube.com/watch?v=wvclP3GySUY
  29. 29. http://www.nature.com/nnano/journal/v6/n10/fig_tab/nnano.2011.129_ F1.html nt=nucleotides Reductions in Translocation Velocity over Time
  30. 30.  Great for work in field, for example studying EBOLA virus in Africa  https://www.youtube.com/watch?v=CE4dW64x3Ts: 1:00 to 2:00 https://www.nanoporetech.com/community/minion-flow-cell-pricing Package Price Number of flow cells Price per flow cell $900 1 $900.00 $9,480 12 $790.00 $16,200 24 $675.00 $24,000 48 $500.00
  31. 31. Personal Sequencing, Garage Biology Sequencing can be done in your home, office, garage, or in field Sequence your own DNA multiple times in your life Sequence the DNA from a bucket of ocean water, sewage, or handful of dirt Find proteins to manufacture other things Combined with 3D printers, PCs, and the Internet, there is no limit to what we can do as individuals
  32. 32.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Data analysis, compression, and cloud computing  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  33. 33.  Many believe this will be the bottleneck in genome sequencing  Partial solution: because there are redundancies in the data, better algorithms can speed up the sequencing Source: Nature 498 pp. 255-260, 13 June 2013
  34. 34. a) File sizes of the uncompressed, compressed with links and edits, and unique sequence data sets with default parameters. (b) Run times of BLAST, compressive BLAST and the coarse search step of compressive BLAST on the unique data ('coarse only'). Error bars, s.d. of five runs. Reported runtimes were on a set of 10,000 simulated queries. For queries that generate very few hits, the coarse search time provides a lower bound on search time. (c) Run times of BLAT, compressive BLAT and the coarse search step on the unique data ('coarse only') for 10,000
  35. 35.  For storage and processing  How to encourage sharing of data?  How to protect privacy?  Who will be the leading providers and users of these services?  How will this impact on the overall industry of health care? ◦ Might this globalize health care?
  36. 36.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  37. 37.  We can synthesize new forms of DNA  Make new drugs, crops, or materials  Test them  Then synthesize/design newer forms of DNA  Keep iterating and making better drugs, crops, and materials
  38. 38. The cost of synthesizing DNA is also drop http://singularityhub.com/2012/09/17/new-software-makes-synthesizing-dna-as-easy-as-
  39. 39. http://www.synthesis.cc/cgi-bin/mt/mt-search.cgi?blog_id=1&tag=Carlson%20Curves&limit=20
  40. 40. About 5 years behind sequencing
  41. 41.  Why do Costs Fall and Other Improvements Occur?  New Methods Continue to Emerge ◦ Pyrosequencing (454 Life Sciences/Roche and Illumina) ◦ Single-molecule real-time sequencing (Pacific Bio) ◦ Semiconductor arrays (Ion Torrent) ◦ Nanopores (Oxford Nanopore Technologies) ◦ Methods of data compression  Synthesizing DNA  Who Cares? What are the Implications?  Conclusions
  42. 42.  Most drugs are naturally occurring substances  But improvements in our knowledge of humans and other organisms and reductions in cost of sequencing and synthesizing DNA increase possibility of synthesizing drugs ◦ Begins with DNA "target”: naturally existing cellular or molecular structure involved in pathology of interest ◦ many targets are proteins whose function has become clear from basic scientific research ◦ Sequence protein’s DNA and then synthesize drug that acts on this protein  This has created a field, called Bio-informatics; many startups are pursuing this field: https://angel.co/bioinformatics Gary Pisano, Science Business: The promise, the reality, and the future of biotech, Chapters 2 and 3
  43. 43.  If we can reduce the cost of drug development, we can target smaller groups of people with drugs  How about synthesizing drugs for individuals?  How about understanding which diseases a human might be susceptible by sequencing their DNA?  Even if we cannot synthesize drugs for individuals, can we better assign drugs to individuals by better understanding which humans are susceptible to known side effects  23andMe is targeting this market Gary Pisano, Science Business: The promise, the reality, and the future of biotech, Chapters 2 and 3
  44. 44.  Scientists have recognized about 6,000 diseases that can be traced to one or more genes  Gleevec treats myeloid leukemia ◦ Blocks activity of protein BCR-ABL; it comes from abnormal gene created by a merge of chromosomes 9 and 22  Crizitnonib treats lung cancer ◦ mutated version of gene called ALK, encodes protein that instructs lung cells to divide uncontrollably  Vemurafenib treats melanoma ◦ Attacks protein that is generated by mutated version of a gene called BRAF  Problems ◦ Many cancers driven by more than one mutation and genes involved in repair are often involved with mutations, $100,000 for 4 doses of one drug Source: Getting Close and Personal, Economist, January 4, 2014. The age of the red pen, Economist, August 22, 2015 http://www.economist.com/news/briefing/21661799-it-now-easy-edit-genomes-plants-animals-and-humans-age-red-pen
  45. 45.  Variants that  1. codes for extra-strong bones (LRP5 G171V/+).  2. codes for lean muscles (MSTN).  3. makes people less sensitive to pain — something that could be dangerous, as pain can be a useful warning signal, but may be helpful in some contexts (SCN9A).  4. associated with low odor production(ABCC11).  5. makes people more resistant to viruses (CCR5, FUT2).  6. connected to a low risk of coronary disease(PCSK9).  7. associated with a low risk of Alzheimer’s disease (APP A63T/+).  8. associated with a low cancer risk (GHR, GH).  9. associated with a low risk of type 2 diabetes(SLC30A8).  10. associated with a low risk of type 1 diabetes(IFIH1 E627X/+).  Or is this playing God? http://www.businessinsider.sg/gene-edits-to-make-you-stronger-and-healthier-2015-4/#ixzz3jko9ehuQ
  46. 46.  Better sensors (cameras, infrared, fluorescence, lasers) and mechanical controls enable complete control and measurement over crop growth  DNA sequencing and DNA synthesizing enable characterization and replication of high performing crops (sometimes called GMO)  Other biological materials http://www.aber.ac.uk/en/media/departmental/ibers/facilities/phenomicscentre/BBC-FOCUS-NPPC-Feature.pdf  Cellulosic ethanol  Algae: http://www.slideshare.net/Funk98/presentations
  47. 47. Source: https://www.soils.org/publications/cs/articles/46/2/528 Improvements in U.S. Corn Yields through New Seeds
  48. 48. Improvements in Yield for other Crops U.S. Department of Agriculture and Michael Bomford, Crop Yield Projections
  49. 49.  U.S. Food and Drug Administration approved genetically engineered salmon for consumption  Developed by AquaBounty Technologies, that first approached FDA in 1990s  Genetic modifications enable it to grow to market size faster, in as little as half the time  Will be in stores by 2017 http://www.nytimes.com/2015/11/20/business/geneticall y-engineered-salmon-approved-for-consumption.html
  50. 50.  Food can be expensive (usually smaller fish), particularly if fish are to have the right oils  Geneticists have added pertinent genes to oil-rich plants to make better food  Successful tests in green house as did outdoor tests, led to oil-rich plants  Added benefits of these plants ◦ Less build up of mercury in fish ◦ Big problem with fish raised on smaller fish (common method of raising fish) ◦ Consumers concerned with mercury in fish also often dislike genetically modified food Something Fishy, Economist, July 11, 2015, p. 69
  51. 51. Bio-Fuels: Scale-up of cellulosic ethanol production might enable economic feasibility, But new plant sources for bio-fuels also needed
  52. 52.  It’s not just about ◦ making bio-fuels from the non-food part of the plant or ◦ scaling up the production in order to reduce cost  It’s also about Developing Better Organisms ◦ Better cellulose that produces more ethanol per weight, while still enabling the plant to produce lots of food ◦ Better algae that consumes more carbon dioxide and generates more energy per weight or area http://www.theguardian.com/science/2012/jan/14/synthetic-biology-spider-goat-genetics
  53. 53.  For example, spider silk is very strong  But difficult to harvest spider silk, partly because it is hard to raise spiders (they eat each other)  Scientists introduced the gene for spider silk into goats so spider silk would be produced in their milk  Now spider silk is produced in the goat’s milk and scientists are trying to improve the results  It is expected that many other natural substances can be manufactured in this way http://www.theguardian.com/science/2012/jan/14/synthetic-biology-spider-goat-genetics
  54. 54.  Enzymes, plastics, textiles, dyes  Many of these are now made from fossil fuels but were once made form natural substances  Can we return to biological feedstocks? ◦ Modify yeast so that sugar can be turned into useful compounds such as malaria drugs and biofuels ◦ Bring a switch from fossil fuels to biological feedstocks such as sugar, starch, and cellulose
  55. 55.  Registry of Standard Biological Parts ◦ More than 10,000 parts  Can build complex systems from these parts  Genetically Engineered Machine Competition ◦ Students compete to build complex systems ◦ One group built a biological light detector with a resolution of 100 million pixels per square inch  Will biological systems ever compete with electronic systems?
  56. 56.  Build complex systems from simple parts in small decentralized labs  Use simplified version of DNA sequencers, called PCR (polymerase chain reaction), to identify a specific segment of DNA ◦ Costs have fallen to $500  Other technologies support use of PCR ◦ Autodesk develops design tools for DNA ◦ Fluid handling robots from Opus ◦ 3D printer for living things Bio-hackers of the world, unite; Economist September 6, 2014
  57. 57.  DNA synthesizing equipment can be used to make (and replicate) DNA  One challenge is how to insert DNA into a cell, so that the cell can then replicate itself ◦ Each cell contains DNA needed for a specific organism ◦ Each cell may even contain the DNA for features that no longer exist and the features can be turned back on  First done by Craig Venter’s team in May 2010 ◦ His team synthesized an entire bacterial genome and “took over” a cell by inserting the DNA into the cell  Can this be done for more complex life forms? Source: Michio Kaku, Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100 (2011)
  58. 58. More Complex Organisms Require More Base Pairs and thus more years for their Synthesizing
  59. 59.  The cost of sequencing and synthesizing DNA continues to fall  A major reason for the cost reductions is the benefits from reductions in scale ◦ Similar to those in ICs, bio-electronic ICs, and MEMS ◦ A powerful way to reduce costs  Further reductions in scale and thus further cost reductions appear possible
  60. 60.  If costs continue to fall, low cost and small DNA sequencers and synthesizers will change drug discovery, health care, and science ◦ How will we do drug discovery in the future?  What kind of analyses can help us understand how these trends will change drug discovery and health care?  What kinds of opportunities will emerge for firms as vast amounts of data become available for analysis?

×