Reverse-engineering techniques in Data Integration

2,683 views

Published on

Published in: Health & Medicine
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,683
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Reverse-engineering techniques in Data Integration

  1. 1. Reverse-­‐engineering  techniques     In   Data  Integra3on   David  Gomez-­‐Cabrero  
  2. 2. The  Unit   COMP  MODEL   WET-­‐LAB  BIOINFORMATICS   MEDICAL  INFORMATICS    More  than  30  researchers     EVERYTHING  CONNECTED  
  3. 3. The  Unit  
  4. 4. What  is   Reverse   Engineering?   Open   Piece  1:  The  quesMons   connecMons   Piece  3:  The   Piece  2:  The   Data   dynamics   IntegraMon  
  5. 5. What  is   Reverse  Engineering?  
  6. 6. REVERSE   SYSTEMS   ENGINEERING   BIOLOGY   DEFINE   DEFINE   MATH   HYPOTHESIS  ELEMENTS   INTERACTION   DESCRIPTION   PIECE  1   PIECE  2  
  7. 7. REVERSE  ENGINEERING  
  8. 8. “…the  process  of  analyzing  a  subject  system  to  iden%fy  its  components   and   their   rela7onships   and   to   create  representa7ons   of   the   system   in   another   form   or   at   a  higher   level   of   abstrac7on…   Generally   involves   extrac7ng  design   ar7facts   and   building   or   synthesizing   abstrac7ons  that  are  less  implementa7on  dependent…”   E.  J.  Chikofsky  and  J.  H.  Cross,  “Reverse  Engineering  and   Design  Recovery:  A  toxonomy”,  IEEE  SoKware,  vol  7,  no.   1,  1990,  pp  13-­‐17.  
  9. 9. Requirements   Design   Source  Code   Behaviour  
  10. 10. Behaviour  
  11. 11. Requirements   Behaviour   Design   Source  Code   Design   Behaviour  
  12. 12. LETTERSDecoding the ancient Greek astronomical calculatorknown as the Antikythera MechanismT. Freeth1,2, Y. Bitsakis3,5, X. Moussas3, J. H. Seiradakis4, A. Tselikas5, H. Mangou6, M. Zafeiropoulou6, R. Hadland7,D. Bate7, A. Ramsey7, M. Allen7, A. Crawley7, P. Hockley7, T. Malzbender8, D. Gelb8, W. Ambrisco9 & M. G. Edmunds1The Antikythera Mechanism is a unique Greek geared device, con- planetary cycles. We note that a major aim of this investigation isstructed around the end of the second century BC. It is known1–9 to set up a data archive to allow non-invasive future research, andthat it calculated and displayed celestial information, particularly access to this will start in 2007. Details will be available on www.an-cycles such as the phases of the moon and a luni-solar calendar. tikythera-mechanism.gr.Calendars were important to ancient societies10 for timing agricul- The back door inscription mixes mechanical terms about con-tural activity and fixing religious festivals. Eclipses and planetary struction (‘‘trunnions’’, ‘‘gnomon’’, ‘‘perforations’’) with astronom-motions were often interpreted as omens, while the calm regular- ical periods. Of the periods, 223 is the Saros eclipse cycle (see Box 1ity of the astronomical cycles must have been philosophically for a brief explanation of astronomical cycles and periods). Weattractive in an uncertain and violent world. Named after its place discover the inscription ‘‘spiral divided into 235 sections’’, which isof discovery in 1901 in a Roman shipwreck, the AntikytheraMechanism is technically more complex than any known devicefor at least a millennium afterwards. Its specific functions haveremained controversial11–14 because its gears and the inscriptionsupon its faces are only fragmentary. Here we report surfaceimaging and high-resolution X-ray tomography of the survivingfragments, enabling us to reconstruct the gear function and doublethe number of deciphered inscriptions. The mechanism predictedlunar and solar eclipses on the basis of Babylonian arithmetic-progression cycles. The inscriptions support suggestions of mech-anical display of planetary positions9,14,15, now lost. In the secondcentury BC, Hipparchos developed a theory to explain the irregu-larities of the Moon’s motion across the sky caused by its ellipticorbit. We find a mechanical realization of this theory in the gear-ing of the mechanism, revealing an unexpected degree of technicalsophistication for the period. The bronze mechanism (Fig. 1), probably hand-driven, was ori-ginally housed in a wooden-framed case1 of (uncertain) overall size315 3 190 3 100 mm (Fig. 2). It had front and back doors, withastronomical inscriptions covering much of the exterior of the mech-anism. Our new transcriptions and translations of the Greek texts are Figure 1 | The surviving fragments of the Antikythera Mechanism. The 82 fragments that survive in the National Archaeological Museum in Athens aregiven in Supplementary Note 2 (‘glyphs and inscriptions’). The shown to scale. A key and dimensions are provided in Supplementary Note 1detailed form of the lettering can be dated to the second half of the (‘fragments’). The major fragments A, B, C, D are across the top, starting atsecond century BC, implying that the mechanism was constructed top left, with E, F, G immediately below them. 27 hand-cut bronze gears areduring the period 150–100 BC, slightly earlier than previously sug-
  13. 13. Goals:    -­‐  as  an  approach  to  study  the  design  or      -­‐  as  a  prerequisite  for  re-­‐design.  
  14. 14. REVERSE   SYSTEMS  ENGINEERING   ENGINEERING   BIOLOGY  
  15. 15. coupled to ef®cient ribosome-binW. C. Mechanism and regulation of eukaryotic protein synthesis. Microbiol. Rev. 56, 291± becauseC.,B. formation. Proc. Natl Acad. Sci.Puri®cation and characterization of its components. Such coope initiation complex 13. Merrick, W. of stochastic ¯uctuations Michael Kemper, W. M. & Anderson, W. F. USA 69, 3602±3605 (1972). of homogenous repression (low `leakiness), Elowitz & Stanislas Leibler in). `rational M2ASchreier, M. H.reticulocytes. J. Biol. Chem. 250, 5556±5562 (1975). II.to the comparable proteinof mRNA initiation factor network design may lead both from rabbit and engineering and IP V., Borukhov, S. I. & Hellen, C. U. T. Eukaryotic ribosomes require initiation factors 1 and 14. Trachsel, H., Emi, B., & Staehelin, T. Initiation of mammalian protein synthesis. The te initiation codons. Nature 394, 854±859 (1998). new cellular complex with puri®edBiology and Physics, Princetongeneralabout the values of paramete Departments of behaviours and to an116, 755±767 (1977). tainty obstacle to the of ofp assembly of the initiation Molecular initiation factors. J. Mol. Biol. improved understanding University, Princeton, design 15. Benne, R., Brown-Luedi, M. L. & Hershey, J. W. B. Puri®cation and characterization of protein ti, A. & Maitra, U. Functions of eukaryotic initiation factor 5 in the formation of an 80S naturally 08544, USA eIF-4D, and eIF-5 from rabbit reticulocytes. J. Biol. Chem. tions between different componen New Jersey occurring networks. synthesis initiation factors eIF-1, eIF-4C, re 253, 3070±3077 (1978). polypeptide chain initiation complex. J. Biol. Chem. 266, 14039±14045 (1991). In the network shown in initiation factor in the ®rst 80S the order of magnitude of the in 16. Peterson, D. T., Safer, B. & Merrick, W. C. Role of eukaryoticFig. 1a, 5theformation ofrepressor protein, LacI .............................................................................................................................................. re hesevich, J. & Maitra, U. Molecular cloning and expression of cDNA for mammalian initiation complexes. J. Biol. Chem. 254, 7730±7735 (1979). compatible with the possibility o from E. coli,I.inhibits Jackson, R. J. & Hellen, C. U. T. A prokaryotic-like mode of out repressor gene, arti® Fletcher, S. P., the transcription of the second many essential Networks ofN., interactinginitiation codon during internal initiation of increase the chances that the d 17. Pestova, T. V., Shatsky, biomolecules carry initiation factor 5. Proc. Natl Acad. Sci. USA 90, 3058±3062 (1993). Forward  engineering   -K., Yoon, H., Hannig, E. M. & Donahue, T. F. GTP hydrolysis controls stringent selection of tetR from theliving cells1,virus RNAs. Genes Dev. 12,transposon the oscillatory regime, we made tw functions in tetracycline-resistance 67±83 (1998). binding of cytoplasmic eukaryotic ribosomes to the translation of Hepatitis C and Classical Swine fever but the `design principles underlying the Tn10, whose pro- nents. First, to address transcripti ¯ tart codon during translation initiation in Saccharomyces cerevisiae. Genes Dev. 11, 2396± tein product in turn inhibits the expression of remaingene, yet from l functioning of such intracellular networks a third strong, cI tightly repressibl used poorly under- n7). ENGINEERING   SyntheMc  Biology   , Lee, J. H., Zoll, W. L., Merrick, W. C. & Dever, T. E. Promotion of Met-tRNAMet binding to Acknowledgements phage. despite intensive efforts expression,quantitative which 6cycle. the m Finally, CI inhibitsand R. Schneider for antibodies, and previously, analysis of l stood, Merrick for discussions, D. Etchison We thank W. lacI including completing the combine by yIF2, a bacterial IF2 homolog in yeast. Science 280, 1757±1760 (1998). That suchfor simple systems2.were supportedwelead to temporalsequencesthat of mRNAla relativelyandsequencing eIF5B.feedback loop cangrants from the lifetimes closer to L. Siconol®-Baez a negative These studies Here by present a 7 complementary operator oscillations to brin . Second, NIH to C.U.T.H. T.V.P.etters to natureChoi, S. K., Roll-Mecak, A., Burley, S. K. & Dever, T. E. Universal conservation in translationrevealed by human and archaeal homologs of bacterial translation factor IF2. Proc. Natl USA 96, 4342±4347 (1999). in the concentrations of each of its components can be8 insertedfrom aa ¯ approach to this problem: theto T.V.P. Correspondence and requests for materials should be addressed simple model of transcriptional regulation, which we used to design the att (e-mail: tpestova@netmail.hscbklyn.edu). synthetic network to implement a particular recognize this tag andused m design and sequence ,seen 39 end of each rep coli ), we construction of at the function. We target a carboxy-termi the repressilator and study its possible behaviours (BoxpartInshown to redu three transcriptional repressor systems that are not 1). of any e Such tags have been thisC., Dessen, P., Hershey, J. W. B., Plumbridge, J. A. & Grunberg-Manago, M. Sequence of the model, the action clock3±5 to build an oscillating network,factors, more natural biological of the network depends on several termed of DNA-binding domain from actor IF2 gene; unusual protein features and homologies with elongation factors. Proc. Natl ................................................................. (ref. 8) and diminishcon- e including the dependence of transcription rate to about100 min (ref. 11). on repressor the half-life Reporter theoscillatoryrate, and 8C.decay leastof the protein in mi A synthetic translation network At rates these considerations 30 the individua USA 81, 7787±7791 (1984). 30±40 a Repressilatory, D., Dewey, K. F., Hershey, J. W. B. & Thach, R. E. Guanosine 59-triphosphatase activity of centration, With andtranscriptionalDepending colonies were tracked manua NATURE | VOL 403 | 20 JANUARY 2000 | www.nature.com biology techniques to construct a actor f2. Proc. Natl Acad. Sci. USA 61, 1066±1070 (1968). PLlac01 Colburn, T. et al. Light-scattering studies showing the effect of initiation factors on the of messenger RNA. regulators the values ofrepressilator and a compatible, hig on these parameters,dissociation of Escherichia coli ribosomes. J. Mol. Biol. 94, 461±478 (1975). at least two types of solutions are quanti®ed. Michael B. Elowitz & Stanislas Leibler possible: the system the tet-repressible tainingmay converge promot intermediate stability variant of gfp ampR et al. In vitro study of two dominant inhibitory GTPase mutants of Escherichia coli toward a stable steady state, or the steady state interferes with repression by L IPTG may become tetR-lite initiation factor IF2. Direct evidence that GTP hydrolysis is necessary for factor recycling. leading to sustained limit-cycle oscillations IPTG 1b, the ¯u unstable, Molecular Biology and Physics, Princeton University, Princeton, pulse of(Fig. mightc). capable of Departments of New Jersey 08544, USA The timecourse of beem. 274, 6074±6079 (1999). , A. H., Sarkar, P. & Maitra, U. Release of polypeptide chain initiation factor IF-2 during We foundtet01 oscillations are favouredTemporal plasmids and grow PL that Fig. 2. by repressilator-containing cells. A cul .............................................................................................................................................. ing the two oscillation strong promoters kanR coupledofto ef®cient ribosome-binding essential tight transcriptional be a si interacting biomolecules carry out many sites, functions in living cells1, but the `design principles underlying increase in cell after transfer overall the displayed what appeared to ¯uorescecomplex formation. Proc. Natl Acad. Sci. USA 69, 3602±3605 (1972). Networks TetRW. C., Kemper, W. M. & Anderson, W. F. Puri®cation and characterization of homogenous repression such intracellular networks remain poorly repressionshown). per TetR of (low `leakiness), cooperative under- ¯uorescence Because individual ce characteristics, actor M2A from rabbit reticulocytes. J. Biol. Chem. 250, 5556±5562 (1975). pSC101 functioning gfp-aav and comparable protein and mRNA decay rates maintainingFig. 1b). A we s stood, despite intensive efforts including quantitative analysis of (Box 1, synchronization, 150 minutes, roughly threefo not !PRH., Emi, B., Schreier, M. H. & Staehelin, T. Initiation of mammalian protein synthesis. II. The 2 f theorigin complex with puri®ed initiation factors. J. Mol. Biol. 116, 755±767 (1977). initiation general simple systems . the design and time. The amplitude under oscil relatively obstacle to Here design ofconstruction of a networks intensity as they mic approach to this problem: the we present a complementary lating single cells biochemical ¯uorescence is uncer-the gre of ! cI LacI tainty about theimplementofparticular function. We characterize the consisting of hund synthetic network to values a parameters that used microcolonies interac- three between repressor components. In of any GFP ¯uorescence. levels of experiments, total observation tim Brown-Luedi, M. L. & Hershey, J. W. B. Puri®cation and characterization of protein GFP nitiation factors eIF-1, eIF-4C, eIF-4D, and eIF-5 from rabbit reticulocytes. J. Biol. Chem. tionstranscriptionaldifferentsystems that arenetwork, our network, estimates phase after natural biological clock to build an oscillating 3±5 not part termed entering a stationary of±3077 (1978). the order of magnitude of the exhibit parameters seem behaviou relevant oscillatory to be lacI-liteD. T., Safer, B. & Merrick, W. C. Role of eukaryotic initiation factor 5 in the formation of 80Scomplexes. J. Biol. Chem. 254, 7730±7735 (1979). compatible with the possibility determined Nevertheless, to NATURE | VOL 403 | 20 JANUARY 2000 | www.nature.com ColE1 of oscillations. by a Fourier a V., Shatsky, I. N., Fletcher, S. P., Jackson, R. J. & Hellen, C. U. T. A prokaryotic-like mode of ! cI-lite increase the chances that the arti®cial network would function in cytoplasmic eukaryotic ribosomes to the initiation codon during internal initiation of the oscillatory regime, we made two alterations to naturalas estimate range of periods, compo- of Hepatitis C and Classical Swine fever virus RNAs. Genes Dev. 12, 67±83 (1998). PLtet01 nents. First, to address transcriptional strength and tightness, min (m intervals, is 160 6 40 we used strong, yet tightly repressible hybrid promoters, developed GFP levels in the two siblindgements previously, which combine the l PL promoter with lac and tetW. Merrick for discussions, D. Etchison and R. Schneider for antibodies, and operator sequences6. Second, to bring the effective repressor protein one another for long perio Baez for sequencing eIF5B. These studies were supported by grants from the
  16. 16. L We thank W. Merrick for discussions, D. Etchison and R. Schneider for antibodies, and R operator sequences6. Second, to br L. Siconol®-Baez for sequencing eIF5B. These studies amp supported by grants from the were tetR-lite lifetimes closer to that of mRNA Box 1 NIH to C.U.T.H. and T.V.P. coli7), we insertedtet01 PL a carboxy-term Network design for materials should be addressed to T.V.P. Correspondence and requests TetR kanR sequence8, at the 39 end of each re (e-mail: tpestova@netmail.hscbklyn.edu). TetR recognize this tag and target the at pSC101 gfp-aav !PR Such tags have been shown to redu origin Design of the repressilator started cI ! with a simple mathematical modelfrom mor LacI DNA-binding domain of GFP ................................................................. transcriptional regulation. We did not set out to describe precisely thehalf-life o (ref. 8) and diminish the to about 30±40 min (ref. 11). Forward  engineering   lacI-lite A synthetic oscillatory network behaviour of the system, as not enough is known abut the molecular ! cI-lite WithColE1 considerations in m these biology techniques to construct a interactions inside the cell to make such a description realistic. Instead, hi of transcriptional regulators SyntheMc  Biology   PLtet01 repressilator and a compatible, ENGINEERING   we hoped to identify possible classes of dynamic behaviour and taining the tet-repressible promo Michael B. Elowitz & Stanislas Leibler intermediate stability variant of gf determine which experimental parameters should be adjusted to obtain by IPTG interferes with repressionetters to nature sustained oscillations. and Physics, Princeton University, Princeton, pulse of IPTG might be capable o Departments of Molecular Biology New Jersey 08544, USA Deterministic, continuous approximation repressilator-containing cells. A cu b .............................................................................................................................................. ing the two plasmids and grow Three repressor-protein concentrations, pi, and theirdisplayed what appeared to be a s Networks of interacting biomolecules steady state many essential carry out corresponding Protein lifetime/mRNA lifetime, β A functions in living cells1, but the `design principles underlying the ¯uorescence per cell after transfer mRNA concentrations, mi (where i remain poorly under- were treated asindividual c functioning of such intracellular networks is lacI, tetR or cI) not shown). Because stable a Repressilator Reporter 30 8C. At least 100 individua continuous dynamical variables. Each of these six molecular species stood, despite intensive efforts including quantitative analysis of maintaining synchronization, we 2 participatestointhis systems . Here design and construction of a lating reactions.under they mi relatively simple transcription, translation complementary ¯uorescence intensity as the gr we present a and degradation single cells Here B approach colonies were tracked manua problem: the we consider only to implement a particular function. We used three repressors are hund synthetic network the symmetrical case in which all microcolonies consisting of PLlac01 three quanti®ed. identicaltranscriptional repressor systems that arenetwork, termed The kinetics of observation ti except for their DNA-binding speci®cities. experiments, total the after not part of any ampR natural biological clock3±5 to build an oscillating entering a stationary phase tetR-lite system are determined by six coupled ®rst-order differential equations: ¯u The timecourse of the C | | | NATURE VOL 403 20 JANUARY 2000 www.nature.com ‡ Fig. 2. Temporal oscillation dmi Ltet01 P a steady state ˆ 2 mi ‡ a0 ! unstable kanR dt n …1 ‡ pj † overallˆ increasecI in ¯uoresce i lacI; tetR; TetR TetR Maximum proteins cI; cell, α (! K ) pSC101 dpi gfp-aav 150 minutes, roughly threefo j ˆ per lacI; tetR M !PR ˆ 2 b…pi 2 mi † origin dt time. The amplitude of oscil ! cI LacI where the number of protein copies levels produced from a given GFP per cell of GFP ¯uorescence. promoter type duringccontinuous growth is a0 in the presence of exhibit oscillatory behaviou saturating amounts of repressor (owing to the `leakiness of the 6,000 6,000 lacI-lite 1 1 promoter), and a ‡ a0 in its absence; b denotes the ratio ofa Fourier a determined by the protein Proteins per cell ColE1 4,000 0 4,000 0 ! cI-lite decay rate to the mRNA decay rate;range of periods, as estimate and n is a Hill coef®cient. Time is -1 0 500 1,000 -1 0 500 1,000 rescaled in units of the mRNA lifetime; protein concentrations are written 2,000 2,000 PLtet01 intervals, is 160 6 40 min (m in units of KM, the number of repressors necessary to half-maximally repress a promoter; and mRNA concentrations are rescaled by theirsiblin GFP levels in the two 0 0 500 1,000 0 0 500 1000 Time (min) Time (min) translation ef®ciency, the average number of proteins produced per perio one another for long mRNA molecule. The numerical solution of the model shown in Fig. 1c
  17. 17. Molecular Systems Biology (2006) doi:10.1038/msb4100090 & 2006 EMBO and Biology (2006) Group All rights reserved 1744-4292/06 Molecular Systems Nature Publishingdoi:10.1038/msb4100090 & 2006 EMBO and Nature Publishing Group All rights reserved 1744-4292/06 www.molecularsystemsbiology.com www.molecularsystemsbiology.com Article number: 45 Available online at www Article number: 45 REVIEW REVIEWENGINEERING   Towards synthesis of a minimal cell Towards synthesis of a minimal cell Anthony C Forster1,* and George M Church2,* step, yet detailed plans have not been published. Here, 1 Synthetic biology through reviewcell contain Anthony C Forster1,* and George M Church2,* Department of Pharmacology and Vanderbilt Institute of Chemical Biology, biomo step, yet detailed plans have not been published. Here, attempt to outline the synthesis of a minimal attempt to outline the synthesis of a minimal cell contain the core cellular replication machinery, the pertin Towards synthesis of a minimal cell 1 1 AC Forster and GM Church the core cellular replication machinery, review the pertin 2 2 Vanderbilt University Medical Kevin Channon , Elizabeth HC Broml Department of Pharmacology Center, Nashville, TN, USA and and Vanderbilt Institute of Chemical Biology, Vanderbilt University Medical Center, Nashville, TN, USA and USA Department of Genetics, Harvard Medical School, Boston, MA, Department of Genetics,AC Forster, Department of Boston, MA, USA Available literature and highlight gaps in knowledge that need filling online at www.sciencedirect.com literature and highlight gaps in knowledge that need filling * Corresponding authors. Harvard Medical School, Pharmacology, Vanderbilt * Corresponding authors. AC23rd Ave.S. at Pierce,of Pharmacology, Vanderbilt University Medical Center, Forster, Department Nashville, TN 37232, USA. University 615 936 Center, 23rdþ 1 615 at Pierce, Nashville, TN 37232, USA. Tel.: þ 1 Medical 3112; Fax: Ave.S. 936 5555; Utility E-mail: 1 615 936 3112; Fax:list 615very short, containing only Utility Tel.: þ a.forster@vanderbilt.edu or is Synthetic biology through151 genes and 113 kbp. All engineering þ 1 GM 936 5555; Church, Department of Genetics, biomolecular design and Harvard Medical School, 77 Avenue Louis Pasteur, NBRderived from rapidly growing field thatWoolfson1,2 knowledge of theKevin Channon1Boston, is a E. coli and its a minimal cell has emerged in E-mail: a.forster@vanderbilt.edu or GM Church, Department ofBoston, genes are 238, ,Genetics, Synthetic biology Synthesizing will advance Elizabeth HC Bromley1 a minimal cell will advance knowledge bacteriophages Synthesizing and Derek N Harvard Medical School, 77 Avenue Louis Pasteur, NBR 238, biological replication. Many hypotheses in replication and MA 02115, USA. Tel.: þ 1 617 432 1278; Fax: þ 1 617 432 6513 (except global, multidisciplinary effort replication. biologists,such in replicationbiolo a Fax: þ 1 hammerhead RNA subsystemsamong beForster in chemists, and MA 02115, USA. Tel.: þ 1 617 432 1278;for the 617 432 6513 biologicalplant virus; from a can only Many hypotheses a synthetic tested Symons, 1987), implyingthat has emerged individual allows a ‘synthetic’of(from atosynthetic biol and Received 7.5.06; accepted 26.7.06 subsystems can only be tested in such Synthetic biology is a rapidly growing field thatproject. The meaning of wide variety the in second, it subsystems studies Greek sunthesis be grouped engineers, physicists, and mathematicians. general approaches. We believe that Received 7.5.06; accepted 26.7.06 into a small number of Broadly, the field project. The meaning of ‘synthetic’ (from Greek sunthesis a global, multidisciplinary effort among biologists, chemists, will be compatible. In contrast Broadly, the field to lists together) discussed here bypasses the current reliance put derived by be useful in defining and, hopefully, helping to comparative has two complementaryput together) discussed here bypasses the current reliance engineers, physicists, and mathematicians. this will has two complementary approaches, the biochemicallycells or understanding products: goals: To improve macromolecular cell of genomics or geneticgoals: To improve producesynthetic biology on exciting andlist area of synthetic biology. understanding of develop the based broad biological systems through mimicry and to synthetic biology on cells or macromolecular cell products: bio- aim the to put together an organism from small molecules alo is to put together an organism from small molecules alo biological systems through mimicry and to produce bio- does not contain with newgenes of unknown function or challen- the breadth of topics that orthogonal systems any functions. Here we review is aim For this review, because of Construction of a chemical system capable reference to the concept of syntheticsimplest approach emerging field, we found it necessary to may The simplest approach for creating an artificial cell ging membrane is, of replication is close to a contributeunderstood, pastan artificial cell of R capable of replication to this for creating area specifically with Construction of a chemical systemspace, thatproteins;ofso itwith new functions.studies fromwe review the may orthogonala systems The components for, evolvingfullyrefer to classic Here the and evolution, fed only by small molecule nutrients, is now functionaland biology evolution, fed only by small molecule nutrients, is now hierarchy by evolving an RNA polymerase made decades, reviews of R an RNA polymerasefive years, exclusively two and replicating ‘platform’ (Szostak etin few known the past made exclusively approaches to synthetic by accurately bytest,generating new apply our and for life. The al, 2001) to replace all protein components various areas from gaps as well as more conceivable. This could be achieved byspecifically with referenceet al, work conceptthree years. area stepwise integration conceivable. This could be achieved to stepwise and systems advance, integration (Szostak to the from the past of synthetic understanding of recent 2001) to replace all protein components constitute only about seven issue inall of replicating and evolving systems with this genes, vitro which are predicted of decades of work on the reconstitution of In keepingRNA and of Currentvitro replicating and evolving systems (e.g. to replace biological systems. of decades of work on the reconstitution of DNA, wethat is,on thehierarchy of components for, (e.g. to replace biology of biomolecule-based (Table indesign Mills et al, column). Opinion in Structural DNA, RNA and protein syntheses from to becomponents. Such a minimal areplicase;in the left 1967). But in comparison with a purifi RNA space, and Biology, focus largely protein pure forengineeringmodification componentsreplicase; Mills et al, 1967). But in comparison with a purifi syntheses from pure components. Such a minimal and I, bold Synthetic biology space: hierarchies of components, and protein-based system,ofapproaches guaranteed to arrive soo interactions and is neither the viewpoint to structural biology, syntheticit is neither aguaranteed approaches of generating new courtesy the hierarchy is set of basic units—amino it and functional to arrive soo cell project would initially define the components sufficient cell project would initially define the components sufficient From systems. protein-based system, recent each subsystem, allow detailed kineticribosome and Addresses kinetic analyses Bristol nor tell us more. base of At the A protein-based system for each subsystem, allow detailed Chemistry, University of Bristol, and BS8 1TS, UKdetermination (Diaconuand lipidswill connect with, a for breakthroughs intoanalysesadvance, and apply protein-based system 1will connect with, a systems synthesisBristol, Bristol reveal UKus more.nucleicexisting biological systems. One lik School of 1 nor tell acids, A test, of structure 1TD, more about, existing biological systems. Life, lik our understanding of Life, acids, sugars (Figure 1). lead to improved in vitro methods Biochemistry, vitro methods for synthesis of lead to improved in et al, 2005; Ogle and University Department of for 2 of reveal more about, BS8 level of complexity above these are what might be termed Ramakrishnan, 2005),tectons. be understood from supramolecular chem- significant three- biological systems. In machine, cannotThis termitisisborrowedsimply by studying it and biopolymers, therapeutics and biosensors. Completion keeping with this issue to simply programmed mol- and biopolymers, therapeutics and biosensors. Completion Corresponding author: Woolfson, Derek N (D.N.Woolfson@bristol.ac.uk) cannot be understood machine, of Current by studying it dimensional information is lacking for istry [1], be put together describeits parts. Along the w would yield a functionally and structurally understood yield a functionally and structurally understood only where ofused from its parts. Along the w parts; it must also 3% and nanoscale building blocks [2]. An parts; it must also be put together from ecular components the would products: aOpinion in Structural Biology, we focusnucleic acid tecton would bedesign Opinion Structural Biology life to synthesizing a cell, we might discover new biochemi Current few for synthetic life to example a largely on the self-replicating biosystem. Safety concernsin RNA modification proteins and cell, we the information for further assem- self-replicating biosystem. Safety concerns for synthetic2008, 18:491–498 synthesizing a ofaminoacyl- discover anew oligo- might short biochem and engineering of will be alleviated by extreme dependence on elaborate This review comes from on elaborate dependence will be alleviated by extreme Engineering and design(Table I, right column). essential for replication, tRNA synthetases essential containing with unsuspected macromo nucleotide a themed issue on biomolecule-based components and functions While some of the other tectons. Similarly, an for replication, unsuspected macromo functions bly through interactions laboratory reagents and conditionsDekfor viability.Regan to be solved at high based tecton would beunrecognized patterns laboratory reagents and conditions for viability. Our states systems. and Lynne Our and by Woolfson remain Edited complexes cular modifications or previously unrecognized patterns cular modificationsresolution, amino acid or previously a polypeptide designed proposed minimal genome is 113 kbp long and contains proposed minimal genome is 113 kbp long and contains coordinated expression. of self-assembling a-helix or b-strands. to form stretches coordinated expression. a draft Available online 5th August 2008 structure for any replicating system an artificial, aprotein-bas three-dimensional Importantly, a tecton is something more than simple How good a model would an artificial, that the Addresses matter the history of biology. a model would 151 genes. We detail building blocks already in place and 151 genes. We detail building blocks already in place and 0959-440X/$ – see front is a major milestone in How good element of secondary structure: It implies protein-bas Ltd. completion. Chemistry, minimal cell be for natural cells? The only cellular alternat major hurdles to overcome for 1# 2008 Elsevier of All rights reserved. University of element contains informationstructures.further assembly about only cellular alternat its School is a perturbed natural cell, an BS8 1TS,Combiningsystem ev Bristol, Bristol incredibly complex system e into prescribed higher order UK complex tec- Molecular pathwaysBiology 22 August 2006;Figure 1 A minimal cell containing biological macromolecules and Systems 2DOI 10.1016/j.sbi.2008.06.006 tons leads to the next level in the hierarchy, in which self-proposed to be necessary and sufficient for replication from small molecule doi:10.1038/msb4100090 Department of Biochemistry,the simplest ofof Bristol, Bristol BS8 1TD,pro- ba for University cells. are much simpler purified system bainteractions UK assembled units A formed through purified system
  18. 18. REVERSE   SYSTEMS  ENGINEERING   ENGINEERING   BIOLOGY   DEFINE   DEFINE   MATH   HYPOTHESIS   ELEMENTS   INTERACTION   DESCRIPTION  
  19. 19. REVERSE   ENGINEERING   Ideally:  all  the  elements  of  a  system…    genes    isoforms    Proteins  (with  all  modificaMons)    Metabolites   DEFINE    all  together…  ELEMENTS   FOCUS  THE   SELECT  BASED  ON   QUESTION   THAT   Genes  being  modified  in  T-­‐cell  acMvaMon?     How  do  they  interact?  
  20. 20. What  is   Piece  1:  The   Reverse   connecMons  Engineering?  
  21. 21. REVERSE   ENGINEERING   DEFINE  ELEMENTS  
  22. 22. REVERSE   SYSTEMS   ENGINEERING   BIOLOGY   DEFINE   DEFINE  ELEMENTS   INTERACTION   PIECE  1  
  23. 23. A   B   A   B   A   B  1   KNOWLEDGE   2   DATA  
  24. 24. INTERACTION  IDENTIFICATION  BY  DATA   Wisdom of crowds for robust gene network inference Daniel Marbach1,2,11, James C Costello3–5,11, Robert Küffner6,11, Nicole M Vega3–5, Robert J Prill7, Diogo M Camacho3–5,10, Kyle R Allison3–5, The DREAM5 Consortium8, Manolis Kellis1,2, James J Collins3–5,9 & Gustavo Stolovitzky7 Reconstructing gene regulatory networks from high-throughput successfully used to address many biological problems8–11, yet data is a long-standing challenge. Through the Dialogue on when applied to the same data, they can generate disparate sets Reverse Engineering Assessment and Methods (DREAM) project, of predicted interactions2,3. we performed a comprehensive blind assessment of over 30 Understanding the advantages and limitations of different network © 2012 Nature America, Inc. All rights reserved. network inference methods on Escherichia coli, Staphylococcus inference methods is critical for their effective application in a given aureus, Saccharomyces cerevisiae and in silico microarray data. We biological context. The DREAM project is a framework to enable characterize the performance, data requirements and inherent such an assessment through standardized performance metrics and biases of different inference approaches, and we provide guidelines common benchmarks12 (http://www.the-dream-project.org/). for algorithm application and development. We observed that no DREAM is organized around annual challenges, whereby the com- single inference method performs optimally across all data sets. munity of network inference experts is solicited to run their algo- h_p://www.the-­‐dream-­‐project.org/   In contrast, integration of predictions from multiple inference rithms on benchmark data sets, participating teams submit their methods shows robust and high performance across diverse data solutions to the challenge and the submissions are evaluated12–14. sets. We thereby constructed high-confidence networks for E. coli Here we present the results for the transcriptional network and S. aureus, each comprising ~1,700 transcriptional interactions inference challenge from DREAM5, the fifth annual set of DREAM at a precision of ~50%. We experimentally tested 53 previously systems biology challenges. The community of network infer- unobserved regulatory interactions in E. coli, of which 23 (43%) ence experts was invited to infer genome-scale transcriptional were supported. Our results establish community-based methods as regulatory networks from gene-expression microarray data sets a powerful and robust tool for the inference of transcriptional gene for a prokaryotic model organism (E. coli), a eukaryotic model regulatory networks. organism (S. cerevisiae), a human pathogen (S. aureus) and an in silico benchmark (Fig. 1). ‘The wisdom of crowds’ refers to the phenomenon in which the The predictions made from this challenge enabled the com- collective knowledge of a community is greater than the knowl- prehensive characterization of network inference methods edge of any individual1. Based on this concept, we developed across different species and data sets, providing insights into a community approach to address one of the long-standing method performance, data requirements and inherent biases. challenges in molecular and computational biology, which is to We found that the performance of inference methods varies, uncover and model gene regulatory networks. Genome-scale with a different method performing best in each setting. Taking inference of transcriptional gene regulation has become pos- advantage of variation, we integrated predictions across infer- sible with the advent of high-throughput technologies such as ence methods and demonstrated that the resulting community- microarrays and RNA sequencing, as they provide snapshots of based consensus networks are robust across species and data the transcriptome under many tested experimental conditions. sets, achieving the best overall performance by far. Finally, we

×