Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Molecular Systems Biology 7; Article number 528; doi:10.1038/msb.2011.58Citation: Molecular Systems Biology 7:528& 2011 EMBO and Macmillan Publishers Limited All rights reserved 1744-4292/11www.molecularsystemsbiology.comThe essential genome of a bacteriumBeat Christen1,5, Eduardo Abeliuk1,2,5, John M Collier3, Virginia S Kalogeraki1, Ben Passarelli3,4, John A Coller3, Michael J Fero1,Harley H McAdams1 and Lucy Shapiro1,*1 Department of Developmental Biology, Stanford University, Stanford, CA, USA, 2 Department of Electrical Engineering, Stanford University, Stanford, CA, USA,3 Functional Genomics Facility, Stanford University, Stanford, CA, USA and 4 Stem Cell Institute Genome Center, Stanford University, Stanford, CA, USA5 These authors contributed equally to this work* Corresponding author. Department of Developmental Biology, Stanford University, B300 Beckman Center, Stanford, CA 94305, USA. Tel.: þ 1 650 725 7678;Fax: þ 1 650 725 7739; E-mail: shapiro@stanford.eduReceived 1.6.11; accepted 13.7.11 Caulobacter crescentus is a model organism for the integrated circuitry that runs a bacterial cell cycle. Full discovery of its essential genome, including non-coding, regulatory and coding elements, is a prerequisite for understanding the complete regulatory network of a bacterial cell. Using hyper- saturated transposon mutagenesis coupled with high-throughput sequencing, we determined the essential Caulobacter genome at 8 bp resolution, including 1012 essential genome features: 480 ORFs, 402 regulatory sequences and 130 non-coding elements, including 90 intergenic segments of unknown function. The essential transcriptional circuitry for growth on rich media includes 10 transcription factors, 2 RNA polymerase sigma factors and 1 anti-sigma factor. We identified all essential promoter elements for the cell cycle-regulated genes. The essential elements are preferentially positioned near the origin and terminus of the chromosome. The high- resolution strategy used here is applicable to high-throughput, full genome essentiality studies and large-scale genetic perturbation experiments in a broad class of bacterial species. Molecular Systems Biology 7: 528; published online 30 August 2011; doi:10.1038/msb.2011.58 Subject Categories: functional genomics; microbiology & pathogens Keywords: functional genomics; next-generation sequencing; systems biology; transposon mutagenesisIntroduction accurate identification of the entire essential genome of any bacterial species at a resolution of a few base pairs.In addition to protein-coding sequences, the essential genomeof any organism contains essential structural elements, non-coding RNAs and regulatory sequences. We have identified theCaulobacter crescentus essential genome to 8 bp resolution Results and discussionby performing ultrahigh-resolution transposon mutagenesis We engineered a Tn5 derivative transposon (Tn5Pxyl) thatfollowed by high-throughput DNA sequencing to determine carries at one end an inducible outward pointing Pxylthe transposon insertion sites. A notable feature of C. promoter (Christen et al, 2010; Supplementary Figure 1A;crescentus is that the regulatory events that control polar Materials and methods). Thus, the Tn5Pxyl element candifferentiation and cell-cycle progression are highly integrated, activate or disrupt transcription at any site of integration,and they occur in a temporally restricted order (McAdams and depending on the insertion orientation. About 8 Â105 viableShapiro, 2011). Many components of the core regulatory circuit Tn5Pxyl transposon insertion mutants capable of colonyhave been identified and simulation of the circuitry has been formation on rich media (PYE) plates were pooled. Next,reported (Shen et al, 2008). The identification of all essential DNA from hundred of thousands of transposon insertion sitesDNA elements is essential for a complete understanding of the reading outwards into flanking genomic regions was parallelregulatory networks that run a bacterial cell. PCR amplified and sequenced by Illumina paired-end sequen- Essential protein-coding sequences have been reported for cing (Figure 1; Supplementary Figure 1B; Materials andseveral bacterial species using relatively low-throughput methods). A single sequencing run yielded 118 milliontransposon mutagenesis (Hutchison et al, 1999; Jacobs et al, raw sequencing reads. Of these, 490 million (480%) read2003; Glass et al, 2006) and in-frame deletion libraries outward from the transposon element into adjacent genomic(Kobayashi et al, 2003; Baba et al, 2006). Two recent studies DNA regions (Supplementary Figure 1C) and were subse-used high-throughput transposon mutagenesis for fitness and quently mapped to the 4-Mbp genome, allowing us togenetic interaction analysis (Langridge et al, 2009; van determine the location and orientation of 428 735 independentOpijnen et al, 2009). Here, we have reliably identified all transposon insertions with base-pair accuracy (Figure 2A;essential coding and non-coding chromosomal elements, Materials and methods).using a hyper-saturated transposon mutagenesis strategy that Eighty percent of the genome sequence showed an ultrahighis scalable and can be extended to obtain rapid and highly density of transposon hits; an average of one insertion event& 2011 EMBO and Macmillan Publishers Limited Molecular Systems Biology 2011 1
  2. 2. The essential genome of a bacteriumB Christen et al elements within the origin region of the chromosome. Although 90 additional non-disruptable small genomeHyper-saturated Tn5 mutant pool PCR Illumina flow-cell elements were identified (Supplementary Data-DT1), theytransposon mutagenesis cannot be explained within the context of the current genome annotation. Eighteen of these are conserved in at least one closely related species. Only two could encode a protein ISGrowth selection on of over 50 amino acids.PYE plates Illumina sequencing of IS Tn5 IS Tn5 junctions Essential protein-coding sequences IS For each of the 3876 annotated open reading frames (ORFs), Parallel PCR amplifyTn5 junctions we analyzed the distribution, orientation and genetic contextFigure 1 Genomic high-resolution transposon scanning strategy. Insertion of transposon insertions. We identified the boundaries of themutants are pooled to generate a hyper-saturated Tn5 mutant library. essential protein-coding sequences and calculated a statisti-Subsequent parallel amplification of individual transposon junctions by a nested cally robust metric for ORF essentiality (Materials andarbitrary PCR yields DNA fragments reading out of transposon elements into methods; Supplementary Data-DT2). There are 480 essentialadjacent genomic DNA sequences. DNA fragments carry terminal adapters ORFs and 3240 non-essential ORFs. In addition, there were(orange, blue) compatible to the Illumina flow-cell and are sequenced in parallelby standard paired-end sequencing. 156 ORFs that severely impacted fitness when mutated, as evidenced by a low number of disruptive transposon insertions (Supplementary methods). Figure 2E shows theevery 7.65 bp. The largest gap detectable between consecutive distribution of transposon hits for a subregion of the genomeinsertions was o50 bp (Supplementary Figure 2). Within the encoding essential and non-essential ORFs. Genome-wideremaining 20% of the genome, chromosomal regions of up to transposon insertion frequencies for the annotated Caulobac-6 kb in length tolerated no transposon insertions. ter ORFs are shown in Figure 2F. In all, 145/480 essential ORFs lacked transposon insertions across the entire coding region, suggesting that the full length of the encoded protein up to theEssential non-coding sequences last amino acid is essential. The 8-bp resolution allowed aWithin non-coding sequences of the Caulobacter genome, we dissection of the essential and non-essential regions of thedetected 130 small non-disruptable DNA segments between coding sequences. Sixty ORFs had transposon insertions90 and 393 bp long (Materials and methods; Supplementary within a significant portion of their 30 region but lackedData-DT1). (Tables in the Excel file of Supplementary Data are insertions in the essential 50 coding region, allowing thedesignated DT1, DT2 and so on.) Owing to the uniform identification of non-essential protein segments. For example,distribution of transposition across the genome (Materials and transposon insertions in the essential cell-cycle regulatorymethods), such non-disruptable DNA regions are highly gene divL, a tyrosine kinase, showed that the last 204unlikely (Supplementary Figure 2). Among 27 previously C-terminal amino acids did not impact viability (Figure 2G),identified and validated sRNAs (Landt et al, 2008), three confirming previous reports that the C-terminal ATPase(annotated as R0014, R0018 and R0074 in Landt et al, 2008) domain of DivL is dispensable for viability (Reisinger et al,were contained within non-disruptable DNA segments while 2007; Iniesta et al, 2010). Our results show that the entireanother three (R0005, R0019 and R0025) were partially C-terminal ATPase domain, as well as the majority of thedisruptable. Figure 2B shows one of the three (Supplementary adjacent kinase domain, is non-essential while the N-terminalData-DT1) non-disruptable sRNA elements, R0014, that is region including the first 25 amino acids of the kinase domainupregulated upon entry into stationary phase (Landt et al, contain essential DivL functions.2008). Two additional small RNAs found to be essential are the Conversely, we found 30 essential ORFs that toleratedtransfer-messenger RNA (tmRNA) and the ribozyme RNAseP disruptive transposon insertions within the 50 region while(Landt et al, 2008). In addition to the 8 non-disruptable sRNAs, no insertion events were tolerated further downstream29 out of the 130 essential non-coding sequences contained (Supplementary Table 1). One such example, the essentialnon-redundant tRNA genes (Figure 2C); duplicated tRNA histidine phosphotransferase gene chpT (Biondi et al, 2006),genes were found to be non-essential. We identified two had 12 transposon insertions near the beginning of thenon-disruptable DNA segments within the chromosomal annotated ORF (Figure 2H). These transposon insertionsorigin of replication (Figure 2D). A 173-bp long essential would prevent the production of a functional protein andregion contains three binding sites for the replication repressor should not be detectable within chpT or any essential ORFCtrA, as well as additional sequences that are essential for unless the translational start site is mis-annotated. Using LacZ-chromosome replication and initiation control (Marczynski reporter assays (Supplementary methods), we found that theet al, 1995). A second 125 bp long essential DNA segment promoter element as well as the translational start site of chpTcontains a binding motif for the replication initiator protein was located downstream of the annotated start codonDnaA. Surprisingly, between these non-disruptable origin (Figure 2H). Cumulatively, 46% of all essential ORFssegments there were multiple transposon hits suggesting that (30 out of 480) appear to be shorter than the annotatedthe Caulobacter origin is modular with possible DNA looping ORF (Supplementary Table 1), suggesting that these arecompensating for large insertion sequences. Thus, we resolved probably mis-annotated, as well. Thus, 145 ORFs showedessential non-coding RNAs, tRNAs and essential replication all regions were essential, 60 ORFs showed non-essential2 Molecular Systems Biology 2011 & 2011 EMBO and Macmillan Publishers Limited
  3. 3. The essential genome of a bacterium B Christen et al i q u e T n 5 in A 5 un B 74 bp small RNA C 75 bp tRNA-Thr Probe cross se -correlation 73 r ti 28 o 4 ns sRNA Caulobacter genome: 4 Mbp CCNA 0778 CCNA 0779 CCNA 2436 CCNA 2437 Essentiality Essentiality 10–8 (P -value) (P-value) 250 bp 10–8 250 bp 10–6 10–6 10–4 10–4 10–2 10–2 Cori 839 750 840 250 2 577 500 2 578 000 Genome coordinates (bp) Genome coordinates (bp) D Origin of replication E 173 bp 125 bp ftsZ ftsA ftsQ ddl murB murC murG ftsW 10–80 2500 bp Essentiality (P -value) hemE CCNA 0001 10–60 10–40 Essentiality 10–8 (P -value) 250 bp 10–6 10–20 10–4 10–2 Genome coordinates (bp) 0 250 500 Genome coordinates (bp) G H Mis-annotated start 250 bp Essential protein domains F 1000 bp 3240 Non-essential chpT Number of Tn5 insertions ORFs 400 divL Real start Promoter activity 300 800 200 156 High-fitness ALA_Rich PAS HK ATPase 400 ORFs 0 100 –200 –150 –100 –50 ATG 480 Essential ORFs Essential part of DivL (565 AA) Length of promoter probe (bp) 0 0 1000 2000 3000 4000 ORF length (bp)Figure 2 Identification of essential genome features. (A) 28 735 unique Tn5 insertions sites (red) were mapped onto the 4-Mbp Caulobacter genome that encodes3876 annotated ORFs shown in the inner (minus strand) and middle (plus strand) tracks by black lines. (B) An 192-bp essential genome segment (no Tn5 insertions)contains a stationary phase expressed non-coding sRNA (Landt et al, 2008). The rectangular heat map above shows the micro-array probe cross-correlation pattern ofthe sRNA (Landt et al, 2008). The locations of transposon insertions (red marks) are shown above the genome track. P-values for essentiality for the different gap sizesobserved are below. (C) A small non-disruptable segment containing an essential tRNA. (D) Two non-disruptable genome regions include two regulatory sequencesrequired for chromosome replication. (E) Locations of mapped transposon insertion sites (red marks) on a segment of the Caulobacter genome. Non-essential ORFs(blue) have dense Tn5 transposon insertions, while large non-disruptable genome regions contain essential ORFs (light red). For every non-disruptable genome region,a P-value for gene essentiality is calculated assuming uniform distributed Tn5 insertion frequency and neutral fitness costs. (F) For each of the 3876 Caulobacter ORFs,the number of Tn5 insertions is plotted against the corresponding ORF length. Non-essential ORFs (blue), fitness relevant ORFs (Supplementary information) (green)and essential ORFs (red) have different transposon insertion frequencies. (G) The essential cell-cycle gene divL had multiple transposon insertions within the 30 tail. Thisdispensable region encodes parts of the histidine kinase domain as well as an ATPase domain that is non-essential. (H) One of the ORFs with mis-annotated start site.The essential cell-cycle gene chpT tolerates disruptive Tn5 insertions in the 50 region of the mis-annotated ORF. The native promoter element and TSS are locateddownstream of the mis-annotated start codon as confirmed by lacZ promoter activity assays.C-termini and the start of 30 ORFs were mis-annotated. The proteins are of unknown function (Table I; Supplementaryremaining 245 ORFs tolerated occasional insertions within a Table 2). We attempted to delete 11 of the genes encodingfew amino acids of the ORF boundaries (Supplementary Figure essential hypothetical proteins and recovered no in-frame3; Materials and methods). deletions, confirming that these proteins are indeed essential The majority of the essential ORFs have annotated (Supplementary Table 3).functions. They participate in diverse core cellular processes Among the 480 essential ORFs, there were 10 essentialsuch as ribosome biogenesis, energy conversion, metabolism, transcriptional regulatory proteins (Supplementary Table 4),cell division and cell-cycle control. Forty-nine of the essential including the cell-cycle regulators ctrA, gcrA, ccrM, sciP and& 2011 EMBO and Macmillan Publishers Limited Molecular Systems Biology 2011 3
  4. 4. The essential genome of a bacteriumB Christen et alTable I The essential Caulobacter genome essential operon is the transcript encoding ATPase synthase components (Figure 3B). Altogether, the 480 essential protein- Quantity Size Fraction of (bp) genome coding and 37 essential RNA-coding Caulobacter genes are (%)a organized into operons such that 402 individual promoter regions are sufficient to regulate their expression (Table I).Essential non-coding elements 130 14 991 0.37 Of these 402 essential promoters, the transcription start sites Non-coding elements, unknown 91 10 893 0.27 function (TSSs) of 105 were previously identified (McGrath et al, 2007). tRNAs 29 2312 0.06 We found that 79/105 essential promoter regions extended Small non-coding RNAs 8 1488 0.04 on average 53 bp upstream beyond previously identified TSS Genome replication elements in 2 298 0.01 the Cori (Figure 3C; McGrath et al, 2007). These essential control elements accommodate binding sites for transcription factorsEssential ORFs 480 444 417 11.00 and RNA polymerase sigma factors (Supplementary Table 5). Metabolism 176 160 011 3.96 Of the 402 essential promoter regions, 26 mapped downstream Ribosome function 95 76 420 1.89 Cell wall & membrane biogenesis 54 63 393 1.57 of the predicted TSS. To determine if these contained an Proteins of unknown function 49 30 469 0.75 additional TSS, we fused the newly identified promoter regions Cell cycle, division and DNA 43 52 322 1.29 with lacZ and found that 24 contained an additional TSS replication Other cellular processes 39 36 017 0.89 (Supplementary Table 6). Therefore, 24 genes contain at least 2 Transcription 15 16 417 0.41 TSS and only the downstream site was found to be essential Signal transduction 9 9368 0.23 during growth on rich media. The upstream TSS may beEssential promoter regions 402 33 533 0.83 required under alternative growth conditions. Contained within intergenic 210 13150 0.33 sequences Extending into upstream ORFs 101 11 428 0.28 Driving operons 91 8955 0.22 Cell cycle-regulated essential genesEssential Caulobacter genome 492 941 12.19 Of the essential ORFs, 84 have a cell cycle-dependent transcription pattern (McGrath et al, 2007; SupplementaryThe identified essential non-coding, protein-coding and regulatory genome Data-DT5). The cell cycle-regulated essential genes hadfeatures are shown.a Fraction of the entire Caulobacter genome encoding essential elements. statistically significant longer promoter regions compared with non-cell cycle-regulated genes (median length 87 versus 41 bp, Mann–Whitney test, P-value 0.0018). The genes with longer promoter regions generally have more complexdnaA (McAdams and Shapiro, 2003; Holtzendorff et al, 2004; transcriptional control. Among these are key genes that areCollier and Shapiro, 2007; Gora et al, 2010; Tan et al, 2010), plus critical for the commitment to energy requirements and5 uncharacterized putative transcription factors. We surmise regulatory controls for cell-cycle progression. For example,that these five uncharacterized transcription factors either the cell-cycle master regulators ctrA, dnaA and gcrA (Colliercomprise transcriptional activators of essential genes or et al, 2006) ranked among the genes with the longest essentialrepressed genes that would move the cell out of its replicative promoter regions (Figure 3D and E; Supplementary Data-DT5).state. In addition, two RNA polymerase sigma factors RpoH Other essential cell cycle-regulated genes with exceptionallyand RpoD, as well as the anti-sigma factor ChrR, which long essential promoters included ribosomal genes, gyrBmitigates rpoE-dependent stress response under physiological encoding DNA gyrase and the ftsZ cell-division genegrowth conditions (Lourenco and Gomes, 2009), were also (Figure 3E). The essential promoter region of ctrA extendedfound to be essential. Thus, a set of 10 transcription factors, 171 bp upstream of the start codon (Figure 3F) and included2 RNA polymerase sigma factors and 1 anti-sigma factor two previously characterized promoters that control itscomprise the essential core transcriptional regulators for transcription by both positive and negative feedback regula-growth on rich media. tion (Domian et al, 1999; Tan et al, 2010). Only one of the two upstream SciP binding sites in the ctrA promoter (Tan et al, 2010) was contained within the essential promoter regionEssential promoter elements (Figure 3F), suggesting that the regulatory function of theTo characterize the core components of the Caulobacter cell- second SciP binding site upstream is non-essential for growthcycle control network, we identified essential regulatory on rich media.sequences and operon transcripts (Supplementary Data-DT3 Altogether, the essential Caulobacter genome contains atand DT4). Figure 3A illustrates the transposon scanning least 492 941 bp. Essential protein-coding sequences comprisestrategy used to locate essential promoter sequences. The 90% of the essential genome. The remaining 10% consistspromoter regions of 210 essential genes were fully contained of essential non-coding RNA sequences, gene regulatorywithin the upstream intergenic sequences, and promoter elements and essential genome replication features (Table I).regions of 101 essential genes extended upstream into flanking Essential genome features are non-uniformly distributed alongORFs (Table I). We also identified 206 essential genes that are the Caulobacter genome and enriched near the origin and theco-transcribed with the corresponding flanking gene(s) and terminus regions, indicating that there are constraints on theexperimentally mapped 91 essential operon transcripts chromosomal positioning of essential elements (Figure 4A).(Table I; Supplementary Data-DT4). One example of an The chromosomal positions of the published E. coli essential4 Molecular Systems Biology 2011 & 2011 EMBO and Macmillan Publishers Limited
  5. 5. The essential genome of a bacterium B Christen et alA B Schematic of essential promoter mapping Identification of an essential operon transcript Sense 500 bp insertions Essential ORF Sense insertion atpB E atpF atpF CCNA 0369 Anti-sense insertion Essential ORF ATP synthase operon Anti-sense insertions Essential promoter region E Essential promoter analysis of cell cycle-regulated genes lacking anti-sense insertions Essential promoter (bp)C D 300 200 100 0 Number of genesNumber of genes 80 20 15 60 gcrA dnaA 10 40 ctrA Cell cycle-expression profile 20 dnaA 5 gcrA 0 0 0 40 80 120 0 100 200 gyrB Essential promoter region Length of essential ftsZ Ribosomal proteins extending beyond TSS (bp) promoter region (bp)F Multiple TSS within essential promoter regions P1 P2 ctrA RNAP RNAP Ribosomal proteins ctrA Essential ctrA promoter (171 bp) 0 24 48 84 108 132 156 SciP binding site CtrA binding site Repressed Induced Cell-cycle progression (min)Figure 3 Genome-wide identification of essential gene regulatory sequences. (A) Transposon insertions within the promoter region of an essential gene are onlyviable if the transposon-specific promoter points toward the gene (sense insertion); insertions in the opposite orientation (anti-sense insertions) are lethal. The distancebetween the annotated start codon and the first detected occurrence of an anti-sense insertion within the upstream region of an essential gene defines its essentialpromoter region. (B) Anti-sense insertions (red marks below genome track) are absent throughout the entire ATP synthase operon while sense insertions (red marksabove genome track) are only tolerated within the non-essential lead gene and within non-coding regions of co-transcribed downstream genes. (C) Lengths of promoterregions that extend upstream of predicted transcriptional start sites. (D) Size distribution of essential promoter regions. The cell-cycle master regulators ctrA, dnaA andgcrA, which are subjected to complex cell cycle-dependent regulation, ranked among the longest essential promoter regions identified. (E) Essential cell cycle-regulatedgenes are clustered according to their temporal expression profile. Essential genes with long essential promoter regions indicate cell-cycle hub nodes subjected tocomplex transcriptional regulation. (F) Only insertions in the sense strand (red lines above the genome track) are tolerated within the 171 bp long essential promoterregion of the cell-cycle master regulator ctrA. Insertions in the anti-sense strand (red lines below the genome track) are absent from the essential ctrA promoter. Bothtranscription start sites (arrows P1, P2), two DNA binding sites for CtrA (gray boxes) and one of two reported binding site for the SciP transcription factor (yellow boxes)are contained within the identified essential promoter region.coding sequences are preferentially located at either side of the for Caulobacter, but not for E. coli, since Caulobacter cannotorigin (Figure 4A; Rocha, 2004). produce ATP through fermentation. Thus, the essentiality of a The question of what genes constitute the minimum set gene is also defined by non-local properties that not onlyrequired for prokaryotic life has been generally estimated by depend on its own function but also on the functions of allcomparative essentiality analysis (Carbone, 2006) and for a other essential elements in the genome. The strategy describedfew species experimentally via large-scale gene perturbation here provides a direct experimental approach that, because ofstudies (Akerley et al, 1998; Hutchison et al, 1999; Kobayashi its simplicity and general applicability, can be used to quicklyet al, 2003; Salama et al, 2004). Of the 480 essential determine the essential genome for a large class of bacterialCaulobacter ORFs, 38% are absent in most species outside species.the a-proteobacteria and 10% are unique to Caulobacter(Figure 4B). Interestingly, among 320 essential Caulobacter Materials and methodsproteins that are conserved in E. coli, more than one third are Supplementary information includes descriptions of (i) transposonnon-essential (Figure 4C). The variations in essential gene construction and mutagenesis, (ii) DNA library preparation andcomplements relate to differences in bacterial physiology and sequencing, (iii) sequence processing, (iv) essentiality analysis andlife style. For example, ATP synthase components are essential (v) statistical data analysis.& 2011 EMBO and Macmillan Publishers Limited Molecular Systems Biology 2011 5
  6. 6. The essential genome of a bacteriumB Christen et al –log(P -value) Essential transcriptsA Distribution of essential C. crescentus transcripts Figure 4 Chromosomal distribution of essential transcripts and phylogenetic 75 conservation of essential Caulobacter ORFs. (A) Chromosome distribution of 50 essential transcripts and ORFs found within a 500-kb window. Top panel: Essential Caulobacter transcripts are non-uniformly distributed and are enriched 25 ori ter ori near the replication origin (ori) and terminus (ter) regions of the chromosome. 0 The P-values for enrichment of essential Caulobacter ORFs (middle panel) and E. coli ORFs (bottom panel) are graphed as a function of chromosomal position. 30 Enrichment of essential C. crescentus ORFs ori Inserts show a schematic representation of the corresponding circular chromosomes indicating regions of enrichment in red. (B) A heat map showing 20 the sequence conservation of essential Caulobacter proteins across the a, b, g, 10 d and e classes of proteobacteria. Highly conserved essential proteins are in red ter while poorly conserved proteins are in black. (C) Venn diagram of overlap 0 between Caulobacter and E. coli ORFs (outer circles) as well as their subsets of essential ORFs (inner circles). Less than 38% of essential Caulobacter ORFs are Enrichment of essential E. coli ORFs ori –log(P -value) 30 conserved and essential in E. coli. Only essential Caulobacter ORFs present in the STING database were considered, leading to a small disparity in the total 20 number of essential Caulobacter ORFs. 10 ter 0 0.00 0.25 0.50 0.75 1.00 Supplementary information Rel. genome coordinates Supplementary information is available at the Molecular Systems Biology website ( Proteobacteria α β γ δ ε Acknowledgements This research was supported by DOE Office of Science grant DE-FG02- 05ER64136 to HM; NIH grants K25 GM070972-01A2 to MF, R01, GM51426k, R01 GM32506 and GM073011-04 to LS; Swiss National 480 Identified essential Caulobacter ORFs Foundation grant PA00P3-126243 to BC and the L&Th. La Roche Foundation Fellowship to BC. Author contributions: BC designed the research. BC, EA, VSK and MJF performed the experiments and analysis. BC, JMC, BP and JAC performed the sequencing and related analysis. BC, EA, HHM and LS wrote the manuscript. Conflict of interest The authors declare that they have no conflict of interest. References Akerley BJ, Rubin EJ, Camilli A, Lampe DJ, Robertson HM, Mekalanos JJ (1998) Systematic identification of essential genes by in vitro mariner mutagenesis. Proc Natl Acad Sci USA 95: 8927–8932 Bit-homology score Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction 0.0 0.2 0.4 0.6 0.8 1.0 of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006 0008C Caulobacter E. coli Biondi EG, Reisinger SJ, Skerker JM, Arif M, Perchuk BS, Ryan KR, ORFs ORFs Laub MT (2006) Regulation of the bacterial cell cycle by an 779 2712 integrated genetic circuit. Nature 444: 899–904 2446 Carbone A (2006) Computational prediction of genomic functional 130 67 cores specific to different microbes. J Mol Evol 63: 733–746 Christen B, Fero MJ, Hillson NJ, Bowman G, Hong SH, Shapiro L, 210 McAdams HH (2010) High-throughput identification of protein 129 235 localization dependency networks. Proc Natl Acad Sci USA 107: 4681–4686 Collier J, Murray SR, Shapiro L (2006) DnaA couples DNA replication and the expression of two cell cycle master regulators. EMBO J 25: Essential for Essential for 346–356 Caulobacter E. coli Collier J, Shapiro L (2007) Spatial complexity and control of a bacterial cell cycle. Curr Opin Biotechnol 18: 333–340 Domian IJ, Reisenauer A, Shapiro L (1999) Feedback control of a master bacterial cell-cycle regulator. Proc Natl Acad Sci USA 96: 6648–66536 Molecular Systems Biology 2011 & 2011 EMBO and Macmillan Publishers Limited
  7. 7. The essential genome of a bacterium B Christen et alGlass JI, Assad-Garcia N, Alperovich N, Yooseph S, Lewis MR, mediated by the sigmaE-ChrR system in Caulobacter crescentus. Maruf M, Hutchison III CA, Smith HO, Venter JC (2006) Essential Mol Microbiol 72: 1159–1170 genes of a minimal bacterium. Proc Natl Acad Sc USA 103: 425–430 Marczynski GT, Lentine K, Shapiro L (1995) A developmentallyGora KG, Tsokos CG, Chen YE, Srinavasan BS, Perchuk BS, regulated chromosomal origin of replication uses essential Laub MT (2010) A cell-type-specific protein-protein interaction transcription elements. Genes Dev 9: 1543–1557 modulates transcriptional activity of a master regulator in McAdams HH, Shapiro L (2003) A bacterial cell-cycle regulatory Caulobacter crescentus. Mol Cell 39: 455–467 network operating in time and space. Science 301: 1874–1877Holtzendorff J, Hung D, Brende P, Reisenauer A, Viollier PH, McAdams HH, Shapiro L (2011) The architecture and conservation McAdams HH, Shapiro L (2004) Oscillating global regulators pattern of whole-cell control circuitry. J Mol Biol 409: 28–35 control the genetic circuit driving a bacterial cell cycle. Science McGrath PT, Lee H, Zhang L, Iniesta AA, Hottes AK, Tan MH, 304: 983–987 Hillson NJ, Hu P, Shapiro L, McAdams HH (2007) High-throughputHutchison CA, Peterson SN, Gill SR, Cline RT, White O, Fraser CM, identification of transcription start sites, conserved promoter Smith HO, Venter JC (1999) Global transposon mutagenesis motifs and predicted regulons. Nat Biotechnol 25: 584–592 and a minimal Mycoplasma genome. Science 286: 2165–2169 Reisinger SJ, Huntwork S, Viollier PH, Ryan KR (2007) DivL performsIniesta AA, Hillson NJ, Shapiro L (2010) Cell pole-specific activation critical cell cycle functions in Caulobacter crescentus independent of of a critical bacterial cell cycle kinase. Proc Natl Acad Sci USA 107: kinase activity. J Bacteriol 189: 8308–8320 7012–7017 Rocha EP (2004) The replication-related organization of bacterialJacobs MA, Alwood A, Thaipisuttikul I, Spencer D, Haugen E, Ernst S, genomes. Microbiology 150: 1609–1627 Will O, Kaul R, Raymond C, Levy R, Chun-Rong L, Guenthner D, Salama NR, Shepherd B, Falkow S (2004) Global transposon Bovee D, Olson MV, Manoil C (2003) Comprehensive transposon mutagenesis and essential gene analysis of Helicobacter pylori. mutant library of Pseudomonas aeruginosa. Proc Natl Acad Sci USA J Bacteriol 186: 7926–7935 100: 14339–14344 Shen X, Collier J, Dill D, Shapiro L, Horowitz M, McAdams HHKobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, (2008) Architecture and inherent robustness of a bacterial Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, Boland F, cell-cycle control system. Proc Natl Acad Sci USA 105: 11340–11345 Brignell SC, Bron S, Bunai K, Chapuis J, Christiansen LC, Tan MH, Kozdon JB, Shen X, Shapiro L, McAdams HH (2010) An Danchin A, Debarbouille M, Dervyn E, Deuerling E et al (2003) essential transcription factor, SciP, enhances robustness of Essential Bacillus subtilis genes. Proc Natl Acad Sci USA 100: Caulobacter cell cycle regulation. Proc Natl Acad Sci USA 107: 4678–4683 18985–18990Landt SG, Abeliuk E, McGrath PT, Lesley JA, McAdams HH, Shapiro L van Opijnen T, Bodi KL, Camilli A (2009) Tn-seq: high-throughput (2008) Small non-coding RNAs in Caulobacter crescentus. parallel sequencing for fitness and genetic interaction studies in Mol Microbiol 68: 600–614 microorganisms. Nat Methods 6: 767–772Langridge GC, Phan MD, Turner DJ, Perkins TT, Parts L, Haase J, Charles I, Maskell DJ, Peters SE, Dougan G, Wain J, Parkhill J, Molecular Systems Biology is an open-access journal Turner AK (2009) Simultaneous assay of every Salmonella published by European Molecular Biology Organiza- typhi gene using one million transposon mutants. Genome Res 19: 2308–2316 tion and Nature Publishing Group. This work is licensed under aLourenco RF, Gomes SL (2009) The transcriptional response to Creative Commons Attribution-Noncommercial-Share Alike 3.0 cadmium, organic hydroperoxide, singlet oxygen and UV-A Unported License.& 2011 EMBO and Macmillan Publishers Limited Molecular Systems Biology 2011 7