SlideShare a Scribd company logo
1 of 85
Download to read offline
EVE 161:

Microbial Phylogenomics
Era IV:
Genomes from the Environment
UC Davis, Winter 2016
Instructor: Jonathan Eisen
!1
Quiz
Choose 2 of these 4 questions to answer.
• 1. Explain how phylogenetic analysis was used in these
papers
• 2. Give an example of how the authors tried to predict the
function of this new “proteorhodopsin” protein.
• 3. Give an example of how the authors experimentally
studied the function of this new “proteorhodopsin.”
• 4. Why did they call this new protein “proteorhodopsin”?
Metagenomics History
• Term first use in 1998 paper by Handelsman et al.
Chem Biol. 1998 Oct;5(10):R245-9. Molecular
biological access to the chemistry of unknown soil
microbes: a new frontier for natural products
• Collective genomes of microbes in the soil termed
the “soil metagenome”
• Good review is Metagenomics: Application of
Genomics to Uncultured Microorganisms by
Handelsman Microbiol Mol Biol Rev. 2004
December; 68(4): 669–685.
Papers for Today
DOI: 10.1126/science.289.5486.1902
, 1902 (2000);289Science
et al.Oded Béjà,
Phototrophy in the Sea
Bacterial Rhodopsin: Evidence for a New Type of
This copy is for your personal, non-commercial use only.
.clicking herecolleagues, clients, or customers by
, you can order high-quality copies forIf you wish to distribute this article to others
.herefollowing the guidelines
can be obtaPermission to republish or repurpose articles or portions of articles
(this information is current as of May 18, 2010 ):
The following resources related to this article are available online at www.scien
http://www.sciencemag.org/cgi/content/full/289/5486/1902
version of this article at:
including high-resolution figures, can be found inUpdated information and services,
found at:
crelated to this articleA list of selected additional articles on the Science W eb sites
17 GPa. J. Geophys. Res. 102, 12252±12263 (1997).
28. Duffy, T. S. & Vaughan, M. T. Elasticity of enstatite and its relationships to crystal structure. J. Geophys.
Res. 93, 383±391 (1988).
Acknowledgements
We thank C. Karger for participation in the measurements, and M. Ferrero and F. Boudier
for providing the samples from Baldissero and Papua New Guinea, respectively. The
Laboratoire de Tectonophysique's EBSD system was funded by the CNRS/INSU,
Universite of Montpellier II, and NSF project ``Anatomy of an archean craton''. This work
was supported by the CNRS/INSU programme ``Action TheÂmatique Innovante''.
Correspondence should be addressed to A.T.
(e-mail: deia@dstu.univ-montp2.fr).
.................................................................
Proteorhodopsin phototrophy
in the ocean
Oded BeÂjaÁ*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc*
& Edward F. DeLong*
* Monterey Bay Aquarium Research Institute, Moss Landing, California 95039,
USA
³ Department of Microbiology and Molecular Genetics, The University of
Texas Medical School, Houston, Texas 77030, USA
²
characteristic of
cycle seen with E
proteorhodopsin
Furthermore,
spectrum of the
and same isosbes
dopsin expressed
chemical reactio
generate a red-s
to that of recomb
¯ash-photolysis
existence of pro
retinal molecules
waters.
The photocyc
and, after hydro
adding all-trans
indeed derive fro
–1
0
1
2
orbance(×10–3)
4
http://www.sciencemag.org/content/289/5486/1902.full
http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
Papers for Today
DOI: 10.1126/science.289.5486.1902
, 1902 (2000);289Science
et al.Oded Béjà,
Phototrophy in the Sea
Bacterial Rhodopsin: Evidence for a New Type of
This copy is for your personal, non-commercial use only.
.clicking herecolleagues, clients, or customers by
, you can order high-quality copies forIf you wish to distribute this article to others
.herefollowing the guidelines
can be obtaPermission to republish or repurpose articles or portions of articles
(this information is current as of May 18, 2010 ):
The following resources related to this article are available online at www.scien
http://www.sciencemag.org/cgi/content/full/289/5486/1902
version of this article at:
including high-resolution figures, can be found inUpdated information and services,
found at:
crelated to this articleA list of selected additional articles on the Science W eb sites
http://www.sciencemag.org/content/289/5486/1902.full
Extremely halophilic archaea contain retinal-binding integral membrane pro-
teins called bacteriorhodopsins that function as light-driven proton pumps. So
far, bacteriorhodopsins capable of generating a chemiosmotic membrane po-
tential in response to light have been demonstrated only in halophilic archaea.
We describe here a type of rhodopsin derived from bacteria that was discovered
through genomic analyses of naturally occuring marine bacterioplankton. The
bacterial rhodopsin was encoded in the genome of an uncultivated -
proteobacterium and shared highest amino acid sequence similarity with
archaeal rhodopsins. The protein was functionally expressed in Escherichia coli
and bound retinal to form an active, light-driven proton pump. The new
rhodopsin ex- hibited a photochemical reaction cycle with intermediates and
kinetics characteristic of archaeal proton-pumping rhodopsins. Our results
demonstrate that archaeal-like rhodopsins are broadly distributed among
different taxa, including members of the domain Bacteria. Our data also
indicate that a previously unsuspected mode of bacterially mediated light-driven
energy generation may commonly occur in oceanic surface waters worldwide.
SAR86
gene
le ge-
iden-
roteo-
from
opsins
erent.
hereas
philes
r than
rmine
l, we
a coli
pres-
rotein
3A).
nes of
popro-
m was
(Fig.
at 520
band-
erated
odop-
nce of
dth is
own transducer of light stimuli [for example,
Htr (22, 23)]. Although sequence analysis of
proteorhodopsin shows moderate statistical
support for a specific relationship with sen-
the kinetics of its photochemical reaction cy-
cle. The transport rhodopsins (bacteriorho-
dopsins and halorhodopsins) are character-
ized by cyclic photochemical reaction se-
!8
The 16S ribosomal RNA neighbor-joining tree was constructed on the
basis of 1256 homologous positions by the “neighbor” program of the
PHYLIP package [J. Felsenstein, Methods Enzymol. 266, 418 (1996)].
DNA distances were calculated with the Kimura’s 2-parameter method.
Cloning of proteorhodopsin. Sequence analysis of a
130-kb genomic fragment that encoded the ribosomal
RNA (rRNA) operon from an uncultivated member of
the marine g-Proteobacteria (that is, the “SAR86”
group) (8, 9) (Fig. 1A) also revealed an open reading
frame (ORF) encoding a putative rhodopsin (referred
to here as proteorhodopsin) (10).
Cloning of proteorhodopsin. Sequence analysis of a
130-kb genomic fragment that encoded the ribosomal
RNA (rRNA) operon from an uncultivated member of
the marine g-Proteobacteria (that is, the “SAR86”
group) (8, 9) (Fig. 1A) also revealed an open reading
frame (ORF) encoding a putative rhodopsin (referred
to here as proteorhodopsin) (10).
Marine Microbe Background
• rRNA PCR studies of marine microbes have been extensive
• Comparative analysis had revealed many lineages, some very
novel, some less so, that were dominant in many, if not all,
open ocean samples
• Lineages given names based on specific clones: e.g., SAR11,
SAR86, etc
!14
http://www.nature.com/nature/journal/v345/n6270/abs/345060a0.html
© 1990 Nature
Vol. 57, No. 6APPLIED AND ENVIRONMENTAL MICROBIOLOGY, June 1991, p. 1707-1713
0099-2240/91/061707-07$02.00/0
Copyright C) 1991, American Society for Microbiology
Phylogenetic Analysis of a Natural Marine Bacterioplankton
Population by rRNA Gene Cloning and Sequencing
THERESA B. BRITSCHGI AND STEPHEN J. GIOVANNONI*
Department of Microbiology, Oregon State University, Corvallis, Oregon 97331
Received 5 February 1991/Accepted 25 March 1991
The identification of the prokaryotic species which constitute marine bacterioplankton communities has been
a long-standing problem in marine microbiology. To address this question, we used the polymerase chain
reaction to construct and analyze a library of 51 small-subunit (16S) rRNA genes cloned from Sargasso Sea
bacterioplankton genomic DNA. Oligonucleotides complementary to conserved regions in the 16S rDNAs of
eubacteria were used to direct the synthesis of polymerase chain reaction products, which were then cloned by
blunt-end ligation into the phagemid vector pBluescript. Restriction fragment length polymorphisms and
hybridizations to oligonucleotide probes for the SARll and marine Synechococcus phylogenetic groups
indicated the presence of at least seven classes of genes. The sequences of five unique rDNAs were determined
completely. In addition to 16S rRNA genes from the marine Synechococcus cluster and the previously identified
but uncultivated microbial group, the SARll cluster [S. J. Giovannoni, T. B. Britschgi, C. L. Moyer, and
K. G. Field, Nature (London) 345:60-63], two new gene classes were observed. Phylogenetic comparisons
indicated that these belonged to unknown species of a- and -y-proteobacteria. The data confirm the earlier
conclusion that a majority of planktonic bacteria are new species previously unrecognized by bacteriologists.
Within the past decade, it has become evident that bacte-
rioplankton contribute significantly to biomass and bio-
geochemical activity in planktonic systems (4, 36). Until
recently, progress in identifying the microbial species which
constitute these communities was slow, because the major-
ity of the organisms present (as measured by direct counts)
cannot be recovered in cultures (5, 14, 16). The discovery a
decade ago that unicellular cyanobacteria are widely distrib-
uted in the open ocean (34) and the more recent observation
of abundant marine prochlorophytes (3) have underscored
the uncertainty of our knowledge about bacterioplankton
community structure.
Molecular approaches are now providing genetic markers
for the dominant bacterial species in natural microbial pop-
ulations (6, 21, 33). The immediate objectives of these
studies are twofold: (i) to identify species, both known and
novel, by reference to sequence data bases; and (ii) to
construct species-specific probes which can be used in
quantitative ecological studies (7, 29).
Previously, we reported rRNA genetic markers and quan-
titative hybridization data which demonstrated that a novel
lineage belonging to the oa-proteobacteria was abundant in
the Sargasso Sea; we also identified novel lineages which
were very closely related to cultivated marine Synechococ-
cus spp. (6). In a study of a hot-spring ecosystem, Ward and
coworkers examined cloned fragments of 16S rRNA cDNAs
and found numerous novel lineages but no matches to genes
from cultivated hot-spring bacteria (33). These results sug-
gested that natural ecosystems in general may include spe-
cies which are unknown to microbiologists.
Although any gene may be used as a genetic marker,
rRNA genes offer distinct advantages. The extensive use of
16S rRNAs for studies of microbial systematics and evolu-
tion has resulted in large computer data bases, such as the
RNA Data Base Project, which encompass the phylogenetic
diversity found within culture collections. rRNA genes are
highly conserved and therefore can be used to examine
distant phylogenetic relationships with accuracy. The pres-
ence of large numbers of ribosomes within cells and the
regulation of their biosyntheses in proportion to cellular
growth rates make these molecules ideally suited for ecolog-
ical studies with nucleic acid probes.
Here we describe the partial analysis of a 16S rDNA
library prepared from photic zone Sargasso Sea bacterio-
plankton using a general approach for cloning eubacterial
16S rDNAs. The Sargasso Sea, a central oceanic gyre,
typifies the oligotrophic conditions of the open ocean, pro-
viding an example of a microbial community adapted to
extremely low-nutrient conditions. The results extend our
previous observations of high genetic diversity among
closely related lineages within this microbial community and
provide rDNA genetic markers for two novel eubacterial
lineages.
MATERIALS AND METHODS
Collection and amplification of rDNA. A surface (2-m)
bacterioplankton population was collected from hydrosta-
tion S (32°4'N, 64°23'W) in the Sargasso Sea as previously
described (8). The polymerase chain reaction (24) was used
to produce double-stranded rDNA from the mixed-microbi-
al-population genomic DNA prepared from the plankton
sample. The amplification primers were complementary to
conserved regions in the 5' and 3' regions of the 16S rRNA
genes. The amplification primers were the universal 1406R
(ACGGGCGGTGTGTRC) and eubacterial 68F (TNANAC
ATGCAAGTCGAKCG) (17). The reaction conditions were
as follows: 1 ,ug of template DNA, 10 ,ul of 10x reaction
buffer (500 mM KCI, 100 mM Tris HCI [pH 9.0 at 250C], 15
mM MgCl2, 0.1% gelatin, 1% Triton X-100), 1 U of Taq DNA
polymerase (Promega Biological Research Products, Madi-
son, Wis.), 1 ,uM (each) primer, 200 ,uM (each) dATP, dCTP,
dGTP, and dTTP in a 100-,ul total volume; 1 min at 94°C, 1
atUNIVOFCALIFDAVISonMay18,2010aem.asm.orgDownloadedfrom
MARINE BACTERIOPLANKTON 1711
Escherichia coli
Vibrio harveyi
inmm Vibrio anguillarum
- Oceanospirillum linum
- SAR 92
Caulobacter crescentus
SAR83
- Erythrobacter OCh 114
- Hyphomicrobium vulgare
Rhodopseudomonas marina
Rochalimaea quintana
Agrobacterium tumefaciens
SARI
-Es SAR95
SAR1I
CZ
* C)0._
u
10C)
00
.0
0Q
C)
0
0
O)4..
eP.
- Liverwort CP
*Anacystis nidulans
- Prochlorothrix
- SAR6
* SAR7
r SARIOO
SAR139
0
p.0
0
0 0.05 0.10 0.15
FIG. 4. Phylogenetic tree showing relationships of the rDNA clones from the Sargasso Sea to representative, cultivated species (2, 6, 18,
30, 31, 32, 35, 40). Positions of uncertain homology in regions containing insertions and deletions were omitted from the analysis.
Evolutionary distances were calculated by the method of Jukes and Cantor (15), which corrects for the effects of superimposed mutations.
The phylogenetic tree was determined by a distance matrix method (20). The tree was rooted with the sequence of Bacillus subtilis (38).
Sequence data not referenced were provided by C. R. Woese and R. Rossen.
phototroph phyla were conserved in the corresponding 880 (G -
C to C G). We note with caution that some classes
VOL. 57, 1991
I I
atUNIVaem.asm.orgDownloadedfrom
JOURNAL OF BACTERIOLOGY, JUlY 1991, p. 4371-4378
0021-9193/91/144371-08$02.00/0
Copyright C) 1991, American Society for Microbiology
Vol. 173, No. 14
Analysis of a Marine Picoplankton Community by 16S rRNA
Gene Cloning and Sequencing
THOMAS M. SCHMIDT,t EDWARD F. DELONG,t AND NORMAN R. PACE*
Department ofBiology and Institute for Molecular and Cellular Biology, Indiana University,
Bloomington, Indiana 47405
Received 7 January 1991/Accepted 13 May 1991
The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing
the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident
microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific
Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned
into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2
x 104 recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and
sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with
an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including
four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was
identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to
sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the
Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the y subdivision
of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were
obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences
from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the
jb.asmDownloadedfrom
INSIGHT REVIEW NATURE|Vol 437|15 September 2005
Crenarchaeota
Group I Archaea
Euryarchaeota
Group II Archaea
Group III Archaea
Group IV Archaea
α-Proteobacteria
* SAR11 - Pelagibacter ubique
* Roseobacter clade
OCS116
ß-Proteobacteria
* OM43
µ-Proteobacteria
SAR86
* OMG Clade
* Vibrionaeceae
* Pseudoalteromonas
* Marinomonas
* Halomonadacae
* Colwellia
* Oceanospirillum
δ-Proteobacteria
Cyanobacteria
* Marine Cluster A
(Synechococcus)
* Prochlorococcus sp.
Lentisphaerae
* Lentisphaera araneosa
Bacteroidetes
Marine Actinobacteria
Fibrobacter
SAR406
Planctobacteria
Chloroflexi
SAR202
Archaea
Bacteria
Figure 1| Schematic illustration of the phylogeny of
the major plankton clades.Black letters indicate
microbial groups that seem to be ubiquitous in
seawater. Gold indicates groups found in the photic
zone. Blue indicates groups confined to the
mesopelagic and surface waters during polar winters.
Green indicates microbial groups associated with
coastal ocean ecosystems.
%&'())%#*+,-###./*/!0##*1""#23##4(56#,!
343
Molecular diversity and ecology
of microbial plankton
Stephen J. Giovannoni1
& Ulrich Stingl1
The history of microbial evolution in the oceans is probably as old as the history of life itself. In contrast to
terrestrial ecosystems, microorganisms are the main form of biomass in the oceans, and form some of the
largest populations on the planet. Theory predicts that selection should act more efficiently in large
populations. But whether microbial plankton populations harbour organisms that are models of adaptive
sophistication remains to be seen. Genome sequence data are piling up, but most of the key microbial
plankton clades have no cultivated representatives, and information about their ecological activities is sparse.
cultivation of key organisms, metagenomics and ongoing biogeo-
chemical studies. It seems very likely that the biology of the dominant
microbial plankton groups will be unravelled in the years ahead.
Here we review current knowledge about marine bacterial and
archaeal diversity, as inferred from phylogenies of genes recovered
from the ocean water column, and consider the implications of micro-
bial diversity for understanding the ecology of the oceans. Although
we leave protists out of the discussion, many of the same issues apply
to them. Some of the studies we refer to extend to the abyssal ocean,
but we focus principally on the surface layer (0–300 m) — the region
of highest biological activity.
Phylogenetic diversity in the ocean
Small-subunit ribosomal (RNA) genes have become universal phylo-
genetic markers and are the main criteria by which microbial plank-
ton groups are identified and named9
. Most of the marine microbial
groups were first identified by sequencing rRNA genes cloned from
seawater10–14
, and remain uncultured today. Soon after the first reports
came in, it became apparent that less than 20 microbial clades
accounted for most of the genes recovered15
. Figure 1 is a schematic
illustration of the phylogeny of these major plankton clades. The taxon
names marked with asterisks represent groups for which cultured iso-
lates are available.
The recent large-scale shotgun sequencing of seawater DNA is pro-
viding much higher resolution 16S rRNA gene phylogenies and bio-
geographical distributions for marine microbial plankton. Although
the main purpose of Venter’sSorcerer IIexpedition is to gather whole-
genome shotgun sequence (WGS) data from planktonic microorgan-
isms16
, thousands of water-column rRNA genes are part of the
by-catch. The first set of collections, from the Sargasso Sea, have
yielded 1,184 16S rRNA gene fragments. These data are shown in
Fig. 2, organized by clade structure. Such data are a rich scientific
resource for two reasons. First, they are not tainted by polymerase
chain reaction (PCR) artefacts; PCR artefacts rarely interfere with the
correct placement of genes in phylogenetic categories, but they are a
major problem for reconstructing evolutionary patterns at the popu-
lation level17
. Second, the enormous number of genes provided by the
Sorcerer II expedition is revealing the distribution patterns and abun-
dance of microbial groups that compose only a small fraction of the
Certain characteristics of the ocean environment — the prevailing
low-nutrient state of the ocean surface, in particular — mean it is
sometimes regarded as an extreme ecosystem. Fixed forms of nitrogen,
phosphorus and iron are often at very low or undetectable levels in the
ocean’s circulatory gyres, which occur in about 70% of the oceans1
.
Photosynthesis is the main source of metabolic energy and the basis of
the food chain; ocean phytoplankton account for nearly 50% of global
carbon fixation, and half of the carbon fixed into organic matter is
rapidly respired by heterotrophic microorganisms. Most cells are freely
suspended in the mainly oxic water column, but some attach to aggre-
gates. In general, these cells survive either by photosynthesizing or by
oxidizing dissolved organic matter (DOM) or inorganic compounds,
using oxygen as an electron acceptor.
Microbial cell concentrations are typically about 105
cells mlǁ1
in
the ocean surface layer (0–300 m) — thymidine uptake into microbial
DNA indicates average growth rates of about 0.15 divisions per day
(ref. 2). Efficient nutrient recycling, in which there is intense competi-
tion for scarce resources, sustains this growth, with predation by
viruses and protozoa keeping populations in check and driving high
turnover rates3
. Despite this competition, steady-state dissolved
organic carbon (DOC) concentrations are many times higher than
carbon sequestered in living microbial biomass4
. However, the average
age of the DOC pool in the deep ocean, of about 5,000 years5
(deter-
mined by isotopic dating), suggests that much of the DOM is refrac-
tory to degradation. Although DOM is a huge resource, rivalling
atmospheric CO2 as a carbon pool6
, chemists have been thwarted by
the complexity of DOM and have characterized it only in broad terms7
.
The paragraphs above capture prominent features of the ocean
environment, but leave out the complex patterns of physical, chemical
and biological variation that drive the evolution and diversification of
microorganisms. For example, members of the genusVibrio — which
include some of the most common planktonic bacteria that can be iso-
lated on nutrient agar plates — readily grow anaerobically by fermen-
tation. The life cycles of some Vibrio species have been shown to
include anoxic stages in association with animal hosts, but the broad
picture of their ecology in the oceans has barely been characterized8
.
The story is similar for most of the microbial groups described below:
the phylogenetic map is detailed, but the ecological panorama is thinly
sketched. New information is rapidly flowing into the field from the
1
Department of Microbiology, Oregon State University, Corvallis, Oregon 97331, USA.
NaturePublishing Group©2005
!20
The ecoty
many micro
tions in the w
best exampl
entiated by
adapted (hig
Phylogeneti
gests that th
cally distinct
Cluster A S
of which ca
characterist
urobilin)37,3
ample suppo
teristics that
SS120 has a
ammonium
extreme, Syn
nitrate, cyan
interesting t
seem to pro
whereby nu
conditions.
seasonal spe
lular cyanob
The obse
libraries prepared using PCR methods that reduced sequence arte-
facts17
. They concluded that most sequence variation was clustered
SAR11 subgroups Ia
+
Ib
SAR86
(γ-Proteobacteria)
SAR11 subgroup
II
M
arine
Picophytoplankton
Uncultured
α-Proteobacteria
Clade
SAR406
(Fibrobacter)
Bacteroidetes
SAR324
(δ-Proteobacteria)
M
arine
Actinobacteria
Alterom
onas/Pseudoalterom
onas
%of16SrRNAsequences
0
5
10
15
20
25
30
35
SAR116
(α-Proteobacteria)
Rheinheim
era
Roseobacter clade
SAR202
(Chloroflexi)
Phylogenetic clade !21
DeLong Lab
• Studying Sar86 and other marine plankton
• Note - published one of first genomic studies of
uncultured microbes - in 1996
JOURNAL OF BACTERIOLOGY, Feb. 1996, p. 591–599 Vol. 178, No. 3
0021-9193/96/$04.00 0
Copyright 1996, American Society for Microbiology
Characterization of Uncultivated Prokaryotes: Isolation and
Analysis of a 40-Kilobase-Pair Genome Fragment
from a Planktonic Marine Archaeon
JEFFEREY L. STEIN,1
* TERENCE L. MARSH,2
KE YING WU,3
HIROAKI SHIZUYA,4
AND EDWARD F. DELONG3
*
Recombinant BioCatalysis, Inc., La Jolla, California 920371
; Microbiology Department, University
of Illinois, Urbana, Illinois 618012
; Department of Ecology, Evolution and Marine Biology,
University of California, Santa Barbara, California 931063
; and Division of Biology,
California Institute of Technology, Pasadena, California 911254
Received 14 July 1995/Accepted 14 November 1995
One potential approach for characterizing uncultivated prokaryotes from natural assemblages involves
genomic analysis of DNA fragments retrieved directly from naturally occurring microbial biomass. In this
study, we sought to isolate large genomic fragments from a widely distributed and relatively abundant but as
yet uncultivated group of prokaryotes, the planktonic marine Archaea. A fosmid DNA library was prepared
from a marine picoplankton assemblage collected at a depth of 200 m in the eastern North Pacific. We
identified a 38.5-kbp recombinant fosmid clone which contained an archaeal small subunit ribosomal DNA
gene. Phylogenetic analyses of the small subunit rRNA sequence demonstrated its close relationship to that of
previously described planktonic archaea, which form a coherent group rooted deeply within the Crenarchaeota
Downloadedfrom
Delong Lab
isolated from fosmid clones
arious restriction endonucle-
, the F-factor-based vector
osmid subfragments. Partial
striction enzyme to 1 ⇥g of
re. The reaction mixture was
oved at 10, 40, and 60 min.
1 ⇥l of 0.5 M EDTA to the
e partially digested DNA was
ribed above except using a 1-
es of the separated fragments
dards. The distances of the
promoter sites on the excised
of T7- or SP6-specific oligo-
d hybridizing with Southern
d and pBAC clones digested
ed with labeled T7 and SP6
ubclones and PCR fragments
sequencing described above.
mates from the partial diges-
fosmids and their subclones.
DeSoete distance (9) analyses
ng GDE 2.2 and Treetool 1.0,
P) (23). DeSoete least squares
rwise evolutionary distances,
count for empirical base fre-
d from the RDP, version 4.0
A sequences were performed
DP. For distance analyses of
ary distances were estimated
topology was inferred by the
tion and global branch swap-
n sequences, the Phylip pro-
ion and ordinary parsimony
sequences reported in Table
ollowing accession numbers:
3, U40244, and U40245. The
FIG. 1. Flowchart depicting the construction and screening of an environ-
mental library from a mixed picoplankton sample. MW, molecular weight;
PFGE, pulsed-field gel electrophoresis.
GENOMIC FRAGMENTS FROM PLANKTONIC MARINE ARCHAEA 593
atUNjb.asm.orgDownloadedfrom
!23
Delong Lab
FIG. 4. High-density filter replica of 2,304 fosmid clones containing approx-
imately 92 million bp of DNA cloned from the mixed picoplankton community.
The filter was probed with the labeled insert from clone 4B7 (dark spot). The
lack of other hybridizing clones suggests that contigs of 4B7 are absent from this
portion of the library. Similar experiments with the remainder of the library
yielded similar results.
J. BACTERIOL.
Downl
!24
SAR86 has Proteorhodopsin in Its Genome
hodop-
ence of
width is
orption
he red-
m in the
d Schiff
y to the
was de-
n a cell
d trans-
proteor-
only in
g. 4A).
um was
of a 10
arbonyl
Illumi-
l poten-
ht-side-
of reti-
ht onset
proteo-
pable of
ysiolog-
ctivities
ntaining
oteorho-
n to be
Fig. 1. (A) Phylogenetic tree of bacterial 16S rRNA gene sequences, including that encoded on the
130-kb bacterioplankton BAC clone (EBAC31A08) (16). (B) Phylogenetic analysis of proteorhodop-
sin with archaeal (BR, HR, and SR prefixes) and Neurospora crassa (NOP1 prefix) rhodopsins (16).
Nomenclature: Name_Species.abbreviation_Genbank.gi (HR, halorhodopsin; SR, sensory rhodopsin;
BR, bacteriorhodopsin). Halsod, Halorubrum sodomense; Halhal, Halobacterium salinarum (halo-
bium); Halval, Haloarcula vallismortis; Natpha, Natronomonas pharaonis; Halsp, Halobacterium sp;
Neucra, Neurospora crassa.
Downloadedfrom
!25
The proteorhodopsin tree was constructed on the basis of the archaeal
rhodopsin alignment used by Mukohata et al. [Y. Mukohata, K. Ihara, T.
Tamura, Y. Sugiyama, J. Bio- chem. ( Tokyo) 125, 649 (1999)] and the
neighbor- joining method as implemented in the neighbor pro- gram of
the PHYLIP package. The numbers at forks indicate the percent of
bootstrap replications (out of 1000) in which the given branching was
observed. The least-square method (the Kitsch program of PHYLIP) and
the maximum likelihood method (the Puzzle program) produced trees
with essentially identical topologies, whereas the protein parsimony
method (the Protpars program of PHYLIP) placed proteorhodopsin
within the sensory rhodopsin cluster. The Yro/Hsp30 subfamilies
sequences (6) were omitted from the phylogenetic analysis to avoid the
effect of long branch attraction.
1 ttgttatatc agtaatggct attgctccaa taacttaata ctaatatata attagtttat
61 gaataaattt tatatatttg ggttattgtt ttttacacta aatgcatttt cttgctcaga
121 tcttctagat acagacatga gagttcttga ttccgctgag tcaagaaacc tttgcgagtt
181 tgaaggaaaa gctttactag ttgtgaatgt tgcaagtaga tgtggttaca cttatcaata
241 tgctggcctt caaaagttat atgaaagtta taaagatgaa gattttctag taattgggat
301 cccatctaga gattttcttc aagaatactc tgatgaaagc gatgttgcag aattttgttc
361 tacagaatac ggtgttgaat ttcctatgtt ctcaactgct aaagtcaaag gaaaaaaagc
421 acacccattt tataaaaaac ttattgcaga atcaggtttt actccctcat ggaactttaa
481 taaatactta atctcaaaag agggcaaggt tgtatccaca tatggatcaa aggtaaagcc
541 tgattcaaaa gagcttatat cagctataga aggcttgctg taaaattatt acttagaaac
601 taatacagtt ttaggcttgt ttgctgcaaa tattccatta tctacaactc caggaatatt
661 attaatcaaa gcttccattt cagtggggtt tgaaatatcc atattagaga tatctaaaat
721 gtgattacct tggtctgtta taaatccagt tctatatgtg ggtattccac cgatcgagat
781 tatttttctt gcaacaaggc tcctactttc aggtatcacc tctataggca gtggaaaagc
841 tcccaaaaga ttaaccatct ttgactgatc aactatacat ataaactcgt tagaggcaga
901 agcaactatc ttttctctag tatgtgcgcc accaccacct ttaataagac aattttcagg
961 agacacctca tctgcaccat ctatgtaata agctatatca actacatcat taaggctaaa
1021 gacctctatc ccattttcat ttaataattt tgatgaagca tctgaactag aaacagctcc
1081 agcaaatttg tgcctatgct cctttagttc ttctataaaa aaattaactg ttgagccggt
1141 tccaatacct aaaatcatct caggatgaag attatttttg atatattcta tagcttgttt
1201 agcaacattt atctttgagc cactcataga gttataatac aagaaaatat aggtagttaa
1261 ttattttgag actaaaaatt aaaaaaacag gttcttttaa gaattcccag aagtacctaa
1321 agcttattcc taatgcatct gtttatgacg tagcattaaa atcacctata acatttgctc
1381 taaatatttc ttcaaagctg gggaataaag ttttcctaaa aagagaggat ctgcaaccta
1441 tattttcttt taaaaacaga ggagcgtata acaagattgt aaatttatcc gatgccgaaa
1501 agaagagggg ggttattgct gcatcagcag gaaatcatgc tcaaggggta gccagtgcat
1561 gtaagaaatt aaaaattaat tgcttgatag ttatgccaat aacaactcca gaaataaaaa
1621 taaaagatgt aaaaagattt ggagccaaaa tactccaaca tggggacaac gtagatgcag
1681 cattaaaaga ggcactgttt attgcaaaga aaaaaaaatt gtcttttgtt catccttttg
1741 acgaccctct aacaattgct ggccaaggga ctataggaca agaaattctt gaagataaaa
1801 ataattttga tgttgtcttt gttccggtgg gaggaggagg tattctagct ggtgtatctg
1861 cctggatagc acagaataat aagaaaataa aaattgttgg tgttgaggtt gaggattccg
1921 cttgtcttgc tgaggccgta aaagctaata aaagagttat tttaaaagaa gtgggcctct
1981 ttgctgatgg ggtggcagta tcaagggttg gaaaaaataa ttttgatgtt attaaagagt
2041 gcgtagatga agtcattaca gttagcgttg atgaggtctg caccgctgta aaagatatct
2101 ttgaagatac aagggttcta tcagaacctg ctggggcatt agcacttgca gggttaaaag
2161 cctacgcaag gaaagttaaa aataaaaaac ttattgctat aagttctggc gctaatgtaa
2221 atttccaaag acttaatttt attgttgagc gatcagagat tggtgaaaat agagaaaaaa
2281 tattaagtat caaaatccca gagatacctg gaagttttct taagctttca aggatgtttg
2341 gcagctctca agttacagag tttaactaca ggaaatctag cttaagcgat gcatatgttt
2401 tagttggtgt tagaactaaa actgaaaaat catttgaaat cttaaagtcc aaattaaaaa
2461 aagcaggctt cacctttagc gactttactc gaaatgaaat atccaatgat catctgaggc
2521 atatggttgg tggcagaaat agtgactcag gctctcataa caatgaaaga atatttaggg
2581 gagagtttcc tgagaagccg ggcgcgctgt taaattttct agagaaattt ggaaataaat
2641 ggmatatttc cttatttcat tacaggaacc taggttcagc ttttggaaag atattaattg
2701 gcatcgagag taaggataaa gacaagctaa taaatcattt aaataagtca ggnactattt
2761 ttacagaaga aacctctaac aaggcataca aagatttttt aaaatgaaag gttaatactt
2821 taatctaaat ttaattgaaa aaagctcatc gctagggttt tcccacggct ctttgaacaa
2881 ctcggattga gatctatcat cctcctcgtc gtaaattctc ccacctttag aatagaccaa
2941 aaatagatat gacaaaggag cgagctcata tttatatcta atttggaacg aagctacgcc
3001 agtattaaat tcattaacta tattatttcc tttataaagg tatccattag catctgaaaa
3061 aatactaata ggattttttg ctttcaaagc aacaaattga ctcttaagtc tgatctcatg
3121 tttattattt ttaaaccaat ttagatcaaa agataaggta tcttgtcttg aatcatatga
3181 ggcaagatta ttattatcct gccatatcag ccattcattt tcctttctta ttctatattg
3241 cgcattgatt cttaagttat catttggaaa tattgaacct gctatcttgt aaaattttct
3301 accataccca ttcgaatccc accgattatc tttctctccc ttaaagaagc taactctcca
3361 gtcatatgtc cagaatgaat agttctttgc ctcaaagtct gctgtaatac ctattcttct
3421 ttttgatttg ataaaggggt aggcttcatt ttttcttgtg atagttgtat ttttcccaga
3481 agatctaaag ttaaaatcta attgaaattt agagttgtcc ttaaaactaa aagaattttt
3541 ttgatcgatg cctattggat ttgaattacc agctgtgtca gcatcataat ttagatcaat
3601 tccataatct atttgtttta atatcgagct attatcaaat tcatttattt ttcgatttcc
3661 tccaatacca gcatgaatcc agtctcttct ttgcagataa ccaaagtcat ttaactcgaa
3721 gtcgtcttca aaataaagaa ggcttccact tatgttagat agtttatttg gaagatatgt
3781 aaactgagtc ctatacccaa gcccattttt accatctttt tctgaggcta acagatctga
3841 atatgtaatt aatttttttg aacgaatatt gatgtaatca ataacattga ctgttgatga
3901 ttcgcctgtc atttcattct caacattcgt taccatgaag ccaagcgttt tatttccaag
3961 ctttgttcga gatcgaagag cataatagtc tcttccaact gaaaaggctt catcagcctc
4021 acttgctaca aatactccaa attcattatt attacttttt tgagtaagtc ttaatgcaaa
4081 atcaatatca gaatagtttt ttttagctgc ctcgcaacct tcttcattac tttcttctga
4141 gcaattatag ctgggggcag ctccaatcct ccgcgtattt ataaccgagt atctatcata
4201 attactaata tcaaatagtg attggttttc attgaaaaat gctctttttt ctgagtaaaa
4261 agtttcttga gcagaaaagt taataaccac atcatcactc tcagcttgtc cgaaatctgg
4321 attaatagct aaatttattt gacgaccttt tccagtgcta taaaagattt cagccccaat
4381 atctgaccct tcttggtttg taactgaatt tttatttgaa gatatatatg gaaaaaaagt
4441 aagctttgat tttgtatagt tttgtatttc taagctatct aactcttgaa agtagtcatt
4501 tctactagcc attgttccgg cactgctaac ccatgactca ttttcggcac tataacgtaa
4561 tgcggtgtaa ttaatttttc ttatatcacc atcaggctgt ttcattaacg ttacatccca
4621 aggaataaaa aactcagaga cccaataccc atcaaatttt tgtgtttttg caatccaatc
4681 tccatcccag tctgtcttaa agtctcctgc ttgcgttttt atggcatcga aaagcgagtt
4741 cccaagattt atagcaagaa tgaaagcttt gttaccatca ccatcaaagt ctatatttat
4801 agagttttta tcgctaagtg aatttatttg atctctaagc gtcttcctgg agaacataga
4861 atcattactt tgaaaattct taaatccaac atatatacca tccttatttg agaaaattaa
4921 agccgttgta agaagttcat ttttcttaag agtaaaagga gatgtttcat aaaaatctgt
4981 aatttcaaat gcattattcc actcaggctc atcaagagag ccatcaataa caattgagtt
5041 tgaccaaatc agcacagagg taagtaaaag tgataatgag gctaaagtta ttttcataga
5101 tagatttaat tcagagtatt ttaatctatg aagagtagga aatcatccga tcattctcag
5161 aaaaaacata atcatattta aggtcatgtt tttctccaaa atgcatatta caaatctggt
5221 attgataaca cagtccaacc aataaaggtc ttgatacctc ctcgtttata gagccaataa
5281 ttctatcgaa atatccagag ccatagccaa gcctgtatcc atttaaatca actcctgtca
5341 taggaataaa cattaaatca atttcattta tgttgacata atcttcactt ttaacctctt
cat /Users/jeisen/Desktop/sequence.gb.txt | grep product
/product="predicted threonine dehydratase"
/product="predicted secreted protein"
/product="predicted 5-formyltetrahydrofolate cyclo-ligase"
/product="predicted Xaa-Pro aminopeptidase"
/product="predicted quinone biosynthesis monooxygenase"
/product="predicted 50S ribosomal protein L28"
/product="predicted deoxyuridine 5'-triphosphate
/product="predicted primosomal protein N' (replication
/product="predicted N-acetylmuramoyl-L-alanine amidase"
/product="predicted kinase of the phosphomethylpyrimidine
/product="predicted kinase of P-loop ATPase superfamily"
/product="predicted eukaryote-type paraoxonase"
/product="predicted alpha/beta hydrolase"
/product="predicted MutT superfamily hydrolase"
/product="predicted metallobeta lactamase fold protein"
/product="predicted acetyl-coenzyme A synthetase"
/product="predicted deoxypurine kinase"
/product="predicted poly(A) polymerase"
/product="predicted DskA family protein with conserved
/product="predicted ribokinase family sugar kinase"
/product="predicted glutamate-1-semialdehyde
/product="predicted enzyme with conserved cysteine"
/product="predicted metallopeptidase of the G-G peptidase
/product="predicted proline dehydrogenase"
/product="predicted tyrosyl-tRNA synthetase"
/product="predicted prolyl endopeptidase"
/product="tRNA-Ile"
/product="predicted unknown protein"
/product="predicted cation efflux pump"
/product="predicted TonB-dependent receptor protein"
/product="predicted conserved membrane protein"
/product="predicted LicA, choline kinase involved in LPS
/product="predicted NAD-dependent formate dehydrogenase"
/product="predicted 2-nitropropane dioxygenase"
/product="predicted oxidoreductase"
/product="predicted CsgA, Rossman fold oxidoreductase"
/product="predicted acid--CoA ligase fadD13"
/product="proteorhodopsin"
/product="predicted phosphopantetheine
/product="predicted formamidopyrimidine-DNA glycosylase"
/product="predicted outer membrane protein TolC"
/product="predicted topoisomerase IV subunit B"
/product="predicted topoisomerase IV subunit A"
/product="predicted succinylornithine aminotransferase"
/product="predicted 6-phosphofructokinase"
/product="predicted
/product="predicted ORF"
/product="predicted penicillin-binding protein 2"
/product="predicted membrane-bound lytic transglycosylase"
/product="predicted penicillin-binding protein 6
/product="predicted D-alanine aminotransferase"
/product="predicted DNA polymerase III, delta subunit"
/product="predicted leucyl-tRNA synthetase (leuS)"
/product="predicted acetyl-CoA carboxylase, biotin
/product="predicted esterase of alpha/beta hydrolase fold"
/product="predicted ABC transporter ATPase subunit"
/product="predicted dihydroxyacid dehydratase"
/product="predicted porphobilinogen synthase"
/product="predicted transcription termination factor Rho"
/product="predicted porphobilinogen deaminase"
/product="predicted transporter (ACRA/ACRE type)"
/product="predicted cation efflux system (AcrB/AcrD/AcrF
/product="predicted secreted lipoprotein"
/product="predicted ribosomal large subunit pseudouridine
/product="predicted acetolactate synthase III large chain"
/product="predicted ketol-acid reductoisomerase"
/product="16S ribosomal RNA"
/product="23S ribosomal RNA"
/product="predicted YacE family of P-loop kinases"
/product="predicted preprotein translocase secA subunit"
The inferred amino acid sequence of the proteorho-
dopsin showed statistically significant similarity to
archaeal rhodopsins (11).
The inferred amino acid sequence of the proteorho-
dopsin showed statistically significant similarity to
archaeal rhodopsins (11).
SAR86 has Proteorhodopsin in Its Genome
generated
eorhodop-
resence of
ndwidth is
absorption
. The red-
nm in the
ated Schiff
ably to the
on was de-
s in a cell
ward trans-
in proteor-
nd only in
(Fig. 4A).
edium was
ce of a 10
re carbonyl
19). Illumi-
ical poten-
right-side-
nce of reti-
light onset
hat proteo-
capable of
physiolog-
e activities
containing
proteorho-
main to be
Fig. 1. (A) Phylogenetic tree of bacterial 16S rRNA gene sequences, including that encoded on the
130-kb bacterioplankton BAC clone (EBAC31A08) (16). (B) Phylogenetic analysis of proteorhodop-
sin with archaeal (BR, HR, and SR prefixes) and Neurospora crassa (NOP1 prefix) rhodopsins (16).
Nomenclature: Name_Species.abbreviation_Genbank.gi (HR, halorhodopsin; SR, sensory rhodopsin;
BR, bacteriorhodopsin). Halsod, Halorubrum sodomense; Halhal, Halobacterium salinarum (halo-
bium); Halval, Haloarcula vallismortis; Natpha, Natronomonas pharaonis; Halsp, Halobacterium sp;
Neucra, Neurospora crassa.
wDownloadedfrom
!38
Halophiles …
!39
Bacteriorhodopsin and its relatives
Figure 3. Phylogenetic tree based on the amino acid sequences of 25 archaeal rhodopsins. (a) NJ-tree. The numbers at each node are clustering probabilities generated by
bootstrap resampling 1000 times. D1 and D2 represent gene duplication points. The four shaded rectangles indicate the speciation dates when halobacteria speciation occurred at
the genus level. (b) ML-tree. Log likelihood value for ML-tree was −6579.02 (best score) and that for topology of the NJ-tree was −6583.43. The stippled bars indicate the 95%
confidence limits. Both trees were tentatively rooted at the mid-point of the longest distance, although true root positions are unknown.
From Ihara et al. 1999
!40
What Does It Do?
Proteorhodopsin Predicted Secondary Structure
Fig. 2. Secondary
structure of proteo-
rhodopsin. Single-
letter amino acid
codes are used (33),
and the numbering
is as in bacteriorho-
dopsin. Predicted
retinal binding pock-
et residues are
marked in red.
R E S E A R C H A R T I C L E S
!42
The amino acid residues that form a retinal binding pocket in archaeal
rhodopsins are also highly conserved in proteorhodopsin (Fig. 2). In par-
ticular, the critical lysine residue in helix G, which forms the Schiff base
linkage with retinal in archaeal rhodopsins, is present in proteorhodopsin.
Analysis of a structural model of proteorhodopsin (14 ), in conjunction
with multiple sequence alignments, indicates that the majority of active
site residues are well conserved between proteorhodopsin and archaeal
bacteriorhodopsins (15).
The amino acid residues that form a retinal binding pocket in archaeal
rhodopsins are also highly conserved in proteorhodopsin (Fig. 2). In par-
ticular, the critical lysine residue in helix G, which forms the Schiff base
linkage with retinal in archaeal rhodopsins, is present in proteorhodopsin.
Analysis of a structural model of proteorhodopsin (14 ), in conjunction
with multiple sequence alignments, indicates that the majority of active
site residues are well conserved between proteorhodopsin and archaeal
bacteriorhodopsins (15).
Proteorhodopsin in E. coli
duce
in th
occu
pigm
and
ed at
tiona
sorp
in 0.
sorp
botto
nated
retin
ms d
deca
shift
appe
term
cay o
step
singl
upw
amp
gene
with
ms p
reco
phot
prod
Fig. 3. (A) Proteorhodopsin-expressing E. coli cell suspension (ϩ) compared to control cells (Ϫ),
both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli
membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes
(red) and a negative control (black). Time points for spectra after retinal addition, progressing from
low to high absorbance values, are 10, 20, 30, and 40 min.
!47
Fig. 3. (A) Proteorhodopsin-expressing E. coli cell suspension (+)
compared to control cells (-), both with all-trans retinal. (B) Absorption
spectra of retinal-reconstituted proteorhodopsin in E. coli membranes
(17). A time series of spectra is shown for reconstituted proteorhodopsin
membranes (red) and a negative control (black). Time points for spectra
after retinal addition, progressing from low to high absorbance values,
are 10, 20, 30, and 40 min.
Proteorhodopsin was amplified from the 130-kb bac- terioplankton BAC
clone 31A08 by polymerase chain reaction (PCR), using the primers 5 -
ACCATGGGTA- AAT TAT TACTGATAT TAGG-3 and 5 -AGCAT TA-
GAAGAT TCT T TAACAGC-3 . The amplified fragment was cloned
with the pBAD TOPO TA Cloning Kit (Invitrogen). The protein was
cloned with its native signal sequence and included an addition of the V5
epitope and a polyhistidine tail in the COOH-terminus. The same PCR
product in the opposite orientation was used as a negative control. The
protein was expressed in the E. coli outer membrane protease- deficient
strain UT5600 [M. E. Elish, J. R. Pierce, C. F. Earhart, J. Gen. Microbiol.
134, 1355 (1988)] and induced with 0.2% arabinose for 3 hours.
Membranes were prepared according to Shimono et al. [K. Shi- mono,
M. Iwamoto, M. Sumi, N. Kamo, FEBS Lett. 420, 54 (1997)] and
resuspended in 50 mM tris-Cl (pH 8.0) and 5 mM MgCl2. The
absorbance spectrum was measured according to Bieszke et al. (7) in the
presence of 10 M all-trans retinal.
Proteorhodopsin function
geneity in
with 87% o
ms photocy
recovery. A
photocycle
produces a b
this alternat
well as a tw
component,
the total am
with a 45-m
amplitude).
cycle rate, w
teristic of
strong evid
tions as a tr
rhodopsin.
Implicat
harbor the
tributed in
bacteria hav
ture-indepen
oceanic reg
Oceans, as
(8, 25–29).
tribution, pr
this ␥-prote
both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli
membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes
(red) and a negative control (black). Time points for spectra after retinal addition, progressing from
low to high absorbance values, are 10, 20, 30, and 40 min.
Fig. 4. (A) Light-driven transport of protons by a
proteorhodopsin-expressing E. coli cell suspension.
The beginning and cessation of illumination (with
yellow light Ͼ485 nm) is indicated by arrows labeled ON and OFF, respectively. The cells were
suspended in 10 mM NaCl, 10 mM MgSO4⅐7H2O, and 100 ␮M CaCl2. (B) Transport of 3
Hϩ
-labeled
tetraphenylphosphonium ([3
Hϩ
]TPP) in E. coli right-side-out vesicles containing expressed proteorho-
dopsin, reconstituted with (squares) or without (circles) 10 ␮M retinal in the presence of light (open
symbols) or in the dark (solid symbols) (20).
15 SEPTEMBER 2000 VOL 289 SCIENCE www.sciencemag.org1904
20. Right-side-out membrane vesicles were prepared according to
Kaback [H. R. Kaback, Methods Enzymol. 22, 99 (1971)]. Membrane
electrical potential was measured with the lipophilic cation TPP by
means of rapid filtration [E. Prossnitz, A. Gee, G. F. Ames, J. Biol. Chem.
264, 5006 (1989)] and was calculated according to Robertson et al. [D.
E. Robertson, G. J. Kaczorowski, M. L. Garcia, H. R. Kaback,
Biochemistry 19, 5692 (1980)]. No energy sources other than light were
added to the vesicles, and the experiments were conducted at room
temperature.
Figure 5
Laser flash-induced absorbance
changes in suspensions of E. coli
membranes containing
proteorhodopsin. A 532-nm pulse
(6 ns duration, 40 mJ) was delivered
at time 0, and absorption changes
were monitored at various
wavelengths in the visible range in
a lab-constructed flash photolysis
system as described (34). Sixty-
four transients were collected for
each wavelength. (A) Transients at
the three wavelengths exhibiting
maximal amplitudes. (B) Absorption
difference spectra calculated from
amplitudes at 0.5 ms (blue) and
between 0.5 ms and 5.0 ms (red).
http://www.sciencemag.org/content/289/5486/1902/F5.expansion.html
Papers for Today
Correspondence should be addressed to A.T.
(e-mail: deia@dstu.univ-montp2.fr).
.................................................................
Proteorhodopsin phototrophy
in the ocean
Oded BeÂjaÁ*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc*
& Edward F. DeLong*
* Monterey Bay Aquarium Research Institute, Moss Landing, California 95039,
USA
³ Department of Microbiology and Molecular Genetics, The University of
Texas Medical School, Houston, Texas 77030, USA
² These authors contributed equally to this work
..............................................................................................................................................
Proteorhodopsin1
, a retinal-containing integral membrane pro-
tein that functions as a light-driven proton pump, was discovered
in the genome of an uncultivated marine bacterium; however, the
prevalence, expression and genetic variability of this protein in
native marine microbial populations remain unknown. Here we
report that photoactive proteorhodopsin is present in oceanic
surface waters. We also provide evidence of an extensive family of
to that of recomb
¯ash-photolysis
existence of pro
retinal molecules
waters.
The photocyc
and, after hydro
adding all-trans
indeed derive fro
–4
–3
–2
–1
0
1
2
ΔAbsorbance(×10–3)0–3)
0
4
2
3
http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
NATURE |VOL 411 |14 JUNE 2001 |www.nature.com
hn L. Spudich³, Marion Leclerc*
ute, Moss Landing, California 95039,
lar Genetics, The University of
30, USA
work
...................................................................
ining integral membrane pro-
n proton pump, was discovered
marine bacterium; however, the
c variability of this protein in
ons remain unknown. Here we
odopsin is present in oceanic
dence of an extensive family of
psin variants. The protein pig-
family seem to be spectrally
orbing light at different wave-
available in the environment.
oteorhodopsin-based phototro-
ic microbial process.
taining membrane protein that
n pump, was discovered nearly
Halobacterium salinarum2
. The
on contain a high concentration
cked in an ordered two-dimen-
tion of light, bacteriorhodopsin
al shifts (a photocycle), causing a
the membrane. The resulting
al drives ATP synthesis, through
indeed derive from retinylidene pigmentation. We conclude that
–4
–3
–2
–1
0
1
2
∆Absorbance(×10–3)∆Absorbance(×10–3)
806040200
Time (ms)
580 nm
500 nm
400 nm
Wavelength (nm)
400 450 500 550 600 650 700
–3
–2
–1
0
1
2
3
Monterey Bay environmental membranes
Proteorhodopsin in E. coli membranes
Figure 1 Laser ¯ash-induced absorbance changes in suspensions of membranes
prepared from the prokaryotic fraction of Monterey Bay surface waters. Top, membrane
absorption was monitored at the indicated wavelengths and the ¯ash was at time 0 at
532 nm. Bottom, absorption difference spectrum at 5 ms after the ¯ash for the
environmental sample (black) and for E. coli-expressed proteorhodopsin (red).
© 2001 Macmillan Magazines Ltd
NATURE |VOL 411 |14 JUNE 2001 |www.nature.com
strongly suggests that this protein has a signi®cant role in the
Laser
flash
Untreated membranes
Hydroxylamine-treated membranes
Retinal-reconstituted membranes
10–3
AU
10–1
s
Figure 2 Laser ¯ash-induced transients at 500 nm of a Monterey Bay bacterioplankton
membrane preparation. Top, before addition of hydroxylamine; middle, after 0.2 M
hydroxylamine treatment at pH 7.0, 18 8C, with 500-nm illumination for 30 min; bottom,
after centrifuging twice with resuspension in 100 mM phosphate buffer, pH 7.0, followed
by addition of 5 mM all-trans retinal and incubation for 1 h.
M
M
MB
MB
HOT
HOT
0.01
HO
PAL
PAL B
PAL B
PAL E
PAL
PAL
PAL
PAL
M
M
M
M
MB
BA
PAL E
B
HOT
Figure 3 Phylogenetic analysis of the inferred
proteorhodopsin genes. Distance analysis of 22
by neighbour-joining using the PaupSearch pr
10.0 (Genetics Computer Group; Madison, Wis
was used as an outgroup, and is not shown. Sc
per site. Bold names indicate the proteorhodop
this study.
© 2001 Macmillan Magazines Ltd
of the surface of a single cell3
. This
nt to produce substantial amounts of
efore, the high density of proteorho-
rane indicated by our calculations
otein has a signi®cant role in the
in amino-acid sequences were not restricted to the hydrophilic
ed membranes
e-treated membranes
stituted membranes
10–3
AU
10–1
s
at 500 nm of a Monterey Bay bacterioplankton
tion of hydroxylamine; middle, after 0.2 M
C, with 500-nm illumination for 30 min; bottom,
n in 100 mM phosphate buffer, pH 7.0, followed
incubation for 1 h.
MB 0m2
MB 0m1
MB 20m2
MB 20m5
MB 40m12
MB 100m9
HOT 75m3
HOT 75m8
0.01
MontereyBayandshallowHOT
AntarcticaanddeepHOT
HOT 75m1
PAL B1
PAL B5
PAL B6
PAL E7
PAL E1
PAL B7
PAL B2
PAL B8
MB 100m10
MB 100m5
MB 100m7
MB 20m12
MB 40m1
MB 40m5
BAC 40E8
BAC 31A8
PAL E6
BAC 64A5
HOT 0m1
HOT 75m4
Figure 3 Phylogenetic analysis of the inferred amino-acid sequence of cloned
proteorhodopsin genes. Distance analysis of 220 positions was used to calculate the tree
by neighbour-joining using the PaupSearch program of the Wisconsin Package version
10.0 (Genetics Computer Group; Madison, Wisconsin). H. salinarum bacteriorhodopsin
was used as an outgroup, and is not shown. Scale bar represents number of substitutions
per site. Bold names indicate the proteorhodopsins that were spectrally characterized in
this study. !57
mechanisms of wavelength regu-
hodopsins. Furthermore, palE6
larly to its Monterey Bay
diated transport of protons in
. Spudich et al., manuscript in
rom Antarctica (E. F. DeLong,
dopsin primers, and sequenced
minary sequence analyses based
indicate that, despite differences
orption spectra, the Antarctic
a bacterium highly related to
AR86 bacteria of Monterey Bay10
nes from the HOT station were
m samples. Most proteorhodop-
surface waters belonged to the
(on the amino-acid level) to the
40E8. In contrast, most of the
) fell within the Antarctic clade
es from the HOT station, HOT
d in E. coli and their absorption
d from their primary sequences,
ng with the Antarctic clade gave
um, whereas the proteorhodop-
clade gave a green (527-nm)
central North Paci®c gyre, most
ge, with maximal intensity near
ak is maintained over depth,
At the surface, the energy peak
th between 400 and 650 nm. In
narrows and the half bandwidth
00 nm (Fig. 5). Considering the
members of the two different
dopsin genetic variants in mixed populations with a selective
advantage at different points along the depth-dependent light
0.06
0.05
0.04
0.03
0.02
0.01
1.0
0.8
0.6
0.4
0.2
0
350 400
75 m
5 m
Monterey
Bay
HOT 0 m
HOT 75 m
Antarctica
450 500
Wavelength (nm)
RelativeirradianceAbsorbance
550 600 650
Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli
membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM
phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for
palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after
retinal addition. Bottom, downwelling irradiance from HOT station measured at six
wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same
depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted
relative to irradiance at 490 nm.
!58
788 NATURE |VOL 411 |14 JUNE 2001 |www.nature.com
whereas the total energy decreases. At the surface, the energy peak
is very broad with a half bandwidth between 400 and 650 nm. In
deeper water below 50 m, the peak narrows and the half bandwidth
is restricted to between 450 and 500 nm (Fig. 5). Considering the
different wavelengths absorbed by members of the two different
Figure 4 Multiple alignment of proteorhodopsin amino-acid sequences. The secondary structure shown (boxes for transmembrane helices) is derived from hydropathy plots. Residues
predicted to form the retinal-binding pocket are marked in red.
palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after
retinal addition. Bottom, downwelling irradiance from HOT station measured at six
wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same
depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted
relative to irradiance at 490 nm.
© 2001 Macmillan Magazines Ltd
!59
More on Large Insert Metagenomics
(e-mail: denning@atmos.colostate.edu).
.................................................................
Unsuspected diversity among marine
aerobic anoxygenic phototrophs
Oded BeÂjaÁ*², Marcelino T. Suzuki*, John F. Heidelberg³,
William C. Nelson³, Christina M. Preston*, Tohru Hamada§²,
Jonathan A. Eisen³, Claire M. Fraser³ & Edward F. DeLong*
* Monterey Bay Aquarium Research Institute, Moss Landing,
California 95039-0628, USA
³ The Institute for Genomic Research, Rockville, Maryland 20850, USA
§ Marine Biotechnology Institute, Kamaishi Laboratories, Kamaishi City,
Iwate 026-0001, Japan
..............................................................................................................................................
Aerobic, anoxygenic, phototrophic bacteria containing bacterio-
chlorophyll a (Bchla) require oxygen for both growth and Bchla
synthesis1±6
. Recent reports suggest that these bacteria are widely
distributed in marine plankton, and that they may account for up
to 5% of surface ocean photosynthetic electron transport7
and
8
b
98
6
http://www.nature.com/nature/journal/v415/n6872/full/415630a.html
Erythrobacter litoralis*
MBIC3019*
Erythrobacter longus*
Erythromicrobium ramosum
cDNA 0m1
cDNA 20m21
Roseobacter denitrificans*
Roseobacter litoralis*
Rhodovulum sulfidophilum*
100
BAC 39B11
BAC 29C02
cDNA 20m10
BAC 52B02
BAC 65D09
BAC 24D02
99
100
100
54
Rhodoferax fermentans
Allochromatium vinosum
Roseateles depolymerans
Rubrivivax gelatinosus
cDNA 0m20
Rhodospirillum rubrum
56
96
52
67
Thiocystisge latinosa
Rhodomicrobium vannielii
Rhodocyclus tenuis
Rhodopseudomonas palustris
Chloroflexus aurantiacus
Sphingomonas natatoria
Rhodobacter sphaeroides
Rhodobacter capsulatus
BAC 56B12a
b
BAC 60D04
BAC 30G07
R2A84*
R2A163*
cDNA 20m22
cDNA 0m13
cDNA 20m8
R2A62*
cDNA 20m11
0.1
98
72
99
100
76
61
73
100
82
96
β
γ
α-3
α-4
α–1
α-2
env20m1
env20m5
env0m2
MBIC3951*
env0m1
envHOT1
envHOT2
envHOT3
100
99
100
100
100
Roseobacter denitrificans*
Roseobacter litoralis*
R2A84*
0.1
100
52
91
esearch, Rockville, Maryland 20850, USA
ute, Kamaishi Laboratories, Kamaishi City,
........................................................................................
ototrophic bacteria containing bacterio-
quire oxygen for both growth and Bchla
rts suggest that these bacteria are widely
ankton, and that they may account for up
photosynthetic electron transport7
and
obial community8
. Known planktonic
belong to only a few restricted groups
ia a-subclass. Here we report genomic
thetic gene content and operon organiza-
ng marine bacteria. These photosynthetic
ome that most closely resembled those of
b-subclass, which have never before been
ronments. Furthermore, these photosyn-
ly distributed in marine plankton, and
itic bacterioplankton assemblages, indi-
nti®ed phototrophs were photosyntheti-
ta demonstrate that planktonic bacterial
ply composed of one uniform, widespread
otrophs, as previously proposed8
; rather,
ain multiple, distantly related, photo-
erial groups, including some unrelated
types.
uired for the formation of bacteriochloro-
ystems in aerobic, anoxygenic, photo-
re clustered in a contiguous, 45-kilobase
n (superoperon)6
. These include bch and crt
nzymes of the bacteriochlorophyll and
pathways, and the puf genes coding for
harvesting complex (pufB and pufA) and
ex (pufL and pufM). To better describe the
planktonic, anoxygenic, photosynthetic
surface-water bacterial arti®cial chromo-
BAC 39B11
BAC 29C02
cDNA 20m10
BAC 52B02
BAC 65D09
BAC 24D02
99
100
100
54
67
Chloroflexus aurantiacus
b
envHOT2
envHOT3
100
Roseobacter denitrificans*
Roseobacter litoralis*
MBIC3951*
R2A84*
R2A62*
Rhodovulum sulfidophilum*
Rhodobacter sphaeroides
Rhodobacter capsulatus
Erythromicrobium ramosum
Erythrobacter litoralis*
MBIC3019*
Sphingomonas natatoria
Erythrobacter longus*
Rhodomicrobium vannielii
Rhodopseudomonas palustris
Rhodospirillum rubrum
Roseateles depolymerans
Rubrivivax gelatinosus
Rhodoferax fermentans
Rhodocyclus tenuis
Allochromatium vinosum
Thiocystis gelatinosa
Chloroflexus aurantiacus
0.1
α-3
α-4
α-2
α-1
β
γ
100
100
100
100
100
100
69
98
61
65
100
82
95
69
100
52
91
100
95
Figure 1 Phylogenetic relationships of pufM gene (a) and rRNA (b) sequences of AAP
bacteria. a, b, Evolutionary distances for the pufM genes (a) were determined from an
alignment of 600 nucleotide positions, and for rRNA genes (b) from an alignment of 860
nucleotide sequence positions. Evolutionary relationships were determined by neighbour-
joining analysis (see Methods). The green non-sulphur bacterium Chloro¯exus
aurantiacus was used as an outgroup. pufM genes that were ampli®ed by PCR in this
study are indicated by the env pre®x, with `m' indicating Monterey, and HOT indicating
Hawaii ocean time series. Cultivated aerobes are marked in light blue, bacteria cultured
from sea water are marked with an asterisk, and environmental cDNAs are marked in red.
Photosynthetic a-, b- and g-proteobacterial groups are indicated by the vertical bars to
rRNA PUF
a, b, Evolutionary distances for the
pufM genes (a) were determined from
an alignment of 600 nucleotide
positions, and for rRNA genes (b) from
an alignment of 860 nucleotide
sequence positions. Evolutionary
relationships were determined by
neighbour-joining analysis (see
Methods). The green non-sulphur
bacterium Chloroflexus aurantiacus
was used as an outgroup. pufM genes
that were amplified by PCR in this
study are indicated by the env prefix,
with 'm' indicating Monterey, and HOT
indicating Hawaii ocean time series.
Cultivated aerobes are marked in light
blue, bacteria cultured from sea water
are marked with an asterisk, and
environmental cDNAs are marked in
red. Photosynthetic -, - and -
proteobacterial groups are indicated
by the vertical bars to the right of the
tree. Bootstrap values greater than
50% are indicated above the
branches. The scale bar represents
number of substitutions per site.
NATURE |VOL 415 |7 FEBRUARY 2002 |www.nature.com 631
complementary DNA. However, a different primer targeting a from which the BACs originated are functional.
BAC 65D09
crt bch puf
hemN
orf276
orf154
orf227
puhA
lhaA
bch
ppsR
bch
Rubrivivax
gelatinosus
Rhodobacter
sphaeroides
AB L M C G PIH B N FM LBCDAE F C X Y Z
C X Y Z L M C ABB C EI F H B N FM L G P
HBNF MLGP J EBE IF D A I DCXYZLM BAX
orf358
orf440
H B N FM LGPB C EI FDAID C X Y Z L MB AQO
BAC 60D04
cycA
Q
orf641
OC
tspO
ppsR
ppa
bchpufbchcrt bchbch
puhA
lhaA
orf213
orf128
orf277
orf292
idi
idi
Figure 2 Schematic comparison of photosynthetic operons from R. gelatinosus (b-
proteobacteria), R. sphaeroides (a-proteobacteria) and uncultured environmental BACs.
ORF abbreviations use the nomenclature de®ned in refs 13, 14 and 24. Predicted ORFs
are coloured according to biological category: green, bacteriochloropyll biosynthesis
genes; orange, carotenoid biosynthesis genes; red, light-harvesting and reaction centre
genes; and blue, cytochrome c2. White boxes indicate non-photosynthetic and
hypothetical proteins with no known function. Homologous regions and genes are
connected by shaded vertical areas and lines, respectively.
© 2002 Macmillan Magazines Ltd
letters to nature
We compared several photosynthetic genes found on the oceanic
bacteriochlorophyll superoperons with characterized homologues
from cultured photosynthetic bacteria. The relationships among
Bchla biosynthetic proteins BchB (a subunit of light-independent
protochlorophyllide reductase) and BchH (magnesium chelatase)
were determined18
(Fig. 3). Similar to pufM relationships, BchB and
operon) to cultivated Erythrobacter species.
Recently, it has been suggested that cultivated Erythrobacter
species (a-4 subclass of proteobacteria) may represent the pre-
dominant AAPs in the upper ocean8,20
. Surprisingly, we were not
able to retrieve photosynthetic operon genes belonging to this group
in any of the samples analysed. Furthermore, very few sequences
BAC 60D04
BAC 65D09
BAC 29C02
ba BchH
100/96
100/100
100/99
100/100
100/100
100/100
100/100
BAC 60D04
BAC 65D09
BAC 29C02
100/66
100/100
100/94
100/100
100/100
100/100
BchB
0.1
0.1
Acidiphilium rubrum
Rhodobacter capsulatus
Rhodobacter sphaeroides
Rhodopseudomonas palustris
Rubrivivax gelatinosus
Chloroflexus aurantiacus
Chlorobium tepidum
Heliobacillus mobilis
Rhodopseudomonas palustris
Rhodobacter capsulatus
Rhodobacter sphaeroides
Acidiphilium rubrum
Heliobacillus mobilis
Chlorobium tepidum
Chloroflexus aurantiacus
Rubrivivax gelatinosus
Figure 3 Phylogenetic analyses of BchB and BchH proteins. a, Phylogenetic tree for the
BchB protein. b, Phylogenetic tree for the BchH protein. The BchH sequences from
Chlorobium vibrioforme25
and BchH2 and BchH3 from C. tepidum18
were omitted from the
tree because these genes potentially encode an enzyme for bacteriochlorophyll c
biosynthesis and are probably of distinct origin (J. Xiong, personal communication).
Bootstrap values (neighbour-joining/parsimony method) greater than 50% are indicated
next to the branches. The scale bar represents number of substitutions per site. The
position of Acidiphilium rubrum (bold branch) was not well resolved by both methods.
opsins. Furthermore, palE6
to its Monterey Bay
d transport of protons in
udich et al., manuscript in
Antarctica (E. F. DeLong,
in primers, and sequenced
ary sequence analyses based
ate that, despite differences
on spectra, the Antarctic
cterium highly related to
bacteria of Monterey Bay10
om the HOT station were
mples. Most proteorhodop-
ce waters belonged to the
he amino-acid level) to the
. In contrast, most of the
within the Antarctic clade
om the HOT station, HOT
E. coli and their absorption
m their primary sequences,
th the Antarctic clade gave
whereas the proteorhodop-
e gave a green (527-nm)
al North Paci®c gyre, most
with maximal intensity near
s maintained over depth,
he surface, the energy peak
tween 400 and 650 nm. In
ows and the half bandwidth
m (Fig. 5). Considering the
mbers of the two different
advantage at different points along the depth-dependent light
0.06
0.05
0.04
0.03
0.02
0.01
1.0
0.8
0.6
0.4
0.2
0
350 400
75 m
5 m
Monterey
Bay
HOT 0 m
HOT 75 m
Antarctica
450 500
Wavelength (nm)
RelativeirradianceAbsorbance
550 600 650
Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli
membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM
phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for
palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after
retinal addition. Bottom, downwelling irradiance from HOT station measured at six
wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same
depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted
relative to irradiance at 490 nm.
Community structure and metabolism
through reconstruction of microbial
genomes from the environment
Gene W. Tyson1
, Jarrod Chapman3,4
, Philip Hugenholtz1
, Eric E. Allen1
, Rachna J. Ram1
, Paul M. Richardson4
, Victor V. Solovyev4
,
Edward M. Rubin4
, Daniel S. Rokhsar3,4
& Jillian F. Banfield1,2
1
Department of Environmental Science, Policy and Management, 2
Department of Earth and Planetary Sciences, and 3
Department of Physics, University of California,
Berkeley, California 94720, USA
4
Joint Genome Institute, Walnut Creek, California 94598, USA
...........................................................................................................................................................................................................................
Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their
roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report
reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other
genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of
genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different
individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level.
The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance
variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous
recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the
pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme
environment.
The study of microbial evolution and ecology has been revolutio-
nized by DNA sequencing and analysis1–3
. However, isolates have
been the main source of sequence data, and only a small fraction of
microorganisms have been cultivated4–6
. Consequently, focus has
shifted towards the analysis of uncultivated microorganisms via
cloning of conserved genes5
and genome fragments directly from
the environment7–9
. To date, only a small fraction of genes have been
recovered from individual environments, limiting the analysis of
fluorescence in situ hybridization (FISH) revealed that all biofilms
contained mixtures of bacteria (Leptospirillum, Sulfobacillus and, in
a few cases, Acidimicrobium) and archaea (Ferroplasma and other
members of the Thermoplasmatales). The genome of one of these
archaea, Ferroplasma acidarmanus fer1, isolated from the Richmond
mine, has been sequenced previously (http://www.jgi.doe.gov/JGI_
microbial/html/ferroplasma/ferro_homepage.html).
A pink biofilm (Fig. 1a) typical of AMD communities was
articles
!65
Environmental Genome Shotgun
Sequencing of the Sargasso Sea
J. Craig Venter,1
* Karin Remington,1
John F. Heidelberg,3
Aaron L. Halpern,2
Doug Rusch,2
Jonathan A. Eisen,3
Dongying Wu,3
Ian Paulsen,3
Karen E. Nelson,3
William Nelson,3
Derrick E. Fouts,3
Samuel Levy,2
Anthony H. Knap,6
Michael W. Lomas,6
Ken Nealson,5
Owen White,3
Jeremy Peterson,3
Jeff Hoffman,1
Rachel Parsons,6
Holly Baden-Tillson,1
Cynthia Pfannkoch,1
Yu-Hui Rogers,4
Hamilton O. Smith1
chlorococcus, tha
photosynthetic bio
Surface water
were collected ab
from three sites o
February 2003. A
lected aboard the S
station S” in May
are indicated on F
S1; sampling prot
one expedition to
was extracted from
genomic libraries w
2 to 6 kb were m
prepared plasmid
RESEARCH ARTICLE
Shotgun metagenomics
shotgun
sequence
!66
Acid Mine Drainage 2004
clones).
The first step in assignment of scaffolds to organism types was to
represent a nearly complete genome of
uncultured Ferroplasma species distinct
this as Ferroplasma type II. The dominan
was unexpected before the genomic analy
We assigned the roughly 3£ coverage
Leptospirillum group III on the basis of rRN
up to 31 kb, totalling 2.66 Mb). Comparis
those assigned to Leptospirillum group
sequence divergence and only locally co
firming that the scaffolds belong to a rel
Leptospirillum group II. A partial 16S rR
Sulfobacillus thermosulfidooxidans was
assembled reads, suggesting very low cov
any Sulfobacillus scaffolds .2 kb were a
grouped with the Leptospirillum group II
We compared the 3£ coverage, low Gþ
4.12 Mb) to the fer1 genome in order to
types (Supplementary Fig. S6). Scaffold
identity to fer1 were assigned to an enviro
I genome (170 scaffolds up to 47 kb in
1.48 Mb of sequence). The remaining
scaffolds are tentatively assigned to G-pla
in this bin (62 kb) contains the G-plasma
scaffolds assigned to G-plasma comprise
partial 16S rRNA gene sequence from A-pl
unassembled reads, suggesting low covera
scaffolds from A-plasma .2 kb would be
bin. Although eukaryotes are present in th
in low abundance in the biofilm studied.
eukaryotes have been detected.
As independent evidence that the Lep
Ferroplasma type II genomes are nearly co
complement of transfer RNA synthetases
An almost complete set of these genes
Leptospirillum group III. The G-plasma bin
set of tRNA synthetases, consistent with in
scaffolds. In addition, we established
group II, Leptospirillum group III, Ferrop
type II and G-plasma bins contained onl
Figure 1 The pink biofilm. a, Photograph of the biofilm in the Richmond mine (hand
included for scale). b, FISH image of a. Probes targeting bacteria (EUBmix; fluorescein
isothiocyanate (green)) and archaea (ARC915; Cy5 (blue)) were used in combination with a
probe targeting the Leptospirillum genus (LF655; Cy3 (red)). Overlap of red and green
(yellow) indicates Leptospirillum cells and shows the dominance of Leptospirillum.
c, Relative microbial abundances determined using quantitative FISH counts.
NATURE | doi:10.1038/na2 ©2004 NaturePublishing Group !67
Bassham cycle (using type II ribulose 1,5-bisphosphate carboxy-
lase–oxygenase). All genomes recovered from the AMD system
by some, or all, organisms. Given the large number of ABC-type
sugar and amino acid transporters encoded in the Ferroplasma type
Figure 4 Cell metabolic cartoons constructed from the annotation of 2,180 ORFs
identified in the Leptospirillum group II genome (63% with putative assigned function) and
1,931 ORFs in the Ferroplasma type II genome (58% with assigned function). The cell
drainage stream (viewed in cross-section). Tight coupling between ferrous iron oxidation,
pyrite dissolution and acid generation is indicated. Rubisco, ribulose 1,5-bisphosphate
carboxylase–oxygenase. THF, tetrahydrofolate.
!68
Environmental Genome Shotgun
Sequencing of the Sargasso Sea
J. Craig Venter,1
* Karin Remington,1
John F. Heidelberg,3
Aaron L. Halpern,2
Doug Rusch,2
Jonathan A. Eisen,3
Dongying Wu,3
Ian Paulsen,3
Karen E. Nelson,3
William Nelson,3
Derrick E. Fouts,3
Samuel Levy,2
Anthony H. Knap,6
Michael W. Lomas,6
Ken Nealson,5
Owen White,3
Jeremy Peterson,3
Jeff Hoffman,1
Rachel Parsons,6
Holly Baden-Tillson,1
Cynthia Pfannkoch,1
Yu-Hui Rogers,4
Hamilton O. Smith1
We have applied “whole-genome shotgun sequencing” to microbial populations
collected en masse on tangential flow and impact filters from seawater samples
collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs
of nonredundant sequence was generated, annotated, and analyzed to elucidate
the gene content, diversity, and relative abundance of the organisms within
these environmental samples. These data are estimated to derive from at least
1800 genomic species based on sequence relatedness, including 148 previously
unknown bacterial phylotypes. We have identified over 1.2 million previously
unknown genes represented in these samples, including more than 782 new
chlorococcus, th
photosynthetic bi
Surface wate
were collected a
from three sites
February 2003. A
lected aboard the
station S” in Ma
are indicated on
S1; sampling pro
one expedition to
was extracted fro
genomic libraries
2 to 6 kb were
prepared plasmid
both ends to prov
Craig Venter Sc
nology Center on
ers (Applied Bi
Whole-genome ra
the Weatherbird II
4) produced 1.66
in length, for a tota
microbial DNA se
sequences were g
RESEARCH ARTICLE
!69
http://www.sciencemag.org/content/304/5667/66
Sargasso Sea
two groups of scaffolds representing two dis-
tinct strains closely related to the published
at depths ranging from 4ϫ to 36ϫ (indicated
with shading in table S3 with nine depicted in
Fig. 1. MODIS-Aqua satellite image of
ocean chlorophyll in the Sargasso Sea grid
about the BATS site from 22 February
2003. The station locations are overlain
with their respective identifications. Note
the elevated levels of chlorophyll (green
color shades) around station 3, which are
not present around stations 11 and 13.
Fig. 2. Gene conser-
vation among closely !70
http://www.sciencemag.org/content/304/5667/66
frames (5). A total of 69,901 novel genes be-
longing to 15,601 single link clusters were iden-
tified. The predicted genes were categorized
Fig. 5. Prochlorococcus-related scaffold 2223290 illustrates the assembly of a broad commu-
nity of closely related organisms, distinctly nonpunctate in nature. The image represents (A)
global structure of Scaffold 2223290 with respect to assembly and (B) a sample of the multiple
sequence alignment. Blue segments, contigs; green segments, fragments; and yellow segments,
stages of the assembly of fragments into the resulting contigs. The yellow bars indicate that
fragments were initially assembled in several different pieces, which in places collapsed to
form the final contig structure. The multiple sequence alignment for this region shows a
homogenous blend of haplotypes, none with sufficient depth of coverage to provide a
separate assembly.
Table 1. Gene count breakdown by TIGR role
category. Gene set includes those found on as-
semblies from samples 1 to 4 and fragment reads
from samples 5 to 7. A more detailed table, sep-
arating Weatherbird II samples from the Sorcerer II
samples is presented in the SOM (table S4). Note
that there are 28,023 genes which were classified
in more than one role category.
TIGR role category
Total
genes
Amino acid biosynthesis 37,118
Biosynthesis of cofactors,
prosthetic groups, and carriers
25,905
Cell envelope 27,883
Cellular processes 17,260
Central intermediary metabolism 13,639
DNA metabolism 25,346
Energy metabolism 69,718
Fatty acid and phospholipid
metabolism
18,558
Mobile and extrachromosomal
element functions
1,061
Protein fate 28,768
Protein synthesis 48,012
Purines, pyrimidines, nucleosides,
and nucleotides
19,912
Regulatory functions 8,392
Signal transduction 4,817
Transcription 12,756
Transport and binding proteins 49,185
Unknown function 38,067
Miscellaneous 1,864
Conserved hypothetical 794,061
Total number of roles assigned 1,242,230
Total number of genes 1,214,207
www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004 69
!71
We closely examined the multiple sequence
alignments of the contigs with high SNP rates
and were able to classify these into two fairly
distinct classes: regions where several closely
related haplotypes have been collapsed, in-
creasing the depth of coverage accordingly
(10), and regions that appear to be a relatively
homogenous blend of discrepancies from the
consensus without any apparent separation into
haplotypes, such as the Prochlorococcus scaf-
fold region (Fig. 5). Indeed, the Prochlorococ-
cus scaffolds display considerable heterogene-
ity not only at the nucleotide sequence level
(Fig. 5) but also at the genomic level, where
multiple scaffolds align with the same region of
the MED4 (11) genome but differ due to gene
or genomic island insertion, deletion, rearrange-
ment events. This observation is consistent with
previous findings (12). For instance, scaffolds
2221918 and 2223700 share gene synteny with
each other and MED4 but differ by the insertion
of 15 genes of probable phage origin, likely
representing an integrated bacteriophage. These
genomic differences are displayed graphically
in Fig. 2, where it is evident that up to four
conflicting scaffolds can align with the same
region of the MED4 genome. More than 85%
of the Prochlorococcus MED4 genome can be
aligned with Sargasso Sea scaffolds greater
than 10 kb; however, there appear to be a
couple of regions of MED4 that are not repre-
sented in the 10-kb scaffolds (Fig. 2). The
larger of these two regions (PMM1187 to
PMM1277) consists primarily of a gene cluster
coding for surface polysaccharide biosynthesis,
which may represent a MED4-specific polysac-
charide absent or highly diverged in our Sar-
gasso Sea Prochlorococcus bacteria. The heter-
ogeneity of the Prochlorococcusscaffolds suggest
that the scaffolds are not derived from a single
discrete strain, but instead probably represent a
conglomerate assembled from a population of
closely related Prochlorococcus biotypes.
The gene complement of the Sargasso.
The heterogeneity of the Sargasso sequences
complicates the identification of microbial
genes. The typical approach for microbial an-
notation, model-based gene finding, relies en-
tirely on training with a subset of manually
eate scaffold borders.
Fig. 4. Circular diagrams of nine complete megaplasmids. Genes encoded in the forward direction
are shown in the outer concentric circle; reverse coding genes are shown in the inner concentric
circle. The genes have been given role category assignment and colored accordingly: amino acid
biosynthesis, violet; biosynthesis of cofactors, prosthetic groups, and carriers, light blue; cell
envelope, light green; cellular processes, red; central intermediary metabolism, brown; DNA
metabolism, gold; energy metabolism, light gray; fatty acid and phospholipid metabolism, magenta;
protein fate and protein synthesis, pink; purines, pyrimidines, nucleosides, and nucleotides, orange;
regulatory functions and signal transduction, olive; transcription, dark green; transport and binding
proteins, blue-green; genes with no known homology to other proteins and genes with homology
to genes with no known function, white; genes of unknown function, gray; Tick marks are placed
on 10-kb intervals.
2 APRIL 2004 VOL 304 SCIENCE www.sciencemag.org68
assembly to identify a set of large, deeply as-
sembling nonrepetitive contigs. This was used to
set the expected coverage in unique regions (to
23ϫ) for a final run of the assembler. This al-
lowed the deep contigs to be treated as unique
sequence when they would otherwise be labeled
as repetitive. We evaluated our final assembly
results in a tiered fashion, looking at well-sampled
genomic regions separately from those barely
sampled at our current level of sequencing.
The 1.66 million sequences from the
Weatherbird II samples (table S1; samples 1 to
4; stations 3, 11, and 13), were pooled and
assembled to provide a single master assembly
for comparative purposes. The assembly gener-
ated 64,398 scaffolds ranging in size from 826
bp to 2.1 Mbp, containing 256 Mbp of unique
sequence and spanning 400 Mbp. After assem-
bly, there remained 217,015 paired-end reads,
or “mini-scaffolds,” spanning 820.7 Mbp as
well as an additional 215,038 unassembled sin-
gleton reads covering 169.9 Mbp (table S2,
column 1). The Sorcerer II samples provided
almost no assembly, so we consider for these
samples only the 153,458 mini-scaffolds, span-
ning 518.4 Mbp, and the remaining 18,692
singleton reads (table S2, column 2). In total,
1.045 Gbp of nonredundant sequence was gen-
erated. The lack of overlapping reads within the
unassembled set indicates that lack of addition-
al assembly was not due to algorithmic limita-
tions but to the relatively limited depth of se-
quencing coverage given the level of diversity
within the sample.
The whole-genome shotgun (WGS) assembly
has been deposited at DDBJ/EMBL/GenBank
under the project accession AACY00000000,
and all traces have been deposited in a corre-
sponding TraceDB trace archive. The version
described in this paper is the first version,
AACY01000000. Unlike a conventional WGS
entry, we have deposited not just contigs and
scaffolds but the unassembled paired singletons
and individual singletons in order to accurate-
ly reflect the diversity in the sample and
allow searches across the entire sample with-
in a single database.
Genomes and large assemblies. Our
analysis first focused on the well-sampled ge-
nomes by characterizing scaffolds with at least
3ϫ coverage depth. There were 333 scaffolds
comprising 2226 contigs and spanning 30.9
Mbp that met this criterion (table S3), account-
ing for roughly 410,000 reads, or 25% of the
pooled assembly data set. From this set of well-
sampled material, we were able to cluster and
classify assemblies by organism; from the rare
species in our sample, we used sequence similar-
ity based methods together with computational
gene finding to obtain both qualitative and quan-
titative estimates of genomic and functional diver-
sity within this particular marine environment.
We employed several criteria to sort the
major assembly pieces into tentative organism
“bins”; these include depth of coverage, oligo-
nucleotide frequencies (7), and similarity to
previously sequenced genomes (5). With these
techniques, the majority of sequence assigned
to the most abundant species (16.5 Mbp of the
30.9 Mb in the main scaffolds) could be sepa-
rated based on several corroborating indicators.
In particular, we identified a distinct group of
scaffolds representing an abundant population
clearly related to Burkholderia (fig. S2) and
two groups of scaffolds representing two dis-
tinct strains closely related to the published
Shewanella oneidensis genome (8) (fig. S3).
There is a group of scaffolds assembling at over
6ϫ coverage that appears to represent the ge-
nome of a SAR86 (table S3). Scaffold sets
representing a conglomerate of Prochlorococ-
cus strains (Fig. 2), as well as an uncultured
marine archaeon, were also identified (table S3;
Fig. 3). Additionally, 10 putative mega plasmids
were found in the main scaffold set, covered
at depths ranging from 4ϫ to 36ϫ (indicated
with shading in table S3 with nine depicted in
Fig. 1. MODIS-Aqua satellite image of
ocean chlorophyll in the Sargasso Sea grid
about the BATS site from 22 February
2003. The station locations are overlain
with their respective identifications. Note
the elevated levels of chlorophyll (green
color shades) around station 3, which are
not present around stations 11 and 13.
Fig. 2. Gene conser-
vation among closely
related Prochlorococ-
cus. The outermost
concentric circle of
the diagram depicts
the competed genom-
ic sequence of Pro-
chlorococcus marinus
MED4 (11). Fragments
from environmental
sequencing were com-
pared to this complet-
ed Prochlorococcus ge-
nome and are shown in
the inner concentric
circles and were given
boxed outlines. Genes
for the outermost cir-
cle have been as-
signed psuedospec-
trum colors based on
the position of those
genes along the chro-
mosome, where genes
nearer to the start of
the genome are col-
ored in red, and genes
nearer to the end of the genome are colored in blue. Fragments from environmental sequencing
were subjected to an analysis that identifies conserved gene order between those fragments and
the completed Prochlorococcus MED4 genome. Genes on the environmental genome segments
that exhibited conserved gene order are colored with the same color assignments as the
Prochlorococcus MED4 chromosome. Colored regions on the environmental segments exhibiting
color differences from the adjacent outermost concentric circle are the result of conserved gene
order with other MED4 regions and probably represent chromosomal rearrangements. Genes that
did not exhibit conserved gene order are colored in black.
www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004 67http://www.sciencemag.org/content/304/5667/66
rRNA phylotyping from metagenomics
!72
http://www.sciencemag.org/
content/304/5667/66
Shotgun Sequencing Allows Alternative Anchors (e.g., RecA)
!73
http://www.sciencemag.org/
content/304/5667/66
nomic group using the phylogenetic analysis
described for rRNA. For example, our data set
marker genes, is roughly comparable to the
97% cutoff traditionally used for rRNA. Thus
Fig. 6. Phylogenetic diversity of Sargasso Sea sequences using multiple phylogenetic markers. The
relative contribution of organisms from different major phylogenetic groups (phylotypes) was
measured using multiple phylogenetic markers that have been used previously in phylogenetic
studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB). The
relative proportion of different phylotypes for each sequence (weighted by the depth of coverage
of the contigs from which those sequences came) is shown. The phylotype distribution was
determined as follows: (i) Sequences in the Sargasso data set corresponding to each of these genes
were identified using HMM and BLAST searches. (ii) Phylogenetic analysis was performed for each
phylogenetic marker identified in the Sargasso data separately compared with all members of that
gene family in all complete genome sequences (only complete genomes were used to control for
the differential sampling of these markers in GenBank). (iii) The phylogenetic affinity of each
sequence was assigned based on the classification of the nearest neighbor in the phylogenetic tree.
RIL 2004 VOL 304 SCIENCE www.sciencemag.org !74
http://www.sciencemag.org/
content/304/5667/66
Functional Inference from Metagenomics
• Can work well for individual genes
!75
Functional Diversity of Proteorhodopsins?
!76
http://www.sciencemag.org/
content/304/5667/66
ARTICLE
Received 3 Nov 2010 | Accepted 11 Jan 2011 | Published 8 Feb 2011 DOI: 10.1038/ncomms1188
Proteorhodopsins are light-driven proton pumps involved in widespread phototrophy.
Discovered in marine proteobacteria just 10 years ago, proteorhodopsins are now known
to have been spread by lateral gene transfer across diverse prokaryotes, but are curiously
absent from eukaryotes. In this study, we show that proteorhodopsins have been acquired
by horizontal gene transfer from bacteria at least twice independently in dinoflagellate
protists. We find that in the marine predator Oxyrrhis marina, proteorhodopsin is indeed the
most abundantly expressed nuclear gene and its product localizes to discrete cytoplasmic
structures suggestive of the endomembrane system. To date, photosystems I and II have
been the only known mechanism for transducing solar energy in eukaryotes; however, it now
appears that some abundant zooplankton use this alternative pathway to harness light to
A bacterial proteorhodopsin proton pump
in marine eukaryotes
Claudio H. Slamovits1,†
, Noriko Okamoto1
, Lena Burri1,†
, Erick R. James1
& Patrick J. Keeling1
http://www.nature.com/ncomms/journal/v2/n2/abs/ncomms1188.html
ARTICLENATURE COMMUNICATIONS | DOI: 10.1038/ncomms1188
NATURE COMMUNICATIONS | 2:183 | DOI: 10.1038/ncomms1188 | www.nature.com/naturecommunications
© 2011 Macmillan Publishers Limited. All rights reserved.
50
100
Neurospora crassa
51 Aspergillus fumigatus
Gibberella zeae
90
‘Oryza_sativa’
95 Leptosphaeria maculans
97 Pyrenophora tritici
Bipolaris oryzae
100 Oxyrrhis marina OM2197
Dinoflagellate 3
Dinoflagellate 1
Dinoflagellate 2
O. marina OM1497
Cyanophora paradoxa
80
Guillardia theta 1490
99
100 Rhodomonas salina 987
G. theta t309
62
Cryptomonas sp.
G. tetha 2135
100 G. theta 2
G. theta 1712
99 Haloterrigena sp.
68 Haloarcula argentinensis
Haloquadratum walsbyi
100
Nostoc sp.
64
Natronomonas pharaonis
Halorubrum sodomense
62 Haloterrigena sp.
Halobacterium salinarum
100
89
58
Salinibacter ruber (Xanthorhodopsin)
uncult. marine picoplankton (Hawaii)
Thermus aquaticus
Gloeobacter violaceus
Roseiflexus sp.
52
100 Env. SS AACY01492695
Env. SS AACY01598379
99
GS012-3
73 GS020-43
98 GS020-39
GS020-70
99
GS033-3
56
100 GS012-21
53
MWH-Dar4
MWH-Dar1
87
GS012-37
98
GS020-82
GS012-39
EgelM2-3.D
GS012-14
86
100 uncult. marine bacterium EB0_41B09 (Monterey Bay)
79 Methylophilales bacterium
Beta proteobacterium KB1
59 Marinobacter sp.
Alpha proteobacterium BAL199
Octadecabacter antarcticus
92
100 uncult. GOS 7929496
uncult. GOS ECY53283
100
Alexandrium catenella
53
Pyrocystis lunula
‘Amphioxus’
83
O. marina 27
91 O. marina 344
O. marina 5001
91
O. marina 173
O. marina 72
55
uncult. marine picoplankton 2 (Hawaii)
65
75 Exiguobacterium sibiricum
100 GS013_1
GS033_122
91
100 Env. SS AACY01002357
98 Tenacibaculum sp.
Psychroflexus torquis
95
Env. SS AACY01054069
76
uncult. marine group II euryarchaeote HF70_39H11
69
52
uncult. gamma proteobacterium eBACHOT4E07
Env. SS AACY01043960
86 uncult. marine bacterium HF130_81H07
Env. SS AACY01120795
100
Env. SS AACY01118955
79
uncult. marine proteobacterium ANT32C12
77
uncult. bacterium MedeBAC35C06
99 uncult. bacterium
uncult. marine gamma proteobacterium EBAC31A08
51 uncult. marine alpha proteobacterium HOT2C01
100 uncult. marine bacterium EB0_41B09
Photobacterium sp.
50
Env. SS AACY01499050
70
uncult. marine gamma proteobacterium HTCC2207
uncult. bacterium eBACred22E04
100 Karlodinium micrum 02
K. micrum 01
100 uncult. bacterium eBACmed86H08
Pelagibacter ubique
50
Env. SS AACY01015111
uncult. marine bacterium 66A03
80
100 GS033_85
Vibrio harveyi
76
Rhodobacterales bacterium
uncult. bacterium MedeBAC82F10
55
uncult. marine bacterium 66A03 (2)
uncult. bacterium MEDPR46A6
Fungal ( H+ / ?)
Halorhodopsins
(H+
/ Cl
–
)
Algal opsins
(sensory)
Proteorhodopsins
(H+)
Figure 1 | Phylogenetic distribution of dinoflagellate rhodopsins. Protein sequences of 96 rhodopsins encompassing the known diversity of microbial
(type I) rhodopsins from the three domains of life10
were used to generate a maximum likelihood phylogenetic tree (See Supplementary Table S1 for accession
numbers). Grey boxes distinguish the recognized groupings of type 1 rhodopsin, and the prevalent function in each group is shown: proton pumps (H+
), sensory,
chlorine pumps (Cl−
) and unknown (?). Numbers indicate bootstrap support when 50% (over 300 replicates). Black boxes highlight dinoflagellate genes:
Dinoflagellate 1 is a large group of proteorhodopsins present in diverse dinoflagellates; dinoflagellate 2 includes two K. micrum proteorhodopsin genes
of independent origin; dinoflagellate 3 includes two O. marina genes related to algal sensory rhodopsins, probably of endosymbiotic origin.
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics

More Related Content

What's hot

UC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomicsUC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomics
Jonathan Eisen
 
BioMinds Poster!!!!!!!!
BioMinds Poster!!!!!!!!BioMinds Poster!!!!!!!!
BioMinds Poster!!!!!!!!
Zuleika86
 

What's hot (20)

UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomics
 
EVE 161 Winter 2018 Class 10
EVE 161 Winter 2018 Class 10EVE 161 Winter 2018 Class 10
EVE 161 Winter 2018 Class 10
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
 
EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
UC Davis EVE161 Lecture 17 by @phylogenomics
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomics
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
 
Microbial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 16: Shotgun MetagenomicsMicrobial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18
 
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomicsUC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomics
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13
 
UC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomicsUC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomics
 
Metagenomics analysis
Metagenomics  analysisMetagenomics  analysis
Metagenomics analysis
 
American Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk UniversityAmerican Gut Project presentation at Masaryk University
American Gut Project presentation at Masaryk University
 
Evolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processesEvolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processes
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
BioMinds Poster!!!!!!!!
BioMinds Poster!!!!!!!!BioMinds Poster!!!!!!!!
BioMinds Poster!!!!!!!!
 

Similar to Microbial Phylogenomics (EVE161) Class 14: Metagenomics

REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
Shruti Gupta
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_Thesis
Nikola Ivica
 
BS831_project_PantelianaIoannou
BS831_project_PantelianaIoannouBS831_project_PantelianaIoannou
BS831_project_PantelianaIoannou
Panteliana Ioannou
 
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
Panteliana Ioannou
 

Similar to Microbial Phylogenomics (EVE161) Class 14: Metagenomics (20)

Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Metagenomics .pptx
Metagenomics .pptxMetagenomics .pptx
Metagenomics .pptx
 
Metagenomics , Applications, Techniques And Limitations .pptx
Metagenomics , Applications, Techniques And Limitations .pptxMetagenomics , Applications, Techniques And Limitations .pptx
Metagenomics , Applications, Techniques And Limitations .pptx
 
Dn abarcode
Dn abarcodeDn abarcode
Dn abarcode
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
replicación ADN.pdf
replicación ADN.pdfreplicación ADN.pdf
replicación ADN.pdf
 
Brief history and development of metagenomics
Brief history and development of metagenomicsBrief history and development of metagenomics
Brief history and development of metagenomics
 
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
 
New microsoft office power point presentation
New microsoft office power point presentationNew microsoft office power point presentation
New microsoft office power point presentation
 
Metagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copyMetagenomics by microbiology dept. panjab university2018copy
Metagenomics by microbiology dept. panjab university2018copy
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_Thesis
 
BS831_project_PantelianaIoannou
BS831_project_PantelianaIoannouBS831_project_PantelianaIoannou
BS831_project_PantelianaIoannou
 
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
CHARACTERIZATION OF SOIL AND LEAF ISOPRENE-DEGRADING COMMUNITIES AND THEIR CO...
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
 
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdfPaper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
 
Molecular basis of inheritance
Molecular basis of inheritanceMolecular basis of inheritance
Molecular basis of inheritance
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 

More from Jonathan Eisen

EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
Jonathan Eisen
 

More from Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Recently uploaded

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 

Recently uploaded (20)

An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 

Microbial Phylogenomics (EVE161) Class 14: Metagenomics

  • 1. EVE 161:
 Microbial Phylogenomics Era IV: Genomes from the Environment UC Davis, Winter 2016 Instructor: Jonathan Eisen !1
  • 2. Quiz Choose 2 of these 4 questions to answer. • 1. Explain how phylogenetic analysis was used in these papers • 2. Give an example of how the authors tried to predict the function of this new “proteorhodopsin” protein. • 3. Give an example of how the authors experimentally studied the function of this new “proteorhodopsin.” • 4. Why did they call this new protein “proteorhodopsin”?
  • 3. Metagenomics History • Term first use in 1998 paper by Handelsman et al. Chem Biol. 1998 Oct;5(10):R245-9. Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products • Collective genomes of microbes in the soil termed the “soil metagenome” • Good review is Metagenomics: Application of Genomics to Uncultured Microorganisms by Handelsman Microbiol Mol Biol Rev. 2004 December; 68(4): 669–685.
  • 4. Papers for Today DOI: 10.1126/science.289.5486.1902 , 1902 (2000);289Science et al.Oded Béjà, Phototrophy in the Sea Bacterial Rhodopsin: Evidence for a New Type of This copy is for your personal, non-commercial use only. .clicking herecolleagues, clients, or customers by , you can order high-quality copies forIf you wish to distribute this article to others .herefollowing the guidelines can be obtaPermission to republish or repurpose articles or portions of articles (this information is current as of May 18, 2010 ): The following resources related to this article are available online at www.scien http://www.sciencemag.org/cgi/content/full/289/5486/1902 version of this article at: including high-resolution figures, can be found inUpdated information and services, found at: crelated to this articleA list of selected additional articles on the Science W eb sites 17 GPa. J. Geophys. Res. 102, 12252±12263 (1997). 28. Duffy, T. S. & Vaughan, M. T. Elasticity of enstatite and its relationships to crystal structure. J. Geophys. Res. 93, 383±391 (1988). Acknowledgements We thank C. Karger for participation in the measurements, and M. Ferrero and F. Boudier for providing the samples from Baldissero and Papua New Guinea, respectively. The Laboratoire de Tectonophysique's EBSD system was funded by the CNRS/INSU, Universite of Montpellier II, and NSF project ``Anatomy of an archean craton''. This work was supported by the CNRS/INSU programme ``Action TheÂmatique Innovante''. Correspondence should be addressed to A.T. (e-mail: deia@dstu.univ-montp2.fr). ................................................................. Proteorhodopsin phototrophy in the ocean Oded BeÂjaÁ*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc* & Edward F. DeLong* * Monterey Bay Aquarium Research Institute, Moss Landing, California 95039, USA ³ Department of Microbiology and Molecular Genetics, The University of Texas Medical School, Houston, Texas 77030, USA ² characteristic of cycle seen with E proteorhodopsin Furthermore, spectrum of the and same isosbes dopsin expressed chemical reactio generate a red-s to that of recomb ¯ash-photolysis existence of pro retinal molecules waters. The photocyc and, after hydro adding all-trans indeed derive fro –1 0 1 2 orbance(×10–3) 4 http://www.sciencemag.org/content/289/5486/1902.full http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
  • 5. Papers for Today DOI: 10.1126/science.289.5486.1902 , 1902 (2000);289Science et al.Oded Béjà, Phototrophy in the Sea Bacterial Rhodopsin: Evidence for a New Type of This copy is for your personal, non-commercial use only. .clicking herecolleagues, clients, or customers by , you can order high-quality copies forIf you wish to distribute this article to others .herefollowing the guidelines can be obtaPermission to republish or repurpose articles or portions of articles (this information is current as of May 18, 2010 ): The following resources related to this article are available online at www.scien http://www.sciencemag.org/cgi/content/full/289/5486/1902 version of this article at: including high-resolution figures, can be found inUpdated information and services, found at: crelated to this articleA list of selected additional articles on the Science W eb sites http://www.sciencemag.org/content/289/5486/1902.full
  • 6. Extremely halophilic archaea contain retinal-binding integral membrane pro- teins called bacteriorhodopsins that function as light-driven proton pumps. So far, bacteriorhodopsins capable of generating a chemiosmotic membrane po- tential in response to light have been demonstrated only in halophilic archaea. We describe here a type of rhodopsin derived from bacteria that was discovered through genomic analyses of naturally occuring marine bacterioplankton. The bacterial rhodopsin was encoded in the genome of an uncultivated - proteobacterium and shared highest amino acid sequence similarity with archaeal rhodopsins. The protein was functionally expressed in Escherichia coli and bound retinal to form an active, light-driven proton pump. The new rhodopsin ex- hibited a photochemical reaction cycle with intermediates and kinetics characteristic of archaeal proton-pumping rhodopsins. Our results demonstrate that archaeal-like rhodopsins are broadly distributed among different taxa, including members of the domain Bacteria. Our data also indicate that a previously unsuspected mode of bacterially mediated light-driven energy generation may commonly occur in oceanic surface waters worldwide.
  • 7.
  • 8. SAR86 gene le ge- iden- roteo- from opsins erent. hereas philes r than rmine l, we a coli pres- rotein 3A). nes of popro- m was (Fig. at 520 band- erated odop- nce of dth is own transducer of light stimuli [for example, Htr (22, 23)]. Although sequence analysis of proteorhodopsin shows moderate statistical support for a specific relationship with sen- the kinetics of its photochemical reaction cy- cle. The transport rhodopsins (bacteriorho- dopsins and halorhodopsins) are character- ized by cyclic photochemical reaction se- !8
  • 9. The 16S ribosomal RNA neighbor-joining tree was constructed on the basis of 1256 homologous positions by the “neighbor” program of the PHYLIP package [J. Felsenstein, Methods Enzymol. 266, 418 (1996)]. DNA distances were calculated with the Kimura’s 2-parameter method.
  • 10. Cloning of proteorhodopsin. Sequence analysis of a 130-kb genomic fragment that encoded the ribosomal RNA (rRNA) operon from an uncultivated member of the marine g-Proteobacteria (that is, the “SAR86” group) (8, 9) (Fig. 1A) also revealed an open reading frame (ORF) encoding a putative rhodopsin (referred to here as proteorhodopsin) (10).
  • 11. Cloning of proteorhodopsin. Sequence analysis of a 130-kb genomic fragment that encoded the ribosomal RNA (rRNA) operon from an uncultivated member of the marine g-Proteobacteria (that is, the “SAR86” group) (8, 9) (Fig. 1A) also revealed an open reading frame (ORF) encoding a putative rhodopsin (referred to here as proteorhodopsin) (10).
  • 12.
  • 13.
  • 14. Marine Microbe Background • rRNA PCR studies of marine microbes have been extensive • Comparative analysis had revealed many lineages, some very novel, some less so, that were dominant in many, if not all, open ocean samples • Lineages given names based on specific clones: e.g., SAR11, SAR86, etc !14
  • 17.
  • 18. Vol. 57, No. 6APPLIED AND ENVIRONMENTAL MICROBIOLOGY, June 1991, p. 1707-1713 0099-2240/91/061707-07$02.00/0 Copyright C) 1991, American Society for Microbiology Phylogenetic Analysis of a Natural Marine Bacterioplankton Population by rRNA Gene Cloning and Sequencing THERESA B. BRITSCHGI AND STEPHEN J. GIOVANNONI* Department of Microbiology, Oregon State University, Corvallis, Oregon 97331 Received 5 February 1991/Accepted 25 March 1991 The identification of the prokaryotic species which constitute marine bacterioplankton communities has been a long-standing problem in marine microbiology. To address this question, we used the polymerase chain reaction to construct and analyze a library of 51 small-subunit (16S) rRNA genes cloned from Sargasso Sea bacterioplankton genomic DNA. Oligonucleotides complementary to conserved regions in the 16S rDNAs of eubacteria were used to direct the synthesis of polymerase chain reaction products, which were then cloned by blunt-end ligation into the phagemid vector pBluescript. Restriction fragment length polymorphisms and hybridizations to oligonucleotide probes for the SARll and marine Synechococcus phylogenetic groups indicated the presence of at least seven classes of genes. The sequences of five unique rDNAs were determined completely. In addition to 16S rRNA genes from the marine Synechococcus cluster and the previously identified but uncultivated microbial group, the SARll cluster [S. J. Giovannoni, T. B. Britschgi, C. L. Moyer, and K. G. Field, Nature (London) 345:60-63], two new gene classes were observed. Phylogenetic comparisons indicated that these belonged to unknown species of a- and -y-proteobacteria. The data confirm the earlier conclusion that a majority of planktonic bacteria are new species previously unrecognized by bacteriologists. Within the past decade, it has become evident that bacte- rioplankton contribute significantly to biomass and bio- geochemical activity in planktonic systems (4, 36). Until recently, progress in identifying the microbial species which constitute these communities was slow, because the major- ity of the organisms present (as measured by direct counts) cannot be recovered in cultures (5, 14, 16). The discovery a decade ago that unicellular cyanobacteria are widely distrib- uted in the open ocean (34) and the more recent observation of abundant marine prochlorophytes (3) have underscored the uncertainty of our knowledge about bacterioplankton community structure. Molecular approaches are now providing genetic markers for the dominant bacterial species in natural microbial pop- ulations (6, 21, 33). The immediate objectives of these studies are twofold: (i) to identify species, both known and novel, by reference to sequence data bases; and (ii) to construct species-specific probes which can be used in quantitative ecological studies (7, 29). Previously, we reported rRNA genetic markers and quan- titative hybridization data which demonstrated that a novel lineage belonging to the oa-proteobacteria was abundant in the Sargasso Sea; we also identified novel lineages which were very closely related to cultivated marine Synechococ- cus spp. (6). In a study of a hot-spring ecosystem, Ward and coworkers examined cloned fragments of 16S rRNA cDNAs and found numerous novel lineages but no matches to genes from cultivated hot-spring bacteria (33). These results sug- gested that natural ecosystems in general may include spe- cies which are unknown to microbiologists. Although any gene may be used as a genetic marker, rRNA genes offer distinct advantages. The extensive use of 16S rRNAs for studies of microbial systematics and evolu- tion has resulted in large computer data bases, such as the RNA Data Base Project, which encompass the phylogenetic diversity found within culture collections. rRNA genes are highly conserved and therefore can be used to examine distant phylogenetic relationships with accuracy. The pres- ence of large numbers of ribosomes within cells and the regulation of their biosyntheses in proportion to cellular growth rates make these molecules ideally suited for ecolog- ical studies with nucleic acid probes. Here we describe the partial analysis of a 16S rDNA library prepared from photic zone Sargasso Sea bacterio- plankton using a general approach for cloning eubacterial 16S rDNAs. The Sargasso Sea, a central oceanic gyre, typifies the oligotrophic conditions of the open ocean, pro- viding an example of a microbial community adapted to extremely low-nutrient conditions. The results extend our previous observations of high genetic diversity among closely related lineages within this microbial community and provide rDNA genetic markers for two novel eubacterial lineages. MATERIALS AND METHODS Collection and amplification of rDNA. A surface (2-m) bacterioplankton population was collected from hydrosta- tion S (32°4'N, 64°23'W) in the Sargasso Sea as previously described (8). The polymerase chain reaction (24) was used to produce double-stranded rDNA from the mixed-microbi- al-population genomic DNA prepared from the plankton sample. The amplification primers were complementary to conserved regions in the 5' and 3' regions of the 16S rRNA genes. The amplification primers were the universal 1406R (ACGGGCGGTGTGTRC) and eubacterial 68F (TNANAC ATGCAAGTCGAKCG) (17). The reaction conditions were as follows: 1 ,ug of template DNA, 10 ,ul of 10x reaction buffer (500 mM KCI, 100 mM Tris HCI [pH 9.0 at 250C], 15 mM MgCl2, 0.1% gelatin, 1% Triton X-100), 1 U of Taq DNA polymerase (Promega Biological Research Products, Madi- son, Wis.), 1 ,uM (each) primer, 200 ,uM (each) dATP, dCTP, dGTP, and dTTP in a 100-,ul total volume; 1 min at 94°C, 1 atUNIVOFCALIFDAVISonMay18,2010aem.asm.orgDownloadedfrom MARINE BACTERIOPLANKTON 1711 Escherichia coli Vibrio harveyi inmm Vibrio anguillarum - Oceanospirillum linum - SAR 92 Caulobacter crescentus SAR83 - Erythrobacter OCh 114 - Hyphomicrobium vulgare Rhodopseudomonas marina Rochalimaea quintana Agrobacterium tumefaciens SARI -Es SAR95 SAR1I CZ * C)0._ u 10C) 00 .0 0Q C) 0 0 O)4.. eP. - Liverwort CP *Anacystis nidulans - Prochlorothrix - SAR6 * SAR7 r SARIOO SAR139 0 p.0 0 0 0.05 0.10 0.15 FIG. 4. Phylogenetic tree showing relationships of the rDNA clones from the Sargasso Sea to representative, cultivated species (2, 6, 18, 30, 31, 32, 35, 40). Positions of uncertain homology in regions containing insertions and deletions were omitted from the analysis. Evolutionary distances were calculated by the method of Jukes and Cantor (15), which corrects for the effects of superimposed mutations. The phylogenetic tree was determined by a distance matrix method (20). The tree was rooted with the sequence of Bacillus subtilis (38). Sequence data not referenced were provided by C. R. Woese and R. Rossen. phototroph phyla were conserved in the corresponding 880 (G - C to C G). We note with caution that some classes VOL. 57, 1991 I I atUNIVaem.asm.orgDownloadedfrom
  • 19. JOURNAL OF BACTERIOLOGY, JUlY 1991, p. 4371-4378 0021-9193/91/144371-08$02.00/0 Copyright C) 1991, American Society for Microbiology Vol. 173, No. 14 Analysis of a Marine Picoplankton Community by 16S rRNA Gene Cloning and Sequencing THOMAS M. SCHMIDT,t EDWARD F. DELONG,t AND NORMAN R. PACE* Department ofBiology and Institute for Molecular and Cellular Biology, Indiana University, Bloomington, Indiana 47405 Received 7 January 1991/Accepted 13 May 1991 The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2 x 104 recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the y subdivision of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the jb.asmDownloadedfrom
  • 20. INSIGHT REVIEW NATURE|Vol 437|15 September 2005 Crenarchaeota Group I Archaea Euryarchaeota Group II Archaea Group III Archaea Group IV Archaea α-Proteobacteria * SAR11 - Pelagibacter ubique * Roseobacter clade OCS116 ß-Proteobacteria * OM43 µ-Proteobacteria SAR86 * OMG Clade * Vibrionaeceae * Pseudoalteromonas * Marinomonas * Halomonadacae * Colwellia * Oceanospirillum δ-Proteobacteria Cyanobacteria * Marine Cluster A (Synechococcus) * Prochlorococcus sp. Lentisphaerae * Lentisphaera araneosa Bacteroidetes Marine Actinobacteria Fibrobacter SAR406 Planctobacteria Chloroflexi SAR202 Archaea Bacteria Figure 1| Schematic illustration of the phylogeny of the major plankton clades.Black letters indicate microbial groups that seem to be ubiquitous in seawater. Gold indicates groups found in the photic zone. Blue indicates groups confined to the mesopelagic and surface waters during polar winters. Green indicates microbial groups associated with coastal ocean ecosystems. %&'())%#*+,-###./*/!0##*1""#23##4(56#,! 343 Molecular diversity and ecology of microbial plankton Stephen J. Giovannoni1 & Ulrich Stingl1 The history of microbial evolution in the oceans is probably as old as the history of life itself. In contrast to terrestrial ecosystems, microorganisms are the main form of biomass in the oceans, and form some of the largest populations on the planet. Theory predicts that selection should act more efficiently in large populations. But whether microbial plankton populations harbour organisms that are models of adaptive sophistication remains to be seen. Genome sequence data are piling up, but most of the key microbial plankton clades have no cultivated representatives, and information about their ecological activities is sparse. cultivation of key organisms, metagenomics and ongoing biogeo- chemical studies. It seems very likely that the biology of the dominant microbial plankton groups will be unravelled in the years ahead. Here we review current knowledge about marine bacterial and archaeal diversity, as inferred from phylogenies of genes recovered from the ocean water column, and consider the implications of micro- bial diversity for understanding the ecology of the oceans. Although we leave protists out of the discussion, many of the same issues apply to them. Some of the studies we refer to extend to the abyssal ocean, but we focus principally on the surface layer (0–300 m) — the region of highest biological activity. Phylogenetic diversity in the ocean Small-subunit ribosomal (RNA) genes have become universal phylo- genetic markers and are the main criteria by which microbial plank- ton groups are identified and named9 . Most of the marine microbial groups were first identified by sequencing rRNA genes cloned from seawater10–14 , and remain uncultured today. Soon after the first reports came in, it became apparent that less than 20 microbial clades accounted for most of the genes recovered15 . Figure 1 is a schematic illustration of the phylogeny of these major plankton clades. The taxon names marked with asterisks represent groups for which cultured iso- lates are available. The recent large-scale shotgun sequencing of seawater DNA is pro- viding much higher resolution 16S rRNA gene phylogenies and bio- geographical distributions for marine microbial plankton. Although the main purpose of Venter’sSorcerer IIexpedition is to gather whole- genome shotgun sequence (WGS) data from planktonic microorgan- isms16 , thousands of water-column rRNA genes are part of the by-catch. The first set of collections, from the Sargasso Sea, have yielded 1,184 16S rRNA gene fragments. These data are shown in Fig. 2, organized by clade structure. Such data are a rich scientific resource for two reasons. First, they are not tainted by polymerase chain reaction (PCR) artefacts; PCR artefacts rarely interfere with the correct placement of genes in phylogenetic categories, but they are a major problem for reconstructing evolutionary patterns at the popu- lation level17 . Second, the enormous number of genes provided by the Sorcerer II expedition is revealing the distribution patterns and abun- dance of microbial groups that compose only a small fraction of the Certain characteristics of the ocean environment — the prevailing low-nutrient state of the ocean surface, in particular — mean it is sometimes regarded as an extreme ecosystem. Fixed forms of nitrogen, phosphorus and iron are often at very low or undetectable levels in the ocean’s circulatory gyres, which occur in about 70% of the oceans1 . Photosynthesis is the main source of metabolic energy and the basis of the food chain; ocean phytoplankton account for nearly 50% of global carbon fixation, and half of the carbon fixed into organic matter is rapidly respired by heterotrophic microorganisms. Most cells are freely suspended in the mainly oxic water column, but some attach to aggre- gates. In general, these cells survive either by photosynthesizing or by oxidizing dissolved organic matter (DOM) or inorganic compounds, using oxygen as an electron acceptor. Microbial cell concentrations are typically about 105 cells mlǁ1 in the ocean surface layer (0–300 m) — thymidine uptake into microbial DNA indicates average growth rates of about 0.15 divisions per day (ref. 2). Efficient nutrient recycling, in which there is intense competi- tion for scarce resources, sustains this growth, with predation by viruses and protozoa keeping populations in check and driving high turnover rates3 . Despite this competition, steady-state dissolved organic carbon (DOC) concentrations are many times higher than carbon sequestered in living microbial biomass4 . However, the average age of the DOC pool in the deep ocean, of about 5,000 years5 (deter- mined by isotopic dating), suggests that much of the DOM is refrac- tory to degradation. Although DOM is a huge resource, rivalling atmospheric CO2 as a carbon pool6 , chemists have been thwarted by the complexity of DOM and have characterized it only in broad terms7 . The paragraphs above capture prominent features of the ocean environment, but leave out the complex patterns of physical, chemical and biological variation that drive the evolution and diversification of microorganisms. For example, members of the genusVibrio — which include some of the most common planktonic bacteria that can be iso- lated on nutrient agar plates — readily grow anaerobically by fermen- tation. The life cycles of some Vibrio species have been shown to include anoxic stages in association with animal hosts, but the broad picture of their ecology in the oceans has barely been characterized8 . The story is similar for most of the microbial groups described below: the phylogenetic map is detailed, but the ecological panorama is thinly sketched. New information is rapidly flowing into the field from the 1 Department of Microbiology, Oregon State University, Corvallis, Oregon 97331, USA. NaturePublishing Group©2005 !20
  • 21. The ecoty many micro tions in the w best exampl entiated by adapted (hig Phylogeneti gests that th cally distinct Cluster A S of which ca characterist urobilin)37,3 ample suppo teristics that SS120 has a ammonium extreme, Syn nitrate, cyan interesting t seem to pro whereby nu conditions. seasonal spe lular cyanob The obse libraries prepared using PCR methods that reduced sequence arte- facts17 . They concluded that most sequence variation was clustered SAR11 subgroups Ia + Ib SAR86 (γ-Proteobacteria) SAR11 subgroup II M arine Picophytoplankton Uncultured α-Proteobacteria Clade SAR406 (Fibrobacter) Bacteroidetes SAR324 (δ-Proteobacteria) M arine Actinobacteria Alterom onas/Pseudoalterom onas %of16SrRNAsequences 0 5 10 15 20 25 30 35 SAR116 (α-Proteobacteria) Rheinheim era Roseobacter clade SAR202 (Chloroflexi) Phylogenetic clade !21
  • 22. DeLong Lab • Studying Sar86 and other marine plankton • Note - published one of first genomic studies of uncultured microbes - in 1996 JOURNAL OF BACTERIOLOGY, Feb. 1996, p. 591–599 Vol. 178, No. 3 0021-9193/96/$04.00 0 Copyright 1996, American Society for Microbiology Characterization of Uncultivated Prokaryotes: Isolation and Analysis of a 40-Kilobase-Pair Genome Fragment from a Planktonic Marine Archaeon JEFFEREY L. STEIN,1 * TERENCE L. MARSH,2 KE YING WU,3 HIROAKI SHIZUYA,4 AND EDWARD F. DELONG3 * Recombinant BioCatalysis, Inc., La Jolla, California 920371 ; Microbiology Department, University of Illinois, Urbana, Illinois 618012 ; Department of Ecology, Evolution and Marine Biology, University of California, Santa Barbara, California 931063 ; and Division of Biology, California Institute of Technology, Pasadena, California 911254 Received 14 July 1995/Accepted 14 November 1995 One potential approach for characterizing uncultivated prokaryotes from natural assemblages involves genomic analysis of DNA fragments retrieved directly from naturally occurring microbial biomass. In this study, we sought to isolate large genomic fragments from a widely distributed and relatively abundant but as yet uncultivated group of prokaryotes, the planktonic marine Archaea. A fosmid DNA library was prepared from a marine picoplankton assemblage collected at a depth of 200 m in the eastern North Pacific. We identified a 38.5-kbp recombinant fosmid clone which contained an archaeal small subunit ribosomal DNA gene. Phylogenetic analyses of the small subunit rRNA sequence demonstrated its close relationship to that of previously described planktonic archaea, which form a coherent group rooted deeply within the Crenarchaeota Downloadedfrom
  • 23. Delong Lab isolated from fosmid clones arious restriction endonucle- , the F-factor-based vector osmid subfragments. Partial striction enzyme to 1 ⇥g of re. The reaction mixture was oved at 10, 40, and 60 min. 1 ⇥l of 0.5 M EDTA to the e partially digested DNA was ribed above except using a 1- es of the separated fragments dards. The distances of the promoter sites on the excised of T7- or SP6-specific oligo- d hybridizing with Southern d and pBAC clones digested ed with labeled T7 and SP6 ubclones and PCR fragments sequencing described above. mates from the partial diges- fosmids and their subclones. DeSoete distance (9) analyses ng GDE 2.2 and Treetool 1.0, P) (23). DeSoete least squares rwise evolutionary distances, count for empirical base fre- d from the RDP, version 4.0 A sequences were performed DP. For distance analyses of ary distances were estimated topology was inferred by the tion and global branch swap- n sequences, the Phylip pro- ion and ordinary parsimony sequences reported in Table ollowing accession numbers: 3, U40244, and U40245. The FIG. 1. Flowchart depicting the construction and screening of an environ- mental library from a mixed picoplankton sample. MW, molecular weight; PFGE, pulsed-field gel electrophoresis. GENOMIC FRAGMENTS FROM PLANKTONIC MARINE ARCHAEA 593 atUNjb.asm.orgDownloadedfrom !23
  • 24. Delong Lab FIG. 4. High-density filter replica of 2,304 fosmid clones containing approx- imately 92 million bp of DNA cloned from the mixed picoplankton community. The filter was probed with the labeled insert from clone 4B7 (dark spot). The lack of other hybridizing clones suggests that contigs of 4B7 are absent from this portion of the library. Similar experiments with the remainder of the library yielded similar results. J. BACTERIOL. Downl !24
  • 25. SAR86 has Proteorhodopsin in Its Genome hodop- ence of width is orption he red- m in the d Schiff y to the was de- n a cell d trans- proteor- only in g. 4A). um was of a 10 arbonyl Illumi- l poten- ht-side- of reti- ht onset proteo- pable of ysiolog- ctivities ntaining oteorho- n to be Fig. 1. (A) Phylogenetic tree of bacterial 16S rRNA gene sequences, including that encoded on the 130-kb bacterioplankton BAC clone (EBAC31A08) (16). (B) Phylogenetic analysis of proteorhodop- sin with archaeal (BR, HR, and SR prefixes) and Neurospora crassa (NOP1 prefix) rhodopsins (16). Nomenclature: Name_Species.abbreviation_Genbank.gi (HR, halorhodopsin; SR, sensory rhodopsin; BR, bacteriorhodopsin). Halsod, Halorubrum sodomense; Halhal, Halobacterium salinarum (halo- bium); Halval, Haloarcula vallismortis; Natpha, Natronomonas pharaonis; Halsp, Halobacterium sp; Neucra, Neurospora crassa. Downloadedfrom !25
  • 26. The proteorhodopsin tree was constructed on the basis of the archaeal rhodopsin alignment used by Mukohata et al. [Y. Mukohata, K. Ihara, T. Tamura, Y. Sugiyama, J. Bio- chem. ( Tokyo) 125, 649 (1999)] and the neighbor- joining method as implemented in the neighbor pro- gram of the PHYLIP package. The numbers at forks indicate the percent of bootstrap replications (out of 1000) in which the given branching was observed. The least-square method (the Kitsch program of PHYLIP) and the maximum likelihood method (the Puzzle program) produced trees with essentially identical topologies, whereas the protein parsimony method (the Protpars program of PHYLIP) placed proteorhodopsin within the sensory rhodopsin cluster. The Yro/Hsp30 subfamilies sequences (6) were omitted from the phylogenetic analysis to avoid the effect of long branch attraction.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. 1 ttgttatatc agtaatggct attgctccaa taacttaata ctaatatata attagtttat 61 gaataaattt tatatatttg ggttattgtt ttttacacta aatgcatttt cttgctcaga 121 tcttctagat acagacatga gagttcttga ttccgctgag tcaagaaacc tttgcgagtt 181 tgaaggaaaa gctttactag ttgtgaatgt tgcaagtaga tgtggttaca cttatcaata 241 tgctggcctt caaaagttat atgaaagtta taaagatgaa gattttctag taattgggat 301 cccatctaga gattttcttc aagaatactc tgatgaaagc gatgttgcag aattttgttc 361 tacagaatac ggtgttgaat ttcctatgtt ctcaactgct aaagtcaaag gaaaaaaagc 421 acacccattt tataaaaaac ttattgcaga atcaggtttt actccctcat ggaactttaa 481 taaatactta atctcaaaag agggcaaggt tgtatccaca tatggatcaa aggtaaagcc 541 tgattcaaaa gagcttatat cagctataga aggcttgctg taaaattatt acttagaaac 601 taatacagtt ttaggcttgt ttgctgcaaa tattccatta tctacaactc caggaatatt 661 attaatcaaa gcttccattt cagtggggtt tgaaatatcc atattagaga tatctaaaat 721 gtgattacct tggtctgtta taaatccagt tctatatgtg ggtattccac cgatcgagat 781 tatttttctt gcaacaaggc tcctactttc aggtatcacc tctataggca gtggaaaagc 841 tcccaaaaga ttaaccatct ttgactgatc aactatacat ataaactcgt tagaggcaga 901 agcaactatc ttttctctag tatgtgcgcc accaccacct ttaataagac aattttcagg 961 agacacctca tctgcaccat ctatgtaata agctatatca actacatcat taaggctaaa 1021 gacctctatc ccattttcat ttaataattt tgatgaagca tctgaactag aaacagctcc 1081 agcaaatttg tgcctatgct cctttagttc ttctataaaa aaattaactg ttgagccggt 1141 tccaatacct aaaatcatct caggatgaag attatttttg atatattcta tagcttgttt 1201 agcaacattt atctttgagc cactcataga gttataatac aagaaaatat aggtagttaa 1261 ttattttgag actaaaaatt aaaaaaacag gttcttttaa gaattcccag aagtacctaa 1321 agcttattcc taatgcatct gtttatgacg tagcattaaa atcacctata acatttgctc 1381 taaatatttc ttcaaagctg gggaataaag ttttcctaaa aagagaggat ctgcaaccta 1441 tattttcttt taaaaacaga ggagcgtata acaagattgt aaatttatcc gatgccgaaa 1501 agaagagggg ggttattgct gcatcagcag gaaatcatgc tcaaggggta gccagtgcat 1561 gtaagaaatt aaaaattaat tgcttgatag ttatgccaat aacaactcca gaaataaaaa 1621 taaaagatgt aaaaagattt ggagccaaaa tactccaaca tggggacaac gtagatgcag 1681 cattaaaaga ggcactgttt attgcaaaga aaaaaaaatt gtcttttgtt catccttttg 1741 acgaccctct aacaattgct ggccaaggga ctataggaca agaaattctt gaagataaaa 1801 ataattttga tgttgtcttt gttccggtgg gaggaggagg tattctagct ggtgtatctg 1861 cctggatagc acagaataat aagaaaataa aaattgttgg tgttgaggtt gaggattccg 1921 cttgtcttgc tgaggccgta aaagctaata aaagagttat tttaaaagaa gtgggcctct 1981 ttgctgatgg ggtggcagta tcaagggttg gaaaaaataa ttttgatgtt attaaagagt 2041 gcgtagatga agtcattaca gttagcgttg atgaggtctg caccgctgta aaagatatct 2101 ttgaagatac aagggttcta tcagaacctg ctggggcatt agcacttgca gggttaaaag 2161 cctacgcaag gaaagttaaa aataaaaaac ttattgctat aagttctggc gctaatgtaa 2221 atttccaaag acttaatttt attgttgagc gatcagagat tggtgaaaat agagaaaaaa 2281 tattaagtat caaaatccca gagatacctg gaagttttct taagctttca aggatgtttg 2341 gcagctctca agttacagag tttaactaca ggaaatctag cttaagcgat gcatatgttt 2401 tagttggtgt tagaactaaa actgaaaaat catttgaaat cttaaagtcc aaattaaaaa 2461 aagcaggctt cacctttagc gactttactc gaaatgaaat atccaatgat catctgaggc 2521 atatggttgg tggcagaaat agtgactcag gctctcataa caatgaaaga atatttaggg 2581 gagagtttcc tgagaagccg ggcgcgctgt taaattttct agagaaattt ggaaataaat 2641 ggmatatttc cttatttcat tacaggaacc taggttcagc ttttggaaag atattaattg 2701 gcatcgagag taaggataaa gacaagctaa taaatcattt aaataagtca ggnactattt 2761 ttacagaaga aacctctaac aaggcataca aagatttttt aaaatgaaag gttaatactt 2821 taatctaaat ttaattgaaa aaagctcatc gctagggttt tcccacggct ctttgaacaa 2881 ctcggattga gatctatcat cctcctcgtc gtaaattctc ccacctttag aatagaccaa 2941 aaatagatat gacaaaggag cgagctcata tttatatcta atttggaacg aagctacgcc 3001 agtattaaat tcattaacta tattatttcc tttataaagg tatccattag catctgaaaa 3061 aatactaata ggattttttg ctttcaaagc aacaaattga ctcttaagtc tgatctcatg 3121 tttattattt ttaaaccaat ttagatcaaa agataaggta tcttgtcttg aatcatatga 3181 ggcaagatta ttattatcct gccatatcag ccattcattt tcctttctta ttctatattg 3241 cgcattgatt cttaagttat catttggaaa tattgaacct gctatcttgt aaaattttct 3301 accataccca ttcgaatccc accgattatc tttctctccc ttaaagaagc taactctcca 3361 gtcatatgtc cagaatgaat agttctttgc ctcaaagtct gctgtaatac ctattcttct 3421 ttttgatttg ataaaggggt aggcttcatt ttttcttgtg atagttgtat ttttcccaga 3481 agatctaaag ttaaaatcta attgaaattt agagttgtcc ttaaaactaa aagaattttt 3541 ttgatcgatg cctattggat ttgaattacc agctgtgtca gcatcataat ttagatcaat 3601 tccataatct atttgtttta atatcgagct attatcaaat tcatttattt ttcgatttcc 3661 tccaatacca gcatgaatcc agtctcttct ttgcagataa ccaaagtcat ttaactcgaa 3721 gtcgtcttca aaataaagaa ggcttccact tatgttagat agtttatttg gaagatatgt 3781 aaactgagtc ctatacccaa gcccattttt accatctttt tctgaggcta acagatctga 3841 atatgtaatt aatttttttg aacgaatatt gatgtaatca ataacattga ctgttgatga 3901 ttcgcctgtc atttcattct caacattcgt taccatgaag ccaagcgttt tatttccaag 3961 ctttgttcga gatcgaagag cataatagtc tcttccaact gaaaaggctt catcagcctc 4021 acttgctaca aatactccaa attcattatt attacttttt tgagtaagtc ttaatgcaaa 4081 atcaatatca gaatagtttt ttttagctgc ctcgcaacct tcttcattac tttcttctga 4141 gcaattatag ctgggggcag ctccaatcct ccgcgtattt ataaccgagt atctatcata 4201 attactaata tcaaatagtg attggttttc attgaaaaat gctctttttt ctgagtaaaa 4261 agtttcttga gcagaaaagt taataaccac atcatcactc tcagcttgtc cgaaatctgg 4321 attaatagct aaatttattt gacgaccttt tccagtgcta taaaagattt cagccccaat 4381 atctgaccct tcttggtttg taactgaatt tttatttgaa gatatatatg gaaaaaaagt 4441 aagctttgat tttgtatagt tttgtatttc taagctatct aactcttgaa agtagtcatt 4501 tctactagcc attgttccgg cactgctaac ccatgactca ttttcggcac tataacgtaa 4561 tgcggtgtaa ttaatttttc ttatatcacc atcaggctgt ttcattaacg ttacatccca 4621 aggaataaaa aactcagaga cccaataccc atcaaatttt tgtgtttttg caatccaatc 4681 tccatcccag tctgtcttaa agtctcctgc ttgcgttttt atggcatcga aaagcgagtt 4741 cccaagattt atagcaagaa tgaaagcttt gttaccatca ccatcaaagt ctatatttat 4801 agagttttta tcgctaagtg aatttatttg atctctaagc gtcttcctgg agaacataga 4861 atcattactt tgaaaattct taaatccaac atatatacca tccttatttg agaaaattaa 4921 agccgttgta agaagttcat ttttcttaag agtaaaagga gatgtttcat aaaaatctgt 4981 aatttcaaat gcattattcc actcaggctc atcaagagag ccatcaataa caattgagtt 5041 tgaccaaatc agcacagagg taagtaaaag tgataatgag gctaaagtta ttttcataga 5101 tagatttaat tcagagtatt ttaatctatg aagagtagga aatcatccga tcattctcag 5161 aaaaaacata atcatattta aggtcatgtt tttctccaaa atgcatatta caaatctggt 5221 attgataaca cagtccaacc aataaaggtc ttgatacctc ctcgtttata gagccaataa 5281 ttctatcgaa atatccagag ccatagccaa gcctgtatcc atttaaatca actcctgtca 5341 taggaataaa cattaaatca atttcattta tgttgacata atcttcactt ttaacctctt
  • 34. cat /Users/jeisen/Desktop/sequence.gb.txt | grep product /product="predicted threonine dehydratase" /product="predicted secreted protein" /product="predicted 5-formyltetrahydrofolate cyclo-ligase" /product="predicted Xaa-Pro aminopeptidase" /product="predicted quinone biosynthesis monooxygenase" /product="predicted 50S ribosomal protein L28" /product="predicted deoxyuridine 5'-triphosphate /product="predicted primosomal protein N' (replication /product="predicted N-acetylmuramoyl-L-alanine amidase" /product="predicted kinase of the phosphomethylpyrimidine /product="predicted kinase of P-loop ATPase superfamily" /product="predicted eukaryote-type paraoxonase" /product="predicted alpha/beta hydrolase" /product="predicted MutT superfamily hydrolase" /product="predicted metallobeta lactamase fold protein" /product="predicted acetyl-coenzyme A synthetase" /product="predicted deoxypurine kinase" /product="predicted poly(A) polymerase" /product="predicted DskA family protein with conserved /product="predicted ribokinase family sugar kinase" /product="predicted glutamate-1-semialdehyde /product="predicted enzyme with conserved cysteine" /product="predicted metallopeptidase of the G-G peptidase /product="predicted proline dehydrogenase" /product="predicted tyrosyl-tRNA synthetase" /product="predicted prolyl endopeptidase" /product="tRNA-Ile" /product="predicted unknown protein" /product="predicted cation efflux pump" /product="predicted TonB-dependent receptor protein" /product="predicted conserved membrane protein" /product="predicted LicA, choline kinase involved in LPS /product="predicted NAD-dependent formate dehydrogenase" /product="predicted 2-nitropropane dioxygenase" /product="predicted oxidoreductase" /product="predicted CsgA, Rossman fold oxidoreductase" /product="predicted acid--CoA ligase fadD13" /product="proteorhodopsin" /product="predicted phosphopantetheine /product="predicted formamidopyrimidine-DNA glycosylase" /product="predicted outer membrane protein TolC" /product="predicted topoisomerase IV subunit B" /product="predicted topoisomerase IV subunit A" /product="predicted succinylornithine aminotransferase" /product="predicted 6-phosphofructokinase" /product="predicted /product="predicted ORF" /product="predicted penicillin-binding protein 2" /product="predicted membrane-bound lytic transglycosylase" /product="predicted penicillin-binding protein 6 /product="predicted D-alanine aminotransferase" /product="predicted DNA polymerase III, delta subunit" /product="predicted leucyl-tRNA synthetase (leuS)" /product="predicted acetyl-CoA carboxylase, biotin /product="predicted esterase of alpha/beta hydrolase fold" /product="predicted ABC transporter ATPase subunit" /product="predicted dihydroxyacid dehydratase" /product="predicted porphobilinogen synthase" /product="predicted transcription termination factor Rho" /product="predicted porphobilinogen deaminase" /product="predicted transporter (ACRA/ACRE type)" /product="predicted cation efflux system (AcrB/AcrD/AcrF /product="predicted secreted lipoprotein" /product="predicted ribosomal large subunit pseudouridine /product="predicted acetolactate synthase III large chain" /product="predicted ketol-acid reductoisomerase" /product="16S ribosomal RNA" /product="23S ribosomal RNA" /product="predicted YacE family of P-loop kinases" /product="predicted preprotein translocase secA subunit"
  • 35.
  • 36. The inferred amino acid sequence of the proteorho- dopsin showed statistically significant similarity to archaeal rhodopsins (11).
  • 37. The inferred amino acid sequence of the proteorho- dopsin showed statistically significant similarity to archaeal rhodopsins (11).
  • 38. SAR86 has Proteorhodopsin in Its Genome generated eorhodop- resence of ndwidth is absorption . The red- nm in the ated Schiff ably to the on was de- s in a cell ward trans- in proteor- nd only in (Fig. 4A). edium was ce of a 10 re carbonyl 19). Illumi- ical poten- right-side- nce of reti- light onset hat proteo- capable of physiolog- e activities containing proteorho- main to be Fig. 1. (A) Phylogenetic tree of bacterial 16S rRNA gene sequences, including that encoded on the 130-kb bacterioplankton BAC clone (EBAC31A08) (16). (B) Phylogenetic analysis of proteorhodop- sin with archaeal (BR, HR, and SR prefixes) and Neurospora crassa (NOP1 prefix) rhodopsins (16). Nomenclature: Name_Species.abbreviation_Genbank.gi (HR, halorhodopsin; SR, sensory rhodopsin; BR, bacteriorhodopsin). Halsod, Halorubrum sodomense; Halhal, Halobacterium salinarum (halo- bium); Halval, Haloarcula vallismortis; Natpha, Natronomonas pharaonis; Halsp, Halobacterium sp; Neucra, Neurospora crassa. wDownloadedfrom !38
  • 40. Bacteriorhodopsin and its relatives Figure 3. Phylogenetic tree based on the amino acid sequences of 25 archaeal rhodopsins. (a) NJ-tree. The numbers at each node are clustering probabilities generated by bootstrap resampling 1000 times. D1 and D2 represent gene duplication points. The four shaded rectangles indicate the speciation dates when halobacteria speciation occurred at the genus level. (b) ML-tree. Log likelihood value for ML-tree was −6579.02 (best score) and that for topology of the NJ-tree was −6583.43. The stippled bars indicate the 95% confidence limits. Both trees were tentatively rooted at the mid-point of the longest distance, although true root positions are unknown. From Ihara et al. 1999 !40
  • 42. Proteorhodopsin Predicted Secondary Structure Fig. 2. Secondary structure of proteo- rhodopsin. Single- letter amino acid codes are used (33), and the numbering is as in bacteriorho- dopsin. Predicted retinal binding pock- et residues are marked in red. R E S E A R C H A R T I C L E S !42
  • 43. The amino acid residues that form a retinal binding pocket in archaeal rhodopsins are also highly conserved in proteorhodopsin (Fig. 2). In par- ticular, the critical lysine residue in helix G, which forms the Schiff base linkage with retinal in archaeal rhodopsins, is present in proteorhodopsin. Analysis of a structural model of proteorhodopsin (14 ), in conjunction with multiple sequence alignments, indicates that the majority of active site residues are well conserved between proteorhodopsin and archaeal bacteriorhodopsins (15).
  • 44.
  • 45. The amino acid residues that form a retinal binding pocket in archaeal rhodopsins are also highly conserved in proteorhodopsin (Fig. 2). In par- ticular, the critical lysine residue in helix G, which forms the Schiff base linkage with retinal in archaeal rhodopsins, is present in proteorhodopsin. Analysis of a structural model of proteorhodopsin (14 ), in conjunction with multiple sequence alignments, indicates that the majority of active site residues are well conserved between proteorhodopsin and archaeal bacteriorhodopsins (15).
  • 46.
  • 47. Proteorhodopsin in E. coli duce in th occu pigm and ed at tiona sorp in 0. sorp botto nated retin ms d deca shift appe term cay o step singl upw amp gene with ms p reco phot prod Fig. 3. (A) Proteorhodopsin-expressing E. coli cell suspension (ϩ) compared to control cells (Ϫ), both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes (red) and a negative control (black). Time points for spectra after retinal addition, progressing from low to high absorbance values, are 10, 20, 30, and 40 min. !47
  • 48. Fig. 3. (A) Proteorhodopsin-expressing E. coli cell suspension (+) compared to control cells (-), both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes (red) and a negative control (black). Time points for spectra after retinal addition, progressing from low to high absorbance values, are 10, 20, 30, and 40 min.
  • 49. Proteorhodopsin was amplified from the 130-kb bac- terioplankton BAC clone 31A08 by polymerase chain reaction (PCR), using the primers 5 - ACCATGGGTA- AAT TAT TACTGATAT TAGG-3 and 5 -AGCAT TA- GAAGAT TCT T TAACAGC-3 . The amplified fragment was cloned with the pBAD TOPO TA Cloning Kit (Invitrogen). The protein was cloned with its native signal sequence and included an addition of the V5 epitope and a polyhistidine tail in the COOH-terminus. The same PCR product in the opposite orientation was used as a negative control. The protein was expressed in the E. coli outer membrane protease- deficient strain UT5600 [M. E. Elish, J. R. Pierce, C. F. Earhart, J. Gen. Microbiol. 134, 1355 (1988)] and induced with 0.2% arabinose for 3 hours. Membranes were prepared according to Shimono et al. [K. Shi- mono, M. Iwamoto, M. Sumi, N. Kamo, FEBS Lett. 420, 54 (1997)] and resuspended in 50 mM tris-Cl (pH 8.0) and 5 mM MgCl2. The absorbance spectrum was measured according to Bieszke et al. (7) in the presence of 10 M all-trans retinal.
  • 50. Proteorhodopsin function geneity in with 87% o ms photocy recovery. A photocycle produces a b this alternat well as a tw component, the total am with a 45-m amplitude). cycle rate, w teristic of strong evid tions as a tr rhodopsin. Implicat harbor the tributed in bacteria hav ture-indepen oceanic reg Oceans, as (8, 25–29). tribution, pr this ␥-prote both with all-trans retinal. (B) Absorption spectra of retinal-reconstituted proteorhodopsin in E. coli membranes (17). A time series of spectra is shown for reconstituted proteorhodopsin membranes (red) and a negative control (black). Time points for spectra after retinal addition, progressing from low to high absorbance values, are 10, 20, 30, and 40 min. Fig. 4. (A) Light-driven transport of protons by a proteorhodopsin-expressing E. coli cell suspension. The beginning and cessation of illumination (with yellow light Ͼ485 nm) is indicated by arrows labeled ON and OFF, respectively. The cells were suspended in 10 mM NaCl, 10 mM MgSO4⅐7H2O, and 100 ␮M CaCl2. (B) Transport of 3 Hϩ -labeled tetraphenylphosphonium ([3 Hϩ ]TPP) in E. coli right-side-out vesicles containing expressed proteorho- dopsin, reconstituted with (squares) or without (circles) 10 ␮M retinal in the presence of light (open symbols) or in the dark (solid symbols) (20). 15 SEPTEMBER 2000 VOL 289 SCIENCE www.sciencemag.org1904
  • 51. 20. Right-side-out membrane vesicles were prepared according to Kaback [H. R. Kaback, Methods Enzymol. 22, 99 (1971)]. Membrane electrical potential was measured with the lipophilic cation TPP by means of rapid filtration [E. Prossnitz, A. Gee, G. F. Ames, J. Biol. Chem. 264, 5006 (1989)] and was calculated according to Robertson et al. [D. E. Robertson, G. J. Kaczorowski, M. L. Garcia, H. R. Kaback, Biochemistry 19, 5692 (1980)]. No energy sources other than light were added to the vesicles, and the experiments were conducted at room temperature.
  • 52. Figure 5 Laser flash-induced absorbance changes in suspensions of E. coli membranes containing proteorhodopsin. A 532-nm pulse (6 ns duration, 40 mJ) was delivered at time 0, and absorption changes were monitored at various wavelengths in the visible range in a lab-constructed flash photolysis system as described (34). Sixty- four transients were collected for each wavelength. (A) Transients at the three wavelengths exhibiting maximal amplitudes. (B) Absorption difference spectra calculated from amplitudes at 0.5 ms (blue) and between 0.5 ms and 5.0 ms (red). http://www.sciencemag.org/content/289/5486/1902/F5.expansion.html
  • 53.
  • 54. Papers for Today Correspondence should be addressed to A.T. (e-mail: deia@dstu.univ-montp2.fr). ................................................................. Proteorhodopsin phototrophy in the ocean Oded BeÂjaÁ*², Elena N. Spudich²³, John L. Spudich³, Marion Leclerc* & Edward F. DeLong* * Monterey Bay Aquarium Research Institute, Moss Landing, California 95039, USA ³ Department of Microbiology and Molecular Genetics, The University of Texas Medical School, Houston, Texas 77030, USA ² These authors contributed equally to this work .............................................................................................................................................. Proteorhodopsin1 , a retinal-containing integral membrane pro- tein that functions as a light-driven proton pump, was discovered in the genome of an uncultivated marine bacterium; however, the prevalence, expression and genetic variability of this protein in native marine microbial populations remain unknown. Here we report that photoactive proteorhodopsin is present in oceanic surface waters. We also provide evidence of an extensive family of to that of recomb ¯ash-photolysis existence of pro retinal molecules waters. The photocyc and, after hydro adding all-trans indeed derive fro –4 –3 –2 –1 0 1 2 ΔAbsorbance(×10–3)0–3) 0 4 2 3 http://www.nature.com/nature/journal/v411/n6839/abs/411786a0.html
  • 55. NATURE |VOL 411 |14 JUNE 2001 |www.nature.com hn L. Spudich³, Marion Leclerc* ute, Moss Landing, California 95039, lar Genetics, The University of 30, USA work ................................................................... ining integral membrane pro- n proton pump, was discovered marine bacterium; however, the c variability of this protein in ons remain unknown. Here we odopsin is present in oceanic dence of an extensive family of psin variants. The protein pig- family seem to be spectrally orbing light at different wave- available in the environment. oteorhodopsin-based phototro- ic microbial process. taining membrane protein that n pump, was discovered nearly Halobacterium salinarum2 . The on contain a high concentration cked in an ordered two-dimen- tion of light, bacteriorhodopsin al shifts (a photocycle), causing a the membrane. The resulting al drives ATP synthesis, through indeed derive from retinylidene pigmentation. We conclude that –4 –3 –2 –1 0 1 2 ∆Absorbance(×10–3)∆Absorbance(×10–3) 806040200 Time (ms) 580 nm 500 nm 400 nm Wavelength (nm) 400 450 500 550 600 650 700 –3 –2 –1 0 1 2 3 Monterey Bay environmental membranes Proteorhodopsin in E. coli membranes Figure 1 Laser ¯ash-induced absorbance changes in suspensions of membranes prepared from the prokaryotic fraction of Monterey Bay surface waters. Top, membrane absorption was monitored at the indicated wavelengths and the ¯ash was at time 0 at 532 nm. Bottom, absorption difference spectrum at 5 ms after the ¯ash for the environmental sample (black) and for E. coli-expressed proteorhodopsin (red). © 2001 Macmillan Magazines Ltd
  • 56. NATURE |VOL 411 |14 JUNE 2001 |www.nature.com strongly suggests that this protein has a signi®cant role in the Laser flash Untreated membranes Hydroxylamine-treated membranes Retinal-reconstituted membranes 10–3 AU 10–1 s Figure 2 Laser ¯ash-induced transients at 500 nm of a Monterey Bay bacterioplankton membrane preparation. Top, before addition of hydroxylamine; middle, after 0.2 M hydroxylamine treatment at pH 7.0, 18 8C, with 500-nm illumination for 30 min; bottom, after centrifuging twice with resuspension in 100 mM phosphate buffer, pH 7.0, followed by addition of 5 mM all-trans retinal and incubation for 1 h. M M MB MB HOT HOT 0.01 HO PAL PAL B PAL B PAL E PAL PAL PAL PAL M M M M MB BA PAL E B HOT Figure 3 Phylogenetic analysis of the inferred proteorhodopsin genes. Distance analysis of 22 by neighbour-joining using the PaupSearch pr 10.0 (Genetics Computer Group; Madison, Wis was used as an outgroup, and is not shown. Sc per site. Bold names indicate the proteorhodop this study. © 2001 Macmillan Magazines Ltd
  • 57. of the surface of a single cell3 . This nt to produce substantial amounts of efore, the high density of proteorho- rane indicated by our calculations otein has a signi®cant role in the in amino-acid sequences were not restricted to the hydrophilic ed membranes e-treated membranes stituted membranes 10–3 AU 10–1 s at 500 nm of a Monterey Bay bacterioplankton tion of hydroxylamine; middle, after 0.2 M C, with 500-nm illumination for 30 min; bottom, n in 100 mM phosphate buffer, pH 7.0, followed incubation for 1 h. MB 0m2 MB 0m1 MB 20m2 MB 20m5 MB 40m12 MB 100m9 HOT 75m3 HOT 75m8 0.01 MontereyBayandshallowHOT AntarcticaanddeepHOT HOT 75m1 PAL B1 PAL B5 PAL B6 PAL E7 PAL E1 PAL B7 PAL B2 PAL B8 MB 100m10 MB 100m5 MB 100m7 MB 20m12 MB 40m1 MB 40m5 BAC 40E8 BAC 31A8 PAL E6 BAC 64A5 HOT 0m1 HOT 75m4 Figure 3 Phylogenetic analysis of the inferred amino-acid sequence of cloned proteorhodopsin genes. Distance analysis of 220 positions was used to calculate the tree by neighbour-joining using the PaupSearch program of the Wisconsin Package version 10.0 (Genetics Computer Group; Madison, Wisconsin). H. salinarum bacteriorhodopsin was used as an outgroup, and is not shown. Scale bar represents number of substitutions per site. Bold names indicate the proteorhodopsins that were spectrally characterized in this study. !57
  • 58. mechanisms of wavelength regu- hodopsins. Furthermore, palE6 larly to its Monterey Bay diated transport of protons in . Spudich et al., manuscript in rom Antarctica (E. F. DeLong, dopsin primers, and sequenced minary sequence analyses based indicate that, despite differences orption spectra, the Antarctic a bacterium highly related to AR86 bacteria of Monterey Bay10 nes from the HOT station were m samples. Most proteorhodop- surface waters belonged to the (on the amino-acid level) to the 40E8. In contrast, most of the ) fell within the Antarctic clade es from the HOT station, HOT d in E. coli and their absorption d from their primary sequences, ng with the Antarctic clade gave um, whereas the proteorhodop- clade gave a green (527-nm) central North Paci®c gyre, most ge, with maximal intensity near ak is maintained over depth, At the surface, the energy peak th between 400 and 650 nm. In narrows and the half bandwidth 00 nm (Fig. 5). Considering the members of the two different dopsin genetic variants in mixed populations with a selective advantage at different points along the depth-dependent light 0.06 0.05 0.04 0.03 0.02 0.01 1.0 0.8 0.6 0.4 0.2 0 350 400 75 m 5 m Monterey Bay HOT 0 m HOT 75 m Antarctica 450 500 Wavelength (nm) RelativeirradianceAbsorbance 550 600 650 Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after retinal addition. Bottom, downwelling irradiance from HOT station measured at six wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted relative to irradiance at 490 nm. !58
  • 59. 788 NATURE |VOL 411 |14 JUNE 2001 |www.nature.com whereas the total energy decreases. At the surface, the energy peak is very broad with a half bandwidth between 400 and 650 nm. In deeper water below 50 m, the peak narrows and the half bandwidth is restricted to between 450 and 500 nm (Fig. 5). Considering the different wavelengths absorbed by members of the two different Figure 4 Multiple alignment of proteorhodopsin amino-acid sequences. The secondary structure shown (boxes for transmembrane helices) is derived from hydropathy plots. Residues predicted to form the retinal-binding pocket are marked in red. palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after retinal addition. Bottom, downwelling irradiance from HOT station measured at six wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted relative to irradiance at 490 nm. © 2001 Macmillan Magazines Ltd !59
  • 60. More on Large Insert Metagenomics (e-mail: denning@atmos.colostate.edu). ................................................................. Unsuspected diversity among marine aerobic anoxygenic phototrophs Oded BeÂjaÁ*², Marcelino T. Suzuki*, John F. Heidelberg³, William C. Nelson³, Christina M. Preston*, Tohru Hamada§², Jonathan A. Eisen³, Claire M. Fraser³ & Edward F. DeLong* * Monterey Bay Aquarium Research Institute, Moss Landing, California 95039-0628, USA ³ The Institute for Genomic Research, Rockville, Maryland 20850, USA § Marine Biotechnology Institute, Kamaishi Laboratories, Kamaishi City, Iwate 026-0001, Japan .............................................................................................................................................. Aerobic, anoxygenic, phototrophic bacteria containing bacterio- chlorophyll a (Bchla) require oxygen for both growth and Bchla synthesis1±6 . Recent reports suggest that these bacteria are widely distributed in marine plankton, and that they may account for up to 5% of surface ocean photosynthetic electron transport7 and 8 b 98 6 http://www.nature.com/nature/journal/v415/n6872/full/415630a.html
  • 61. Erythrobacter litoralis* MBIC3019* Erythrobacter longus* Erythromicrobium ramosum cDNA 0m1 cDNA 20m21 Roseobacter denitrificans* Roseobacter litoralis* Rhodovulum sulfidophilum* 100 BAC 39B11 BAC 29C02 cDNA 20m10 BAC 52B02 BAC 65D09 BAC 24D02 99 100 100 54 Rhodoferax fermentans Allochromatium vinosum Roseateles depolymerans Rubrivivax gelatinosus cDNA 0m20 Rhodospirillum rubrum 56 96 52 67 Thiocystisge latinosa Rhodomicrobium vannielii Rhodocyclus tenuis Rhodopseudomonas palustris Chloroflexus aurantiacus Sphingomonas natatoria Rhodobacter sphaeroides Rhodobacter capsulatus BAC 56B12a b BAC 60D04 BAC 30G07 R2A84* R2A163* cDNA 20m22 cDNA 0m13 cDNA 20m8 R2A62* cDNA 20m11 0.1 98 72 99 100 76 61 73 100 82 96 β γ α-3 α-4 α–1 α-2 env20m1 env20m5 env0m2 MBIC3951* env0m1 envHOT1 envHOT2 envHOT3 100 99 100 100 100 Roseobacter denitrificans* Roseobacter litoralis* R2A84* 0.1 100 52 91 esearch, Rockville, Maryland 20850, USA ute, Kamaishi Laboratories, Kamaishi City, ........................................................................................ ototrophic bacteria containing bacterio- quire oxygen for both growth and Bchla rts suggest that these bacteria are widely ankton, and that they may account for up photosynthetic electron transport7 and obial community8 . Known planktonic belong to only a few restricted groups ia a-subclass. Here we report genomic thetic gene content and operon organiza- ng marine bacteria. These photosynthetic ome that most closely resembled those of b-subclass, which have never before been ronments. Furthermore, these photosyn- ly distributed in marine plankton, and itic bacterioplankton assemblages, indi- nti®ed phototrophs were photosyntheti- ta demonstrate that planktonic bacterial ply composed of one uniform, widespread otrophs, as previously proposed8 ; rather, ain multiple, distantly related, photo- erial groups, including some unrelated types. uired for the formation of bacteriochloro- ystems in aerobic, anoxygenic, photo- re clustered in a contiguous, 45-kilobase n (superoperon)6 . These include bch and crt nzymes of the bacteriochlorophyll and pathways, and the puf genes coding for harvesting complex (pufB and pufA) and ex (pufL and pufM). To better describe the planktonic, anoxygenic, photosynthetic surface-water bacterial arti®cial chromo- BAC 39B11 BAC 29C02 cDNA 20m10 BAC 52B02 BAC 65D09 BAC 24D02 99 100 100 54 67 Chloroflexus aurantiacus b envHOT2 envHOT3 100 Roseobacter denitrificans* Roseobacter litoralis* MBIC3951* R2A84* R2A62* Rhodovulum sulfidophilum* Rhodobacter sphaeroides Rhodobacter capsulatus Erythromicrobium ramosum Erythrobacter litoralis* MBIC3019* Sphingomonas natatoria Erythrobacter longus* Rhodomicrobium vannielii Rhodopseudomonas palustris Rhodospirillum rubrum Roseateles depolymerans Rubrivivax gelatinosus Rhodoferax fermentans Rhodocyclus tenuis Allochromatium vinosum Thiocystis gelatinosa Chloroflexus aurantiacus 0.1 α-3 α-4 α-2 α-1 β γ 100 100 100 100 100 100 69 98 61 65 100 82 95 69 100 52 91 100 95 Figure 1 Phylogenetic relationships of pufM gene (a) and rRNA (b) sequences of AAP bacteria. a, b, Evolutionary distances for the pufM genes (a) were determined from an alignment of 600 nucleotide positions, and for rRNA genes (b) from an alignment of 860 nucleotide sequence positions. Evolutionary relationships were determined by neighbour- joining analysis (see Methods). The green non-sulphur bacterium Chloro¯exus aurantiacus was used as an outgroup. pufM genes that were ampli®ed by PCR in this study are indicated by the env pre®x, with `m' indicating Monterey, and HOT indicating Hawaii ocean time series. Cultivated aerobes are marked in light blue, bacteria cultured from sea water are marked with an asterisk, and environmental cDNAs are marked in red. Photosynthetic a-, b- and g-proteobacterial groups are indicated by the vertical bars to rRNA PUF a, b, Evolutionary distances for the pufM genes (a) were determined from an alignment of 600 nucleotide positions, and for rRNA genes (b) from an alignment of 860 nucleotide sequence positions. Evolutionary relationships were determined by neighbour-joining analysis (see Methods). The green non-sulphur bacterium Chloroflexus aurantiacus was used as an outgroup. pufM genes that were amplified by PCR in this study are indicated by the env prefix, with 'm' indicating Monterey, and HOT indicating Hawaii ocean time series. Cultivated aerobes are marked in light blue, bacteria cultured from sea water are marked with an asterisk, and environmental cDNAs are marked in red. Photosynthetic -, - and - proteobacterial groups are indicated by the vertical bars to the right of the tree. Bootstrap values greater than 50% are indicated above the branches. The scale bar represents number of substitutions per site.
  • 62. NATURE |VOL 415 |7 FEBRUARY 2002 |www.nature.com 631 complementary DNA. However, a different primer targeting a from which the BACs originated are functional. BAC 65D09 crt bch puf hemN orf276 orf154 orf227 puhA lhaA bch ppsR bch Rubrivivax gelatinosus Rhodobacter sphaeroides AB L M C G PIH B N FM LBCDAE F C X Y Z C X Y Z L M C ABB C EI F H B N FM L G P HBNF MLGP J EBE IF D A I DCXYZLM BAX orf358 orf440 H B N FM LGPB C EI FDAID C X Y Z L MB AQO BAC 60D04 cycA Q orf641 OC tspO ppsR ppa bchpufbchcrt bchbch puhA lhaA orf213 orf128 orf277 orf292 idi idi Figure 2 Schematic comparison of photosynthetic operons from R. gelatinosus (b- proteobacteria), R. sphaeroides (a-proteobacteria) and uncultured environmental BACs. ORF abbreviations use the nomenclature de®ned in refs 13, 14 and 24. Predicted ORFs are coloured according to biological category: green, bacteriochloropyll biosynthesis genes; orange, carotenoid biosynthesis genes; red, light-harvesting and reaction centre genes; and blue, cytochrome c2. White boxes indicate non-photosynthetic and hypothetical proteins with no known function. Homologous regions and genes are connected by shaded vertical areas and lines, respectively. © 2002 Macmillan Magazines Ltd
  • 63. letters to nature We compared several photosynthetic genes found on the oceanic bacteriochlorophyll superoperons with characterized homologues from cultured photosynthetic bacteria. The relationships among Bchla biosynthetic proteins BchB (a subunit of light-independent protochlorophyllide reductase) and BchH (magnesium chelatase) were determined18 (Fig. 3). Similar to pufM relationships, BchB and operon) to cultivated Erythrobacter species. Recently, it has been suggested that cultivated Erythrobacter species (a-4 subclass of proteobacteria) may represent the pre- dominant AAPs in the upper ocean8,20 . Surprisingly, we were not able to retrieve photosynthetic operon genes belonging to this group in any of the samples analysed. Furthermore, very few sequences BAC 60D04 BAC 65D09 BAC 29C02 ba BchH 100/96 100/100 100/99 100/100 100/100 100/100 100/100 BAC 60D04 BAC 65D09 BAC 29C02 100/66 100/100 100/94 100/100 100/100 100/100 BchB 0.1 0.1 Acidiphilium rubrum Rhodobacter capsulatus Rhodobacter sphaeroides Rhodopseudomonas palustris Rubrivivax gelatinosus Chloroflexus aurantiacus Chlorobium tepidum Heliobacillus mobilis Rhodopseudomonas palustris Rhodobacter capsulatus Rhodobacter sphaeroides Acidiphilium rubrum Heliobacillus mobilis Chlorobium tepidum Chloroflexus aurantiacus Rubrivivax gelatinosus Figure 3 Phylogenetic analyses of BchB and BchH proteins. a, Phylogenetic tree for the BchB protein. b, Phylogenetic tree for the BchH protein. The BchH sequences from Chlorobium vibrioforme25 and BchH2 and BchH3 from C. tepidum18 were omitted from the tree because these genes potentially encode an enzyme for bacteriochlorophyll c biosynthesis and are probably of distinct origin (J. Xiong, personal communication). Bootstrap values (neighbour-joining/parsimony method) greater than 50% are indicated next to the branches. The scale bar represents number of substitutions per site. The position of Acidiphilium rubrum (bold branch) was not well resolved by both methods.
  • 64. opsins. Furthermore, palE6 to its Monterey Bay d transport of protons in udich et al., manuscript in Antarctica (E. F. DeLong, in primers, and sequenced ary sequence analyses based ate that, despite differences on spectra, the Antarctic cterium highly related to bacteria of Monterey Bay10 om the HOT station were mples. Most proteorhodop- ce waters belonged to the he amino-acid level) to the . In contrast, most of the within the Antarctic clade om the HOT station, HOT E. coli and their absorption m their primary sequences, th the Antarctic clade gave whereas the proteorhodop- e gave a green (527-nm) al North Paci®c gyre, most with maximal intensity near s maintained over depth, he surface, the energy peak tween 400 and 650 nm. In ows and the half bandwidth m (Fig. 5). Considering the mbers of the two different advantage at different points along the depth-dependent light 0.06 0.05 0.04 0.03 0.02 0.01 1.0 0.8 0.6 0.4 0.2 0 350 400 75 m 5 m Monterey Bay HOT 0 m HOT 75 m Antarctica 450 500 Wavelength (nm) RelativeirradianceAbsorbance 550 600 650 Figure 5 Absorption spectra of retinal-reconstituted proteorhodopsins in E. coli membranes. All-trans retinal (2.5 mM) was added to membrane suspensions in 100 mM phosphate buffer, pH 7.0, and absorption spectra were recorded. Top, four spectra for palE6 (Antarctica), HOT 75m4, HOT 0m1, and BAC 31A8 (Monterey Bay) at 1 h after retinal addition. Bottom, downwelling irradiance from HOT station measured at six wavelengths (412, 443, 490, 510, 555 and 665 nm) and at two depths, for the same depths and date that the HOT samples were collected (0 and 75 m). Irradiance is plotted relative to irradiance at 490 nm.
  • 65. Community structure and metabolism through reconstruction of microbial genomes from the environment Gene W. Tyson1 , Jarrod Chapman3,4 , Philip Hugenholtz1 , Eric E. Allen1 , Rachna J. Ram1 , Paul M. Richardson4 , Victor V. Solovyev4 , Edward M. Rubin4 , Daniel S. Rokhsar3,4 & Jillian F. Banfield1,2 1 Department of Environmental Science, Policy and Management, 2 Department of Earth and Planetary Sciences, and 3 Department of Physics, University of California, Berkeley, California 94720, USA 4 Joint Genome Institute, Walnut Creek, California 94598, USA ........................................................................................................................................................................................................................... Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level. The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment. The study of microbial evolution and ecology has been revolutio- nized by DNA sequencing and analysis1–3 . However, isolates have been the main source of sequence data, and only a small fraction of microorganisms have been cultivated4–6 . Consequently, focus has shifted towards the analysis of uncultivated microorganisms via cloning of conserved genes5 and genome fragments directly from the environment7–9 . To date, only a small fraction of genes have been recovered from individual environments, limiting the analysis of fluorescence in situ hybridization (FISH) revealed that all biofilms contained mixtures of bacteria (Leptospirillum, Sulfobacillus and, in a few cases, Acidimicrobium) and archaea (Ferroplasma and other members of the Thermoplasmatales). The genome of one of these archaea, Ferroplasma acidarmanus fer1, isolated from the Richmond mine, has been sequenced previously (http://www.jgi.doe.gov/JGI_ microbial/html/ferroplasma/ferro_homepage.html). A pink biofilm (Fig. 1a) typical of AMD communities was articles !65 Environmental Genome Shotgun Sequencing of the Sargasso Sea J. Craig Venter,1 * Karin Remington,1 John F. Heidelberg,3 Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3 Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3 Derrick E. Fouts,3 Samuel Levy,2 Anthony H. Knap,6 Michael W. Lomas,6 Ken Nealson,5 Owen White,3 Jeremy Peterson,3 Jeff Hoffman,1 Rachel Parsons,6 Holly Baden-Tillson,1 Cynthia Pfannkoch,1 Yu-Hui Rogers,4 Hamilton O. Smith1 chlorococcus, tha photosynthetic bio Surface water were collected ab from three sites o February 2003. A lected aboard the S station S” in May are indicated on F S1; sampling prot one expedition to was extracted from genomic libraries w 2 to 6 kb were m prepared plasmid RESEARCH ARTICLE
  • 67. Acid Mine Drainage 2004 clones). The first step in assignment of scaffolds to organism types was to represent a nearly complete genome of uncultured Ferroplasma species distinct this as Ferroplasma type II. The dominan was unexpected before the genomic analy We assigned the roughly 3£ coverage Leptospirillum group III on the basis of rRN up to 31 kb, totalling 2.66 Mb). Comparis those assigned to Leptospirillum group sequence divergence and only locally co firming that the scaffolds belong to a rel Leptospirillum group II. A partial 16S rR Sulfobacillus thermosulfidooxidans was assembled reads, suggesting very low cov any Sulfobacillus scaffolds .2 kb were a grouped with the Leptospirillum group II We compared the 3£ coverage, low Gþ 4.12 Mb) to the fer1 genome in order to types (Supplementary Fig. S6). Scaffold identity to fer1 were assigned to an enviro I genome (170 scaffolds up to 47 kb in 1.48 Mb of sequence). The remaining scaffolds are tentatively assigned to G-pla in this bin (62 kb) contains the G-plasma scaffolds assigned to G-plasma comprise partial 16S rRNA gene sequence from A-pl unassembled reads, suggesting low covera scaffolds from A-plasma .2 kb would be bin. Although eukaryotes are present in th in low abundance in the biofilm studied. eukaryotes have been detected. As independent evidence that the Lep Ferroplasma type II genomes are nearly co complement of transfer RNA synthetases An almost complete set of these genes Leptospirillum group III. The G-plasma bin set of tRNA synthetases, consistent with in scaffolds. In addition, we established group II, Leptospirillum group III, Ferrop type II and G-plasma bins contained onl Figure 1 The pink biofilm. a, Photograph of the biofilm in the Richmond mine (hand included for scale). b, FISH image of a. Probes targeting bacteria (EUBmix; fluorescein isothiocyanate (green)) and archaea (ARC915; Cy5 (blue)) were used in combination with a probe targeting the Leptospirillum genus (LF655; Cy3 (red)). Overlap of red and green (yellow) indicates Leptospirillum cells and shows the dominance of Leptospirillum. c, Relative microbial abundances determined using quantitative FISH counts. NATURE | doi:10.1038/na2 ©2004 NaturePublishing Group !67
  • 68. Bassham cycle (using type II ribulose 1,5-bisphosphate carboxy- lase–oxygenase). All genomes recovered from the AMD system by some, or all, organisms. Given the large number of ABC-type sugar and amino acid transporters encoded in the Ferroplasma type Figure 4 Cell metabolic cartoons constructed from the annotation of 2,180 ORFs identified in the Leptospirillum group II genome (63% with putative assigned function) and 1,931 ORFs in the Ferroplasma type II genome (58% with assigned function). The cell drainage stream (viewed in cross-section). Tight coupling between ferrous iron oxidation, pyrite dissolution and acid generation is indicated. Rubisco, ribulose 1,5-bisphosphate carboxylase–oxygenase. THF, tetrahydrofolate. !68
  • 69. Environmental Genome Shotgun Sequencing of the Sargasso Sea J. Craig Venter,1 * Karin Remington,1 John F. Heidelberg,3 Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3 Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3 Derrick E. Fouts,3 Samuel Levy,2 Anthony H. Knap,6 Michael W. Lomas,6 Ken Nealson,5 Owen White,3 Jeremy Peterson,3 Jeff Hoffman,1 Rachel Parsons,6 Holly Baden-Tillson,1 Cynthia Pfannkoch,1 Yu-Hui Rogers,4 Hamilton O. Smith1 We have applied “whole-genome shotgun sequencing” to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These data are estimated to derive from at least 1800 genomic species based on sequence relatedness, including 148 previously unknown bacterial phylotypes. We have identified over 1.2 million previously unknown genes represented in these samples, including more than 782 new chlorococcus, th photosynthetic bi Surface wate were collected a from three sites February 2003. A lected aboard the station S” in Ma are indicated on S1; sampling pro one expedition to was extracted fro genomic libraries 2 to 6 kb were prepared plasmid both ends to prov Craig Venter Sc nology Center on ers (Applied Bi Whole-genome ra the Weatherbird II 4) produced 1.66 in length, for a tota microbial DNA se sequences were g RESEARCH ARTICLE !69 http://www.sciencemag.org/content/304/5667/66
  • 70. Sargasso Sea two groups of scaffolds representing two dis- tinct strains closely related to the published at depths ranging from 4ϫ to 36ϫ (indicated with shading in table S3 with nine depicted in Fig. 1. MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003. The station locations are overlain with their respective identifications. Note the elevated levels of chlorophyll (green color shades) around station 3, which are not present around stations 11 and 13. Fig. 2. Gene conser- vation among closely !70 http://www.sciencemag.org/content/304/5667/66
  • 71. frames (5). A total of 69,901 novel genes be- longing to 15,601 single link clusters were iden- tified. The predicted genes were categorized Fig. 5. Prochlorococcus-related scaffold 2223290 illustrates the assembly of a broad commu- nity of closely related organisms, distinctly nonpunctate in nature. The image represents (A) global structure of Scaffold 2223290 with respect to assembly and (B) a sample of the multiple sequence alignment. Blue segments, contigs; green segments, fragments; and yellow segments, stages of the assembly of fragments into the resulting contigs. The yellow bars indicate that fragments were initially assembled in several different pieces, which in places collapsed to form the final contig structure. The multiple sequence alignment for this region shows a homogenous blend of haplotypes, none with sufficient depth of coverage to provide a separate assembly. Table 1. Gene count breakdown by TIGR role category. Gene set includes those found on as- semblies from samples 1 to 4 and fragment reads from samples 5 to 7. A more detailed table, sep- arating Weatherbird II samples from the Sorcerer II samples is presented in the SOM (table S4). Note that there are 28,023 genes which were classified in more than one role category. TIGR role category Total genes Amino acid biosynthesis 37,118 Biosynthesis of cofactors, prosthetic groups, and carriers 25,905 Cell envelope 27,883 Cellular processes 17,260 Central intermediary metabolism 13,639 DNA metabolism 25,346 Energy metabolism 69,718 Fatty acid and phospholipid metabolism 18,558 Mobile and extrachromosomal element functions 1,061 Protein fate 28,768 Protein synthesis 48,012 Purines, pyrimidines, nucleosides, and nucleotides 19,912 Regulatory functions 8,392 Signal transduction 4,817 Transcription 12,756 Transport and binding proteins 49,185 Unknown function 38,067 Miscellaneous 1,864 Conserved hypothetical 794,061 Total number of roles assigned 1,242,230 Total number of genes 1,214,207 www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004 69 !71 We closely examined the multiple sequence alignments of the contigs with high SNP rates and were able to classify these into two fairly distinct classes: regions where several closely related haplotypes have been collapsed, in- creasing the depth of coverage accordingly (10), and regions that appear to be a relatively homogenous blend of discrepancies from the consensus without any apparent separation into haplotypes, such as the Prochlorococcus scaf- fold region (Fig. 5). Indeed, the Prochlorococ- cus scaffolds display considerable heterogene- ity not only at the nucleotide sequence level (Fig. 5) but also at the genomic level, where multiple scaffolds align with the same region of the MED4 (11) genome but differ due to gene or genomic island insertion, deletion, rearrange- ment events. This observation is consistent with previous findings (12). For instance, scaffolds 2221918 and 2223700 share gene synteny with each other and MED4 but differ by the insertion of 15 genes of probable phage origin, likely representing an integrated bacteriophage. These genomic differences are displayed graphically in Fig. 2, where it is evident that up to four conflicting scaffolds can align with the same region of the MED4 genome. More than 85% of the Prochlorococcus MED4 genome can be aligned with Sargasso Sea scaffolds greater than 10 kb; however, there appear to be a couple of regions of MED4 that are not repre- sented in the 10-kb scaffolds (Fig. 2). The larger of these two regions (PMM1187 to PMM1277) consists primarily of a gene cluster coding for surface polysaccharide biosynthesis, which may represent a MED4-specific polysac- charide absent or highly diverged in our Sar- gasso Sea Prochlorococcus bacteria. The heter- ogeneity of the Prochlorococcusscaffolds suggest that the scaffolds are not derived from a single discrete strain, but instead probably represent a conglomerate assembled from a population of closely related Prochlorococcus biotypes. The gene complement of the Sargasso. The heterogeneity of the Sargasso sequences complicates the identification of microbial genes. The typical approach for microbial an- notation, model-based gene finding, relies en- tirely on training with a subset of manually eate scaffold borders. Fig. 4. Circular diagrams of nine complete megaplasmids. Genes encoded in the forward direction are shown in the outer concentric circle; reverse coding genes are shown in the inner concentric circle. The genes have been given role category assignment and colored accordingly: amino acid biosynthesis, violet; biosynthesis of cofactors, prosthetic groups, and carriers, light blue; cell envelope, light green; cellular processes, red; central intermediary metabolism, brown; DNA metabolism, gold; energy metabolism, light gray; fatty acid and phospholipid metabolism, magenta; protein fate and protein synthesis, pink; purines, pyrimidines, nucleosides, and nucleotides, orange; regulatory functions and signal transduction, olive; transcription, dark green; transport and binding proteins, blue-green; genes with no known homology to other proteins and genes with homology to genes with no known function, white; genes of unknown function, gray; Tick marks are placed on 10-kb intervals. 2 APRIL 2004 VOL 304 SCIENCE www.sciencemag.org68 assembly to identify a set of large, deeply as- sembling nonrepetitive contigs. This was used to set the expected coverage in unique regions (to 23ϫ) for a final run of the assembler. This al- lowed the deep contigs to be treated as unique sequence when they would otherwise be labeled as repetitive. We evaluated our final assembly results in a tiered fashion, looking at well-sampled genomic regions separately from those barely sampled at our current level of sequencing. The 1.66 million sequences from the Weatherbird II samples (table S1; samples 1 to 4; stations 3, 11, and 13), were pooled and assembled to provide a single master assembly for comparative purposes. The assembly gener- ated 64,398 scaffolds ranging in size from 826 bp to 2.1 Mbp, containing 256 Mbp of unique sequence and spanning 400 Mbp. After assem- bly, there remained 217,015 paired-end reads, or “mini-scaffolds,” spanning 820.7 Mbp as well as an additional 215,038 unassembled sin- gleton reads covering 169.9 Mbp (table S2, column 1). The Sorcerer II samples provided almost no assembly, so we consider for these samples only the 153,458 mini-scaffolds, span- ning 518.4 Mbp, and the remaining 18,692 singleton reads (table S2, column 2). In total, 1.045 Gbp of nonredundant sequence was gen- erated. The lack of overlapping reads within the unassembled set indicates that lack of addition- al assembly was not due to algorithmic limita- tions but to the relatively limited depth of se- quencing coverage given the level of diversity within the sample. The whole-genome shotgun (WGS) assembly has been deposited at DDBJ/EMBL/GenBank under the project accession AACY00000000, and all traces have been deposited in a corre- sponding TraceDB trace archive. The version described in this paper is the first version, AACY01000000. Unlike a conventional WGS entry, we have deposited not just contigs and scaffolds but the unassembled paired singletons and individual singletons in order to accurate- ly reflect the diversity in the sample and allow searches across the entire sample with- in a single database. Genomes and large assemblies. Our analysis first focused on the well-sampled ge- nomes by characterizing scaffolds with at least 3ϫ coverage depth. There were 333 scaffolds comprising 2226 contigs and spanning 30.9 Mbp that met this criterion (table S3), account- ing for roughly 410,000 reads, or 25% of the pooled assembly data set. From this set of well- sampled material, we were able to cluster and classify assemblies by organism; from the rare species in our sample, we used sequence similar- ity based methods together with computational gene finding to obtain both qualitative and quan- titative estimates of genomic and functional diver- sity within this particular marine environment. We employed several criteria to sort the major assembly pieces into tentative organism “bins”; these include depth of coverage, oligo- nucleotide frequencies (7), and similarity to previously sequenced genomes (5). With these techniques, the majority of sequence assigned to the most abundant species (16.5 Mbp of the 30.9 Mb in the main scaffolds) could be sepa- rated based on several corroborating indicators. In particular, we identified a distinct group of scaffolds representing an abundant population clearly related to Burkholderia (fig. S2) and two groups of scaffolds representing two dis- tinct strains closely related to the published Shewanella oneidensis genome (8) (fig. S3). There is a group of scaffolds assembling at over 6ϫ coverage that appears to represent the ge- nome of a SAR86 (table S3). Scaffold sets representing a conglomerate of Prochlorococ- cus strains (Fig. 2), as well as an uncultured marine archaeon, were also identified (table S3; Fig. 3). Additionally, 10 putative mega plasmids were found in the main scaffold set, covered at depths ranging from 4ϫ to 36ϫ (indicated with shading in table S3 with nine depicted in Fig. 1. MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003. The station locations are overlain with their respective identifications. Note the elevated levels of chlorophyll (green color shades) around station 3, which are not present around stations 11 and 13. Fig. 2. Gene conser- vation among closely related Prochlorococ- cus. The outermost concentric circle of the diagram depicts the competed genom- ic sequence of Pro- chlorococcus marinus MED4 (11). Fragments from environmental sequencing were com- pared to this complet- ed Prochlorococcus ge- nome and are shown in the inner concentric circles and were given boxed outlines. Genes for the outermost cir- cle have been as- signed psuedospec- trum colors based on the position of those genes along the chro- mosome, where genes nearer to the start of the genome are col- ored in red, and genes nearer to the end of the genome are colored in blue. Fragments from environmental sequencing were subjected to an analysis that identifies conserved gene order between those fragments and the completed Prochlorococcus MED4 genome. Genes on the environmental genome segments that exhibited conserved gene order are colored with the same color assignments as the Prochlorococcus MED4 chromosome. Colored regions on the environmental segments exhibiting color differences from the adjacent outermost concentric circle are the result of conserved gene order with other MED4 regions and probably represent chromosomal rearrangements. Genes that did not exhibit conserved gene order are colored in black. www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004 67http://www.sciencemag.org/content/304/5667/66
  • 72. rRNA phylotyping from metagenomics !72 http://www.sciencemag.org/ content/304/5667/66
  • 73. Shotgun Sequencing Allows Alternative Anchors (e.g., RecA) !73 http://www.sciencemag.org/ content/304/5667/66
  • 74. nomic group using the phylogenetic analysis described for rRNA. For example, our data set marker genes, is roughly comparable to the 97% cutoff traditionally used for rRNA. Thus Fig. 6. Phylogenetic diversity of Sargasso Sea sequences using multiple phylogenetic markers. The relative contribution of organisms from different major phylogenetic groups (phylotypes) was measured using multiple phylogenetic markers that have been used previously in phylogenetic studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB). The relative proportion of different phylotypes for each sequence (weighted by the depth of coverage of the contigs from which those sequences came) is shown. The phylotype distribution was determined as follows: (i) Sequences in the Sargasso data set corresponding to each of these genes were identified using HMM and BLAST searches. (ii) Phylogenetic analysis was performed for each phylogenetic marker identified in the Sargasso data separately compared with all members of that gene family in all complete genome sequences (only complete genomes were used to control for the differential sampling of these markers in GenBank). (iii) The phylogenetic affinity of each sequence was assigned based on the classification of the nearest neighbor in the phylogenetic tree. RIL 2004 VOL 304 SCIENCE www.sciencemag.org !74 http://www.sciencemag.org/ content/304/5667/66
  • 75. Functional Inference from Metagenomics • Can work well for individual genes !75
  • 76. Functional Diversity of Proteorhodopsins? !76 http://www.sciencemag.org/ content/304/5667/66
  • 77. ARTICLE Received 3 Nov 2010 | Accepted 11 Jan 2011 | Published 8 Feb 2011 DOI: 10.1038/ncomms1188 Proteorhodopsins are light-driven proton pumps involved in widespread phototrophy. Discovered in marine proteobacteria just 10 years ago, proteorhodopsins are now known to have been spread by lateral gene transfer across diverse prokaryotes, but are curiously absent from eukaryotes. In this study, we show that proteorhodopsins have been acquired by horizontal gene transfer from bacteria at least twice independently in dinoflagellate protists. We find that in the marine predator Oxyrrhis marina, proteorhodopsin is indeed the most abundantly expressed nuclear gene and its product localizes to discrete cytoplasmic structures suggestive of the endomembrane system. To date, photosystems I and II have been the only known mechanism for transducing solar energy in eukaryotes; however, it now appears that some abundant zooplankton use this alternative pathway to harness light to A bacterial proteorhodopsin proton pump in marine eukaryotes Claudio H. Slamovits1,† , Noriko Okamoto1 , Lena Burri1,† , Erick R. James1 & Patrick J. Keeling1 http://www.nature.com/ncomms/journal/v2/n2/abs/ncomms1188.html
  • 78. ARTICLENATURE COMMUNICATIONS | DOI: 10.1038/ncomms1188 NATURE COMMUNICATIONS | 2:183 | DOI: 10.1038/ncomms1188 | www.nature.com/naturecommunications © 2011 Macmillan Publishers Limited. All rights reserved. 50 100 Neurospora crassa 51 Aspergillus fumigatus Gibberella zeae 90 ‘Oryza_sativa’ 95 Leptosphaeria maculans 97 Pyrenophora tritici Bipolaris oryzae 100 Oxyrrhis marina OM2197 Dinoflagellate 3 Dinoflagellate 1 Dinoflagellate 2 O. marina OM1497 Cyanophora paradoxa 80 Guillardia theta 1490 99 100 Rhodomonas salina 987 G. theta t309 62 Cryptomonas sp. G. tetha 2135 100 G. theta 2 G. theta 1712 99 Haloterrigena sp. 68 Haloarcula argentinensis Haloquadratum walsbyi 100 Nostoc sp. 64 Natronomonas pharaonis Halorubrum sodomense 62 Haloterrigena sp. Halobacterium salinarum 100 89 58 Salinibacter ruber (Xanthorhodopsin) uncult. marine picoplankton (Hawaii) Thermus aquaticus Gloeobacter violaceus Roseiflexus sp. 52 100 Env. SS AACY01492695 Env. SS AACY01598379 99 GS012-3 73 GS020-43 98 GS020-39 GS020-70 99 GS033-3 56 100 GS012-21 53 MWH-Dar4 MWH-Dar1 87 GS012-37 98 GS020-82 GS012-39 EgelM2-3.D GS012-14 86 100 uncult. marine bacterium EB0_41B09 (Monterey Bay) 79 Methylophilales bacterium Beta proteobacterium KB1 59 Marinobacter sp. Alpha proteobacterium BAL199 Octadecabacter antarcticus 92 100 uncult. GOS 7929496 uncult. GOS ECY53283 100 Alexandrium catenella 53 Pyrocystis lunula ‘Amphioxus’ 83 O. marina 27 91 O. marina 344 O. marina 5001 91 O. marina 173 O. marina 72 55 uncult. marine picoplankton 2 (Hawaii) 65 75 Exiguobacterium sibiricum 100 GS013_1 GS033_122 91 100 Env. SS AACY01002357 98 Tenacibaculum sp. Psychroflexus torquis 95 Env. SS AACY01054069 76 uncult. marine group II euryarchaeote HF70_39H11 69 52 uncult. gamma proteobacterium eBACHOT4E07 Env. SS AACY01043960 86 uncult. marine bacterium HF130_81H07 Env. SS AACY01120795 100 Env. SS AACY01118955 79 uncult. marine proteobacterium ANT32C12 77 uncult. bacterium MedeBAC35C06 99 uncult. bacterium uncult. marine gamma proteobacterium EBAC31A08 51 uncult. marine alpha proteobacterium HOT2C01 100 uncult. marine bacterium EB0_41B09 Photobacterium sp. 50 Env. SS AACY01499050 70 uncult. marine gamma proteobacterium HTCC2207 uncult. bacterium eBACred22E04 100 Karlodinium micrum 02 K. micrum 01 100 uncult. bacterium eBACmed86H08 Pelagibacter ubique 50 Env. SS AACY01015111 uncult. marine bacterium 66A03 80 100 GS033_85 Vibrio harveyi 76 Rhodobacterales bacterium uncult. bacterium MedeBAC82F10 55 uncult. marine bacterium 66A03 (2) uncult. bacterium MEDPR46A6 Fungal ( H+ / ?) Halorhodopsins (H+ / Cl – ) Algal opsins (sensory) Proteorhodopsins (H+) Figure 1 | Phylogenetic distribution of dinoflagellate rhodopsins. Protein sequences of 96 rhodopsins encompassing the known diversity of microbial (type I) rhodopsins from the three domains of life10 were used to generate a maximum likelihood phylogenetic tree (See Supplementary Table S1 for accession numbers). Grey boxes distinguish the recognized groupings of type 1 rhodopsin, and the prevalent function in each group is shown: proton pumps (H+ ), sensory, chlorine pumps (Cl− ) and unknown (?). Numbers indicate bootstrap support when 50% (over 300 replicates). Black boxes highlight dinoflagellate genes: Dinoflagellate 1 is a large group of proteorhodopsins present in diverse dinoflagellates; dinoflagellate 2 includes two K. micrum proteorhodopsin genes of independent origin; dinoflagellate 3 includes two O. marina genes related to algal sensory rhodopsins, probably of endosymbiotic origin.