4. Social Networking in Science
HOME PAGE MY TIMES TODAY'S PAPER VIDEO MOST POPULAR TIMES TOPICS Welcome, fcollins Member Center Log Out
Sunday, April 1, 2007 Health
WORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE AUTOS
FITNESS & NUTRITION HEALTH CARE POLICY MENTAL HEALTH & BEHAVIOR
Scientist Reveals Secret of the Ocean: It's Him
By NICHOLAS WADE
Published: April 1, 2007
PRINT nytimes.com/sports
Maverick scientist J. Craig Venter has done it again. It was just a few years SINGLE-PAGE
ago that Dr. Venter announced that the human genome sequenced by Celera
SAVE
Genomics was in fact, mostly his own. And now, Venter has revealed a second
SHARE
twist in his genomic self-examination. Venter was discussing his Global
SHARE
Ocean Voyage, in which he used his personal yacht to collect ocean water
samples from around the world. He then used large filtration units to collect How good is your bracket? Compare your tournament picks
to choices from members of The New York Times sports
microbes from the water samples which were then brought back to his high desk and other players.
tech lab in Rockville, MD where he used the same methods that were used to Also in Sports:
The Bracket Blog - all the news leading up to the Final
sequence the human genome to study the genomes of the 1000s of ocean Four
dwelling microbes found in each sample. In discussing the sampling methods, Venter let slip his Bats Blog: Spring training updates
Play Magazine: How to build a super athlete
latest attack on the standards of science – some of the samples were in fact not from the ocean, but
were from microbial habitats in and on his body.
“The human microbiome is the next frontier,” Dr. Venter said. “The ocean voyage was just a cover.
My main goal has always been to work on the microbes that live in and on people. And now that my
genome is nearly complete, why not use myself as the model for human microbiome studies as well.
”
It is certainly true that in the last few years, the microbes that live in and on people have become a
hot research topic. So hot that the same people who were involved in the race to sequence the human
genome have been involved in this race too. Francis Collins, Venter main competitor and still the
Friday, January 28, 2011
director of the National Human Genome Research Institute (NHGRI), recently testified before
7. Research Areas
Mechanisms of
Origin of New
Functions
Friday, January 28, 2011
8. Research Areas
Mechanisms of Variation in
Origin of New Mechanisms:
Functions Patterns, Causes
and Effects
Friday, January 28, 2011
9. Research Areas
Mechanisms of Variation in
Origin of New Mechanisms:
Functions Patterns, Causes
and Effects
Species Evolution
Friday, January 28, 2011
10. Research Areas
• Study the evolution of function
• Make extensive use of genome
sequence data
• Requires integration of
experimental information and
genome analysis
• Categorize and classify ways
that novelty originates (examples)
• Duplication and divergence
• Recombination
• Simple substitutions
• Gene transfer
Friday, January 28, 2011
11. Research Areas
• Study the evolution of function
• Make extensive use of genome
Mechanisms of sequence data
Origin of New • Requires integration of
Functions experimental information and
genome analysis
• Categorize and classify ways
that novelty originates (examples)
• Duplication and divergence
• Recombination
• Simple substitutions
• Gene transfer
Friday, January 28, 2011
12. Research Areas
• Patterns of variation
• Within species
• Between species
• Comparative genomics plays
important role
• Causes
• Variation in RRR
• Regulatory complexity
• Effects
• Differences in evolvability
• Ecological niche
• Short and long term genome
evolution
Friday, January 28, 2011
13. Research Areas
• Patterns of variation
• Within species Variation in
• Between species Mechanisms:
• Comparative genomics plays Patterns, Causes
important role and Effects
• Causes
• Variation in RRR
• Regulatory complexity
• Effects
• Differences in evolvability
• Ecological niche
• Short and long term genome
evolution
Friday, January 28, 2011
14. Research Areas
• Information needed to distinguish convergence from
homology
• Allows inference of rates and patterns of change
• Allows one to determine if something is a “one time” event
or a common theme in many lineages
Friday, January 28, 2011
15. Research Areas
• Information needed to distinguish convergence from
homology
• Allows inference of rates and patterns of change
• Allows one to determine if something is a “one time” event
or a common theme in many lineages
Species Evolution
Friday, January 28, 2011
16. Phylogenomics of Novelty
Variation in
Mechanisms of
Mechanisms:
Origin of New
Patterns, Causes
Functions
and Effects
Species Evolution
Friday, January 28, 2011
17. Why do this?
• Discover causes and effects of differences in
evolvability
• Improve predictions from genome analysis
• Guide interpretation of biological dat
Friday, January 28, 2011
18. My microbial evolution obsessions
• Introduction
• Phylogenomic Stories
– Within genome invention of novelty
– Stealing novelty
– Community service
Friday, January 28, 2011
19. Introduction
Genome Sequencing
Friday, January 28, 2011
28. Origin of New Functions
• Many different processes
contribute to the origin of novelty
• Denovo invention of new
genes
• Simple substitutions within
existing genes
• Duplication and divergence
• Domain swapping
• Genome rearrangements
• Regulatory changes
Friday, January 28, 2011
29. Origin of New Functions
• Many different processes
contribute to the origin of novelty
Mechanisms of • Denovo invention of new
Origin of New genes
Functions • Simple substitutions within
existing genes
• Duplication and divergence
• Domain swapping
• Genome rearrangements
• Regulatory changes
Friday, January 28, 2011
30. Phylogenomics of Novelty
Variation in
Mechanisms of
Mechanisms:
Origin of New
Patterns, Causes
Functions
and Effects
Species Evolution
Friday, January 28, 2011
31. Example I:
Mutation Rates and Functional
Predictions
Friday, January 28, 2011
32. From Eisen et al.
1997 Nature
Medicine 3:
1076-1078.
Friday, January 28, 2011
33. Blast Search of H. pylori “MutS”
• Blast search pulls up Syn. sp MutS#2 with much higher p
value than other MutS homologs
• Based on this TIGR predicted this species had mismatch
repair
• Assumes functional constancy
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.
Friday, January 28, 2011
34. MutL??
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.
Friday, January 28, 2011
35. Phylogenetic Tree of MutS Family
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Yeast
Human Borbu Metth
Celeg
mSaco
Yeast
Human Yeast
Mouse
Arath Celeg
Human
Arath
Human
Mouse
Spombe Fly
Yeast Xenla
Rat
Mouse
Yeast Human
Spombe Yeast
Neucr
Arath
Aquae Trepa
Chltr
DeiraTheaq
Thema BacsuBorbu Based on Eisen,
SynspStrpy 1998 Nucl Acids
Ecoli
Neigo Res 26: 4291-4300.
Friday, January 28, 2011
36. MutS Subfamilies
MSH5 MutS2
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Yeast
Human Borbu Metth
Celeg
mSaco
MSH6 Yeast
Human
Mouse
Arath
Yeast MSH4
Celeg
Human
Arath
Human
MSH3 Mouse
Fly
Spombe
Yeast Xenla
Rat
Mouse
Yeast
MSH1 Spombe
Human
Yeast
MSH2
Neucr
Arath
Aquae Trepa
Chltr
DeiraTheaq
BacsuBorbu
Thema
SynspStrpy
Ecoli
Neigo Based on Eisen,
1998 Nucl Acids
MutS1
Res 26: 4291-4300.
Friday, January 28, 2011
37. MutS Subfamilies
• MutS1
Bacterial MMR
• MSH1
Euk - mitochondrial MMR
• MSH2
Euk - all MMR in nucleus
• MSH3
Euk - loop MMR in nucleus
• MSH6
Euk - base:base MMR in nucleus
• MutS2
Bacterial - function unknown
• MSH4
Euk - meiotic crossing-over
• MSH5
Euk - meiotic crossing-over
Friday, January 28, 2011
38. Overlaying Functions onto Tree
MutS2
Aquae
MSH5 Strpy
Bacsu
Synsp
Deira Helpy
Yeast
Human Borbu Metth
Celeg
MSH6 mSaco
Yeast
Human
Mouse
Arath
Yeast MSH4
Celeg
Human
Arath
Human
MSH3 Mouse
Fly
Spombe
Yeast Xenla
Rat
Mouse
Yeast Human
MSH1 Spombe Yeast MSH2
Neucr
Arath
Aquae Trepa
Chltr
DeiraTheaq
BacsuBorbu
Thema
SynspStrpy Based on Eisen,
Ecoli
Neigo
1998 Nucl Acids
MutS1 Res 26: 4291-4300.
Friday, January 28, 2011
39. Functional Prediction Using Tree
MSH5 - Meiotic Crossing Over MutS2 - Unknown Functions
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Yeast
Human Borbu Metth
Celeg
MSH6 - Nuclear mSaco
Repair
Of Mismatches Yeast
Human MSH4 - Meiotic Crossing
Mouse Yeast Over
Arath Celeg
Human
Arath
MSH3 - Nuclear Human
Mouse
RepairOf Loops Spombe Fly
Yeast Xenla
Rat
Mouse MSH2 - Eukaryotic Nuclear
Yeast Human Mismatch and Loop Repair
MSH1 Spombe Yeast
Mitochondrial Neucr
Arath
Repair
Aquae Trepa
Chltr
DeiraTheaq
BacsuBorbu
Thema
SynspStrpy
Ecoli Based on Eisen,
Neigo
1998 Nucl Acids
MutS1 - Bacterial Mismatch and Loop Repair Res 26: 4291-4300.
Friday, January 28, 2011
46. Tetrahymena Genome Processing
• Probably exists as a defense mechanism
• Analogous to RIPPING and
heterochromatin silencing
• Presence of repetitive DNA in MAC but
not TEs suggests the mechanism involves
targeting foreign DNA
• Thus unlike RIPPING ciliate processing
does not limit diversification by duplication
Eisen et al. 2006. PLoS Biology.
Friday, January 28, 2011
47. Phylogenomics of Novelty II
Sometimes, it is easier to steal, borrow, or
coopt functions rather than evolve them
anew
Friday, January 28, 2011
49. rRNA Tree of Life
Bacteria
Archaea
Eukaryotes
FIgure from Barton, Eisen et al.
“Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Friday, January 28, 2011
51. Network of Life
Bacteria
Archaea
Eukaryotes
Figure from Barton, Eisen et al.
“Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Friday, January 28, 2011
52. Non homology functional prediction
methods
• Many genes have homologs in other species
but no homologs have ever been studied
experimentally
• Non-homology methods can make
functional predictions for these
• Example: phylogenetic profiling
Friday, January 28, 2011
53. Phylogenetic profiling basis
• Microbial genes are lost rapidly when not
maintained by selection
• Genes can be acquired by lateral transfer
• Frequently gain and loss occurs for entire
pathways/processes
• Thus might be able to use correlated
presence/absence information to identify
genes with similar functions
Friday, January 28, 2011
54. Non-Homology Predictions:
Phylogenetic Profiling
• Step 1: Search all genes in
organisms of interest against all
other genomes
• Ask: Yes or No, is each gene
found in each other species
• Cluster genes by distribution
patterns (profiles)
Friday, January 28, 2011
55. Carboxydothermus hydrogenoformans
• Isolated from a Russian hotspring
• Thermophile (grows at 80°C)
• Anaerobic
• Grows very efficiently on CO
(Carbon Monoxide)
• Produces hydrogen gas
• Low GC Gram positive
(Firmicute)
• Genome Determined (Wu et al.
2005 PLoS Genetics 1: e65. )
Friday, January 28, 2011
83. rRNA phylotyping issues
• Massive amounts of data
– 1 x 10^6 new partial sequences with new 454
– 2 x 10^6 full length sequences in DB
• Alignments of new sequences not always
straightforward
• Solutions:
– Reliance on similarity scores (bad)
– High throughput automated phylogenetic tools
• STAP
• WATERs
Friday, January 28, 2011
84. rRNA: A Phylogenetic Anchor to
Determine Who’s Out There
Eisen et
al. 1992
Friday, January 28, 2011
85. rRNA: A Phylogenetic Anchor to
Determine Who’s Out There
Eisen et
al. 1992
Friday, January 28, 2011
86. rRNA: A Phylogenetic Anchor to
Determine Who’s Out There
Eisen et
al. 1992
Friday, January 28, 2011
87. rRNA: A Phylogenetic Anchor to
Determine Who’s Out There
Biology not Eisen et
similar enough al. 1992
Friday, January 28, 2011
91. How can we best use
metagenomic data?
• Many possible uses including:
– Improvements on rRNA based phylotyping and
species diversity measurements
– Adding functional information on top of
phylogenetic/species diversity information
• Most/all possible uses either require or are
improved with phylogenetic analysis
Friday, January 28, 2011
93. Weighted % of Clones
0
0.1250
0.2500
0.3750
0.5000
Al
ph
ap
ro
t eo
Be b ac
ta
pr t er
ot
e ia
G ob
am ac
Friday, January 28, 2011
m t er
ap ia
ro
Ep te
si ob
lo ac
np t er
ro ia
De t eo
lta b ac
pr te
ot ria
eo
b
C ac
ya ter
n ob ia
ac
t er
Fi ia
rm
ic
u te
Ac s
tin
ob
ac
t er
C ia
hl
or
ob
i
C
FB
Major Phylogenetic Group
Sargasso Phylotypes
C
hl
or
ofl
xi e
Sp
iro
ch
ae
te
Fu
so s
De ba
in ct
er
oc ia
oc
cu
s-
Eu The
ry r
ar mu
ch s
ae
C ot
re a
na
rc
ha
eo
ta
Shotgun Sequencing Allows Use of Other Markers
EFG
Venter et al., Science 304: 66-74. 2004
EFTu
rRNA
RecA
RpoB
HSP70
106. Commonly Used Binning Methods
Did not Work Well
• Assembly
– Only Baumannia generated good contigs
• Depth of coverage
– Everything else 0-1X coverage
• Nucleotide composition
– No detectible peaks in any vector we looked at
Friday, January 28, 2011
108. Binning by Phylogeny
• Four main “phylotypes”
– Gamma proteobacteria (Baumannia)
– Arthropoda (sharpshooter)
– Bacteroidetes (Sulcia)
– Alpha-proteobacteria (Wolbachia)
Friday, January 28, 2011
109. Binning by Phylogeny
• Four main “phylotypes”
– Gamma proteobacteria (Baumannia)
– Arthropoda (sharpshooter)
– Bacteroidetes (Sulcia) - only a.a. genes here
– Alpha-proteobacteria (Wolbachia)
Friday, January 28, 2011
110. Wu et al. 2006 PLoS Biology 4: e188.
Friday, January 28, 2011
111. Essential Amino Acid Synthesis
Wu et al. 2006 PLoS
Biology 4: e188.
Friday, January 28, 2011
112. Sulcia makes amino acids
Baumannia makes vitamins and cofactors
Wu et al. 2006 PLoS Biology 4: e188.
Friday, January 28, 2011