http://www.nimr.mrc.ac.uk/images/multimedia/news/large/cake-large.jpg
Robert Beiko
Faculty of Computer Science*
Dalhousie University
Halifax, 2 feet of snow last week, Canada
April 5, 2014
The dream of a Tree of Life
Can a ToL be
[correctly]
[reliably]
[accurately]
inferred?
Woese
“All happy phylogenies are alike;
each unhappy phylogeny is
unhappy in its own way.”
- Evolution Leo
Tolstoy
Creevey et al. Proc. R. Soc. Lond. B (2004)
Early ancestral signal
is probably gone
It gets
worse
W. Ford Doolittle, Sci Am (1999)
OMFG it gets even worse
Kunin et al. (2005) Genome Res
make it stop make it stop
Dagan et al. (2008) PNAS
Do not adjust your model
What is the meaning of this??
• Signal saturation + tiny branches that happened a
long, long, long time ago
• Other unpleasant biases (G+C, rates, etc)
• Lateral gene transfer
Finding LGT
en.wikipedia.org
K-mersorcodonusage
Wang et al. (2001) MBE
Phylogeneticdiscordance
Concordance weighted Discordance weighted
Euchlamydispirokaryotes
Extremarchalsobacteriae
Phylogenetics!
MAFs, SPRs, LGTs
Chris Whidden
+ Norbert Zeh
Building a MAF by
edge cutting
Example case: a & c are sisters in the species tree, but not in the gene tree.
What can we do to the gene tree?
• Naïve case: O(3kn)
• Fancy refinements: O(2.42kn)
• Even fancier refinements: O(2kn) (conjectured)
FIXED PARAMETER TRACTABLE –
Exponential in the distance between trees, not the
number of leaves
Hypotheses about LGT
Hypotheses about LGT
The Complexity Hypothesis
(Jain et al., 1999)
• “Informational” proteins have more interactions
with other proteins in the cell, and are therefore
less likely to be successfully transferred than, say,
metabolic stuff
• Cohen et al. (2011): forget about function, it’s all
about the connections with other proteins in the
cell
The Selfish Operon Hypothesis:
Lawrence and Roth (1996)
• Genes associate in operons because it facilitates
transfer of all constituents of a pathway at once
• If the genes were dispersed throughout the
genome, then the selective advantage of a pathway
could not be propagated via transfer
The Public Goods Hypothesis:
McInerney et al. (2011)
• Genes are public goods that can be freely shared
and cannot be excluded from being available
• These genes are constantly acquired and integrated
into genomes, invalidating the idea of a unifying
Tree of Life
Highways of gene sharing:
Beiko et al. (2005)
• Gene sharing occurs preferentially between
lineages, and successful gene acquisitions often
reflect shared ecology
LGT stories
P. aeruginosa
P. fluorescens
P. lePewtida
P. syringae
P. entomophila
P. stutzeri
P. mendocina
(Catherine) Holloway and Beiko, 2010
“Plume”
Proteobacteria
Planar is plainer, could be pain-er
Beiko, 2011
244 taxa
40,631 trees
= Bacterial SPR supertree
LGT patterns for Clostridium
Whidden et al., 2014
Cold case – Aquifex aeolicus & friends
(Rob) Eveleigh et al., 2013
LGT in the Wild
Hehemann et al. (2010) Nature
Smillie et al. (2011) Science
Lachnospiraceae – Gut / mouth enthusiasts
(Conor) Meehan and Beiko (2014) GBE
“Good” strains ..?
“Not so good” strains ..?
Butyrate production – a crucial
function, subject to LGT
Finding LGT in the
microbiome?
• Illumina sequencing - aaaaargh!
• Mixed samples! [imagine what happens
when you try to assemble!]
• Strain-level differentiation!
• etc
What does it all mean?
LGT seriously undermines the recovery (and validity?) of
the Tree of Life
Even so, aggregation methods (supertrees, etc.) can
provide a useful scaffold for inferring LGT events
LGT serves as a useful starting point for hypotheses of
habitat adaptation / invasion
Metagenomic data offer new context to LGT events (and
genomic data show we should be looking at
communities), but present huge challenges to inference
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"

Beiko Deep Genomics presentation - "Grand theft operon - lateral city"