Paul Lewis SSB Past-President's address at Evol20161. © Copyright 2016 by Paul O. Lewis
Entropy and information
in phylogenetics
Past-President
Society of Systematic Biologists
Paul O. Lewis
Evolution2016
Joint Annual Meeting of SSE, ASN, and SSB
Austin, Texas ~ 19 June 2016
2. © Copyright 2016 by Paul O. Lewis
What is information?
details, particulars, facts, figures, statistics, data;
knowledge, intelligence; instruction, advice,
guidance, direction, counsel, enlightenment; news,
word; hot tip; informal: info, lowdown, dope, dirt,
inside story, scoop, poop.
— Synonyms of information in the Oxford American
Writer’s Thesaurus
3. © Copyright 2016 by Paul O. Lewis
Does information=data?
Taxon 1
Taxon 2
Taxon 3
Taxon 4
Taxon 5
Taxon 6
Taxon 7
Taxon 8
Taxon 9
Taxon 10
Taxon 11
Taxon 12
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
5. © Copyright 2016 by Paul O. Lewis
Information=Data?
GGGTTGAATGGGGTGCGACTTATTC
GCGGCGATAGACTGCTACTACGTGC
CCCGTGGATAGCGACGTCTACAAGA
GGCTGTCGTAGCTTCCGTGTAATAC
CCGGAGGCAAACACCCTGTTCCCCC
GGGCAATATATATCCGCACCGCTCG
AAGAGCCGACAAGTAGAATCGGGAT
AGTAGCACAAGCGACACGGCAATAA
GTCGTGTTTTACCAGAGGTTGCATA
GCGTTGTAACACCCTTACCCTCTTT
AGTACATGTATGTTTCCTTCGTTCG
TGGGTTCCGCCCCGAGACGAGGCTC
Taxon 1
Taxon 2
Taxon 3
Taxon 4
Taxon 5
Taxon 6
Taxon 7
Taxon 8
Taxon 9
Taxon 10
Taxon 11
Taxon 12
7. © Copyright 2016 by Paul O. Lewis
The correct exposure for
phylogenetic inference
0.02 subst./site
Data simulated on the tree
above are nearly optimal for
phylogeny estimation
ACGGTCGAGGCGTAGACTCGATCAA
ACGGTCGAGGCGTAGACTCGATCAA
ACGGTCGAGGCGTAGACTCGATCAA
ACGGTCGATGCGTAGACTCGATCAA
ACGGTCGACGCGTATACTCGATCAA
ACGGTCGACGCGTATACTCGATCAA
ACGGTCGACGCGGATACTCGATCAA
ACGGTCGACGCGTATACTCGATCAA
ACGGTTGACGCATATACTCGATCAA
ACGGTTGACGCATATACTCGATCAA
ACCGTTGACGCATATACTCGATCAA
ACCGTTGACGCATATACTCGATCAA
Taxon 1
Taxon 2
Taxon 3
Taxon 4
Taxon 5
Taxon 6
Taxon 7
Taxon 8
Taxon 9
Taxon 10
Taxon 11
Taxon 12
8. © Copyright 2016 by Paul O. Lewis
Negatively skewed parsimony tree length
distributions indicate information content
Noisy
Fitch 1984
Informative
most
parsimonious
tree
9. © Copyright 2016 by Paul O. Lewis
The g1
statistic quantifies
skewness, and hence
information content
g1=0.05 g1=-0.96
Hillis 1991; Huelsenbeck 1991
slightly
positive
quite
negative
10. © Copyright 2016 by Paul O. Lewis
Taxon 1
Taxon 2
Taxon 3
Taxon 4
Taxon 5
A AC T G T
A AC T G T
C AG A TT
C GG A CT
C GG A CT
Shuffling taxon assignments within characters (sites)
removes hierarchical structure due to history
Archie 1989; Faith & Cranston 1991
11. © Copyright 2016 by Paul O. Lewis
Taxon 1
Taxon 2
Taxon 3
Taxon 4
Taxon 5 A
C
T
G
T
A
A
C T G
TC
AG
A
T
T
C G
G
A
C
T
C G
G
A
C
T
Shuffling taxon assignments within characters (sites)
removes hierarchical structure due to history
Archie 1989; Faith & Cranston 1991
A
12. © Copyright 2016 by Paul O. Lewis
Shuffling tests easily
differentiate random versus
properly exposed data
Archie 1989; Faith & Cranston 1991
unshuffled
original
now that's
significant!
13. © Copyright 2016 by Paul O. Lewis
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTTGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTTAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTGGGGGTAGCCCTCAC
GGGTTGAATGGGGTGCGACTTATTC
GCGGCGATAGACTGCTACTACGTGC
CCCGTGGATAGCGACGTCTACAAGA
GGCTGTCGTAGCTTCCGTGTAATAC
CCGGAGGCAAACACCCTGTTCCCCC
GGGCAATATATATCCGCACCGCTCG
AAGAGCCGACAAGTAGAATCGGGAT
AGTAGCACAAGCGACACGGCAATAA
GTCGTGTTTTACCAGAGGTTGCATA
GCGTTGTAACACCCTTACCCTCTTT
AGTACATGTATGTTTCCTTCGTTCG
TGGGTTCCGCCCCGAGACGAGGCTC
GGGTTGAATGGGGTGCGACTTATTC
GCGGCGATAGACTGCTACTACGTGC
CCCGTGGATAGCGACGTCTACAAGA
GGCTGTCGTAGCTTCCGTGTAATAC
CCGGAGGCAAACACCCTGTTCCCCC
GGGCAATATATATCCGCACCGCTCG
AAGAGCCGACAAGTAGAATCGGGAT
AGTAGCACAAGCGACACGGCAATAA
GTCGTGTTTTACCAGAGGTTGCATA
GCGTTGTAACACCCTTACCCTCTTT
AGTACATGTATGTTTCCTTCGTTCG
TGGGTTCCGCCCCGAGACGAGGCTC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTTGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTTAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTTGGGGTAGCCCTCAC
TGCGTGGCGTGGGGGTAGCCCTCAC
Xie et al. 2003
T
G
A
T
G
C
AT
G C
A T
G
C
A
G CT A
Properly exposed data has lower nucleotide
compositional entropy than saturated data
S
LO
W
LY
E
V
O
LV
IN
G
SA
TU
R
A
TE
D
14. © Copyright 2016 by Paul O. Lewis
Plotting pairwise p-distance against model-corrected
distance reveals overexposure graphically
0 1 2 3 4 5
0
0.1
0.2
0.3
0.4
0.5
Estimated distance
Proportiondifferent
2nd codon positions
3rd codon positions
??
15. © Copyright 2016 by Paul O. Lewis
The Bayesian framework provides a
natural way to quantify information
0.0 0.2 0.4 0.6 0.8 1.0
θ = Pr(coin lands heads on any given flip)
uniform probability density
2-headed coinfair coin2-tailed coin
16. © Copyright 2016 by Paul O. Lewis
The information in just 3 flips is
enough to make trick coins impossible
0.0 0.2 0.4 0.6 0.8 1.0
θ = Pr(coin lands heads on any given flip)
2-headed coin2-tailed coin
0.0 0.2 0.4 0.6 0.8 1.0
17. © Copyright 2016 by Paul O. Lewis
The difference between prior and
posterior measures information content
0.0 0.2 0.4 0.6 0.8 1.0
θ = Pr(coin lands heads on any given flip)
0.0 0.2 0.4 0.6 0.8 1.00.0 0.2 0.4 0.6 0.8 1.0
prior
posterior
Brown 2014
18. © Copyright 2016 by Paul O. Lewis
Information Theory
Dissonance
Additivity
Scaling Storm
Polytomy Rainbow
Why?
19. © Copyright 2016 by Paul O. Lewis
“Information is the resolution of uncertainty”
— Claude Shannon, 1948
20. © Copyright 2016 by Paul O. Lewis
The uncertainty Claude Shannon was
interested in resolving was “Which symbol was
last transmitted over a telegraph system?”
Sender chooses 1 of 8
possible symbols
Receiver must resolve which
symbol was sent
Information = number of
questions receiver needs to ask to
determine which symbol was sent
★
★
★★ ?★
★
★
★
★
21. © Copyright 2016 by Paul O. Lewis
Any 1 of the 8 symbols can be identified
by answering 3 yes/no questions
★ ★ ★ ★
circle
? noyes
blue?
yes no
blue?
noyes
★? ★? ★? ★?yes no yes no yes no yes no
111 110 101 100 011 010 001 0001 1 1 1 0 0 0 011 11 10 10 01 01 00 00
22. © Copyright 2016 by Paul O. Lewis
Dichotomous keys embody
Shannon's basic units information
seeds?
yesno
vascular?
no yes
flowers?
yesno
bryophyte fern gymnosperm angiosperm
23. © Copyright 2016 by Paul O. Lewis
entropy = 3
1/81/81/8 1/8 1/8 1/8 1/8 1/8 1/8
If each symbol has an equal chance of being
chosen by the sender, then 3 bits are needed to
identify each symbol on average
★ ★ ★ ★
1/8 1/8 1/8 1/8 1/8 1/8 1/8
111 110 101 100 011 010 001 000
3 bits 3 bits 3 bits 3 bits 3 bits 3 bits 3 bits 3 bits
entropy equals
average number of
questions needed
1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
24. © Copyright 2016 by Paul O. Lewis
If 1 symbol is sent half the time, and 4 other
remaining symbols are equally probable, then
only 2 bits are needed on average
★ ★ ★
1 011 010 001 000
circle
? noyes
blue?
noyes
★? ★?yes no yes no
Only 1 bit needed
half the time
3 bits needed
the other half of
the time
entropy = 2
1/2 1/8 1/8 1/8 1/8
25. © Copyright 2016 by Paul O. Lewis
If only 1 symbol is ever sent, then no
questions need be asked by the receiver,
and thus no information is required
★ ★ ★ ★ ★ ★ ★ ★ ★ ★
entropy = 0
1
0 questions
need be
asked
26. © Copyright 2016 by Paul O. Lewis
In the previous examples, there is
no uncertainty at the receiving end
★ ★ ★ ★
100% correct
27. © Copyright 2016 by Paul O. Lewis
Noise means that the data received do not
contain enough information to unambiguously
identify the symbol transmitted
★ ★ ★ ★
73%
101
100 001111
8.1% 8.1% 8.1%
1.6 bits3 bits
28. © Copyright 2016 by Paul O. Lewis
If not all bits are transmitted, there will
also be uncertainty at the destination
★ ★ ★ ★
50%
100
50%
101
10
29. © Copyright 2016 by Paul O. Lewis
Estimating the phylogeny for 4 taxa involves identifying
1 symbol (a tree) from a total of 15 symbols
A B C D B A C D A B C D C D A B D C A B A C B D C A B D A C B D B D A C D B A C A D B C D A B C A D B C B C A D C B A DA C B D
A B C D B A C D A B C D C D A B D C A B A C B D C A B D A C B D B D A C D B A C A D B C D A B C A D B C B C A D C B A D
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110
0111 3.9 bits sent
3.9 bits received
30. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
A B C D B A C D A B C D C D A B D C A B
A C B D C A B D A C B D B D A C D B A C
A D B C D A B C A D B C B C A D C B A D
model tree
31. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
0 sites
32. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
1 site
33. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
10 sites
34. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
100 sites
35. © Copyright 2016 by Paul O. Lewis
Simulating sequence data on a tree captures
information about that tree's topology
1000 sites
1000 sites captures
enough information to
identify the tree topology
chosen as the model tree
36. © Copyright 2016 by Paul O. Lewis
In 4-taxon simulations, information
estimation works as you might expect
Relative
rate
%I
0.01 18
0.1 99
1 100
10 64
100 1.5
Percent
missing
%I
0 100
50 98
100 0
Rate
variance
%I
1 100
10 97
100 13
1000 0
info highest at
optimal rate
info decreases with
no missing data
info decreases with
rate heterogeneity
37. © Copyright 2016 by Paul O. Lewis
Information can be
false information!
POLICE
38. © Copyright 2016 by Paul O. Lewis
Dissonance
Additivity
Scaling Storm
Polytomy Rainbow
Why?
Information Theory
39. © Copyright 2016 by Paul O. Lewis
Horizontal transfer results in conflicting
information about the placement of bloodroot
(Sanguinaria)
Bocconia
Eschscholzia
Oryza
Disporum
Sanguinaria
Oryza
Disporum
Eschscholzia
Bocconia
Sanguinaria
5’ end
3’ end
(horizontally transferred
from monocots)
monocots
monocots
Papaveraceae
Papaveraceae
Bergthorsson et al. 2003rps11 mtDNA
40. © Copyright 2016 by Paul O. Lewis
The 5' data contains 2.9 of 3.9 bits of
information
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O O O O O O
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O
B D S E
OO O
D S B E
O O O O
74.5%
information
D S B E
O
B D S E
O
41. © Copyright 2016 by Paul O. Lewis
Likewise, the 3' data captures 2.6 of 3.9
bits of information
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O O O O O O
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O
B D S E
OO O
D S B E
O O O O
66.8%
information
B D S E
O
D S B E
O
42. © Copyright 2016 by Paul O. Lewis
What do you expect will happen if we
concatenate the two data sets?
A ACGTACGTA ATATGTGTG
B ACGTACGTA GCGCACACA
C CCATGCGCA GCGCACACA
D GTACGCACA ATATGTGTG
E GTACGCACA ATATGGTTG
A ACGTACGTA
B ACGTACGTA
C CCATGCGCA
D GTACGCACA
E GTACGCACA
Data 1
A ATATGTGTG
B GCGCACACA
C GCGCACACA
D ATATGTGTG
E ATATGGTTG
Data 2
D
A
C
BE
Concatenated
Tree file
43. © Copyright 2016 by Paul O. Lewis
Concatenating the 3' and 5' data, we might
expect the conflict to be expressed as noise
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O O O O O O
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O
B D S E
OO O
D S B E
O O O O
B D S E
O
D S B E
O
5' tree 3' tree
hypothetical posterior distribution
44. © Copyright 2016 by Paul O. Lewis
Instead, we get all 3.9 bits of information needed, but
identify a tree that is neither the 3' nor the 5' tree!
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O O O O O O
5' tree 3' tree
S B E D B S E D S B E D E D S B D E S B S E B D E S B D S E B D D B S E S D B E S D B E B E S D E B S D
O O O O O O O O O O O O O
D S B E
O
B D S E
O
D S B E
O
B D S E
O
concatenated
data contains
100% of info!
45. © Copyright 2016 by Paul O. Lewis
Each data set strongly rejects the other's
favorite tree, so a mediocre tree wins everything
Bocconia
Eschscholzia
Oryza
Disporum
Sanguinaria
Oryza
Disporum
Eschscholzia
Bocconia
Sanguinaria
5'
Topology 5’ 3’ Concatenated
((S,D),O),E,B) --- 0.64 ---
((S,O),D),E,B) --- 0.18 ---
((O,D),S),E,B) 0.11 0.18 1
(O,D,(B,(S,E)) 0.12 --- ---
(O,D,(E,(S,B)) 0.77 --- ---
Info 74.5% 66.8% 100%
3'
This loser wins
everything!
5' data rejects
these 2 trees
3' data rejects
these 2 trees
46. © Copyright 2016 by Paul O. Lewis
D
E
C
BA
Trees 2
Merging tree files provides a means of
measuring information dissonance
A ACGTACGTA ATATGTGTG
B ACGTACGTA GCGCACACA
C CCATGCGCA GCGCACACA
D GTACGCACA ATATGTGTG
E GTACGCACA ATATGGTTG
A ACGTACGTA
B ACGTACGTA
C CCATGCGCA
D GTACGCACA
E GTACGCACA
Data 1
D
C
A
BE
Trees 1
A ATATGTGTG
B GCGCACACA
C GCGCACACA
D ATATGTGTG
E ATATGGTTG
Data 2
D
C
A
BE
D
E
C
BA
Merged
D
A
C
BE
Concatenated
47. © Copyright 2016 by Paul O. Lewis
Merged tree file says the same thing as
individual tree files if there is no dissonance
Topology 5’ 5’ Merged
((O,D),S),E,B) 0.11 0.11 0.11
(O,D,(B,(S,E)) 0.12 0.12 0.12
(O,D,(E,(S,B)) 0.77 0.77 0.77
Info 74.5% 74.5% 74.5%
same, no dissonance
same
48. © Copyright 2016 by Paul O. Lewis
Dissonance is the difference between
merged info and average info
Topology 5’ 3’ Merged
((S,D),O),E,B) --- 0.64 0.32
((S,O),D),E,B) --- 0.18 0.09
((O,D),S),E,B) 0.11 0.18 0.14
(O,D,(B,(S,E)) 0.12 --- 0.06
(O,D,(E,(S,B)) 0.77 --- 0.39
Info 74.5% 66.8% 48.6%
average info = 70.7
22.1
dissonance
49. © Copyright 2016 by Paul O. Lewis
Additivity
Scaling Storm
Polytomy Rainbow
Why?
Information Theory
Dissonance
50. © Copyright 2016 by Paul O. Lewis
A sample of trees can be used to
build a conditional clade graph
AB|C D|EF DE|F
ABCDEF
ABC|DEF
A|BC
A B C D E FA B C D E F
Larget 2013
1
1
0.5 0.5 0.5 0.5
1
51. © Copyright 2016 by Paul O. Lewis
Clade mixing-and-matching allows us to
greatly extend the reach of our sample
AB|C D|EF DE|F
ABCDEF
ABC|DEF
A|BC
A B C D E FA B C D E F A B C D E F A B C D E F
0.5 0.5 0.5 0.5
1
Larget 2013
52. © Copyright 2016 by Paul O. Lewis
0.6
0.6
6.7
this clade provides the the largest
contribution because here the 945
possible trees are cut down to just 9
Entropy, information, and dissonance
can all be partitioned by clade
AB|C D|EF DE|F
ABCDEF
ABC|DEF
A|BC
A B C D E FA B C D E F A B C D E F A B C D E F
0.25 0.25 0.25 0.25
Information = 7.9 bits
= 6.7 + 0.6 + 0.6
0.5 0.5 0.5 0.5
1
53. © Copyright 2016 by Paul O. Lewis
Two data sets simulated on trees differing only in
the swapping of two tips illustrates that
dissonance can pinpoint disagreement
54. © Copyright 2016 by Paul O. Lewis
Two data sets simulated on trees differing only in
the swapping of two tips illustrates that
dissonance can pinpoint disagreement
All dissonance
attributed to
clade
containing
swapped taxa
55. © Copyright 2016 by Paul O. Lewis
Scaling Storm
Polytomy Rainbow
Why?
Information Theory
Dissonance
Additivity
56. © Copyright 2016 by Paul O. Lewis
There are 5.6×1026
distinct labeled
unrooted binary tree topologies for 24 taxa
57. © Copyright 2016 by Paul O. Lewis
There are 5.6×10
unrooted binary tree topologies for
A computer examining 1 billion trees/second would
have to start before the Big Bang in order to finish
looking through all these trees!
58. © Copyright 2016 by Paul O. Lewis
There are 5.6×10
unrooted binary tree topologies for
A computer examining
have to start
looking through all these trees!
An MCMC sample of 1 trillion trees is still 564 trillion
times too small to sample each tree once
59. © Copyright 2016 by Paul O. Lewis
There are 5.6×10
unrooted binary tree topologies for
A computer examining
have to start
looking through all these trees!
An MCMC
times too small
Bottom line: it is impossible to accurately estimate the
entropy of a posterior representing zero information
for any reasonable number of taxa
60. © Copyright 2016 by Paul O. Lewis
Taxa Unrooted Trees Estimated information (%)
4 3 0
5 15 0
6 105 0
7 945 1
8 10,395 6
9 135,135 22
10 2,027,025 37
11 34,459,425 47
12 654,729,075 55
If data contains zero information, inadequate
sampling results in high estimated information
content
10,000 trees
sampled
61. © Copyright 2016 by Paul O. Lewis
Taxa Unrooted Trees Estimated information (%)
4 3 0
5 15 0
6 105 0
7 945 1
8 10,395 6
9 135,135 22
10 2,027,025 37
11 34,459,425 47
12 654,729,075 55
If data contains zero information, inadequate
sampling results in high estimated information
content
10,000 trees
sampled
This little dot is how
much tree space
we've covered
65,473 times
larger than
sample size
62. © Copyright 2016 by Paul O. Lewis
Polytomy Rainbow
Why?
Information Theory
Dissonance
Additivity
Scaling Storm
63. © Copyright 2016 by Paul O. Lewis
Polytomy priors make it possible to
estimate low information content accurately
1 tree
25 trees
105 trees
105 trees
Lewis, Holder & Holsinger 2005
64. © Copyright 2016 by Paul O. Lewis
Polytomy priors make it possible to
estimate low information content accurately
1 tree
25 trees
105 trees
105 trees
Lewis, Holder & Holsinger 2005
the star tree (resolution class 1)
65. © Copyright 2016 by Paul O. Lewis
Polytomy priors make it possible to
estimate low information content accurately
1 tree
25 trees
105 trees
105 trees
Lewis, Holder & Holsinger 2005
the star tree (resolution class 1)
fully resolved (resolution class 4)
66. © Copyright 2016 by Paul O. Lewis
Polytomy priors make it possible to
estimate low information content accurately
1 tree
25 trees
105 trees
105 trees
Lewis, Holder & Holsinger 2005
more than doubles
size of tree space
67. © Copyright 2016 by Paul O. Lewis
0.25
0.25
0.25
0.25
Polytomy priors make it possible to
estimate low information content accurately
1 tree
25 trees
105 trees
105 trees
Make each of the 4
resolution classes
equally probable
under the prior
Lewis, Holder & Holsinger 2005
68. © Copyright 2016 by Paul O. Lewis
Flat resolution class prior easy to
sample even for a 24-taxon problem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
0 125 250 375 500
Info = 0.026%
10,000 trees sampled
69. © Copyright 2016 by Paul O. Lewis
Highly informative data sets place all
posterior in the fully-resolved resolution class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
0 2500 5000 7500 10000
Info = 100%
10,000 trees sampled
All posterior on just 1
of the 5.6×1026
possible trees!
70. © Copyright 2016 by Paul O. Lewis
Estimated distance
Proportiondifferent
The Bayesian approach is better at assessing the
information content of 2nd vs. 3rd position sites
0 1 2 3 4 5
0
0.1
0.2
0.3
0.4
0.5
2nd codon positions
3rd codon positions
Saturated?
71. © Copyright 2016 by Paul O. Lewis
The Bayesian approach is better at assessing the
information content of 2nd vs. 3rd position sites
0 1 2 3 4 5
0
0.1
0.2
0.3
0.4
0.5
0.005
0.700
3rd position sites:
info = 86.4%
2nd position sites:
info = 75.6%
3rd positions have
more information than
2nd positions!
72. © Copyright 2016 by Paul O. Lewis
0 1 2 3 4 5
0
0.1
0.2
0.3
0.4
0.5
Using the resolution class prior does not change the
conclusion that 3rd position sites have more
information than 2nd position sites
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
0 1000 2000 3000 4000
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
0 400 800 1200 1600
2nd position sites:
info = 30.2%
3rd position sites:
info = 54.9%
73. © Copyright 2016 by Paul O. Lewis
Why?
Information Theory
Dissonance
Additivity
Scaling Storm
Polytomy Rainbow
74. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Morphology vs. molecules
75. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Informed site-stripping
0 1 2 3 4 5
0
0.1
0.2
0.3
0.4
0.5
76. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Impact of missing data
missing taxa missing genes random
77. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Partition gene tree conflict
D
E
C
BA
Trees 2
A ACGTACGTA
B ACGTACGTA
C CCATGCGCA
D GTACGCACA
E GTACGCACA
Data 1
D
C
A
BE
Trees 1
A ATATGTGTG
B GCGCACACA
C GCGCACACA
D ATATGTGTG
E ATATGGTTG
Data 2
dissonance
78. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Profiling information content
79. © Copyright 2016 by Paul O. Lewis
Why measure information
content?
• Divergence time analyses
80. © Copyright 2016 by Paul O. Lewis
Thanks!
~ UConn Collaborators ~
Ming-Hui Chen, Lynn Kuo, Louise Lewis, Karolina Fučíková,
Suman Neupane, Yu-Bo Wang, Daoyuan Shi
Supported by the National
Science Foundation
Department of Ecology and
Evolutionary Biology
http://dx.doi.org/10.1093/sysbio/syw042
Systematic Biology Advance Access
81. © Copyright 2016 by Paul O. Lewis
Literature Cited
Archie, J. W. 1989. A randomization test for phylogenetic information in systematic data. Systematic Zoology 38(3):239–252.
Bergthorsson U., Adams K.L., Thomason B., Palmer J.D. 2003. Widespread horizontal transfer of mitochondrial genes in
flowering plants. Nature 424:197–201.
Brown, J. M. 2014. Detection of implausible phylogenetic inferences using posterior predictive assessment of model fit.
Systematic Biology, 63(3), 334–348.
Faith, D. P., & Cranston, P. S. 1991. Could a cladogram this short have arisen by chance alone?: on permutation tests for cladistic
structure. Cladistics 7(1):1–28.
Fitch, W. M. 1984. Cladistic and other methods: problems, pitfalls, and potentials. Chapter 12 in: Duncan, T., and Stuessy, T. F.
(eds.), Cladistics: perspectives on the reconstruction of evolutionary history. Papers presented at a workshop on the theory and application
of cladistic methodology, March 22-28, 1981, University of California, Berkeley. Columbia University Press, New York.
Hillis, D. M. 1991. Discriminating between phylogenetic signal and random noise in DNA sequences. In M. M. Miyamoto & J.
Cracraft (Eds.), Phylogenetic analysis of DNA sequences (pp. 278–284). New York: Oxford University Press.
Huelsenbeck, J. P. 1991. Tree-length distribution skewness: an indicator of phylogenetic information. Systematic Zoology 40(3):
257–270.
Larget, B. 2013. The estimation of tree posterior probabilities using conditional clade probability distributions. Systematic Biology
62(4):501–511.
Lewis, P. O., Holder, M. T., & Holsinger, K. E. 2005. Polytomies and Bayesian phylogenetic inference. Systematic Biology 54(2):241–
253.
Xia, X., Xie, Z., Salemi, M., Chen, L., & Wang, Y. 2003. An index of substitution saturation and its application. Molecular
Phylogenetics and Evolution 26(1):1–7.
Claude Shannon photograph: http://www.itsoc.org/about/shannon
All other photographs by Paul O. Lewis