1. RidgeRace: ridge regression for continuous ancestral
character estimation on phylogenetic trees
Presentation by Rosemary McCloskey
Christina Kratsch1 Alice C. McHardy1
1Department for Algorithmic Bioinformatics, Heinrich Heine University
November 6, 2014
Kratsch & McHardy RidgeRace November 6, 2014 1 / 13
2. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
3. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
4. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
5. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
ancestral reconstruction:
estimation of characteristics of unseen
ancestral taxa
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
6. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
ancestral reconstruction:
estimation of characteristics of unseen
ancestral taxa
I discrete (eg. DNA sequence)
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
7. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
ancestral reconstruction:
estimation of characteristics of unseen
ancestral taxa
I discrete (eg. DNA sequence)
I continuous (eg. body weight)
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
8. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
ancestral reconstruction:
estimation of characteristics of unseen
ancestral taxa
I discrete (eg. DNA sequence)
I continuous (eg. body weight)
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
9. Ancestral reconstruction
?
?
phylogeny: binary tree representing
evolutionary relationships between
organisms
I leaves , observed/sampled taxa
I internal nodes , common ancestors
ancestral reconstruction:
estimation of characteristics of unseen
ancestral taxa
I discrete (eg. DNA sequence)
I continuous (eg. body weight)
http://topicpages.ploscompbiol.org/wiki/Ancestral reconstruction
Kratsch & McHardy RidgeRace November 6, 2014 2 / 13
10. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
11. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
assume
12. xed rates of evolution across some or all branches
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
13. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
assume
14. xed rates of evolution across some or all branches
use ancestral reconstruction only as a stepping stone to examine
correlated traits
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
15. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
assume
16. xed rates of evolution across some or all branches
use ancestral reconstruction only as a stepping stone to examine
correlated traits
RidgeRace:
uses phylogenetic information only (no evolutionary model)
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
17. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
assume
18. xed rates of evolution across some or all branches
use ancestral reconstruction only as a stepping stone to examine
correlated traits
RidgeRace:
uses phylogenetic information only (no evolutionary model)
allows any rate on any branch
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
19. RidgeRace
Existing ancestral reconstruction algorithms:
assume traits evolve along the tree according to a particular model
(eg. Brownian motion)
assume
20. xed rates of evolution across some or all branches
use ancestral reconstruction only as a stepping stone to examine
correlated traits
RidgeRace:
uses phylogenetic information only (no evolutionary model)
allows any rate on any branch
has ancestral reconstruction as its goal
Kratsch & McHardy RidgeRace November 6, 2014 3 / 13
21. Methods
Observed phenotypes are sums of
contributions of each ancestral
branch, plus the root.
y4 = g0 + ga + gb + gc
Kratsch & McHardy RidgeRace November 6, 2014 4 / 13
22. Methods
Observed phenotypes are sums of
contributions of each ancestral
branch, plus the root.
y4 = g0 + ga + gb + gc
Branch contributions are
proportional to branch lengths.
ga = la
23. a
Kratsch & McHardy RidgeRace November 6, 2014 4 / 13
74. ;
where
L0
ij =
(
lj j ! i
0 otherwise
:
Kratsch McHardy RidgeRace November 6, 2014 6 / 13
75. Simulations
random trees of size 30, 100, 200, 300, 400, 500
Kratsch McHardy RidgeRace November 6, 2014 7 / 13
76. Simulations
random trees of size 30, 100, 200, 300, 400, 500
phenotypic evolution by Brownian motion with 2 2 f0:5; 1; : : : ; 5g
Kratsch McHardy RidgeRace November 6, 2014 7 / 13
77. Simulations
random trees of size 30, 100, 200, 300, 400, 500
phenotypic evolution by Brownian motion with 2 2 f0:5; 1; : : : ; 5g
ancestral reconstruction with generalized least squares (GLS),
maximum likelihood (ML), and RidgeRace
Kratsch McHardy RidgeRace November 6, 2014 7 / 13
78. Simulations
random trees of size 30, 100, 200, 300, 400, 500
phenotypic evolution by Brownian motion with 2 2 f0:5; 1; : : : ; 5g
ancestral reconstruction with generalized least squares (GLS),
maximum likelihood (ML), and RidgeRace
RidgeRace comparable to other methods.
Kratsch McHardy RidgeRace November 6, 2014 7 / 13
80. Ovarian cancer data
Hierarchical clustering of 325 ovarian cancer samples.
Kratsch McHardy RidgeRace November 6, 2014 9 / 13
81. Ovarian cancer data
Hierarchical clustering of 325 ovarian cancer samples.
Reconstructed survival time; mapped mutations to ancestral nodes by
parsimony.
Kratsch McHardy RidgeRace November 6, 2014 9 / 13
82. Good points
The good:
simple approach comparable in
performance to more complex
methods
Kratsch McHardy RidgeRace November 6, 2014 10 / 13
83. Good points
The good:
simple approach comparable in
performance to more complex
methods
ancestral reconstruction
without assuming a particular
model of evolution
Kratsch McHardy RidgeRace November 6, 2014 10 / 13
84. Good points
The good:
simple approach comparable in
performance to more complex
methods
ancestral reconstruction
without assuming a particular
model of evolution
Kratsch McHardy RidgeRace November 6, 2014 10 / 13
85. Room for improvement
choice of real data was a bit
odd (not ancestral
reconstruction)
Kratsch McHardy RidgeRace November 6, 2014 11 / 13
86. Room for improvement
choice of real data was a bit
odd (not ancestral
reconstruction)
limitation is very limiting
The estimation of
87. might thus be
biased if the depth of single leaf
nodes is large compared with the
rest of the tree. We therefore
recommend RidgeRace for
approximately balanced trees.
Kratsch McHardy RidgeRace November 6, 2014 11 / 13
88. Room for improvement
choice of real data was a bit
odd (not ancestral
reconstruction)
limitation is very limiting
The estimation of
89. might thus be
biased if the depth of single leaf
nodes is large compared with the
rest of the tree. We therefore
recommend RidgeRace for
approximately balanced trees.
Bush, Robin M., et al. Eects of passage history
and sampling bias on phylogenetic reconstruction of
human in
uenza A evolution. PNAS 97.13 (2000):
6974-6980.
Kratsch McHardy RidgeRace November 6, 2014 11 / 13
91. Brownian motion
15 kg
48 kg
: : :
: : :
At each time step t, movement
drawn from a normal distribution
with mean 0 and variance 2, then
let t ! 0.
average body mass
time
10 20 30 40 50
Kratsch McHardy RidgeRace November 6, 2014 13 / 13