An Introduction to RevBayes and Graphical Models

I  P
I  RB
Tracy A. Heath
Iowa State University
@trayc7
Michael J. Landis
Yale University
@landismj
2016 Workshop on Molecular Evolution
Woods Hole, MA

O
Overview – Heath
Introduction to RevBayes
• Motivation
• Probabilistic graphical models
• The Rev language and demo
short break
Demo & Tutorial – Landis
Phylogenetic reconstruction in RevBayes
• Demo: tree reconstruction using MCMC under JC
• Tutorial (on your own): specify the HKY model, sample
using MCMC, summarize the tree
beer(s)

C  S P
Prior options in MrBayes v3.2

M B P S
Several software packages in
phylogenetics are moving toward a
more modular framework
• reuse code
• easier to extend existing models
and implement new models
through a rich, language-based
interface
• provides a uniﬁed framework for
analyses under complex models
RevBayes
Bali-Phy
BEAST2

RB
Höhna et al. 2016. RevBayes:
Bayesian phylogenetic inference using
graphical models and an interactive
model-speciﬁcation language.
Systematic Biology.
(doi: 10.1093/sysbio/syw021)
http://revbayes.com
Development team
Höhna
Lartillot Huelsenbeck Ronquist
Landis Heath Boussau
& others...
(Höhna et al. 2016. Systematic Biology, 65:726-736.)

G M  RB
Graphical models provide tools for
visually & computationally representing
complex, parameter-rich probabilistic
models
We can depict the conditional
dependence structure of various
parameters and other random variables
Höhna, Heath, Boussau, Landis, Ronquist, Huelsenbeck. 2014.
Probabilistic Graphical Model Representation in Phylogenetics.
Systematic Biology. (doi: 10.1093/sysbio/syu039)

M  C V
What is the distribution of heights in a population of
penguins?
4.098
2.867
3.756
1.693
3.251
2.516
3.998
2.606
2.744
3.463
4.20
3.058
4.559
3.55
2.852
(silhouette from http://phylopic.org/)

M  C V
To estimate the distribution of a variable (like the heights
of all individuals within a population of penguins), we need
a prior model of that parameter
For a parameter like height, we
want a distribution with properties
that match our biological
knowledge
3.756
The Lognormal distribution is an asymmetric probability
distribution on positive values

T L D
A variable that is lognormally distributed matches a normal
distribution when on a log scale
χ ∼ LN(μ,σ)
log(χ) ∼ Norm(μ,σ) 3.756
χ must be a positive real number & is the product of a
large number of independent, identically-distributed variables

T L D
0 1 2 3 4 5 6 7 8
mean = exp(µ + σ/2)
mean = 1.11
mean = 1.29
mean = 1.65
Density
Variable
σ=0.2, µ=0
σ=0.5, µ=0
σ=1.0, µ=0

T L D
0 1 2 3 4 5 6 7 8
Density
Variable
σ=0.2, µ=0
σ=0.5, µ=0
σ=1.0, µ=0
σ=0.2, µ=1
σ=0.5, µ=1
σ=1.0, µ=1

M  C V
3.756

M  C V
3.756
4.098
2.867
3.756
1.693
3.251
2.516
3.998
2.606
2.744
3.463
4.20
3.058
4.559
3.55
2.852

G M  RB

G M  RB
Deﬁning the model in the Rev language
xi
µ
observations = [<your data go here>]

G M  RB
xi
µ
M
α β observations = [<your data go here>]
alpha <- 3.0
beta <- 1.0

G M  RB
xi
µ
M
alpha <- 3.0
beta <- 1.0
M ∼ dnGamma(alpha, beta)

G M  RB
xi
µ σ
M
λ
alpha <- 3.0
beta <- 1.0
lambda <- 1.0

G M  RB
xi
µ σ
M
λ
alpha <- 3.0
beta <- 1.0
lambda <- 1.0
sigma ∼ dnExponential(lambda)

G M  RB
xi
µ σ
M
λ
alpha <- 3.0
beta <- 1.0
lambda <- 1.0
mu := ln(M) - (power(sigma, 2.0) / 2.0)

G M  RB
xi
µ σ
M
i ∈ N
λ
alpha <- 3.0
beta <- 1.0
lambda <- 1.0
N <- observations.size()
for( i in 1:N ){
x[i] ∼ dnLnorm(mu, sigma)
}

G M  RB
xi
µ σ
M
i ∈ N
λ
alpha <- 3.0
beta <- 1.0
lambda <- 1.0
N <- observations.size()
for( i in 1:N ){
x[i] ∼ dnLnorm(mu, sigma)
x[i].clamp(observations[i])
}

RB D: A S M
Use MCMC to approximate the posterior distributions of
stochastic and deterministic variables

RB
http://revbayes.com
• Downloads
• Tutorials
• Help documentation
• User forum
• Source code:
https://github.com/revbayes/revbayes

I B M  RB

An Introduction to RevBayes and Graphical Models

Recommended

Recommended

More Related Content

Similar to An Introduction to RevBayes and Graphical Models

Similar to An Introduction to RevBayes and Graphical Models (20)

Recently uploaded

Recently uploaded (20)

An Introduction to RevBayes and Graphical Models