5. The Likelihood Principle
● All information contained in data x, with
respect to inference about the value of θ, is
contained in the likelihood function:
L | x ∝ P X= x |
Corey Chivers, 2012
7. The Likelihood Function
L | x ∝ P X= x |
L | x =f | x
Where θ is(are) our parameter(s) of interest
ex:
Attack rate
Fitness
Mean body mass
Mortality
etc...
Corey Chivers, 2012
9. The Ecologist's Quarter
Lands tails (caribou up) 60% of the time
● 1) What is the probability that I will flip tails, given that
I am flipping an ecologist's quarter (p(tail=0.6))?
P x | =0.6
● 2) What is the likelihood that I am flipping an
ecologist's quarter, given the flip(s) that I have
observed?
L=0.6 | x
Corey Chivers, 2012
10. The Ecologist's Quarter
T H
L | x = ∏ ∏ 1−
t=1 h=1
L=0.6 | x=H T T H T
3 2
= ∏ 0.6 ∏ 0.4
t =1 h=1
= 0.03456
Corey Chivers, 2012
11. The Ecologist's Quarter
T H
L | x = ∏ ∏ 1−
t=1 h=1
L=0.6 | x=H T T H T
3 2 But what does this
= ∏ 0.6 ∏ 0.4 mean?
0.03456 ≠ P(θ|x) !!!!
t =1 h=1
= 0.03456
Corey Chivers, 2012
12. How do we ask Statistical Questions?
A Frequentist asks: What is the probability of
having observed data at least as extreme as my
data if the null hypothesis is true?
P(data | H0) ? ← note: P=1 does not mean P(H0)=1
A Bayesian asks: What is the probability of
hypotheses given that I have observed my data?
P(H | data) ? ← note: here H denotes the space of all
possible hypotheses
Corey Chivers, 2012
13. P(data | H0) P(H | data)
But we both want to make
inferences about our hypotheses,
not the data.
Corey Chivers, 2012
14. Bayes Theorem
● The posterior probability of θ, given
our observation (x) is proportional to the
likelihood times the prior probability of θ.
P x | P
P | x=
P x
Corey Chivers, 2012
16. The Ecologist's Quarter
T H
L | x = ∏ ∏ 1−
t=1 h=1
L=0.6 | x=H T T H T
3 2
= ∏ 0.6 ∏ 0.4
t =1 h=1
= 0.03456
Corey Chivers, 2012
17. Likelihood of data
given hypothesis
P( x | θ)
But we want to know
P(θ | x )
Corey Chivers, 2012
18. ● How can we make inferences about our
ecologist's quarter using Bayes?
P( x | θ) P(θ)
P(θ | x )=
P( x )
Corey Chivers, 2012
19. ● How can we make inferences about our
ecologist's quarter using Bayes?
Likelihood
P x | P
P | x=
P x
Corey Chivers, 2012
20. ● How can we make inferences about our
ecologist's quarter using Bayes?
Likelihood Prior
P( x | θ) P(θ)
P(θ | x )=
P( x )
Corey Chivers, 2012
21. ● How can we make inferences about our
ecologist's quarter using Bayes?
Likelihood Prior
P x | P
P | x=
Posterior
P x
Corey Chivers, 2012
22. ● How can we make inferences about our
ecologist's quarter using Bayes?
Likelihood Prior
P x | P
P | x=
Posterior
P x
P x =∫ P x | P d
Not always a closed form solution possible!!
Corey Chivers, 2012
24. Randomization to Solve Difficult
Problems
`
Feynman, Ulam &
Von Neumann
∫ f d
Corey Chivers, 2012
25. Monte Carlo
Throw darts at random Feynman, Ulam &
Von Neumann
(0,1)
P(blue) = ?
P(blue) = 1/2
P(blue) ~ 7/15 ~ 1/2
(0.5,0) (1,0)
Corey Chivers, 2012
26. Your turn...
Let's use Monte Carlo to estimate π
- Generate random x and y values using the number sheet
- Plot those points on your graph
How many of the points fall
within the circle?
y=17
x=4
29. Posterior Integration via Markov
Chain Monte Carlo
A Markov Chain is a mathematical construct
where given the present, the past and the
future are independent.
“Where I decide to go next depends not
on where I have been, or where I may
go in the future – but only on where I
am right now.”
-Andrey Markov (maybe)
Corey Chivers, 2012
32. Metropolis-Hastings Algorithm
1. Pick a starting location at
The Markovian Explorer!
random.
2. Choose a new location in
your vicinity.
3. Go to the new location with
probability:
p=min 1, x proposal
x current
4. Otherwise stay where you
are.
5. Repeat.
Corey Chivers, 2012
34. ● We've solved our integration problem!
P x | P
P | x=
P x
P | x∝ P x | P
Corey Chivers, 2012
35. Ex: Bayesian Regression
● Regression coefficients are traditionally
estimated via maximum likelihood.
● To obtain full posterior distributions, we can
view the regression problem from a Bayesian
perspective.
Corey Chivers, 2012
37. Example: Salmon Regression
Model Priors
Y =a+ bX +ϵ a ~ Normal (0,100)
ϵ ~ Normal( 0, σ) b ~ Normal (0,100)
σ ~ gamma (1,1/ 100)
P( a , b , σ | X , Y )∝ P( X ,Y | a , b , σ)
P( a) P(b) P( σ)
Corey Chivers, 2012
38. Example: Salmon Regression
Likelihood of the data (x,y), given
the parameters (a,b,σ):
n
P( X ,Y | a , b , σ)= ∏ N ( y i ,μ=a+ b x i , sd=σ)
i=1
Corey Chivers, 2012
42. ##@ 2.5 @##
>## Print the Bayesian Credible Intervals
> BCI(mcmc_salmon)
0.025 0.975 post_mean
a -13.16485 14.84092 0.9762583
b 0.127730 0.455046 0.2911597
Sigma 1.736082 3.186122 2.3303188
Inference:
Does body length have
EM =ab BL an effect on egg mass?
Corey Chivers, 2012
43. The Prior revisited
● What if we do have prior information?
● You have done a literature search and find that a
previous study on the same salmon population
found a slope of 0.6mg/cm (SE=0.1), and an
intercept of -3.1mg (SE=1.2).
How does this prior information change your
analysis?
Corey Chivers, 2012
45. Example: Salmon Regression
Informative
Model Priors
EM =ab BL a ~ Normal (−3.1,1 .2)
~ Normal 0, b ~ Normal (0.6,0 .1)
~ gamma1,1 /100
Corey Chivers, 2012
46. If you can formulate the likelihood function, you
can estimate the posterior, and we have a
coherent way to incorporate prior information.
Most experiments do happen in a vacuum.
Corey Chivers, 2012
47. Making predictions using point estimates can
be a dangerous endeavor – using the posterior
(aka predictive) distribution allows us to take
full account of uncertainty.
How sure are we about our predictions?
Corey Chivers, 2012
49. ##@ 3.1 @##
● Suppose you have a 90cm long individual
salmon, what do you predict to be the egg
mass produced by this individual?
● What is the posterior probability that the egg
mass produced will be greater than 35mg?
Corey Chivers, 2012
53. Extensions:
● By quantifying our uncertainty through
integration of the posterior distribution, we can
make better informed decisions.
● Bayesian analysis provides the basis for
decision theory.
● Bayesian analysis allows us to construct
hierarchical models of arbitrary complexity.
Corey Chivers, 2012
54. Summary
● The output of a Bayesian analysis is not a single estimate of
θ, but rather the entire posterior distribution., which
represents our degree of belief about the value of θ.
● To get a posterior distribution, we need to specify our prior
belief about θ.
● Complex Bayesian models can be estimated using MCMC.
● The posterior can be used to make both inference about θ,
and quantitative predictions with proper accounting of
uncertainty.
Corey Chivers, 2012
55. Questions for Corey
● You can email me!
Corey.chivers@mail.mcgill.ca
● I blog about statistics:
bayesianbiologist.com
● I tweet about statistics:
@cjbayesian
56. Resources
● Bayesian Updating using Gibbs Sampling
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/
● Just Another Gibbs Sampler
http://www-ice.iarc.fr/~martyn/software/jags/
● Chi-squared example, done Bayesian:
http://madere.biol.mcgill.ca/cchivers/biol373/chi-
squared_done_bayesian.pdf
Corey Chivers, 2012