The Misspecified Bayesian
Stephen Walker
Stephen Walker The Misspecified Bayesian
The talk is about a way a Bayesian can proceed without
having to model big data with a big model.
Can only be possible if aim is to extract something specific
from the data; for example, the value of I = g(x) f0(x) dx.
Choose a (misspecified) model f (x|θ) and target the θ0 for
which
g(x) f (x|θ0) dx = I.
Prior beliefs represented by π(θ).
The Bayes update may be seen as problematic.
No guarantee learning is about θ0.
Stephen Walker The Misspecified Bayesian
Consider the (misspecified) Bayesian model {f (x|θ), π(θ)}.
No connection between x and any θ via the probability density.
Instead connect through the loss function − log f (x|θ); the
target is the θ∗
which minimizes
− log f (x|θ) dF0(x).
Update π via a decision problem.
Select ν to represent updated beliefs; i.e. minimize
L(ν; x, π) = l1(ν, x) + l2(ν, π).
The obvious loss functions for ν to represent revised beliefs
about θ∗
are given by
l1(ν, x) = − log f (x|θ) ν(dθ) and l2(ν, π) = D(ν, π).
The solution to this is the Bayes update, ν(θ) ∝ f (x|θ) π(θ).
Stephen Walker The Misspecified Bayesian
The model is to do with learning about θ∗
, and the argument
is essentially asymptotic.
That it is all about θ∗
follows from the fact that πn(θ)
accumulates at θ∗
.
Hence, learning is about θ∗
.
What is being learnt about does not change with the sample
size.
Hence, the prior is also targeting θ∗
.
Need a model f (x|θ) for which targeted value θ0 and θ∗
coincide.
Stephen Walker The Misspecified Bayesian
Look at an illustration involving time series data;
(xi )n
i=1.
Suppose interest is in learning about
E0(xi xi+1),
assumed to be constant for all i.
Want a model f (x, y|θ) for which θ∗
, minimizing
− log f (x, y|θ) f0(x, y) dx dy,
and the θ0 for which
x y f (x, y|θ0) dx dy = E0(x y),
coincide.
Stephen Walker The Misspecified Bayesian
Such a model is provided by
f (x, y|θ) = c(x, y) exp{θ x y − b(θ)}.
Then
b (θ0) = b (θ∗
) = E0(x y).
If we take
c(x, y) = exp −1
2
(x2
+ y2
)
then
b (θ) =
θ
1 − θ2
.
Interest is in posterior distribution of r(θ) = b (θ).
Stephen Walker The Misspecified Bayesian

Rss talk for Bayes 250 by Steven Walker

  • 1.
    The Misspecified Bayesian StephenWalker Stephen Walker The Misspecified Bayesian
  • 2.
    The talk isabout a way a Bayesian can proceed without having to model big data with a big model. Can only be possible if aim is to extract something specific from the data; for example, the value of I = g(x) f0(x) dx. Choose a (misspecified) model f (x|θ) and target the θ0 for which g(x) f (x|θ0) dx = I. Prior beliefs represented by π(θ). The Bayes update may be seen as problematic. No guarantee learning is about θ0. Stephen Walker The Misspecified Bayesian
  • 3.
    Consider the (misspecified)Bayesian model {f (x|θ), π(θ)}. No connection between x and any θ via the probability density. Instead connect through the loss function − log f (x|θ); the target is the θ∗ which minimizes − log f (x|θ) dF0(x). Update π via a decision problem. Select ν to represent updated beliefs; i.e. minimize L(ν; x, π) = l1(ν, x) + l2(ν, π). The obvious loss functions for ν to represent revised beliefs about θ∗ are given by l1(ν, x) = − log f (x|θ) ν(dθ) and l2(ν, π) = D(ν, π). The solution to this is the Bayes update, ν(θ) ∝ f (x|θ) π(θ). Stephen Walker The Misspecified Bayesian
  • 4.
    The model isto do with learning about θ∗ , and the argument is essentially asymptotic. That it is all about θ∗ follows from the fact that πn(θ) accumulates at θ∗ . Hence, learning is about θ∗ . What is being learnt about does not change with the sample size. Hence, the prior is also targeting θ∗ . Need a model f (x|θ) for which targeted value θ0 and θ∗ coincide. Stephen Walker The Misspecified Bayesian
  • 5.
    Look at anillustration involving time series data; (xi )n i=1. Suppose interest is in learning about E0(xi xi+1), assumed to be constant for all i. Want a model f (x, y|θ) for which θ∗ , minimizing − log f (x, y|θ) f0(x, y) dx dy, and the θ0 for which x y f (x, y|θ0) dx dy = E0(x y), coincide. Stephen Walker The Misspecified Bayesian
  • 6.
    Such a modelis provided by f (x, y|θ) = c(x, y) exp{θ x y − b(θ)}. Then b (θ0) = b (θ∗ ) = E0(x y). If we take c(x, y) = exp −1 2 (x2 + y2 ) then b (θ) = θ 1 − θ2 . Interest is in posterior distribution of r(θ) = b (θ). Stephen Walker The Misspecified Bayesian