1. DISCUSSION
of
Bayesian Computation via empirical likelihood
Stefano Cabras, stefano.cabras@uc3m.es
Universidad Carlos III de Madrid (Spain)
Universit` di Cagliari (Italy)
a
Padova, 21-Mar-2013
3. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
4. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN (θ | y ) ∝ LN (θ)π(θ).
5. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN (θ | y ) ∝ LN (θ)π(θ).
◮ BUT
6. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN (θ | y ) ∝ LN (θ)π(θ).
◮ BUT
◮ IF LN (θ) is not available:
◮ THEN all life ABC;
7. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN (θ | y ) ∝ LN (θ)π(θ).
◮ BUT
◮ IF LN (θ) is not available:
◮ THEN all life ABC;
◮ IF it is not even possible to simulate from f (y | θ):
8. Summary
◮ Problem:
◮ a statistical model f (y | θ);
◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN (θ | y ) ∝ LN (θ)π(θ).
◮ BUT
◮ IF LN (θ) is not available:
◮ THEN all life ABC;
◮ IF it is not even possible to simulate from f (y | θ):
◮ THEN replace LN (θ) with LEL (θ)
(the proposed BCel procedure):
π(θ|y ) ∝ LEL (θ) × π(θ).
.
10. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
11. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned and
expressed by h(Y , θ);
12. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned and
expressed by h(Y , θ);
◮ Constraints are model driven and so there is still a timid trace
of f (y | θ) in BCel .
13. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned and
expressed by h(Y , θ);
◮ Constraints are model driven and so there is still a timid trace
of f (y | θ) in BCel .
◮ Examples:
14. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned and
expressed by h(Y , θ);
◮ Constraints are model driven and so there is still a timid trace
of f (y | θ) in BCel .
◮ Examples:
◮ The coalescent model example is illuminating in suggesting the
score of the pairwise likelihood;
15. ... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,
by means of a set of constraints:
Ef (y |θ) [h(Y , θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned and
expressed by h(Y , θ);
◮ Constraints are model driven and so there is still a timid trace
of f (y | θ) in BCel .
◮ Examples:
◮ The coalescent model example is illuminating in suggesting the
score of the pairwise likelihood;
◮ The residuals in GARCH models.
19. ... how to elicit h(·) automatically
◮ Set h(Y , θ) = Y − g (θ), where
g (θ) = Ef (y |θ) (Y |θ),
is the regression function of Y |θ;
20. ... how to elicit h(·) automatically
◮ Set h(Y , θ) = Y − g (θ), where
g (θ) = Ef (y |θ) (Y |θ),
is the regression function of Y |θ;
◮ g (θ) should be replaced by an estimator g (θ).
21. How to estimate g (θ) ?
1
... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,
Castellanos, Ruli (Ercim-2012, Oviedo).
22. How to estimate g (θ) ?
◮ Use a once forever pilot-run simulation study: 1
1
... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,
Castellanos, Ruli (Ercim-2012, Oviedo).
23. How to estimate g (θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:
θ1 , . . . , θM
1
... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,
Castellanos, Ruli (Ercim-2012, Oviedo).
24. How to estimate g (θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:
θ1 , . . . , θM
2. Simulate the corresponding y1 , . . . , yM
1
... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,
Castellanos, Ruli (Ercim-2012, Oviedo).
25. How to estimate g (θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:
θ1 , . . . , θM
2. Simulate the corresponding y1 , . . . , yM
3. Regress y1 , . . . , yM on θ 1 , . . . , θ M obtaining g (θ).
1
... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,
Castellanos, Ruli (Ercim-2012, Oviedo).
26. ... example: y ∼ N(|θ|, 1)
For a pilot run of M = 1000 we have g (θ) = |θ|.
ˆ
Pilot−Run s.s.
g(θ)
10
y
5
0
−10 −5 0 5 10
θ
27. ... example: y ∼ N(|θ|, 1)
Suppose to draw a n = 100 sample from θ = 2:
Histogram of y
20
15
Frequency
10
5
0
0 1 2 3 4
y
28. ... example: y ∼ N(|θ|, 1)
The Empirical Likelihood is this
2.5
2.0
Emp. Lik.
1.5
1.0
−4 −2 0 2 4
θ
29. 1st Point: Do we need necessarily have to use f (y | θ) ?
30. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
31. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
32. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
◮ For a given data y;
33. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
◮ For a given data y;
◮ and h(Y , θ) fixed;
34. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
◮ For a given data y;
◮ and h(Y , θ) fixed;
◮ the LEL (θ) is the same regardless of f (y | θ).
35. 1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
◮ For a given data y;
◮ and h(Y , θ) fixed;
◮ the LEL (θ) is the same regardless of f (y | θ).
Can we ignore f (y | θ) ?
37. 2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”sample
free”, i.e.
38. 2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”sample
free”, i.e.
◮ LEL (θ) → LN (θ) for n → ∞,
◮ implying π(θ|y) → πN (θ | y ) asymptotically in n.
39. 2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”sample
free”, i.e.
◮ LEL (θ) → LN (θ) for n → ∞,
◮ implying π(θ|y) → πN (θ | y ) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.
40. 2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”sample
free”, i.e.
◮ LEL (θ) → LN (θ) for n → ∞,
◮ implying π(θ|y) → πN (θ | y ) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.
◮ π(θ|ρ(s(y), so bs) < ǫ) → πN (θ | y ) as ǫ → 0
◮ implying convergence in the number of simulations if s(y ) were
sufficient.
41. 2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”sample
free”, i.e.
◮ LEL (θ) → LN (θ) for n → ∞,
◮ implying π(θ|y) → πN (θ | y ) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.
◮ π(θ|ρ(s(y), so bs) < ǫ) → πN (θ | y ) as ǫ → 0
◮ implying convergence in the number of simulations if s(y ) were
sufficient.
A quick answer recommends use BCel
BUT
a small sample would recommend ABC ?
42. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
43. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
44. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
45. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003)
◮ Mengersen et al. (PNAS, 2012)
◮ ...
46. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003)
◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
47. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003)
◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
48. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003)
◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
◮ Quasi-Likelihoods:
49. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003)
◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006)
◮ Greco et al. (JSPI, 2008)
◮ Ventura et al. (JSPI, 2010)
◮ ...
50. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.
◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006)
◮ Greco et al. (JSPI, 2008)
◮ Ventura et al. (JSPI, 2010)
◮ ...
51. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.
◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006)
◮ Greco et al. (JSPI, 2008)
◮ Ventura et al. (JSPI, 2010)
◮ ...
52. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.
◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009) : second order matching
properties;
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006)
◮ Greco et al. (JSPI, 2008)
◮ Ventura et al. (JSPI, 2010)
◮ ...
53. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.
◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009) : second order matching
properties;
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006) : examples;
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006)
◮ Greco et al. (JSPI, 2008)
◮ Ventura et al. (JSPI, 2010)
◮ ...
54. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesian
setting:
◮ Empirical Likelihoods:
◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.
◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.
◮ ...
◮ Modified-Likelihoods:
◮ Ventura et al. (JASA, 2009) : second order matching
properties;
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006) : examples;
◮ ...
◮ Quasi-Likelihoods:
◮ Lin (Statist. Methodol., 2006) : examples;
◮ Greco et al. (JSPI, 2008) : robustness properties;
◮ Ventura et al. (JSPI, 2010) : examples and coverages of C.I.;
◮ ...
55. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion of
validity:
56. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion of
validity:
π(θ|y ) should obey the laws of probability in a fashion that is
consistent with statements derived from Bayes’rule.
57. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion of
validity:
π(θ|y ) should obey the laws of probability in a fashion that is
consistent with statements derived from Bayes’rule.
◮ Very difficult!
58. 3nd Point: How to validate a pseudo-posterior
π(θ|y ) ∝ LEL (θ) × π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion of
validity:
π(θ|y ) should obey the laws of probability in a fashion that is
consistent with statements derived from Bayes’rule.
◮ Very difficult!
How to validate the pseudo-posterior π(θ|y ) when this is not
possible ?
60. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
61. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
◮ Statistical Journals;
62. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
◮ Statistical Journals;
◮ Twitter;
63. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
◮ Statistical Journals;
◮ Twitter;
◮ Xiang’s blog ( xianblog.wordpress.com )
64. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
◮ Statistical Journals;
◮ Twitter;
◮ Xiang’s blog ( xianblog.wordpress.com )
◮ ... it is tailored to Approximate LN (θ).
65. ... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
◮ Statistical Journals;
◮ Twitter;
◮ Xiang’s blog ( xianblog.wordpress.com )
◮ ... it is tailored to Approximate LN (θ).
Where is the A in BCel ?