Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013

2,196 views
2,137 views

Published on

This is the discussion of my ABC talk at the Padova workshop on rencent advances in statistical inference by Francesco Pauli.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,196
On SlideShare
0
From Embeds
0
Number of Embeds
1,759
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013

  1. 1. Discussion of “Approximate Bayesian Computation (ABC) as the new empirical Bayes approach” by Christian Robert The validation of ABC Francesco Pauli DEAMS - University of Trieste Padova, March, 21st 2013F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 1 / 19
  2. 2. ABC: picture We actually have πABC (θ|y ) = π(θ|s(Y ) ∈ Uε (s(yobs ))) or a non parametric approximation. We can at best aim at π(θ|s(y )) We want π(θ|y )F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 2 / 19
  3. 3. ABC: picture We actually have πABC (θ|y ) = π(θ|s(Y ) ∈ Uε (s(yobs ))) or a non parametric approximation. We can at best aim at π(θ|s(y )) We want π(θ|y ) What legitimates using πABC (θ|y )?Type of justification is also connected to whether ABC is a computational tool or a ‘new inference machine’.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 2 / 19
  4. 4. ABC: picture We actually have πABC (θ|y ) = π(θ|s(Y ) ∈ Uε (s(yobs ))) or a non parametric approximation. this is the easy part, there are various answers “ε matters little if “small enough””, can be included in estimation. We can at best aim at π(θ|s(y )) We want π(θ|y ) What legitimates using πABC (θ|y )?Type of justification is also connected to whether ABC is a computational tool or a ‘new inference machine’.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 2 / 19
  5. 5. ABC: picture We actually have πABC (θ|y ) = π(θ|s(Y ) ∈ Uε (s(yobs ))) or a non parametric approximation. this is the easy part, there are various answers “ε matters little if “small enough””, can be included in estimation. We can at best aim at π(θ|s(y )) we need a good statistic S, good with respect to what? consistency, coverage. We want π(θ|y ) What legitimates using πABC (θ|y )?Type of justification is also connected to whether ABC is a computational tool or a ‘new inference machine’.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 2 / 19
  6. 6. Legitimacy: consistency results I πABC (θ|s) consistent for π(θ|s) It is easily seen that, as ε → 0, πABC (θ|Data) tend to π(θ|s(y )). Biau et al (2012) define the approximation as a k nearest neighbour and prove consistency. What about π(θ|s) and π(θ|y )? Equal if S is sufficient for θ. Consistent if S ‘tends to sufficiency’. The approach taken is to find conditions for insufficient S to guarantee consistency.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 3 / 19
  7. 7. Legitimacy: consistency results II Using the framework of noisy ABC, consistency for π(θ|y ) is shown by Dean, Singh, Jasra, & Peters, 2011. The proof is written assuming that observations and not a summary statistic are used. However, they also say that “If the mapping S() preserves the identifiability of the system, that is to say if assumption (A1) also holds for the HMMs with observations S(Y1); S(Y2), then it is trivial to see that assumptions (A2)-(A7) will also be preserved for all reasonable choices of S() and thus that Theorems 1, 2 and 3 will also hold for ABC MLE performed using the summary statistic”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 4 / 19
  8. 8. Legitimacy: consistency results III The conclusion is “Theorems 1 and 2 provide a theoretical justification for the ABC MLE procedure analogous to that provided for the standard MLE procedure by the classical notion of asymptotic consistency. In particular they show that an arbitrary degree of accuracy in the parameter estimate can be achieved given sufficient data and a sufficiently small ε.”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 5 / 19
  9. 9. Legitimacy: consistency results IV Fearnhead and Prangle (2012) also put forward a consistency result within the noisy ABC framework, in particular assuming that a coverage property holds: “[. . .] under repeated sampling from the prior, data and summary statistics, events assigned probability q by the ABC posterior will occur with probability q.” they show that “[. . .] under the standard regularity conditions (Bernardo and Smith, 1994), the noisy ABC posterior will converge onto a point mass on the true parameter value as m → ∞.”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 6 / 19
  10. 10. Legitimacy: consistency results V Marin et al (2012) focus on consistency in model selection; they state condition for the summary statistics in order to obtain consistent model selection through Bayes factors. they also point out that “(a) different statistics should be used for estimation and for testing and (b) that they should not be mixed in a single summary statistic. ”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 7 / 19
  11. 11. Legitimacy: consistency results VI Connection between the different results? Are they equivalent or cover different situations? Undoubtedly, they offer legitimacy to ABC procedures. It seems to me they go into the direction of justifying the procedure per se, not as an approximation of the standard ABC (this might be relevant to interpreting ABC as a mere computational tool or a new inference type).F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 8 / 19
  12. 12. Is consistency enough? Consistency is a nice property. It does not say how far from the target π(θ|y ) do we get in a specific instance. The strategy is to find a class of statistics S for which ABC is consistent. Does not allow to say which of the different (insufficient) statistics or strategies to select the statistics is better. It is true that some of the aforementioned works state optimality of particular strategies, for instance Fearnhead & Prangle state that “[. . .] choosing summary statistics as the posterior means produces ABC estimators that are optimal in terms of minimizing quadratic loss”, but it is also true that when different procedures are compared the picture is not totally clear.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 9 / 19
  13. 13. Comparison of procedures Blum et al (2013) compare methods of dimension reduction in ABC; that is, the methods differ because of the choice of the summary statistics; the comparison is based on simulations on three different models; they put forward that “the most suitable set of summary statistics for an analysis may be dataset dependent”; eventually, no uniformly best method is found. This would call for “application specific” validation to complement the theory.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 10 / 19
  14. 14. Diagnostics based on coverage properties Prangle et al (2013) propose diagnostics based on the coverage property “For inference on a continuous scalar parameter, θ, an informal definition is that a given credible interval based on (θ|y0 ), where y0 ∼ π(y |θ0 , m0 ) for fixed m0 , should contain the true parameter, θ0 , the appropriate proportion of times.” Diagnostics are obtained repeatedly constructing ABC approximations for known values of the parameters (for known models) and checking that the coverage property holds. Technically, these becomes a problem of checking uniform distribution of p-values. DetailsF. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 11 / 19
  15. 15. What kind of justification is most appropriate? Is using validation the most appropriate thing to do? Can we say something about how far do we get from π(θ|y )? Does using validation qualify the method as approximation or new inference? Prangle et al (2013) say that “Note that the above results do not prove that the posterior π(θ|y ) is the only distribution to satisfy coverage with respect to our choice of H. However, we are unaware of any other such distributions that are likely to arise in the ABC context.” this may be more coherent with seeing ABC as a new inference machine. Connections with Monahan and Boos (1992)?F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 12 / 19
  16. 16. ABCel In ABCel the likelihood is substituted by the EL; no simulation of the sample is involved; As a side note, it seems to me that this is a different framework, even if we look at the empirical likelihood as a summary statistic: is ABCel A? Anyway, since we substitute the likelihood with a surrogate, the issue of validating the results we get is relevant.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 13 / 19
  17. 17. Legitimacy of EL in (A)BC I Lazar (2003) proposed using EL in the Bayesian paradigm; the procedure seem to lack a general justification; in particular a simulation study is performed; the conclusion is that “Based on both the Monahan & Boos (1992) heuristic and an examination of the frequentist properties of Bayesian intervals, it appears reasonable to use empirical likelihood within the Bayesian paradigm.”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 14 / 19
  18. 18. Legitimacy of EL in (A)BC II however “These results need to be interpreted with some care. While they indicate that it is feasible to consider a Bayesian inferential procedure based on replacing the data likelihood with empirical likelihood, the validity of the posterior inference needs to be established for each case individually. For example, as demonstrated in an unpublished Carnegie Mellon University technical report by L. A. Wasserman, empirical likelihood for the median and Jeffreys’ likelihood are related, and hence the two can be expected to exhibit similar poor behaviour.” This may suggest that the proposals above for the diagnostics in ABC can be exploited here.F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 15 / 19
  19. 19. Legitimacy of EL in (A)BC III Adimari & Pauli (2010) also employed EL as a surrogate for the proper likelihood in the context of pairwise likelihood inference; they argue that “ based on general results for empirical likelihood, [. . .] such a surrogate has good asymptotic properties.” In particular, asymptotic normality with covariance matrix the Godambe information matrix is put forward as a justification; they also explored its efficacy “by comparing it with the ordinary posterior distribution on simulated datasets.”F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 16 / 19
  20. 20. Diagnostics based on coverage properties, details I g (θ|y ), Gy (θ) resp. density and df approximating π(θ|y ); B(α) : [0, 1] → B([0, 1]) s.t. BM(α) = α; C (y , α) = G 1 [B(α)] a cred. int. according to g ; H(θ, y ) df for (θ, y ).g satisfies the coverage property w.r. to H(θ0 , y0 ) if ∀ B, α ∈ [0, 1] P(θ0 ∈ C (y0 , α)) = αThat is, if p0 = Gy0 (θ0 ) ∼ U(0, 1)F. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 17 / 19
  21. 21. Diagnostics based on coverage properties, details II π(θ|y0) g(θ|y0) α y C(y0,α) θF. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 18 / 19
  22. 22. Diagnostics based on coverage properties, details III α y C(y0,α) θ BackF. Pauli (DEAMS - Univ. of Trieste) ABC: validation Padova, March, 21st 2013 19 / 19

×