近似ベイズ計算によるベイズ推定

7,332 views
7,262 views

Published on

Likelihood is sometimes difficult to compute because of the complexity of the model. Approximate Bayesian computation (ABC) makes it easy to sample parameters generating approximation of observed data.

近似ベイズ計算によるベイズ推定

  1. 1. kos59125[2011-09-24] Tokyo.R#17
  2. 2. 1..2..3..4..
  3. 3. 1..2..3..4..
  4. 4. • Twitter ID: kos59125• R :7
  5. 5. 1..2..3..4..
  6. 6. ... .
  7. 7. ... P(D|θ) π(θ) f (θ|D) = ∝ P(D|θ) π(θ) P(D) • π(θ): D θ • P(D|θ): D θ • f (θ|D): D . θ ∝ x f (x) = cg(x) c f (x) ∝ g(x)
  8. 8. ... N(µ, 52 ) 3 4.7, 11.9, 13.4 π(µ) N(0, 102 ) ( ) 1 µ2 π(µ) = √ exp − 2π · 10 2 · 102 . µ
  9. 9. ( ) 1 (4.7 − µ)2P(D|µ) = √ exp − 2π · 5 2 · 52 ( ) 1 (11.9 − µ)2 × √ exp − 2π · 5 2 · 52 ( ) 1 (13.4 − µ)2 × √ exp − 2π · 5 2 · 52 ( ) (4.7 − µ)2 + (11.9 − µ)2 + (13.4 − µ)2 ∝ exp − 2 · 52 ( ) 3µ2 − 60µ + (4.72 + 11.92 + 13.42 ) = exp − 2 · 52 ( ) 3µ2 − 60µ ∝ exp − 2 · 52
  10. 10. f (µ|D) ∝ P(D|µ) π(µ) ( ) ( ) 3µ2 − 60µ µ2 ∝ exp − exp − 2 · 52 2 · 102  ( )2   µ − 120/13 − (120/13)2    = exp −      2 · 100/13  ( )  1  µ − 120/13 2    ∝ √ √ exp −    2π · 100/13 2 · 100/13 
  11. 11. ... N(µ, 52 ) 3 4.7, 11.9, 13.4 π(µ) N(0, 102 ) ( ) 1 µ2 π(µ) = √ exp − 2π · 10 2 · 102 . µ ... ( ) 120 100 N , . 13 13
  12. 12. µ
  13. 13. •• ( )•
  14. 14. 1..2..3..4..
  15. 15. . (Approximate Bayesian Computation, ABC).. •. •
  16. 16. 1..2..3..4..
  17. 17. . (Rectangular Kernel).. n x1 , . . . , xn f (x) f (x) h 1 ∑ n f (x) = I(|x − x j | ≤ h) 2nh j=1 I(X) X 1. 0 F(x + h) − F(x − h) fh (x) ≡ 2h 1 = P(x − h < X ≤ x + h) 2h
  18. 18. . (1).. θ n D′ , . . . , D′ 1 n P(D|θ) ρ ϵ 1∑ n P(D|θ) ∝ I(ρ(D, D′j ) ≤ ϵ) θ n j=1 I(X) X 1 . 0 ϵ→∞
  19. 19. . (2).. θ n D′ , . . . , D′ 1 n P(D|θ) Sa ρ ϵ 1∑ n P(D|θ) ∝ I(ρ(S(D), S(D′j )) ≤ ϵ) θ n j=1 I(X) X 1 0 . a S D
  20. 20. 1..2..3..4..
  21. 21. . (without ABC).. 1 .. θ π(·) 2 .. θ P(D|θ) a 3 .. 1 a max P(D|θ) ≤ c c θ P(D|θ)/c . 1 f (θ|D) P(D|θ) = c ≤ 1 c π(θ) P(D)
  22. 22. likelihood <- (function(data) { L <- function(m) prod(dnorm(data, m, 5)) function(mu) sapply(mu, L)})(observed)ML <- likelihood(mean(observed))posterior <- numeric()while ((n <- N - length(posterior)) > 0) { theta <- rprior(n) posterior <- c(posterior, theta[runif(n) <= likelihood(theta)/ML])}
  23. 23. µ
  24. 24. . (with ABC).. 1 .. θ π(·) ′ 2 .. θ D .. 3 ρ(S(D), S(D′ )) ≤ ϵ θ . 4 .. 1
  25. 25. distance <- (function(data) function(mu) { S <- function(m) mean(rnorm(length(data), m, 5)) abs((mean(data) - sapply(mu, S)) / mean(data)) })(observed)posterior <- numeric()while ((n <- N - length(posterior)) > 0) { theta <- rprior(n) posterior <- c(posterior, theta[distance(theta) <= TOLERANCE])}
  26. 26. µ
  27. 27. . MCMC (M-H algorithm without ABC).. ′ .. θ 1 q(θ → θ′ ) { } P(D|θ′ ) π(θ′ ) q(θ′ → θ) 2 .. min 1, θ′ P(D|θ) π(θ) q(θ → θ′ ) . .3. 1 P(D|θ′ ) π(θ′ ) f (θ′ |D) P(D) P(D|θ′ ) π(θ′ ) = = f (θ|D) P(D|θ) π(θ) P(D|θ) π(θ) P(D)
  28. 28. likelihood <- (function(data) { L <- function(m) prod(dnorm(data, m, 5)) function(mu) sapply(mu, L) })(observed) ratio <- function(mu1, mu2) (likelihood(mu2) /likelihood(mu1)) * (dprior(mu2) / dprior(mu1)) * (dtransition(mu2, mu1) / dtransition(mu1, mu2)) chain <- numeric(N) chain[1] <- rprior(1) t <- 1; while (t < length(chain)) { proposal <- rtransition(chain[t]) probability <- min(1, ratio(chain[t], proposal)) if (runif(1) <= probability) { chain[t + 1] <- proposal t <- t + 1 } } ( ratio)log
  29. 29. µ
  30. 30. . MCMC (M-H algorithm with ABC).. ′ .. θ 1 q(θ → θ′ ) ′ .. θ 2 D′ ′ .. ρ(S(D), S(D )) > ϵ 3 1 { } ′ π(θ′ ) q(θ′ → θ) 4 .. α(θ → θ ) = min 1, θ′ π(θ) q(θ → θ′ ). 5 .. 1 π(θ′ ) q(θ′ →θ)α(θ → θ′ ) = π(θ) q(θ→θ′ ) (≤ 1) f (θ|ρ ≤ ϵ) q(θ → θ′ ) P(ρ ≤ ϵ|θ′ ) α(θ → θ′ ) P(ρ ≤ ϵ|θ)π(θ) π(θ′ ) q(θ′ → θ) = q(θ → θ′ ) P(ρ ≤ ϵ|θ′ ) P(ρ ≤ ϵ) π(θ) q(θ → θ′ ) = f (θ′ |ρ ≤ ϵ) q(θ′ → θ) P(ρ ≤ ϵ|θ) α(θ′ → θ) (∵ α(θ′ → θ) = 1)α(θ → θ′ ) = 1
  31. 31. distance <- (function(data) function(mu) { S <- function(m) mean(rnorm(length(data), m, 5)) abs((mean(data) - sapply(mu, S)) / mean(data)) } )(observed) ratio <- function(mu1, mu2) (dprior(mu2) / dprior(mu1)) * (dtransition(mu2, mu1) / dtransition(mu1, mu2)) chain <- numeric(N) while (distance(chain[1] <- rprior(1)) > TOLERANCE) {} t <- 1; while (t < length(chain)) { proposal <- rtransition(chain[t]) if (distance(proposal) <= TOLERANCE) { probability <- min(1, ratio(chain[t], proposal)) if (runif(1) <= probability) { chain[t + 1] <- proposal t <- t + 1 } } } ( ratio)log
  32. 32. µ
  33. 33. ϵ••
  34. 34. . (with ABC ).. 1 .. θ1 , . . . , θkN (k > 1) π(·) 2 .. θi D′i 3 .. ρ(S(D), S(D′ )) i 4 .. (1), . . . , (kN) {θ(1) , . . . , θ(N) } . ϵ = ρ(S(D), S(D′ )) (N)
  35. 35. distance <- (function(data) function(mu) { S <- function(m) mean(rnorm(length(data), m, 5)) abs((mean(data) - sapply(mu, S)) / mean(data)) })(observed)prior <- rprior(k * N)sortedDistance <- sort(distance(prior), index.return=TRUE)posterior <- prior[sortedDistance$ix[1:N]]
  36. 36. µ
  37. 37. ••••
  38. 38. CRAN abc→
  39. 39. 1..2..3..4..
  40. 40. ... . ... ( ) .
  41. 41. ... . : 突然変異 現在 過去
  42. 42. ... a N k ( ) k(k − 1) EXP 2N a . ( )
  43. 43. ... POIS(Lµ) L ( ) µ .
  44. 44. 現在 分化 過去• ⇒ ⇒•
  45. 45. ... 2 Hana mogeraa 2 b ( 2 ) • 1 N 400,000 • 2 rN 1 2 • aN 1 5 • (T; 4N ) 2 a b.
  46. 46. 集団 1 2N 2aN集団 2 2rN 0 T N ∼ U(0, 400000) r ∼ U(0, 2) a ∼ U(0, 5) T ∼ U(0, 2)
  47. 47. • 30 1 • 10−5 • 20 • S k (S1 , k1 , S2 , k2 ) = (15.4, 2.9, 8.9, 0.3) 1(N, r, a, T) = (80000, 0.1, 3.0, 0.1)
  48. 48. •••
  49. 49. ´[1] Marjoram P, Molitor J, Plagnol V, and Tavare S (2003) Markov chain Monte Carlo without likelihoods. PNAS, 100: 15324–15328.[2] (2001) . , 31: 305–344.[3] (2005) . , 12: II ( ) pp.153–211.[4] Robert CP (2010) MCMC and Likelihood-free Methods. SlideShare.
  50. 50. https://bitbucket.org/kos59125/tokyo.r-17/

×