Successfully reported this slideshow.
Upcoming SlideShare
×

# 情報幾何の基礎とEMアルゴリズムの解釈

3,172 views

Published on

Published in: Science
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### 情報幾何の基礎とEMアルゴリズムの解釈

1. 1. EM 4 2016/07/15(Fri.) 1
2. 2. Agenda intro EM em 2
3. 3. Agenda intro EM em 3 “ ” EM
4. 4. Agenda intro EM em 4 “ ” EM
5. 5. Agenda intro EM em 5
6. 6.   6
7. 7. ( )             2         β D = (X, y) ˆy = X E( ) = ||ˆy y||2 = ||y X ||2 7
8. 8.                     @E( ) @ = 2( XT )(y X ) = 0 XT X = XT y = (XT X) 1 XT y ˆy = X = X(XT X) 1 XT y 8
9. 9. S X = 0 @ xT 1 : xT N 1 A y ˆy = y = X(XT X) 1 XT S X 9
10. 10. S X = 0 @ xT 1 : xT N 1 A y ˆy = y = X(XT X) 1 XT S X Data Model Data Model 10
11. 11. https://staff.aist.go.jp/s.akaho/papers/infogeo-sice.pdf = →   ⇒ Euclid 11
12. 12. p1 p2 q2 q1 12
13. 13. p1 p2 q2 q1 KL[p1||q1] = Z p1(x) log p1(x) q1(x) dx = 8 KL[p2||q2] = Z p2(x) log p2(x) q2(x) dx = 2 13
14. 14. Euclid (Kullback-Leibler )   ⇒ ” ” Euclid ( ) 14
15. 15.         ξ- 1 1 f(x; ⇠) ⇠ = (⇠1 , · · · , ⇠n ) ⇠ = (µ, 2 ) 15
16. 16. i) M Hausdorff ii) Λ M 3 1) 2) λ 3)     Cr : U ! Rn {(U , )} 2⇤ M = [ 2⇤ U (U ) U↵ [ U 6= ; 1 ↵ : ↵(U↵ [ U ) ! (U↵ [ U ) 16
17. 17. i) M Hausdorff ii) Λ M 3 1) 2) λ 3)     Cr : U ! Rn {(U , )} 2⇤ M = [ 2⇤ U (U ) U↵ [ U 6= ; 1 ↵ : ↵(U↵ [ U ) ! (U↵ [ U ) M Cr (M, {(U , ) 2⇤) 17
18. 18. i) M Hausdorff ii) Λ M 3 1) 2) λ 3)     Cr : U ! Rn {(U , )} 2⇤ M = [ 2⇤ U (U ) U↵ [ U 6= ; 1 ↵ : ↵(U↵ [ U ) ! (U↵ [ U ) 18 ” ” ” ” = atlas= atlas M U Rn
19. 19. c(t) p = c(t) p v = dc(t) dt 1 ” ”
20. 20. p p’ p dε p+dε dε’ … p p + d" p p + d" dε
21. 21.           dε i p ˜p = p + d"{ei} {˜ei} {˜ei}ej ⇧d"[ej] ⇧d"[ej] = ˜ej X i,k d"i k ij ˜ek k ij p ej ⇠j ˜p ˜⇠j ˜ej ⇧d"[ej] X i,k d"i k ij ˜ek d" d"i dε
22. 22. ” ” ” ” α- α− α     Markov [1] 5 α− [3]
23. 23. α- 0 α- α-                       α- Euclid ≒ p ej ⇠j ˜p ˜⇠j ˜ej ⇧d"[ej] X i,k d"i k ij ˜ek d" 2 ” ”
24. 24. α- α     α=+1   α=0   α=-1 1- e- (exponential) -1- m- (mixture)
25. 25. α-       α=±1 Kullback-Leibler   ⇒ 1- (e- ) -1- (m- ) p M α- q M   ⇒ α- D(↵) (p||q) = (✓(p)) + '(⌘(q)) X i ✓i (p)⌘i(q) D(↵) (p||q) 25 α- ” ”
26. 26. α- α=±1   (“ ” Kullback-Leibler )
27. 27. Agenda intro EM em 27
28. 28. EM X: Z: θ:                 p(X, Z|✓) 28 p(X|✓) = Z p(X, Z|✓)dZ
29. 29. EM [2] p.166   KL Kullback-Leibler                       L[q, ✓] = Z q(Z) ln p(X, Z|✓) q(Z) ln p(X|✓) = L[q, ✓] + KL[q(Z)||p(Z|X, ✓)] ln p(X|✓) = Z q(Z) ln p(X, Z|✓) P q(Z) + KL[q(Z)||p(Z|X, ✓)] 29
30. 30. EM KL ≥0         EM 2 [E ] q(Z) [M ] ⇔ θ ln p(X|✓) = L[q, ✓] + KL[q(Z)||p(Z|X, ✓)] ln p(X|✓) L[q, ✓] 30
31. 31. EM [E ] q(Z)   ⇔ KL =0       [M ] ⇔ θ   E = KL ≥0   ln p(X|✓) = L[q, ✓] + KL[q(Z)||p(Z|X, ✓)] max q L[q, ✓] = L[p(Z|X, ✓), ✓] 31
32. 32. EM ln p(X|✓) ✓ 32
33. 33. EM ln p(X|✓) ✓ E L[q; ✓] 33
34. 34. EM ln p(X|✓) ✓ L[q; ✓] 34
35. 35. EM ln p(X|✓) ✓ L[q; ✓] ✓0 M 35
36. 36. EM ln p(X|✓) ✓ L[q; ✓] 36
37. 37. EM ln p(X|✓) ✓ L[q; ✓] E 37
38. 38. EM ln p(X|✓) ✓ L[q; ✓] 38
39. 39. EM ln p(X|✓) ✓ L[q; ✓] ✓0 M 39
40. 40. Agenda intro EM em 40
41. 41. em ⇒ KL e- : KL m- : KL e- m- KL M M D D p p q q 41
42. 42. em M D e- m- 42
43. 43. em M D e- p q q = argmin q2D KL[q||p]e- 43
44. 44. em M D m- p q p = argmin p2M KL[q||p]m- 44
45. 45. e- E KL         q q 1           KL[q||p] = Z X Z q(X, Z) ln q(X, Z) p(X, Z|✓) dX q(X, Z) = NY n=1 (X Xn)q(Z) X Z q(Z) = 1 {Xn}N n=1 EM KL 45
46. 46. e- E KL         q q 1           KL[q||p] = Z X Z q(X, Z) ln q(X, Z) p(X, Z|✓) dX q(X, Z) = NY n=1 (X Xn)q(Z) X Z q(Z) = 1 {Xn}N n=1 EM KL X 1 46
47. 47. e- E KL[q||p] = Z X Z q(X, Z) ln q(X, Z) p(X, Z|✓) dX = X Z q(Z){ln q(Z) ln p({Xn}, Z|✓)} Lagrangeq(Z) q(Z) = e ( +1) p({Xn}, Z|✓) λ 47 = Z X Z NY n=1 (X Xn)q(Z) ln P Z QN n=1 (X Xn)q(Z) p(X, Z|✓) dX
48. 48. e- E 1 = X Z q(Z) = e ( +1) X Z p(Z|{Xn}, ✓)p({Xn}|✓) = e ( +1) p({Xn}|✓) q(Z) = p({Xn}, Z|✓) p({Xn}|✓) = p(Z|{Xn}, ✓) EM E 48
49. 49. m- M KL[q||p] = X Z q(Z){ln q(Z) ln p({Xn}, Z|✓)} = X Z q(Z) ln q(Z) p({Xn}, Z|✓) KL p L[q, ✓] = Z q(Z) ln p(X, Z|✓) q(Z) EM KL[q||p] = L[q, ✓] (M ) KL (m- ) 49
50. 50. Euclid → EM em   E ↔ e- / M ↔ m-   50
51. 51. Reference [1] [2] C.M.Bishop (2006). Pattern Recognition and Machine Learning. Springer [3] [4] [5] EM <http:// enakai00.hatenablog.com/entry/2015/05/09/145257> [6] EM <http:// www.slideshare.net/ShinagawaSeitaro/em-58323841> 51