rinko2010

6,893 views

Published on

東大情報理工の数理輪講で発表したときのスライド資料です。CRF, Structured Perceptron, DPLVM (LD-CRF), Latent Variable Perceptron についての説明で、機械学習を専門としていない人も対象としています。

Published in: Technology, Education
0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,893
On SlideShare
0
From Embeds
0
Number of Embeds
935
Actions
Shares
0
Downloads
77
Comments
0
Likes
18
Embeds 0
No embeds

No notes for slide

rinko2010

  1. 1. CRF 2010 12 10 10 1 ( )
  2. 2. ‣‣ 2
  3. 3. 4[Lafferty+, 01] Conditional Random Fields: Probabilistic Modelsfor Segmenting and Labeling Sequence Data. John Lafferty, AndrewMcCallum, Fernando Pereira. Proceedings of ICML’01, 2001.[Collins, 02] Discriminative training methods for hidden markovmodels: Theory and experiments with perceptron algorithms.Michael Collins. Proceedings of EMNLP’02, 2002.[Morency+, 07] Latent-dynamic discriminative models forcontinuous gesture recognition. Louis-Philippe Morency, AriadnaQuattoni, and Trevor Darrell. Proceedings of CVPR’07, 2007.[Sun+, 09] Latent Variable Perceptron Algorithm for StructuredClassification. Xu Sun, Takuya Matsuzaki, Daisuke Okanohara andJun’ichi Tsujii. Proceedings of IJCAI’09, 2009 3
  4. 4. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 4
  5. 5. x= x1 x2 xmy= y1 y2 ym y1 , . . . , ym ∈ Y 5
  6. 6. ( : NP-chunking)x1 x2 x3 x4 x5He is her brother .B O B I Oy1 y2 y3 y4 y5 Y = {B, I, O} 6
  7. 7. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 7
  8. 8. Θ P (y|x, Θ)P (yi |xi , Θ) ∗ Θ {(xi , yi )}i=1 ∗ d .. . .. . d 8
  9. 9. Θ P (y|x, Θ)x ˆ y = argmax P (y|x, Θ) ˆ y 9
  10. 10. (x, y) →     f1 (y, x) Θ1   f2 (y, x)     Θ2    . .   . . n  . · .  = F (y|x, Θ)      . .   . .   .   .  fn (y, x) Θn = = f (y, x) Θ 10
  11. 11. 1 P (y|x, Θ) = exp F (y|x, Θ) Z Z= exp F (y |x, Θ) y F (y|x, Θ) = f (y, x) · Θ1/Z argmax P (y|x, Θ) = argmax F (y|x, Θ) y y 11
  12. 12. 1 P (y|x, Θ) = exp F (y|x, Θ) Z Z= exp F (y |x, Θ) y F (y|x, Θ) = fO(|Yx) ) Θ (y, |m ·1/Z argmax P (y|x, Θ) = argmax F (y|x, Θ) y y 12
  13. 13. CRF: Conditional Random Field (sequential) yj−1 yj s(j, x, yj ) t(j, x, yj−1 , yj ) ⇒ 13
  14. 14. CRF: Conditional Random Field (sequential) yj−1 yj s(j, x, yj ) t(j, x, yj−1 , yj ) ⇒ 14
  15. 15. CRF d maximize log P (yi |xi , Θ) ∗ − R(Θ) i=1 R(Θ) Θ 15
  16. 16. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 16
  17. 17. Structured Perceptron‣‣ (xi , yi ) ∗ F (yi |xi , Θ) ∗ =Θ· f (yi , xi ) ∗(xi , yi ) ∗ yi = argmax F (y|xi , Θ ) i y yi = ∗ yi yi = ∗ yi Θ i+1 =Θ + i f (yi , xi ) ∗ − f (yi , xi ) Θ i+1 =Θ i 17
  18. 18. Structured Perceptron Θ i+1 =Θ + i f (yi , xi ) ∗ − f (yi , xi ) Θ i+1 · (f (yi , xi ) ∗ − f (yi , xi )) 2 =Θ · i (f (yi , xi ) ∗ − f (yi , xi )) + f (yi , xi ) ∗ − f (yi , xi )2⇔ F (yi |xi , Θ ) ∗ i+1 − F (yi |xi , Θ i+1 ) 2 = F (yi |xi , Θi ) ∗ − F (yi |xi , Θ ) + i f (yi , xi ) ∗ − f (yi , xi )2 ≥0 18
  19. 19. Structured Perceptron Θ i+1 =Θ + i f (yi , xi ) ∗ − f (yi , xi ) ∗ yi yi F (yi |xi , Θ ) ∗ i+1 − F (yi |xi , Θ i+1 ) 2 = F (yi |xi , Θi ) ∗ − F (yi |xi , Θ ) + i f (yi , xi ) ∗ − f (yi , xi )2 ≥0 19
  20. 20. Structured Perceptron‣‣ d M 20
  21. 21. separabilityG(xi ) = {all possible label sequences for an example xi },G(xi ) = G(xi ) − ∗ {yi } {(xi , yi )}d ∗ i=1 δ0 U2 = 1 U ∀i, ∀z ∈ G(xi ), F (yi |xi , U) − F (z|xi , U) ≥ δ. ∗ 21
  22. 22. mistake bound δ0{(xi , yi )}d ∗ i=1 M 2 R M≤ 2 δ R ∀i, ∀z ∈ G(xi ), f (yi , xi ) − f (z, xi )2 ≤ R ∗ d 22
  23. 23. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 23
  24. 24. They are her flowers . B O B I OThey gave her flowers . B O B B O 24
  25. 25. They are her flowers . B O B I O B1They gave her flowers . B O B B O B2 25
  26. 26. DPLVM - Discriminative Probabilistic Latent Variable Model Y ={ B , I , O }                    HB = { B1 , . . . , B|HB | } |HB | 26
  27. 27. DPLVM - Discriminative Probabilistic Latent Variable Model y= y1 y2 ym h= h1 h2 hm ∀j, hj ∈ Hyj def. ⇐⇒ Proj(h) = y 27
  28. 28. DPLVM (x, h) →     f1 (h, x) Θ1  f2 (h, x)   Θ2       . .   . .   . · .  = F (h|x, Θ)      .   .   . .   . .  fn (h, x) Θn = = f (h, x) Θ 28
  29. 29. DPLVM h 1 P (h|x, Θ) = exp F (h|x, Θ) Z Z= exp F (h |x, Θ) h F (h|x, Θ) = f (h, x) · Θ f (h, x) argmax P (h|x, Θ) = argmax F (h|x, Θ) h h 29
  30. 30. DPLVM ∗ (xi , yi ) h P (h|x, Θ) ∗ yi h P (y|x, Θ) = P (y|h, x, Θ)P (h|x, Θ) h = P (h|x, Θ) h:Proj(h)=y 30
  31. 31. DPLVM d maximize log P (yi |xi , Θ) ∗ − R(Θ) i=1 R(Θ) Θ 31
  32. 32. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 32
  33. 33. Latent Variable Perceptron (xi , yi ) ∗ hi = argmax F (hi |xi , Θ), h yi = Proj(hi ) yi = ∗ yi yi = ∗ yiΘ i+1 =Θ +i f (hi , xi ) ∗ − f (h, xi ) Θ i+1 =Θ i ∗ hi ∗ hi = argmax F (h|xi , Θ ) i ∗ h:Proj(h)=yi 33
  34. 34. mistake bound δ0{(xi , yi )}i=1 ∗ d M 2T M 2 2 M≤ δ2 T dM = max f (y, xi )2 . i,y 34
  35. 35. ‣‣ CRF Structured (Conditional Random Field) Perceptron 1 2 3 4 DPLVM Latent Variable (Discriminative Probabilistic Latent Variable Model) Perceptron‣ 35
  36. 36. ( )‣ X = {a, b}‣ Y = {A, B}‣ HA = {A1 , A2 }, HB = {B1 , B2 }‣ P (hj |hj−1 ) P (xj |hj ) h x‣ y = Proj(h)‣ {(xi , yi )}i=1 ∗ d‣ 36
  37. 37. ( )‣ p from to A1 A2 B1 B2 A1 (1 − p)/3 (1 − p)/3 (1 − p)/3 p A2 p (1 − p)/3 (1 − p)/3 (1 − p)/3 B1 (1 − p)/3 p (1 − p)/3 (1 − p)/3 B2 (1 − p)/3 (1 − p)/3 p (1 − p)/3‣ P (xi = a|hi ) hi = A1 hi = A2 hi = B1 hi = B2 0.1 0.7 0.7 0.6 37
  38. 38. ( ) Latent Variable Perceptron Structured Perceptron 100 90accuracy [%] 80 70 60 50 40 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 p 38
  39. 39. ‣‣‣‣‣ 39
  40. 40. ‣‣ 40

×