Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

86,202 views

Published on

Supervised Nonparametric Topic Model

Published in: Technology, Business
  • Be the first to comment

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

  1. 1. [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data 2012/07/28 Nakatani Shuyo @ Cybozu Labs, Inc twitter : @shuyo
  2. 2. LDA(Latent Dirichlet Allocation) [Blei+ 03]• Unsupervised Topic Model – Each word has an unobserved topic• Parametric – The topic size K is given in advance via Wikipedia
  3. 3. Labeled LDA [Ramage+ 09]• Supervised Topic Model – Each document has an observed label• Parametric via [Ramage+ 09]
  4. 4. Generative Process for L-LDA• 𝜷 𝑘 ~Dir 𝜼 topics corresponding to 𝑑 observed labels• Λ 𝑘 ~Bernoulli Φ 𝑘• 𝜽 𝑑 ~Dir 𝜶 𝑑 restricted to labeled – where 𝜶 𝑑 = 𝛼𝑘 parameters 𝑑 𝑘 Λ 𝑘 =1 𝑑 𝑑• 𝑧 𝑖 ~Multi 𝜽 𝑑• 𝑤𝑖 ~Multi 𝜷 𝑧 𝑑 𝑖 via [Ramage+ 09]
  5. 5. Pros/Cons of L-LDA• Pros – Easy to implement• Cons via [Ramage+ 09] – It is necessary to specify label-topic correspondence manually • Its performance depends on the corresponds ※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py
  6. 6. DP-MRM [Kim+ 12] – Dirichlet Process with Mixed Random Measures• Supervised Topic Model• Nonparametric – K is not the topic size, but the label size 𝛼 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖 𝜆j 𝑟𝑗 𝐷 𝛽 𝛾𝑘 𝜂 𝐾
  7. 7. Generative Process for DP-MRM 𝛼 Each label has a random measure as topic space 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖• 𝐻 = Dir 𝛽 𝜆j 𝑟𝑗 𝐷• 𝐺0𝑘 ~DP 𝛾 𝑘 , 𝐻 𝛽 𝐾 𝛾𝑘 𝜂• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 where 𝒓 𝑗 = 𝐼 𝑘∈label 𝑗• 𝐺 𝑗 ~DP 𝛼, 𝑘∈label 𝑗 𝜆 𝑗𝑘 𝐺0𝑘 mixed random measures• 𝜃 𝑗𝑖 ~𝐺 𝑗 , 𝑥 𝑗𝑖 ~𝐹 𝜃 𝑗𝑖 = Multi 𝜃 𝑗𝑖
  8. 8. Stick Breaking Process• 𝑣 𝑙 𝑘 ~Beta 1, 𝛾 𝑘 , 𝜋 𝑙𝑘 = 𝑣 𝑙 𝑘 𝑙−1 𝑑=0 1 − 𝑣 𝑑𝑘• 𝜙 𝑙𝑘 ~𝐻, 𝐺0𝑘 = ∞ 𝑙=0 𝜋 𝑙𝑘 𝛿 𝜙 𝑘 𝑙 𝑡−1• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 , 𝑤 𝑗𝑡 ~Beta 1, 𝛼 , 𝜋 𝑗𝑡 = 𝑤 𝑗𝑡 𝑑=0 1 − 𝑤 𝑗𝑑 𝑘 𝑗𝑡 ∞• 𝑘 𝑗𝑡 ~Multi 𝜆 𝑗 , 𝜓 𝑗𝑡 ~𝐺0 , 𝐺𝑗 = 𝑡=0 𝜋 𝑗𝑡 𝛿 𝜓 𝑗𝑡
  9. 9. Chinese Restaurant Franchise• 𝑡 𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document• 𝑘 𝑗𝑡 , 𝑙 𝑗𝑡 : dish indexes on 𝑡-th table of 𝑗-th document This layer consists on only a single DP G0 on normal HDP
  10. 10. Inference (1)• Sampling 𝑡
  11. 11. Inference (2)• Sampling 𝑘 and 𝑙
  12. 12. Experiments• DP-MRM gives label-topic probabilistic corresponding automatically. via [Kim+ 12]
  13. 13. via [Kim+ 12]• L-LDA can also predict single labeled document to assign a common second label to any documents.
  14. 14. References• [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data• [Ramage+ EMNLP2009] Labeled LDA : A supervised topic model for credit attribution in multi-labeled corpora• [Blei+ 2003] Latent Dirichlet Allocation

×