SlideShare a Scribd company logo
Efficient Methods for Incorporating
Knowledge into Topic Models
[Yang, Downey and Boyd-Graber 2015]
2015/10/24
EMNLP 2015 Reading
@shuyo
Large-scale Topic Model
• In academic papers
– Up to 10^3 topics
• Industrial applications
– 10^5~10^6 topics!
– Search engines, online ads. and so on
– To capture infrequent topics
• This paper handles up to 500 topics...
really?
(Standard) LDA
[Blei+ 2003, Griffiths+ 2004]
• "Conventional" Gibbs sampling
𝑃 𝑧 = 𝑡 𝒛−, 𝑤 ∝ 𝑞𝑡 ≔ 𝑛 𝑑,𝑡 + 𝛼
𝑛 𝑤,𝑡 + 𝛽
𝑛 𝑡 + 𝑉𝛽
– 𝑇 : Topic size
– For 𝑈~𝒰 0, 𝑧
𝑇 𝑞 𝑧 , find 𝑡 s.t. 𝑧
𝑡−1 𝑞 𝑧 < 𝑈 < 𝑧
𝑡 𝑞 𝑧
• For large T, it is computationally intensive
– 𝑛 𝑤,𝑡 is sparse
– When T is very large, 𝑛 𝑑,𝑡 is too e.g. 𝑇 = 106
> 𝑛 𝑑
SparseLDA [Yao+ 2009]
𝑡
𝑃 𝑧 = 𝑡 𝒛−, 𝑤 ∝
𝑡
𝛼𝛽
𝑛 𝑡 + 𝑉𝛽
+
𝑡
𝑛 𝑑,𝑡 𝛽
𝑛 𝑡 + 𝑉𝛽
+
𝑡
𝑛 𝑑,𝑡 + 𝛼 𝑛 𝑤,𝑡
𝑛 𝑡 + 𝑉𝛽
• 𝑠 = 𝑡 𝑠𝑡 , 𝑟 = 𝑡 𝑟𝑡 , 𝑞 = 𝑡 𝑞𝑡
• For 𝑈~𝒰 0, 𝑠 + 𝑟 + 𝑞 ,
– If 0 < 𝑈 < 𝑠, find 𝑡 s.t. 𝑧
𝑡−1
𝑠 𝑧 < 𝑈 < 𝑧
𝑡
𝑠 𝑧
– If 𝑠 < 𝑈 < 𝑠 + 𝑟, find 𝑡 s.t.𝑛 𝑑,𝑡 > 0, 𝑧
𝑡−1
𝑟𝑧 < 𝑈 − 𝑠 < 𝑧
𝑡
𝑟𝑧
– If 𝑠 + 𝑟 < 𝑈 < 𝑠 + 𝑟 + 𝑞,
find 𝑡 s.t.𝑛 𝑤,𝑡 > 0, 𝑧
𝑡−1 𝑞 𝑧 < 𝑈 − 𝑠 − 𝑟 < 𝑧
𝑡 𝑞 𝑧
• Faster because 𝑛 𝑤,𝑡 and 𝑛 𝑑,𝑡 are sparse
𝑠𝑡 𝑟𝑡 𝑞𝑡
independent on w, d dependent on d only
Leveraging Prior Knowledge
• The objective function of topic models
does not correlate with human
judgements
Word correlation prior
knowledge
• Must-link
– “quarterback” and “fumble” are both
related to American football
• Cannot-link
– “fumble” and “bank” imply two different
topics
SC-LDA [Yang+ 2015]
• 𝑚 ∈ 𝑀 : Prior knowledge
• 𝑓𝑚(𝑧, 𝑤, 𝑑) : Potential function of prior
knowledge 𝑚 about word 𝑤 with topic
𝑧 in document 𝑑
• 𝜓 𝒛, 𝑀 = 𝑧∈𝒛 exp 𝑓𝑚 𝑧, 𝑤, 𝑑
• 𝑃 𝒘, 𝒛 𝛼, 𝛽, 𝑀 = 𝑃 𝒘 𝒛, 𝛽 𝑃 𝒛 𝛼 𝜓(𝒛, 𝑀)
maybe ∝
maybe 𝑚 ∈ 𝑀, all 𝑤 with 𝑧 in all 𝑑
Sparse Constrained
Inference for SC-LDA
𝑉
Word correlation prior
knowledge for SC-LDA
• 𝑓𝑚 𝑧, 𝑤, 𝑑 =
𝑢∈𝑀 𝑤
𝑚
log max 𝜆, 𝑛 𝑢,𝑧 +
𝑣∈𝑀 𝑤
𝑐
log
1
max 𝜆, 𝑛 𝑣,𝑧
– where 𝑀 𝑤
𝑚 : Must-link of 𝑤, 𝑀 𝑤
𝑐 : Cannot-link of 𝑤
• 𝑃 𝑧 = 𝑡 𝒛−, 𝑤, 𝑀 ∝
𝛼𝛽
𝑛 𝑡+𝑉𝛽
+
𝑛 𝑑,𝑡 𝛽
𝑛 𝑡+𝑉𝛽
+
𝑛 𝑑,𝑡+𝛼 𝑛 𝑤,𝑡
𝑛 𝑡+𝑉𝛽
𝑢∈𝑀 𝑤
𝑚
max 𝜆, 𝑛 𝑢,𝑧
𝑣∈𝑀 𝑤
𝑐
1
max 𝜆, 𝑛 𝑣,𝑧
Factor Graph
• They tell that prior knowledge is incorporated
“by adding a factor graph to encode prior
knowledge,” but it does not be drawn.
• The potential function 𝑓𝑚 𝑧, 𝑤, 𝑑 contains 𝑛 𝑤,𝑧,
and 𝜑 𝑤,𝑧 ∝ 𝑛 𝑤,𝑧 + 𝛽.
• So the above model seems like Fig.b:
Fig.a Fig.b
[Ramage+ 2009] Labeled LDA
• Supervized LDA for labeled documents
– It is equivalent to SC-LDA with the
following potential function
𝑓𝑚 𝑧, 𝑤, 𝑑 =
1, if 𝑧 ∈ 𝑚 𝑑
−∞, else
where 𝑚 𝑑 specifies a label set of 𝑑
Experiments
• Baselines
– Dirichlet Forest-LDA [Andrzejewski+ 2009]
– Logic-LDA [Andrzejewski+ 2011]
– MRF-LDA [Xie+ 2015]
• Encodes word correlations in LDA as MRF
– SparseLDA
DATASET DOCS TYPE TOKEN(APPROX) Experiments
NIPS 1,500 12,419 1,900,000
Word correlation
NYT-NEWS 3,000,000 102,660 100,000,000
20NG 18,828 21,514 1,946,000 Labeled docs
Generate Word Correlation
• Must-link
– Obtain synsets from WordNet 3.0
– Similarity between the word and its
synsets on word embedding from
word2vec is higher than threshold 0.2
• Cannot-link
– Nothing?
Convergence Speed
The average running time per iteration
over 100 iterations, averaged over 5
seeds, on 20NG dataset.
Coherence [Mimno+ 2011]
• 𝐶 𝑡: 𝑉 𝑡 = 𝑚=2
𝑀
𝑙=1
𝑚−1
log
𝐹 𝑣 𝑚
𝑡
,𝑣𝑙
𝑡
+𝜖
𝐹 𝑣𝑙
𝑡
– 𝐹 𝑣 : document frequency of word type 𝑣
– 𝐹 𝑣, 𝑣′ :co-document frequency of word type 𝑣, 𝑣′
It means
“include”?
𝜖 is very small like
10−12
[Röder+ 2015]
-39.1 -36.6
References
• [Yang+ 2015] Efficient Methods for Incorporating Knowledge into Topic Models
• [Blei+ 2003] Latent Dirichlet allocation.
• [Griffiths+ 2004] Finding scientific topics.
• [Yao+ 2009] Efficient methods for topic model inference on streaming document
collections.
• [Ramage+ 2009] Labeled LDA: A supervised topic model for credit attribution in
multilabeled corpora.
• [Andrzejewski+ 2009] Incorporating domain knowledge into topic modeling via
Dirichlet forest priors.
• [Andrzejewski+ 2011] A framework for incorporating general domain knowledge
into latent Dirichlet allocation using first-order logic.
• [Xie+ 2015] Incorporating word correlation knowledge into topic modeling.
• [Mimno+ 2011] Optimizing semantic coherence in topic models.
• [Röder+ 2015] Exploring the space of topic coherence measures.

More Related Content

What's hot

Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
hyunyoung Lee
 
AI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, FlexudyAI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, Flexudy
Erlangen Artificial Intelligence & Machine Learning Meetup
 
Data mining techniques
Data mining techniquesData mining techniques
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetup
Dan Sullivan, Ph.D.
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
Bruno Gonçalves
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
HPCC Systems
 
Maximum likelihood-set - introduction
Maximum likelihood-set - introductionMaximum likelihood-set - introduction
Maximum likelihood-set - introductionYusuke Matsubara
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
rusbase
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
👋 Christopher Moody
 
Probabilistic Retrieval
Probabilistic RetrievalProbabilistic Retrieval
Probabilistic Retrieval
otisg
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
Mehwish Alam
 
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
Masumi Shirakawa
 
8. String
8. String8. String
8. String
Nilesh Dalvi
 

What's hot (14)

Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
 
AI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, FlexudyAI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, Flexudy
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetup
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
 
Maximum likelihood-set - introduction
Maximum likelihood-set - introductionMaximum likelihood-set - introduction
Maximum likelihood-set - introduction
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Probabilistic Retrieval
Probabilistic RetrievalProbabilistic Retrieval
Probabilistic Retrieval
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
 
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
 
8. String
8. String8. String
8. String
 

Viewers also liked

Learning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional RepresentationsLearning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional Representations
Takanori Nakai
 
EMNLP 2015 yomikai
EMNLP 2015 yomikai EMNLP 2015 yomikai
EMNLP 2015 yomikai
Yo Ehara
 
Emnlp読み会資料
Emnlp読み会資料Emnlp読み会資料
Emnlp読み会資料
Jiro Nishitoba
 
Humor Recognition and Humor Anchor Extraction
Humor Recognition and Humor Anchor ExtractionHumor Recognition and Humor Anchor Extraction
Humor Recognition and Humor Anchor Extraction
裕樹 奥田
 
Memory Networks (End-to-End Memory Networks の Chainer 実装)
Memory Networks (End-to-End Memory Networks の Chainer 実装)Memory Networks (End-to-End Memory Networks の Chainer 実装)
Memory Networks (End-to-End Memory Networks の Chainer 実装)
Shuyo Nakatani
 
無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)
Shuyo Nakatani
 
強化学習その1
強化学習その1強化学習その1
強化学習その1
nishio
 
星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章
Shuyo Nakatani
 
星野「調査観察データの統計科学」第1&2章
星野「調査観察データの統計科学」第1&2章星野「調査観察データの統計科学」第1&2章
星野「調査観察データの統計科学」第1&2章
Shuyo Nakatani
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]
Yuta Kikuchi
 

Viewers also liked (10)

Learning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional RepresentationsLearning Better Embeddings for Rare Words Using Distributional Representations
Learning Better Embeddings for Rare Words Using Distributional Representations
 
EMNLP 2015 yomikai
EMNLP 2015 yomikai EMNLP 2015 yomikai
EMNLP 2015 yomikai
 
Emnlp読み会資料
Emnlp読み会資料Emnlp読み会資料
Emnlp読み会資料
 
Humor Recognition and Humor Anchor Extraction
Humor Recognition and Humor Anchor ExtractionHumor Recognition and Humor Anchor Extraction
Humor Recognition and Humor Anchor Extraction
 
Memory Networks (End-to-End Memory Networks の Chainer 実装)
Memory Networks (End-to-End Memory Networks の Chainer 実装)Memory Networks (End-to-End Memory Networks の Chainer 実装)
Memory Networks (End-to-End Memory Networks の Chainer 実装)
 
無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)
 
強化学習その1
強化学習その1強化学習その1
強化学習その1
 
星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章
 
星野「調査観察データの統計科学」第1&2章
星野「調査観察データの統計科学」第1&2章星野「調査観察データの統計科学」第1&2章
星野「調査観察データの統計科学」第1&2章
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]
 

Similar to [Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowledge into Topic Models

04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
Shree Shree
 
R tutorial
R tutorialR tutorial
R tutorial
Richard Vidgen
 
Text Mining
Text MiningText Mining
Text Mining
sathish sak
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
ananth
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
Ashraf Uddin
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
Robin van Emden
 
Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
DataWorks Summit
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
ryanorban
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
Dr Arash Najmaei ( Phd., MBA, BSc)
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Max Irwin
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
odsc
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
GowrySailaja
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
fridolin.wild
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 
A Matching Approach Based on Term Clusters for eRecruitment
A Matching Approach Based on Term Clusters for eRecruitmentA Matching Approach Based on Term Clusters for eRecruitment
A Matching Approach Based on Term Clusters for eRecruitment
Kemal Can Kara
 
information retrival and text processing
information retrival and text processinginformation retrival and text processing
information retrival and text processing
mausamraushan2288
 
Learning deep structured semantic models for web search
Learning deep structured semantic models for web searchLearning deep structured semantic models for web search
Learning deep structured semantic models for web search
hyunsung lee
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Traian Rebedea
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
Cloudera, Inc.
 
Info 2402 irt-chapter_4
Info 2402 irt-chapter_4Info 2402 irt-chapter_4
Info 2402 irt-chapter_4
Shahriar Rafee
 

Similar to [Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowledge into Topic Models (20)

04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
 
R tutorial
R tutorialR tutorial
R tutorial
 
Text Mining
Text MiningText Mining
Text Mining
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
 
Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
A Matching Approach Based on Term Clusters for eRecruitment
A Matching Approach Based on Term Clusters for eRecruitmentA Matching Approach Based on Term Clusters for eRecruitment
A Matching Approach Based on Term Clusters for eRecruitment
 
information retrival and text processing
information retrival and text processinginformation retrival and text processing
information retrival and text processing
 
Learning deep structured semantic models for web search
Learning deep structured semantic models for web searchLearning deep structured semantic models for web search
Learning deep structured semantic models for web search
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 
Info 2402 irt-chapter_4
Info 2402 irt-chapter_4Info 2402 irt-chapter_4
Info 2402 irt-chapter_4
 

More from Shuyo Nakatani

画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
Shuyo Nakatani
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Shuyo Nakatani
 
人工知能と機械学習の違いって?
人工知能と機械学習の違いって?人工知能と機械学習の違いって?
人工知能と機械学習の違いって?
Shuyo Nakatani
 
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
RとStanでクラウドセットアップ時間を分析してみたら #TokyoRRとStanでクラウドセットアップ時間を分析してみたら #TokyoR
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
Shuyo Nakatani
 
ドラえもんでわかる統計的因果推論 #TokyoR
ドラえもんでわかる統計的因果推論 #TokyoRドラえもんでわかる統計的因果推論 #TokyoR
ドラえもんでわかる統計的因果推論 #TokyoR
Shuyo Nakatani
 
言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyoShuyo Nakatani
 
Zipf? (ジップ則のひみつ?) #DSIRNLP
Zipf? (ジップ則のひみつ?) #DSIRNLPZipf? (ジップ則のひみつ?) #DSIRNLP
Zipf? (ジップ則のひみつ?) #DSIRNLP
Shuyo Nakatani
 
ソーシャルメディアの多言語判定 #SoC2014
ソーシャルメディアの多言語判定 #SoC2014ソーシャルメディアの多言語判定 #SoC2014
ソーシャルメディアの多言語判定 #SoC2014
Shuyo Nakatani
 
猫に教えてもらうルベーグ可測
猫に教えてもらうルベーグ可測猫に教えてもらうルベーグ可測
猫に教えてもらうルベーグ可測Shuyo Nakatani
 
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
アラビア語とペルシャ語の見分け方 #DSIRNLP 5アラビア語とペルシャ語の見分け方 #DSIRNLP 5
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
Shuyo Nakatani
 
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013Shuyo Nakatani
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門Shuyo Nakatani
 
数式を綺麗にプログラミングするコツ #spro2013
数式を綺麗にプログラミングするコツ #spro2013数式を綺麗にプログラミングするコツ #spro2013
数式を綺麗にプログラミングするコツ #spro2013Shuyo Nakatani
 
ノンパラベイズ入門の入門
ノンパラベイズ入門の入門ノンパラベイズ入門の入門
ノンパラベイズ入門の入門
Shuyo Nakatani
 
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
Shuyo Nakatani
 
Short Text Language Detection with Infinity-Gram
Short Text Language Detection with Infinity-GramShort Text Language Detection with Infinity-Gram
Short Text Language Detection with Infinity-GramShuyo Nakatani
 
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing SystemsShuyo Nakatani
 
極大部分文字列を使った twitter 言語判定
極大部分文字列を使った twitter 言語判定極大部分文字列を使った twitter 言語判定
極大部分文字列を使った twitter 言語判定
Shuyo Nakatani
 
人間言語判別 カタルーニャ語編
人間言語判別 カタルーニャ語編人間言語判別 カタルーニャ語編
人間言語判別 カタルーニャ語編
Shuyo Nakatani
 
Extreme Extraction - Machine Reading in a Week
Extreme Extraction - Machine Reading in a WeekExtreme Extraction - Machine Reading in a Week
Extreme Extraction - Machine Reading in a WeekShuyo Nakatani
 

More from Shuyo Nakatani (20)

画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
人工知能と機械学習の違いって?
人工知能と機械学習の違いって?人工知能と機械学習の違いって?
人工知能と機械学習の違いって?
 
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
RとStanでクラウドセットアップ時間を分析してみたら #TokyoRRとStanでクラウドセットアップ時間を分析してみたら #TokyoR
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
 
ドラえもんでわかる統計的因果推論 #TokyoR
ドラえもんでわかる統計的因果推論 #TokyoRドラえもんでわかる統計的因果推論 #TokyoR
ドラえもんでわかる統計的因果推論 #TokyoR
 
言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo
 
Zipf? (ジップ則のひみつ?) #DSIRNLP
Zipf? (ジップ則のひみつ?) #DSIRNLPZipf? (ジップ則のひみつ?) #DSIRNLP
Zipf? (ジップ則のひみつ?) #DSIRNLP
 
ソーシャルメディアの多言語判定 #SoC2014
ソーシャルメディアの多言語判定 #SoC2014ソーシャルメディアの多言語判定 #SoC2014
ソーシャルメディアの多言語判定 #SoC2014
 
猫に教えてもらうルベーグ可測
猫に教えてもらうルベーグ可測猫に教えてもらうルベーグ可測
猫に教えてもらうルベーグ可測
 
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
アラビア語とペルシャ語の見分け方 #DSIRNLP 5アラビア語とペルシャ語の見分け方 #DSIRNLP 5
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
 
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門
 
数式を綺麗にプログラミングするコツ #spro2013
数式を綺麗にプログラミングするコツ #spro2013数式を綺麗にプログラミングするコツ #spro2013
数式を綺麗にプログラミングするコツ #spro2013
 
ノンパラベイズ入門の入門
ノンパラベイズ入門の入門ノンパラベイズ入門の入門
ノンパラベイズ入門の入門
 
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
 
Short Text Language Detection with Infinity-Gram
Short Text Language Detection with Infinity-GramShort Text Language Detection with Infinity-Gram
Short Text Language Detection with Infinity-Gram
 
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
 
極大部分文字列を使った twitter 言語判定
極大部分文字列を使った twitter 言語判定極大部分文字列を使った twitter 言語判定
極大部分文字列を使った twitter 言語判定
 
人間言語判別 カタルーニャ語編
人間言語判別 カタルーニャ語編人間言語判別 カタルーニャ語編
人間言語判別 カタルーニャ語編
 
Extreme Extraction - Machine Reading in a Week
Extreme Extraction - Machine Reading in a WeekExtreme Extraction - Machine Reading in a Week
Extreme Extraction - Machine Reading in a Week
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 

[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowledge into Topic Models

  • 1. Efficient Methods for Incorporating Knowledge into Topic Models [Yang, Downey and Boyd-Graber 2015] 2015/10/24 EMNLP 2015 Reading @shuyo
  • 2. Large-scale Topic Model • In academic papers – Up to 10^3 topics • Industrial applications – 10^5~10^6 topics! – Search engines, online ads. and so on – To capture infrequent topics • This paper handles up to 500 topics... really?
  • 3. (Standard) LDA [Blei+ 2003, Griffiths+ 2004] • "Conventional" Gibbs sampling 𝑃 𝑧 = 𝑡 𝒛−, 𝑤 ∝ 𝑞𝑡 ≔ 𝑛 𝑑,𝑡 + 𝛼 𝑛 𝑤,𝑡 + 𝛽 𝑛 𝑡 + 𝑉𝛽 – 𝑇 : Topic size – For 𝑈~𝒰 0, 𝑧 𝑇 𝑞 𝑧 , find 𝑡 s.t. 𝑧 𝑡−1 𝑞 𝑧 < 𝑈 < 𝑧 𝑡 𝑞 𝑧 • For large T, it is computationally intensive – 𝑛 𝑤,𝑡 is sparse – When T is very large, 𝑛 𝑑,𝑡 is too e.g. 𝑇 = 106 > 𝑛 𝑑
  • 4. SparseLDA [Yao+ 2009] 𝑡 𝑃 𝑧 = 𝑡 𝒛−, 𝑤 ∝ 𝑡 𝛼𝛽 𝑛 𝑡 + 𝑉𝛽 + 𝑡 𝑛 𝑑,𝑡 𝛽 𝑛 𝑡 + 𝑉𝛽 + 𝑡 𝑛 𝑑,𝑡 + 𝛼 𝑛 𝑤,𝑡 𝑛 𝑡 + 𝑉𝛽 • 𝑠 = 𝑡 𝑠𝑡 , 𝑟 = 𝑡 𝑟𝑡 , 𝑞 = 𝑡 𝑞𝑡 • For 𝑈~𝒰 0, 𝑠 + 𝑟 + 𝑞 , – If 0 < 𝑈 < 𝑠, find 𝑡 s.t. 𝑧 𝑡−1 𝑠 𝑧 < 𝑈 < 𝑧 𝑡 𝑠 𝑧 – If 𝑠 < 𝑈 < 𝑠 + 𝑟, find 𝑡 s.t.𝑛 𝑑,𝑡 > 0, 𝑧 𝑡−1 𝑟𝑧 < 𝑈 − 𝑠 < 𝑧 𝑡 𝑟𝑧 – If 𝑠 + 𝑟 < 𝑈 < 𝑠 + 𝑟 + 𝑞, find 𝑡 s.t.𝑛 𝑤,𝑡 > 0, 𝑧 𝑡−1 𝑞 𝑧 < 𝑈 − 𝑠 − 𝑟 < 𝑧 𝑡 𝑞 𝑧 • Faster because 𝑛 𝑤,𝑡 and 𝑛 𝑑,𝑡 are sparse 𝑠𝑡 𝑟𝑡 𝑞𝑡 independent on w, d dependent on d only
  • 5. Leveraging Prior Knowledge • The objective function of topic models does not correlate with human judgements
  • 6. Word correlation prior knowledge • Must-link – “quarterback” and “fumble” are both related to American football • Cannot-link – “fumble” and “bank” imply two different topics
  • 7. SC-LDA [Yang+ 2015] • 𝑚 ∈ 𝑀 : Prior knowledge • 𝑓𝑚(𝑧, 𝑤, 𝑑) : Potential function of prior knowledge 𝑚 about word 𝑤 with topic 𝑧 in document 𝑑 • 𝜓 𝒛, 𝑀 = 𝑧∈𝒛 exp 𝑓𝑚 𝑧, 𝑤, 𝑑 • 𝑃 𝒘, 𝒛 𝛼, 𝛽, 𝑀 = 𝑃 𝒘 𝒛, 𝛽 𝑃 𝒛 𝛼 𝜓(𝒛, 𝑀) maybe ∝ maybe 𝑚 ∈ 𝑀, all 𝑤 with 𝑧 in all 𝑑 Sparse Constrained
  • 9. Word correlation prior knowledge for SC-LDA • 𝑓𝑚 𝑧, 𝑤, 𝑑 = 𝑢∈𝑀 𝑤 𝑚 log max 𝜆, 𝑛 𝑢,𝑧 + 𝑣∈𝑀 𝑤 𝑐 log 1 max 𝜆, 𝑛 𝑣,𝑧 – where 𝑀 𝑤 𝑚 : Must-link of 𝑤, 𝑀 𝑤 𝑐 : Cannot-link of 𝑤 • 𝑃 𝑧 = 𝑡 𝒛−, 𝑤, 𝑀 ∝ 𝛼𝛽 𝑛 𝑡+𝑉𝛽 + 𝑛 𝑑,𝑡 𝛽 𝑛 𝑡+𝑉𝛽 + 𝑛 𝑑,𝑡+𝛼 𝑛 𝑤,𝑡 𝑛 𝑡+𝑉𝛽 𝑢∈𝑀 𝑤 𝑚 max 𝜆, 𝑛 𝑢,𝑧 𝑣∈𝑀 𝑤 𝑐 1 max 𝜆, 𝑛 𝑣,𝑧
  • 10. Factor Graph • They tell that prior knowledge is incorporated “by adding a factor graph to encode prior knowledge,” but it does not be drawn. • The potential function 𝑓𝑚 𝑧, 𝑤, 𝑑 contains 𝑛 𝑤,𝑧, and 𝜑 𝑤,𝑧 ∝ 𝑛 𝑤,𝑧 + 𝛽. • So the above model seems like Fig.b: Fig.a Fig.b
  • 11. [Ramage+ 2009] Labeled LDA • Supervized LDA for labeled documents – It is equivalent to SC-LDA with the following potential function 𝑓𝑚 𝑧, 𝑤, 𝑑 = 1, if 𝑧 ∈ 𝑚 𝑑 −∞, else where 𝑚 𝑑 specifies a label set of 𝑑
  • 12. Experiments • Baselines – Dirichlet Forest-LDA [Andrzejewski+ 2009] – Logic-LDA [Andrzejewski+ 2011] – MRF-LDA [Xie+ 2015] • Encodes word correlations in LDA as MRF – SparseLDA DATASET DOCS TYPE TOKEN(APPROX) Experiments NIPS 1,500 12,419 1,900,000 Word correlation NYT-NEWS 3,000,000 102,660 100,000,000 20NG 18,828 21,514 1,946,000 Labeled docs
  • 13. Generate Word Correlation • Must-link – Obtain synsets from WordNet 3.0 – Similarity between the word and its synsets on word embedding from word2vec is higher than threshold 0.2 • Cannot-link – Nothing?
  • 14. Convergence Speed The average running time per iteration over 100 iterations, averaged over 5 seeds, on 20NG dataset.
  • 15. Coherence [Mimno+ 2011] • 𝐶 𝑡: 𝑉 𝑡 = 𝑚=2 𝑀 𝑙=1 𝑚−1 log 𝐹 𝑣 𝑚 𝑡 ,𝑣𝑙 𝑡 +𝜖 𝐹 𝑣𝑙 𝑡 – 𝐹 𝑣 : document frequency of word type 𝑣 – 𝐹 𝑣, 𝑣′ :co-document frequency of word type 𝑣, 𝑣′ It means “include”? 𝜖 is very small like 10−12 [Röder+ 2015] -39.1 -36.6
  • 16. References • [Yang+ 2015] Efficient Methods for Incorporating Knowledge into Topic Models • [Blei+ 2003] Latent Dirichlet allocation. • [Griffiths+ 2004] Finding scientific topics. • [Yao+ 2009] Efficient methods for topic model inference on streaming document collections. • [Ramage+ 2009] Labeled LDA: A supervised topic model for credit attribution in multilabeled corpora. • [Andrzejewski+ 2009] Incorporating domain knowledge into topic modeling via Dirichlet forest priors. • [Andrzejewski+ 2011] A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. • [Xie+ 2015] Incorporating word correlation knowledge into topic modeling. • [Mimno+ 2011] Optimizing semantic coherence in topic models. • [Röder+ 2015] Exploring the space of topic coherence measures.