SlideShare a Scribd company logo
Latent Semantic
Analysis
Kiarash Kiani
“Latent semantic analysis (LSA) is a technique in
natural language processing, in particular
distributional semantics, of analyzing relationships
between a set of documents and the terms they
contain by producing a set of concepts related to
the documents and terms.”
“LSA assumes that words that are close in meaning
will occur in similar pieces of text.”
Topics
• LSA: Latent Semantic Analysis

• PLSA: Probabilistic Latent Semantic Analysis

• Unsupervised PLSA
LSA
Integer posuere erat a Pellentesque
ornare lacinia quam. Donec id elit
non mi porta gravida at eget metus.
Curabitur sit blandit. Maecenas
lorem faucibus mollis interdum.
Cras mattis consectetur purus sit
amet fermentum. Fusce dapibus,
tellus ac cursus commodo…
Egestas Justo
Documents
Vocabulary
D = {d1, . . . , dn}
W = {w1, . . . , wm}
Nn×m = (n(di, wj))ij
Nn×m =
n(d1, w1) n(d1, w1) n(d1, w1) … n(d1, wm)
n(d2, w1) n(d2, w1) n(d2, w1) … n(d2, wm)
⋮ ⋮ ⋮ ⋱ ⋮
n(dn, w1) n(dn, w1) n(dn, w1) … n(dn, wm)
A typical term-document matrix derived
from short articles, text summaries or
abstracts may only have a small fraction of
non-zero entries (typically well below 1%)
Sparsness
one has to account for synonyms in order
not to underestimate the true similarity of
documents
Underestimate
one has to deal with polysems to avoid
overestimating the true similarity between
documents by counting common terms
that are used in different meanings
Overestimating
LSA by SVD
Ñ = UΣ̃Vt
≅ UΣVt
= N
NÑt
= UΣ̃2
Ut
PLSA (aspect model)
Documents
Vocabulary
D = {d1, . . . , dn}
W = {w1, . . . , wm}
Class (Topic)
Z = {z1, . . . , zk}
PLSA (aspect model)
d z w
N
wd
P(z|d)
P(d) P(w|z)
P(di)
1.
P(wj |di)
2.
P(zk |di)
3.
select a document
pick a latent class
generate a word
Symmetric Parameterization
D Z W
P(di) P(zk |di) P(wj |zk)
D Z W
P(zk)
P(di |zk) P(wj |zk)
P(di, wj) = ΣK
k=1P(zk)P(di |zk)P(wj |zk)
Topic 1
Topic 2
Topic k
…
P(z1 |d1)
P(z2 |d1)
P(zk |d1)
P(w|z1)
P(w|z1)
P(w|z1) } P(wj |di)
Doc i
Σ
PLSA (aspect model)
Doc 1
Doc 2
Doc i
…
P(d1)
P(d2)
P(di)
P(w|d1)
P(w|d2)
P(w|di) } w
P(di, wj) = P(di)P(wj |di)
P(wj |di) =
K
∑
k=1
P(wj |zk)P(zk |di)
Expectation Maximization
(EM) algorithm
01 step where posterior probabilities
are computed for the latent
variables, based on the current
estimates of the parameters
Expectation (E)
02 where parameters are updated
based on the so-called
expected complete data log-
likelihood which depends on the
posterior probabilities
computed in the E-step.
Maximization (M)
P(zk |di, wj) =
P(wj |zk)P(zk |di)
∑
K
l=1
P(wj |zl)P(zl |di)
P(wj |zk) =
∑
N
i=1
n(di, wj)P(zk |di, wj)
∑
M
m=1
∑
N
i=1
n(di, wm)P(zk |di, wm)
P(zk |di) =
∑
M
j=1
n(di, wj)P(zk |di, wj)
n(di)
SVD
U = (P(di |zk))i,k
V = (P(wj |zk))j,k
Σ = diag(P(zk))k
} P = UΣVt
P(di, wj) = ΣK
k=1P(zk)P(di |zk)P(wj |zk)
Latent Probability Space
0
P(wj |z2)
P(wj |z1)
P(wj |z3)
P(wj |di)
Aspect versus Cluster
P(wj |di) =
K
∑
k=1
P{c(di) = ck}P(wj |ck)
P{c(di) = ck} =
P(ck)∏
M
j=1
P(wj |ck)n(di,wj)
∑
k
l=1
P(cl)∏
M
j=1
P(wj |cl)n(di,wj)
Aspect versus Cluster
P = exp
[
−
∑i,j
n′(di, wj)logP(wj |di)
∑i,j
n′(di, wj) ]
P̃(zk; di, wj) =
[P(zk |di)P(wj |zk)]β
∑l
[p(zl |di)P(wj |zl)]β
Set
Decrease
As long as the performance on hold-out data improves (non-negligible) continue TEM
iterations at this value of fl, otherwise goto step 2
Perform stopping on
1.
2.
3.
4.
β ← 1 and perform EM with early stopping.
) and perform one TEM iteration.
β ← ηβ
,i.e., stop when decreasing does not yield further improvements.
β β
(with η < 1
Reference
• http://www.statisticshowto.com/likelihood-function/

• https://pdfs.semanticscholar.org/presentation/6a21/
da166526b0f6a2cdb5eb451bb46327d0f7b2.pdf

• https://www.youtube.com/watch?v=vtadpVDr1hM

• https://arxiv.org/pdf/1301.6705.pdf

• A View Of The Em Algorithm That Justifies Incremental,
Sparse, And Other Variants - Radford M. Neal & Geoffrey
E. Hinton

More Related Content

Similar to Latent semantic analysis

CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semanticsJinho Choi
 
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language ModelBreaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language ModelSsu-Rui Lee
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learningtelss09
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfJunghyun Lee
 
Frstorder 9 sldes read
Frstorder 9 sldes readFrstorder 9 sldes read
Frstorder 9 sldes readYasir Khan
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)fridolin.wild
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
 
ESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchDaniel Oberski
 
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.ppt
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.pptdfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.ppt
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.pptNobitaNobi489694
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Basic Knowledge Representation in First Order Logic.ppt
Basic Knowledge Representation in First Order Logic.pptBasic Knowledge Representation in First Order Logic.ppt
Basic Knowledge Representation in First Order Logic.pptAshfaqAhmed693399
 
similarity measure
similarity measure similarity measure
similarity measure ZHAO Sam
 
Computational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesComputational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesAntonis Antonopoulos
 
Machine Learning : Latent variable models for discrete data (Topic model ...)
Machine Learning : Latent variable models for discrete data (Topic model ...)Machine Learning : Latent variable models for discrete data (Topic model ...)
Machine Learning : Latent variable models for discrete data (Topic model ...)Yukara Ikemiya
 

Similar to Latent semantic analysis (20)

CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semantics
 
LSA and PLSA
LSA and PLSALSA and PLSA
LSA and PLSA
 
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language ModelBreaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
Frstorder 9 sldes read
Frstorder 9 sldes readFrstorder 9 sldes read
Frstorder 9 sldes read
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Fol
FolFol
Fol
 
Ultra-efficient algorithms for testing well-parenthesised expressions by Tati...
Ultra-efficient algorithms for testing well-parenthesised expressions by Tati...Ultra-efficient algorithms for testing well-parenthesised expressions by Tati...
Ultra-efficient algorithms for testing well-parenthesised expressions by Tati...
 
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
 
ESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey Research
 
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.ppt
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.pptdfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.ppt
dfgsdfdsgdfgfdgdrgdfgffdhyrthfgnhgjhgdfs.ppt
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Basic Knowledge Representation in First Order Logic.ppt
Basic Knowledge Representation in First Order Logic.pptBasic Knowledge Representation in First Order Logic.ppt
Basic Knowledge Representation in First Order Logic.ppt
 
similarity measure
similarity measure similarity measure
similarity measure
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Computational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesComputational Complexity: Complexity Classes
Computational Complexity: Complexity Classes
 
Machine Learning : Latent variable models for discrete data (Topic model ...)
Machine Learning : Latent variable models for discrete data (Topic model ...)Machine Learning : Latent variable models for discrete data (Topic model ...)
Machine Learning : Latent variable models for discrete data (Topic model ...)
 
1
11
1
 

Recently uploaded

Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单ewymefz
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatheahmadsaood
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundOppotus
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单ukgaet
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Domenico Conte
 

Recently uploaded (20)

Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 

Latent semantic analysis

  • 2. “Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.” “LSA assumes that words that are close in meaning will occur in similar pieces of text.”
  • 3. Topics • LSA: Latent Semantic Analysis • PLSA: Probabilistic Latent Semantic Analysis • Unsupervised PLSA
  • 4. LSA Integer posuere erat a Pellentesque ornare lacinia quam. Donec id elit non mi porta gravida at eget metus. Curabitur sit blandit. Maecenas lorem faucibus mollis interdum. Cras mattis consectetur purus sit amet fermentum. Fusce dapibus, tellus ac cursus commodo… Egestas Justo Documents Vocabulary D = {d1, . . . , dn} W = {w1, . . . , wm} Nn×m = (n(di, wj))ij
  • 5. Nn×m = n(d1, w1) n(d1, w1) n(d1, w1) … n(d1, wm) n(d2, w1) n(d2, w1) n(d2, w1) … n(d2, wm) ⋮ ⋮ ⋮ ⋱ ⋮ n(dn, w1) n(dn, w1) n(dn, w1) … n(dn, wm) A typical term-document matrix derived from short articles, text summaries or abstracts may only have a small fraction of non-zero entries (typically well below 1%) Sparsness one has to account for synonyms in order not to underestimate the true similarity of documents Underestimate one has to deal with polysems to avoid overestimating the true similarity between documents by counting common terms that are used in different meanings Overestimating
  • 6. LSA by SVD Ñ = UΣ̃Vt ≅ UΣVt = N NÑt = UΣ̃2 Ut
  • 7. PLSA (aspect model) Documents Vocabulary D = {d1, . . . , dn} W = {w1, . . . , wm} Class (Topic) Z = {z1, . . . , zk}
  • 8. PLSA (aspect model) d z w N wd P(z|d) P(d) P(w|z) P(di) 1. P(wj |di) 2. P(zk |di) 3. select a document pick a latent class generate a word
  • 9. Symmetric Parameterization D Z W P(di) P(zk |di) P(wj |zk) D Z W P(zk) P(di |zk) P(wj |zk) P(di, wj) = ΣK k=1P(zk)P(di |zk)P(wj |zk)
  • 10. Topic 1 Topic 2 Topic k … P(z1 |d1) P(z2 |d1) P(zk |d1) P(w|z1) P(w|z1) P(w|z1) } P(wj |di) Doc i Σ PLSA (aspect model) Doc 1 Doc 2 Doc i … P(d1) P(d2) P(di) P(w|d1) P(w|d2) P(w|di) } w P(di, wj) = P(di)P(wj |di) P(wj |di) = K ∑ k=1 P(wj |zk)P(zk |di)
  • 11. Expectation Maximization (EM) algorithm 01 step where posterior probabilities are computed for the latent variables, based on the current estimates of the parameters Expectation (E) 02 where parameters are updated based on the so-called expected complete data log- likelihood which depends on the posterior probabilities computed in the E-step. Maximization (M) P(zk |di, wj) = P(wj |zk)P(zk |di) ∑ K l=1 P(wj |zl)P(zl |di) P(wj |zk) = ∑ N i=1 n(di, wj)P(zk |di, wj) ∑ M m=1 ∑ N i=1 n(di, wm)P(zk |di, wm) P(zk |di) = ∑ M j=1 n(di, wj)P(zk |di, wj) n(di)
  • 12. SVD U = (P(di |zk))i,k V = (P(wj |zk))j,k Σ = diag(P(zk))k } P = UΣVt P(di, wj) = ΣK k=1P(zk)P(di |zk)P(wj |zk)
  • 13. Latent Probability Space 0 P(wj |z2) P(wj |z1) P(wj |z3) P(wj |di)
  • 14. Aspect versus Cluster P(wj |di) = K ∑ k=1 P{c(di) = ck}P(wj |ck) P{c(di) = ck} = P(ck)∏ M j=1 P(wj |ck)n(di,wj) ∑ k l=1 P(cl)∏ M j=1 P(wj |cl)n(di,wj)
  • 15. Aspect versus Cluster P = exp [ − ∑i,j n′(di, wj)logP(wj |di) ∑i,j n′(di, wj) ] P̃(zk; di, wj) = [P(zk |di)P(wj |zk)]β ∑l [p(zl |di)P(wj |zl)]β Set Decrease As long as the performance on hold-out data improves (non-negligible) continue TEM iterations at this value of fl, otherwise goto step 2 Perform stopping on 1. 2. 3. 4. β ← 1 and perform EM with early stopping. ) and perform one TEM iteration. β ← ηβ ,i.e., stop when decreasing does not yield further improvements. β β (with η < 1
  • 16. Reference • http://www.statisticshowto.com/likelihood-function/ • https://pdfs.semanticscholar.org/presentation/6a21/ da166526b0f6a2cdb5eb451bb46327d0f7b2.pdf • https://www.youtube.com/watch?v=vtadpVDr1hM • https://arxiv.org/pdf/1301.6705.pdf • A View Of The Em Algorithm That Justifies Incremental, Sparse, And Other Variants - Radford M. Neal & Geoffrey E. Hinton