Latent semantic analysis

Latent Semantic
Analysis
Kiarash Kiani

“Latent semantic analysis (LSA) is a technique in
natural language processing, in particular
distributional semantics, of analyzing relationships
between a set of documents and the terms they
contain by producing a set of concepts related to
the documents and terms.”
“LSA assumes that words that are close in meaning
will occur in similar pieces of text.”

Topics
• LSA: Latent Semantic Analysis

• PLSA: Probabilistic Latent Semantic Analysis

• Unsupervised PLSA

LSA
Integer posuere erat a Pellentesque
ornare lacinia quam. Donec id elit
non mi porta gravida at eget metus.
Curabitur sit blandit. Maecenas
lorem faucibus mollis interdum.
Cras mattis consectetur purus sit
amet fermentum. Fusce dapibus,
tellus ac cursus commodo…
Egestas Justo
Documents
Vocabulary
D = {d1, . . . , dn}
W = {w1, . . . , wm}
Nn×m = (n(di, wj))ij

Nn×m =
n(d1, w1) n(d1, w1) n(d1, w1) … n(d1, wm)
n(d2, w1) n(d2, w1) n(d2, w1) … n(d2, wm)
⋮ ⋮ ⋮ ⋱ ⋮
n(dn, w1) n(dn, w1) n(dn, w1) … n(dn, wm)
A typical term-document matrix derived
from short articles, text summaries or
abstracts may only have a small fraction of
non-zero entries (typically well below 1%)
Sparsness
one has to account for synonyms in order
not to underestimate the true similarity of
documents
Underestimate
one has to deal with polysems to avoid
overestimating the true similarity between
documents by counting common terms
that are used in different meanings
Overestimating

LSA by SVD
Ñ = UΣ̃Vt
≅ UΣVt
= N
NÑt
= UΣ̃2
Ut

PLSA (aspect model)
Documents
Vocabulary
D = {d1, . . . , dn}
W = {w1, . . . , wm}
Class (Topic)
Z = {z1, . . . , zk}

PLSA (aspect model)
d z w
N
wd
P(z|d)
P(d) P(w|z)
P(di)
1.
P(wj |di)
2.
P(zk |di)
3.
select a document
pick a latent class
generate a word

Topic 1
Topic 2
Topic k
…
P(z1 |d1)
P(z2 |d1)
P(zk |d1)
P(w|z1)
P(w|z1)
P(w|z1) } P(wj |di)
Doc i
Σ
PLSA (aspect model)
Doc 1
Doc 2
Doc i
…
P(d1)
P(d2)
P(di)
P(w|d1)
P(w|d2)
P(w|di) } w
P(di, wj) = P(di)P(wj |di)
P(wj |di) =
K
∑
k=1
P(wj |zk)P(zk |di)

Expectation Maximization
(EM) algorithm
01 step where posterior probabilities
are computed for the latent
variables, based on the current
estimates of the parameters
Expectation (E)
02 where parameters are updated
based on the so-called
expected complete data log-
likelihood which depends on the
posterior probabilities
computed in the E-step.
Maximization (M)
P(zk |di, wj) =
P(wj |zk)P(zk |di)
∑
K
l=1
P(wj |zl)P(zl |di)
P(wj |zk) =
∑
N
i=1
n(di, wj)P(zk |di, wj)
∑
M
m=1
∑
N
i=1
n(di, wm)P(zk |di, wm)
P(zk |di) =
∑
M
j=1
n(di, wj)P(zk |di, wj)
n(di)

SVD
U = (P(di |zk))i,k
V = (P(wj |zk))j,k
Σ = diag(P(zk))k
} P = UΣVt
P(di, wj) = ΣK
k=1P(zk)P(di |zk)P(wj |zk)

Latent Probability Space
0
P(wj |z2)
P(wj |z1)
P(wj |z3)
P(wj |di)

Aspect versus Cluster
P(wj |di) =
K
∑
k=1
P{c(di) = ck}P(wj |ck)
P{c(di) = ck} =
P(ck)∏
M
j=1
P(wj |ck)n(di,wj)
∑
k
l=1
P(cl)∏
M
j=1
P(wj |cl)n(di,wj)

Aspect versus Cluster
P = exp
[
−
∑i,j
n′(di, wj)logP(wj |di)
∑i,j
n′(di, wj) ]
P̃(zk; di, wj) =
[P(zk |di)P(wj |zk)]β
∑l
[p(zl |di)P(wj |zl)]β
Set
Decrease
As long as the performance on hold-out data improves (non-negligible) continue TEM
iterations at this value of fl, otherwise goto step 2
Perform stopping on
1.
2.
3.
4.
β ← 1 and perform EM with early stopping.
) and perform one TEM iteration.
β ← ηβ
,i.e., stop when decreasing does not yield further improvements.
β β
(with η < 1

Reference
• http://www.statisticshowto.com/likelihood-function/

• https://pdfs.semanticscholar.org/presentation/6a21/
da166526b0f6a2cdb5eb451bb46327d0f7b2.pdf

• https://www.youtube.com/watch?v=vtadpVDr1hM

• https://arxiv.org/pdf/1301.6705.pdf

• A View Of The Em Algorithm That Justifies Incremental,
Sparse, And Other Variants - Radford M. Neal & Geoﬀrey
E. Hinton

Latent semantic analysis

Recommended

Recommended

More Related Content

Similar to Latent semantic analysis

Similar to Latent semantic analysis (20)

Recently uploaded

Recently uploaded (20)

Latent semantic analysis