Lifelong Topic Modelling
Paper Review Presentation
Daniele Di Mitri
Department of Knowledge Engineering
University of Maastricht
22th May 2015
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 1 / 13
Chosen paper
Chen, Zhiyuan, and Bing Liu.
Topic Modeling using Topics from Many Domains, Lifelong Learning
and Big Data.
Proceedings of the 31st ICML conference, 2014
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 2 / 13
Outline
1 Topic modelling
LDA description
LDA limitations
2 Topic modelling using knowledge
Knowledge Based Topic modelling
3 Lifelong Topic modelling
Lifelong learning approach
The proposed algorithm
Incorporation of knowledge
4 Evaluation
5 Summary
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 3 / 13
Latent Dirichlet Allocation
some useful backgroundLatent Dirichlet allocation (LDA)
gene 0.04
dna 0.02
genetic 0.01
.,,
life 0.02
evolve 0.01
organism 0.01
.,,
brain 0.04
neuron 0.02
nerve 0.01
...
data 0.02
number 0.02
computer 0.01
.,,
Topics Documents
Topic proportions and
assignments
• Each topic is a distribution over words
• Each document is a mixture of corpus-wide topics
• Each word is drawn from one of those topics
Figure: David Blei, Probabilistic Topic Models, 2012
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 4 / 13
LDA limitations
Unsupervised model can produce incoherent topics
Example
LDA sample topics
D1 = {price, color, cost, life}
D2 = {cost, picture, price, expensive}
D3 = {price, money, customer, expensive}
These topics have incoherent words: color, life, picture, customer
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 5 / 13
Can we use Knowledge?
some related works
SUPERVISED
Topic model in supervised settings
E.g. Blei & McAuliffe (2007)
All prior knowledge is correct
Uses ”regions” and ”labels”
UNSUPERVISED
Knowledge Based Topic Modelling
E.g. GK-LDA (Chen et al. 2013) and DF-LDA (Andrezejewski et al.
2009)
Typically assume that given knowledge is correct
They don’t extract automatically and target prior knowledge
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 6 / 13
Can we do better?
A fully automatic system to mine prior knowledge and deal with inconsistencies
INTUITION
If we find a set or words common in two domains these can serve as
prior knowledge
Example
D1 ∩ D2 = {price, cost}
D2 ∩ D3 = {price, expensive}
These are prior knowledge sets (pk-sets)
Example (D1 improved)
D1 = {price, cost, expensive, color}
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 7 / 13
Lifelong Learning approach
In 4 ”simple” steps
1 Given a set of domains D = {D1, .., Dn} it runs simple LDA(Di ) to
generate prior topics p-topics, unionised in S
2 Given a test domain Dt, run LTM(Dt) to generate c-topics At
3 For each aj ∈ At find matching topics Mt
j ∈ S (high level knowledge
for aj )
4 Mine Mt
j to generate pk-sets of length 2
Why Lifelong Learning? Retaining the learnt knowledge with LTM and
adding (replacing) it to our initial prior topics S.
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 8 / 13
LTM algorithm
1 Runs GibbsSampling(Dt, ∅) (equivalent to LDA), for N iterations
2 Runs GibbsSampling(Dt, Kt) for N iterations adding Kt
3 Kt is updated at each iteration using minimum Symmetrised
KL-divergence sk ∈ S and aj ∈ At, and the Frequent Itemset Mining
to generate frequent itemsets of length 2 (pk-sets)
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 9 / 13
How does LTM incorporate knowledge?
NB: d is added not by 1, but to a certain proportion, which stored in a
matrix and is determined by using Pointwise Mutual Information.
PMI(w1, w2) = log(P(w1, w2)/P(w1)P(w2))
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 10 / 13
Evaluation
Test against 4 other baseline algorithms: LDA,DF-LDA, GK-LDA
and AKL
Average Topic Coherence as quality measure
Figure: Results of tests in settings 1 & 2
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 11 / 13
In summary
Lifelong Topic Modelling
Learn prior knowledge
Fault tolerance
First Lifelong Learning Topic model
Big Data ready
However...
some points for improvement
Text-corpora to be diversified (only Amazon review)
Focus on the flow of the algorithm
2nd test setting and test with Big Data not fully reported
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 12 / 13
Thank you!
Q&A
Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 13 / 13

Lifelong Topic Modelling presentation

  • 1.
    Lifelong Topic Modelling PaperReview Presentation Daniele Di Mitri Department of Knowledge Engineering University of Maastricht 22th May 2015 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 1 / 13
  • 2.
    Chosen paper Chen, Zhiyuan,and Bing Liu. Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data. Proceedings of the 31st ICML conference, 2014 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 2 / 13
  • 3.
    Outline 1 Topic modelling LDAdescription LDA limitations 2 Topic modelling using knowledge Knowledge Based Topic modelling 3 Lifelong Topic modelling Lifelong learning approach The proposed algorithm Incorporation of knowledge 4 Evaluation 5 Summary Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 3 / 13
  • 4.
    Latent Dirichlet Allocation someuseful backgroundLatent Dirichlet allocation (LDA) gene 0.04 dna 0.02 genetic 0.01 .,, life 0.02 evolve 0.01 organism 0.01 .,, brain 0.04 neuron 0.02 nerve 0.01 ... data 0.02 number 0.02 computer 0.01 .,, Topics Documents Topic proportions and assignments • Each topic is a distribution over words • Each document is a mixture of corpus-wide topics • Each word is drawn from one of those topics Figure: David Blei, Probabilistic Topic Models, 2012 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 4 / 13
  • 5.
    LDA limitations Unsupervised modelcan produce incoherent topics Example LDA sample topics D1 = {price, color, cost, life} D2 = {cost, picture, price, expensive} D3 = {price, money, customer, expensive} These topics have incoherent words: color, life, picture, customer Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 5 / 13
  • 6.
    Can we useKnowledge? some related works SUPERVISED Topic model in supervised settings E.g. Blei & McAuliffe (2007) All prior knowledge is correct Uses ”regions” and ”labels” UNSUPERVISED Knowledge Based Topic Modelling E.g. GK-LDA (Chen et al. 2013) and DF-LDA (Andrezejewski et al. 2009) Typically assume that given knowledge is correct They don’t extract automatically and target prior knowledge Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 6 / 13
  • 7.
    Can we dobetter? A fully automatic system to mine prior knowledge and deal with inconsistencies INTUITION If we find a set or words common in two domains these can serve as prior knowledge Example D1 ∩ D2 = {price, cost} D2 ∩ D3 = {price, expensive} These are prior knowledge sets (pk-sets) Example (D1 improved) D1 = {price, cost, expensive, color} Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 7 / 13
  • 8.
    Lifelong Learning approach In4 ”simple” steps 1 Given a set of domains D = {D1, .., Dn} it runs simple LDA(Di ) to generate prior topics p-topics, unionised in S 2 Given a test domain Dt, run LTM(Dt) to generate c-topics At 3 For each aj ∈ At find matching topics Mt j ∈ S (high level knowledge for aj ) 4 Mine Mt j to generate pk-sets of length 2 Why Lifelong Learning? Retaining the learnt knowledge with LTM and adding (replacing) it to our initial prior topics S. Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 8 / 13
  • 9.
    LTM algorithm 1 RunsGibbsSampling(Dt, ∅) (equivalent to LDA), for N iterations 2 Runs GibbsSampling(Dt, Kt) for N iterations adding Kt 3 Kt is updated at each iteration using minimum Symmetrised KL-divergence sk ∈ S and aj ∈ At, and the Frequent Itemset Mining to generate frequent itemsets of length 2 (pk-sets) Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 9 / 13
  • 10.
    How does LTMincorporate knowledge? NB: d is added not by 1, but to a certain proportion, which stored in a matrix and is determined by using Pointwise Mutual Information. PMI(w1, w2) = log(P(w1, w2)/P(w1)P(w2)) Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 10 / 13
  • 11.
    Evaluation Test against 4other baseline algorithms: LDA,DF-LDA, GK-LDA and AKL Average Topic Coherence as quality measure Figure: Results of tests in settings 1 & 2 Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 11 / 13
  • 12.
    In summary Lifelong TopicModelling Learn prior knowledge Fault tolerance First Lifelong Learning Topic model Big Data ready However... some points for improvement Text-corpora to be diversified (only Amazon review) Focus on the flow of the algorithm 2nd test setting and test with Big Data not fully reported Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 12 / 13
  • 13.
    Thank you! Q&A Daniele DiMitri (DKE) Lifelong Topic Modelling 22th May 2015 13 / 13