SlideShare a Scribd company logo
A Simple Stochastic Gradient
Variational Bayes
for Latent Dirichlet Allocation
Tomonari MASADA (正田备也)
Nagasaki University (长崎大学)
masada@nagasaki-u.ac.jp
Aim
•Obtain an informative summary of a large
set of documents
•by extracting word lists, each relating to a
specific topic
 Topic modeling
2
3
Contribution
•We propose a new posterior estimation for
latent Dirichlet allocation (LDA) [Blei+ 03]
•by applying stochastic gradient variational Bayes
(SGVB) [Kingma+ 14] to LDA
4
5
LDA [Blei+ 03]
• Achieve a clustering of word tokens by assigning each word
token to one among the 𝐾 topics.
• 𝑧 𝑑𝑖: the topic to which the 𝑖-th word token in document 𝑑 is
assigned.
• 𝜃 𝑑𝑘: How often the topic 𝑘 is talked about in document 𝑑?
• Topic probability distribution in each document
• 𝜙 𝑘𝑣: How often the word 𝑣 is used to talk about the topic 𝑘?
• Word probability distribution for each topic
discrete variables
continuous variables
6
Variational Bayesian (VB) inference
= maximization of evidence lower bound (ELBO)
•VB tries to approximate the true posterior.
•An approximate posterior is introduced when ELBO is
obtained by applying Jensen's inequality:
• 𝒛: discrete hidden variables (topic assignments)
• 𝚯: continuous hidden variables (multinomial parameters)
evidence approximate posterior 𝑞(𝒛, 𝚯)
7
Factorization assumption
•We assume the approximate posterior 𝑞 𝒛, 𝚯
factorizes as 𝑞 𝒛 𝑞 𝚯 to make the inference
tractable.
•Then ELBO can be written as
8
Stochastic gradient variational Bayes
(SGVB) [Kingma+ 14]
•A general framework for estimating evidence
lower bound (ELBO) in variational Bayes (VB)
•Only applicable to continuous distributions
𝑞 𝚯
9
(SGVB) Monte Carlo integration
•By using Monte Carlo integration, ELBO can be
estimated with 𝐿 random samples as
• The discrete part 𝑞 𝒛 is estimated in a similar manner
to the original VB for LDA [Blei+ 03].
10
(SGVB) Reparameterization
• SGVB can be applied "under certain mild conditions."
• We use the logistic normal distributions for approximating
the true posterior of
𝜃 𝑑𝑘: per-doc topic probability distributions, and
𝜙 𝑘𝑣: per-topic word probability distributions.
• We can efficiently sample from the logistic normal with
reparameterization.
11
Maximize ELBO using gradient ascent
12
13
"Stochastic" gradient VB
•The expectation integrations in ELBO are estimated
by Monte Carlo method.
•The derivatives of ELBO depend on random samples.
•Randomness is incorporated into maximization.
• SGVB = VB where gradients are stochastic.
• (Observation) It seems easier to avoid poor local minima.
14
without randomness
= with zero standard deviation
•A special case of the proposed method is quite
similar to CVB0 [Asuncion+ 09].
•Our method has a context.
15
Data sets for evaluation
# docs
# vocabulary
words
NYT 99,932 46,263
MOVIE 27,859 62,408
NSF 128,818 21,471
MED 125,490 42,830
16
17
18
19
20
Not that efficient in time…
•500 iters for NYT data set when 𝐾 = 200
•LNV: 43 hours
•CGS: 14 hours
•VB: 23 hours
•However, parallelization with GPU works.
•(preparing an implementation with TensorFlow)
21
Conclusion
•We incorporate randomness into variational
inference for LDA by applying SGVB to LDA.
•The proposed method gives perplexities
comparable to the existing inferences for LDA.
22
Future work
•SGVB is a general framework for devising a
posterior inference for probabilistic models.
•We've already applied SGVB to CTM [Blei+ 05].
• This will be poster-presented at APWeb'16.
•SGVB is also applicable to other document models.
• NVDM [Miao+ 16]: document modeling with MLP
23
24

More Related Content

What's hot

text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amr
amit nagarkoti
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
Selman Bozkır
 
Ir 09
Ir   09Ir   09
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
hyunyoung Lee
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
Kalpit Desai
 
..Ans 1
..Ans 1..Ans 1
..Ans 1
Vimmi Kaushal
 
Overloading
OverloadingOverloading
Overloading
adil raja
 
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
marxliouville
 
Ire final
Ire finalIre final
Ire final
Dhruva Das
 

What's hot (9)

text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amr
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
 
Ir 09
Ir   09Ir   09
Ir 09
 
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
..Ans 1
..Ans 1..Ans 1
..Ans 1
 
Overloading
OverloadingOverloading
Overloading
 
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
 
Ire final
Ire finalIre final
Ire final
 

Similar to A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation

Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Association for Computational Linguistics
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases
JAEMINJEONG5
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Science in text mining
Science in text miningScience in text mining
Science in text mining
Tanay Chowdhury
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
Learning group variational inference
Learning group  variational inferenceLearning group  variational inference
Learning group variational inference
Shuai Zhang
 
Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
Bayesian Multi-topic Microarray Analysis with Hyperparameter ReestimationBayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
Tomonari Masada
 
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
ssuserc35c0e
 
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
AIST
 
A Topic Model for Traffic Speed Data Analysis
A Topic Model for Traffic Speed Data AnalysisA Topic Model for Traffic Speed Data Analysis
A Topic Model for Traffic Speed Data Analysis
Tomonari Masada
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Alexandre Riazanov
 
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
NS-CUK Seminar: H.B.Kim,  Review on "subgraph2vec: Learning Distributed Repre...NS-CUK Seminar: H.B.Kim,  Review on "subgraph2vec: Learning Distributed Repre...
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
ssuser4b1f48
 

Similar to A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation (12)

Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
Science in text mining
Science in text miningScience in text mining
Science in text mining
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Learning group variational inference
Learning group  variational inferenceLearning group  variational inference
Learning group variational inference
 
Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
Bayesian Multi-topic Microarray Analysis with Hyperparameter ReestimationBayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation
 
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
 
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
 
A Topic Model for Traffic Speed Data Analysis
A Topic Model for Traffic Speed Data AnalysisA Topic Model for Traffic Speed Data Analysis
A Topic Model for Traffic Speed Data Analysis
 
Expressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query RewritingExpressive Querying of Semantic Databases with Incremental Query Rewriting
Expressive Querying of Semantic Databases with Incremental Query Rewriting
 
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
NS-CUK Seminar: H.B.Kim,  Review on "subgraph2vec: Learning Distributed Repre...NS-CUK Seminar: H.B.Kim,  Review on "subgraph2vec: Learning Distributed Repre...
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
 

More from Tomonari Masada

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説
Tomonari Masada
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Tomonari Masada
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmax
Tomonari Masada
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用
Tomonari Masada
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocation
Tomonari Masada
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic Modeling
Tomonari Masada
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate Gaussian
Tomonari Masada
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior Distributions
Tomonari Masada
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
Tomonari Masada
 
A Note on ZINB-VAE
A Note on ZINB-VAEA Note on ZINB-VAE
A Note on ZINB-VAE
Tomonari Masada
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM Allocation
Tomonari Masada
 
A Note on TopicRNN
A Note on TopicRNNA Note on TopicRNN
A Note on TopicRNN
Tomonari Masada
 
Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)
Tomonari Masada
 
Poisson factorization
Poisson factorizationPoisson factorization
Poisson factorization
Tomonari Masada
 
Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28
Tomonari Masada
 
FDSE2015
FDSE2015FDSE2015
FDSE2015
Tomonari Masada
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
Tomonari Masada
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LM
Tomonari Masada
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
Tomonari Masada
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDA
Tomonari Masada
 

More from Tomonari Masada (20)

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmax
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocation
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic Modeling
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate Gaussian
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior Distributions
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
 
A Note on ZINB-VAE
A Note on ZINB-VAEA Note on ZINB-VAE
A Note on ZINB-VAE
 
A Note on Latent LSTM Allocation
A Note on Latent LSTM AllocationA Note on Latent LSTM Allocation
A Note on Latent LSTM Allocation
 
A Note on TopicRNN
A Note on TopicRNNA Note on TopicRNN
A Note on TopicRNN
 
Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)Topic modeling with Poisson factorization (2)
Topic modeling with Poisson factorization (2)
 
Poisson factorization
Poisson factorizationPoisson factorization
Poisson factorization
 
Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28
 
FDSE2015
FDSE2015FDSE2015
FDSE2015
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LM
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDA
 

Recently uploaded

ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
vmspraneeth
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
Kamal Acharya
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Indrajeet sahu
 
P5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civilP5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civil
AnasAhmadNoor
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
MadhavJungKarki
 
AI-Based Home Security System : Home security
AI-Based Home Security System : Home securityAI-Based Home Security System : Home security
AI-Based Home Security System : Home security
AIRCC Publishing Corporation
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
um7474492
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
Paris Salesforce Developer Group
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
Addu25809
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Transcat
 
OOPS_Lab_Manual - programs using C++ programming language
OOPS_Lab_Manual - programs using C++ programming languageOOPS_Lab_Manual - programs using C++ programming language
OOPS_Lab_Manual - programs using C++ programming language
PreethaV16
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
PreethaV16
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
mahaffeycheryld
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 

Recently uploaded (20)

ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
 
P5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civilP5 Working Drawings.pdf floor plan, civil
P5 Working Drawings.pdf floor plan, civil
 
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
 
AI-Based Home Security System : Home security
AI-Based Home Security System : Home securityAI-Based Home Security System : Home security
AI-Based Home Security System : Home security
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
 
OOPS_Lab_Manual - programs using C++ programming language
OOPS_Lab_Manual - programs using C++ programming languageOOPS_Lab_Manual - programs using C++ programming language
OOPS_Lab_Manual - programs using C++ programming language
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 

A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation

  • 1. A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation Tomonari MASADA (正田备也) Nagasaki University (长崎大学) masada@nagasaki-u.ac.jp
  • 2. Aim •Obtain an informative summary of a large set of documents •by extracting word lists, each relating to a specific topic  Topic modeling 2
  • 3. 3
  • 4. Contribution •We propose a new posterior estimation for latent Dirichlet allocation (LDA) [Blei+ 03] •by applying stochastic gradient variational Bayes (SGVB) [Kingma+ 14] to LDA 4
  • 5. 5
  • 6. LDA [Blei+ 03] • Achieve a clustering of word tokens by assigning each word token to one among the 𝐾 topics. • 𝑧 𝑑𝑖: the topic to which the 𝑖-th word token in document 𝑑 is assigned. • 𝜃 𝑑𝑘: How often the topic 𝑘 is talked about in document 𝑑? • Topic probability distribution in each document • 𝜙 𝑘𝑣: How often the word 𝑣 is used to talk about the topic 𝑘? • Word probability distribution for each topic discrete variables continuous variables 6
  • 7. Variational Bayesian (VB) inference = maximization of evidence lower bound (ELBO) •VB tries to approximate the true posterior. •An approximate posterior is introduced when ELBO is obtained by applying Jensen's inequality: • 𝒛: discrete hidden variables (topic assignments) • 𝚯: continuous hidden variables (multinomial parameters) evidence approximate posterior 𝑞(𝒛, 𝚯) 7
  • 8. Factorization assumption •We assume the approximate posterior 𝑞 𝒛, 𝚯 factorizes as 𝑞 𝒛 𝑞 𝚯 to make the inference tractable. •Then ELBO can be written as 8
  • 9. Stochastic gradient variational Bayes (SGVB) [Kingma+ 14] •A general framework for estimating evidence lower bound (ELBO) in variational Bayes (VB) •Only applicable to continuous distributions 𝑞 𝚯 9
  • 10. (SGVB) Monte Carlo integration •By using Monte Carlo integration, ELBO can be estimated with 𝐿 random samples as • The discrete part 𝑞 𝒛 is estimated in a similar manner to the original VB for LDA [Blei+ 03]. 10
  • 11. (SGVB) Reparameterization • SGVB can be applied "under certain mild conditions." • We use the logistic normal distributions for approximating the true posterior of 𝜃 𝑑𝑘: per-doc topic probability distributions, and 𝜙 𝑘𝑣: per-topic word probability distributions. • We can efficiently sample from the logistic normal with reparameterization. 11
  • 12. Maximize ELBO using gradient ascent 12
  • 13. 13
  • 14. "Stochastic" gradient VB •The expectation integrations in ELBO are estimated by Monte Carlo method. •The derivatives of ELBO depend on random samples. •Randomness is incorporated into maximization. • SGVB = VB where gradients are stochastic. • (Observation) It seems easier to avoid poor local minima. 14
  • 15. without randomness = with zero standard deviation •A special case of the proposed method is quite similar to CVB0 [Asuncion+ 09]. •Our method has a context. 15
  • 16. Data sets for evaluation # docs # vocabulary words NYT 99,932 46,263 MOVIE 27,859 62,408 NSF 128,818 21,471 MED 125,490 42,830 16
  • 17. 17
  • 18. 18
  • 19. 19
  • 20. 20
  • 21. Not that efficient in time… •500 iters for NYT data set when 𝐾 = 200 •LNV: 43 hours •CGS: 14 hours •VB: 23 hours •However, parallelization with GPU works. •(preparing an implementation with TensorFlow) 21
  • 22. Conclusion •We incorporate randomness into variational inference for LDA by applying SGVB to LDA. •The proposed method gives perplexities comparable to the existing inferences for LDA. 22
  • 23. Future work •SGVB is a general framework for devising a posterior inference for probabilistic models. •We've already applied SGVB to CTM [Blei+ 05]. • This will be poster-presented at APWeb'16. •SGVB is also applicable to other document models. • NVDM [Miao+ 16]: document modeling with MLP 23
  • 24. 24