Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
This document proposes a new variational autoencoder (VAE) approach for topic modeling that addresses the issue of latent variable collapse. The proposed VAE models each word token separately using a context-dependent sampling approach. It minimizes a KL divergence term not considered in previous VAEs for topic modeling. An experiment on four large datasets found the proposed VAE improved over existing VAEs for about half the datasets in terms of perplexity or normalized pairwise mutual information.
More Related Content
Similar to Learning Latent Space Energy Based Prior Modelの解説
Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
This document proposes a new variational autoencoder (VAE) approach for topic modeling that addresses the issue of latent variable collapse. The proposed VAE models each word token separately using a context-dependent sampling approach. It minimizes a KL divergence term not considered in previous VAEs for topic modeling. An experiment on four large datasets found the proposed VAE improved over existing VAEs for about half the datasets in terms of perplexity or normalized pairwise mutual information.
This note explicates some details of the discussion
given in Appendix B of E. Jang, S. Gu, and B. Poole.
Categorical representation with Gumbel-softmax.
ICLR, 2017.
Mini-batch Variational Inference for Time-Aware Topic ModelingTomonari Masada
This paper proposes a time-aware topic model that uses two vector embeddings: one for latent topics and one for document timestamps. By combining these embeddings, the model extracts time-dependent word probability distributions for each topic. The paper also proposes a mini-batch variational inference for the model that does not require knowing the total number of documents, allowing efficient processing of large datasets. Evaluation on a paper title dataset showed the model could improve perplexity by using timestamps and was comparable to collapsed Gibbs sampling while being more memory efficient.
A note on variational inference for the univariate GaussianTomonari Masada
1. The document summarizes variational inference for the univariate Gaussian model.
2. It derives an approximate posterior distribution q(μ, τ) that factorizes into q(μ) and q(τ).
3. q(μ) is Gaussian and q(τ) is Gamma, where their parameters are updated through an iterative procedure that maximizes a lower bound on the log evidence.
Document Modeling with Implicit Approximate Posterior DistributionsTomonari Masada
This document proposes a Bayesian probabilistic model for document modeling that uses an implicit approximate posterior distribution for variational inference. The model generates documents by drawing a noise vector from a standard normal distribution, passing it through a neural network to get parameters for a multinomial distribution, and drawing word counts. Unlike previous models, it uses an implicit distribution approximated by an adversarial training framework to flexibly model the posterior. Evaluation shows the model is comparable to LDA in terms of perplexity.
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionTomonari Masada
This document proposes a method to score sequences generated by RNN for automatic poetry composition using LDA-based topic modeling. It trains an RNN like GRU or LSTM on a corpus of Japanese tanka poems represented as sequences of character bigrams. It then uses LDA to infer topics in the training corpus and assign words in generated sequences to topics. Sequences containing many words assigned to the same topic receive a higher score, promoting diversity in top-ranked sequences compared to scoring based solely on RNN output probabilities. An experiment on tanka generation found the proposed method selected more varied poems.
This document presents the equations and process for deriving the evidence lower bound (ELBO) for a zero-inflated negative binomial variational autoencoder (ZINB-VAE) model. It defines the probability distributions for the ZINB-VAE and breaks the log likelihood function into individual terms. It then uses a variational approximation to the intractable posterior distribution to obtain the ELBO that can be optimized. Monte Carlo sampling is used to approximate the expectation with respect to the variational distribution.
This document summarizes the derivation of an evidence lower bound (ELBO) for latent LSTM allocation, a model that uses an LSTM to determine topic assignments in a topic modeling framework. It expresses the ELBO as terms related to the variational posterior distributions over topics and topics proportions, the generative process of words given topics, and the LSTM's prediction of topic assignments. It also describes how to optimize the ELBO with respect to the variational and LSTM parameters through gradient ascent.
TopicRNN is a generative model for documents that:
1. Draws a topic vector from a standard normal distribution and uses it to generate words in a document.
2. Computes a lower bound on the log marginal likelihood of words and stop word indicators.
3. Approximates the expected values in the lower bound using samples from an inference network that models the approximate posterior distribution over topics.
Topic modeling with Poisson factorization (2)Tomonari Masada
A modified version of the manuscript Published on Feb 3, 2017.
1. Use a gamma prior for $r_k$.
2. Use the same shape parameter $s$ for all gamma distributions.
Topic modeling with Poisson factorization is introduced. The generative model assumes words in documents are generated from topics modeled with Poisson distributions. Variational Bayesian inference is used to approximate the posterior. Update equations are derived for the variational parameters ω, representing topic assignments, α, the Dirichlet prior, and γ, the gamma prior over topic distributions. ω is updated proportionally to functions of α and γ. α is updated based on sums of ω. γ is updated based on sums of ω and the prior shape parameter.
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelTomonari Masada
This document presents a new method for estimating the posterior distribution of the Correlated Topic Model (CTM) using Stochastic Gradient Variational Bayes (SGVB). The CTM is an extension of LDA that models correlations between topics. The proposed method approximates the true posterior of the CTM with a factorial variational distribution and uses SGVB to maximize the evidence lower bound. This allows incorporating randomness into posterior inference for the CTM without requiring explicit inversion of the covariance matrix. Perplexity results on several datasets were comparable to LDA. Future work could explore online learning for topic models using neural networks.
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationTomonari Masada
This document proposes applying stochastic gradient variational Bayes (SGVB) to latent Dirichlet allocation (LDA) topic modeling to obtain an efficient posterior estimation. SGVB introduces randomness into variational inference for LDA by estimating expectations with Monte Carlo integration and using reparameterization to sample from approximate posterior distributions. Evaluation on several text corpora shows perplexities comparable to existing LDA inference methods, with the potential for faster parallelization using techniques like GPU processing. Future work will explore applying SGVB to other probabilistic document models like correlated topic models.
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationTomonari Masada
This document proposes a new inference method for latent Dirichlet allocation (LDA) based on stochastic gradient variational Bayes (SGVB). The proposed method approximates the true posterior using a logistic normal distribution, rather than the Dirichlet distribution used in standard variational Bayes for LDA. Through experiments, the proposed method achieved better predictive performance than standard variational Bayes and collapsed Gibbs sampling on many datasets, demonstrating the effectiveness of SGVB for devising new variational inferences.
This document summarizes a hierarchical topic modeling approach for analyzing traffic speed data. The model treats each day's data as a mixture of different speed distributions. It extends latent Dirichlet allocation to model speeds as continuous gamma distributions rather than discrete words. The model further incorporates metadata on time of day and sensor location to make topic probabilities dependent on context. Model parameters are estimated using variational Bayesian inference, and the model achieves better performance than alternatives by capturing similarity between observations based on timing and location.
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...Tomonari Masada
A derivation of the sampling formulas for An Entity-Topic Model for
Entity Linking [Han+ EMNLP-CoNLL12]
and
A Context-Aware Topic Model for Statistical Machine Translation [Su+ ACL15]