Modeling science

1,981 views

Published on

topic models

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,981
On SlideShare
0
From Embeds
0
Number of Embeds
744
Actions
Shares
0
Downloads
74
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Modeling science

  1. 1. Modeling Science David M. Blei Department of Computer Science Princeton University April 17, 2008Joint work with John Lafferty (CMU) D. Blei Modeling Science 1 / 53
  2. 2. Modeling Science Science, August 13, 1886 Science, June 24, 1994 evolution rna disease water acid disease evolutionary mrna host milk water blood species site bacteria food solution cholera organisms splicing diseases dry experiments bacteria biology rnas new fed liquid found phylogenetic nuclear bacterial cows chemical bacillus life sequence resistance houses action experiments origin introns control butter copper organisms diversity messenger strains fat crystals bacilli groups cleavage infectious found carbon cases made alcohol diseases molecular two malaria contained made germs animals splice parasites wells obtained animal two sequences parasite produced substances koch new polymerase tuberculosis poisonous nitrogen made living intron health 5 • On-line archives of document collections require better 6 organization. Manual organization is not practical. • Our goal: To discover the hidden thematic structure with hierarchical probabilistic models called topic models. • Use this structure for browsing, search, and similarity. D. Blei Modeling Science 2 / 53
  3. 3. Modeling Science Science, August 13, 1886 Science, June 24, 1994 evolution rna disease water acid disease evolutionary mrna host milk water blood species site bacteria food solution cholera organisms splicing diseases dry experiments bacteria biology rnas new fed liquid found phylogenetic nuclear bacterial cows chemical bacillus life sequence resistance houses action experiments origin introns control butter copper organisms diversity messenger strains fat crystals bacilli groups cleavage infectious found carbon cases made alcohol diseases molecular two malaria contained made germs animals splice parasites wells obtained animal two sequences parasite produced substances koch new polymerase tuberculosis poisonous nitrogen made living intron health 5 • Our data are the pages Science from 1880-2002 (from JSTOR) 6 • No reliable punctuation, meta-data, or references. • Note: this is just a subset of JSTOR’s archive. D. Blei Modeling Science 2 / 53
  4. 4. Discover topics from a corpus “Genetics” “Evolution” “Disease” “Computers” human evolution disease computer genome evolutionary host models dna species bacteria information genetic organisms diseases data genes life resistance computers sequence origin bacterial system gene biology new network molecular groups strains systems sequencing phylogenetic control model map living infectious parallel information diversity malaria methods genetics group parasite networks mapping new parasites software project two united new sequences common tuberculosis simulations D. Blei Modeling Science 3 / 53
  5. 5. Model the evolution of topics over time "Theoretical Physics" "Neuroscience" FORCE OXYGEN o o o o LASER o o o o o o o o o NERVE o o o o o o o o o o o o o o o o o o o o o o o o o o o RELATIVITY o o o o o o o o o o o o o o o o o o o o o o o o o o o o o NEURON o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 1880 1900 1920 1940 1960 1980 2000 1880 1900 1920 1940 1960 1980 2000 D. Blei Modeling Science 4 / 53
  6. 6. Model connections between topics neurons brain stimulus motor memory visual activated subjects synapses tyrosine phosphorylation cortical left ltp activation phosphorylation p53 task surface glutamate kinase cell cycle proteins tip synaptic activity protein cyclin binding rna image neurons regulation domain dna sample materials computer domains rna polymerase organic problem device receptor cleavage information polymer science amino acids research scientists receptors cdna site computers polymers funding molecules physicists support says ligand sequence problems laser particles nih research ligands isolated optical physics program people protein sequence light apoptosis sequences surface particle electrons experiment genome liquid quantum wild type dna surfaces stars mutant enzyme sequencing fluid mutations enzymes model reaction astronomers united states mutants iron active site reactions universe women cells mutation reduction molecule galaxies universities cell molecules expression magnetic galaxy cell lines plants magnetic field transition state students bone marrow plant spin superconductivity gene education genes superconducting pressure mantle arabidopsis bacteria high pressure crust sun bacterial pressures upper mantle solar wind host fossil record core meteorites earth resistance development birds inner core ratios planets mice parasite embryos fossils planet gene dinosaurs antigen virus drosophila species disease fossil t cells hiv genes forest mutations antigens aids expression forests families earthquake co2 immune response infection populations mutation earthquakes carbon viruses ecosystems fault carbon dioxide ancient images methane patients genetic found disease cells population impact data water ozone treatment proteins populations million years ago volcanic atmospheric drugs differences africa clinical researchers deposits climate measurements variation stratosphere protein magma ocean eruption ice concentrations found volcanism changes climate change D. Blei Modeling Science 5 / 53
  7. 7. Outline1 Introduction2 Latent Dirichlet allocation3 Dynamic topic models4 Correlated topic models D. Blei Modeling Science 6 / 53
  8. 8. Outline1 Introduction2 Latent Dirichlet allocation3 Dynamic topic models4 Correlated topic models D. Blei Modeling Science 7 / 53
  9. 9. Probabilistic modeling 1 Treat data as observations that arise from a generative probabilistic process that includes hidden variables • For documents, the hidden variables reflect the thematic structure of the collection. 2 Infer the hidden structure using posterior inference • What are the topics that describe this collection? 3 Situate new data into the estimated model. • How does this query or new document fit into the estimated topic structure? D. Blei Modeling Science 8 / 53
  10. 10. Intuition behind LDA Simple intuition: Documents exhibit multiple topics. D. Blei Modeling Science 9 / 53
  11. 11. Generative process • Cast these intuitions into a generative probabilistic process • Each document is a random mixture of corpus-wide topics • Each word is drawn from one of those topics D. Blei Modeling Science 10 / 53
  12. 12. Generative process • In reality, we only observe the documents • Our goal is to infer the underlying topic structure • What are the topics? • How are the documents divided according to those topics? D. Blei Modeling Science 10 / 53
  13. 13. Graphical models (Aside) Y Y ≡ ··· Xn X1 X2 XN N • Nodes are random variables • Edges denote possible dependence • Observed variables are shaded • Plates denote replicated structure D. Blei Modeling Science 11 / 53
  14. 14. Graphical models (Aside) Y Y ≡ ··· Xn X1 X2 XN N • Structure of the graph defines the pattern of conditional dependence between the ensemble of random variables • E.g., this graph corresponds to N p(y, x1 , . . . , xN ) = p(y ) p(xn | y) n=1 D. Blei Modeling Science 11 / 53
  15. 15. Latent Dirichlet allocation Per-word Dirichlet topic assignment parameter Per-document Observed Topic topic proportions word Topics hyperparameter α θd Zd,n Wd,n βk η N D K Each piece of the structure is a random variable. D. Blei Modeling Science 12 / 53
  16. 16. Latent Dirichlet allocation α θd Zd,n Wd,n βk η N D K 1 Draw each topic βi ∼ Dir(η), for i ∈ {1, . . . , K }. 2 For each document: 1 Draw topic proportions θd ∼ Dir(α). 2 For each word: 1 Draw Zd,n ∼ Mult(θd ). 2 Draw Wd,n ∼ Mult(βzd,n ). D. Blei Modeling Science 13 / 53
  17. 17. Latent Dirichlet allocation α θd Zd,n Wd,n βk η N D K • From a collection of documents, infer • Per-word topic assignment zd,n • Per-document topic proportions θd • Per-corpus topic distributions βk • Use posterior expectations to perform the task at hand, e.g., information retrieval, document similarity, etc. D. Blei Modeling Science 13 / 53
  18. 18. Latent Dirichlet allocation α θd Zd,n Wd,n βk η N D K • Computing the posterior is intractable: N p(θ | α) n=1 p(zn | θ )p(wn | zn , β1:K ) N K θ p(θ | α) n=1 z=1 p(zn | θ )p(wn | zn , β1:K ) • Several approximation techniques have been developed. D. Blei Modeling Science 13 / 53
  19. 19. Latent Dirichlet allocation α θd Zd,n Wd,n βk η N D K • Mean field variational methods (Blei et al., 2001, 2003) • Expectation propagation (Minka and Lafferty, 2002) • Collapsed Gibbs sampling (Griffiths and Steyvers, 2002) • Collapsed variational inference (Teh et al., 2006) D. Blei Modeling Science 13 / 53
  20. 20. Example inference • Data: The OCR’ed collection of Science from 1990–2000 • 17K documents • 11M words • 20K unique terms (stop words and rare words removed) • Model: 100-topic LDA model using variational inference. D. Blei Modeling Science 14 / 53
  21. 21. Example inference 0.4 0.3 Probability 0.2 0.1 0.0 1 8 16 26 36 46 56 66 76 86 96 Topics D. Blei Modeling Science 15 / 53
  22. 22. Example topics “Genetics” “Evolution” “Disease” “Computers” human evolution disease computer genome evolutionary host models dna species bacteria information genetic organisms diseases data genes life resistance computers sequence origin bacterial system gene biology new network molecular groups strains systems sequencing phylogenetic control model map living infectious parallel information diversity malaria methods genetics group parasite networks mapping new parasites software project two united new sequences common tuberculosis simulations D. Blei Modeling Science 16 / 53
  23. 23. LDA summary • LDA is a powerful model for • Visualizing the hidden thematic structure in large corpora • Generalizing new data to fit into that structure • LDA is a mixed membership model (Erosheva, 2004) that builds on the work of Deerwester et al. (1990) and Hofmann (1999). • For document collections and other grouped data, this might be more appropriate than a simple finite mixture D. Blei Modeling Science 17 / 53
  24. 24. LDA summary • Modular : It can be embedded in more complicated models. • E.g., syntax and semantics; authorship; word sense • General: The data generating distribution can be changed. • E.g., images; social networks; population genetics data • Variational inference is fast; lets us to analyze large data sets. • See Blei et al., 2003 for details and a quantitative comparison. • Code to play with LDA is freely available on my web-site, http://www.cs.princeton.edu/∼blei. D. Blei Modeling Science 18 / 53
  25. 25. LDA summary • But, LDA makes certain assumptions about the data. • When are they appropriate? D. Blei Modeling Science 19 / 53
  26. 26. Outline1 Introduction2 Latent Dirichlet allocation3 Dynamic topic models4 Correlated topic models D. Blei Modeling Science 20 / 53
  27. 27. LDA and exchangeability α θd Zd,n Wd,n βk η N D K • LDA assumes that documents are exchangeable. • I.e., their joint probability is invariant to permutation. • This is too restrictive. D. Blei Modeling Science 21 / 53
  28. 28. Documents are not exchangeable "Infrared Reflectance in Leaf-Sitting "Instantaneous Photography" (1890) Neotropical Frogs" (1977) • Documents about the same topic are not exchangeable. • Topics evolve over time. D. Blei Modeling Science 22 / 53
  29. 29. Dynamic topic model • Divide corpus into sequential slices (e.g., by year). • Assume each slice’s documents exchangeable. • Drawn from an LDA model. • Allow topic distributions evolve from slice to slice. D. Blei Modeling Science 23 / 53
  30. 30. Dynamic topic models α α α θd θd θd Zd,n Zd,n Zd,n Wd,n Wd,n Wd,n N N N D D D ... βk,1 βk,2 βk,T K D. Blei Modeling Science 24 / 53
  31. 31. Modeling evolving topics βk,1 βk,2 βk,T ... • Use a logistic normal distribution to model evolving topics (Aitchison, 1980) • A state-space model on the natural parameter of the topic multinomial (West and Harrison, 1997) βt,k | βt−1,k ∼ N (βt−1,k , Iσ 2 ) V −1 p(w | βt,k ) = exp βt,k − log(1 + v =1 exp{βt,k ,v }) D. Blei Modeling Science 25 / 53
  32. 32. Posterior inference • Our goal is to compute the posterior distribution, p(β1:T ,1:K , θ1:T ,1:D , z1:T ,1:D | w1:T ,1:D ). • Exact inference is impossible • Per-document mixed-membership model • Non-conjugacy between p(w | βt,k ) and p(βt,k ) • MCMC is not practical for the amount of data. • Solution: Variational inference D. Blei Modeling Science 26 / 53
  33. 33. Science data TECHVIEW: DNA S E Q U E N C I NG Sequencing the Genome, Fast James C. Mullikin and Amanda A. McMurray Genome sequencing projects reveal the genetic makeup of an organism by reading off the sequence of the DNA bases, which encodes all of the infor- mation necessary for the life of the organ- ism. The base sequence contains four nu- cleotides-adenine, thymidine, guanosine, and cytosine-which are linked together into long double-helical chains. Over the last two decades, automated DNA se- quencers have made the process of obtain- ing the base-by-base sequence of DNA... • Analyze JSTOR’s entire collection from Science (1880-2002) • Restrict to 30K terms that occur more than ten times • The data are 76M words in 130K documents D. Blei Modeling Science 27 / 53
  34. 34. Analyzing a document Original article Topic proportions D. Blei Modeling Science 28 / 53
  35. 35. Analyzing a document Original article Most likely words from top topics sequence devices data genome device information genes materials network sequences current web human high computer gene gate language dna light networks sequencing silicon time chromosome material software regions technology system analysis electrical words data fiber algorithm genomic power number number based internet D. Blei Modeling Science 28 / 53
  36. 36. Analyzing a topic 1880 1890 1900 1910 1920 1930 1940 electric electric apparatus air apparatus tube air machine power steam water tube apparatus tube power company power engineering air glass apparatus engine steam engine apparatus pressure air glass steam electrical engineering room water mercury laboratory two machine water laboratory glass laboratory rubber machines two construction engineer gas pressure pressure iron system engineer made made made small battery motor room gas laboratory gas mercury wire engine feet tube mercury small gas 1950 1960 1970 1980 1990 2000 tube tube air high materials devices apparatus system heat power high device glass temperature power design power materials air air system heat current current chamber heat temperature system applications gate instrument chamber chamber systems technology high small power high devices devices light laboratory high flow instruments design silicon pressure instrument tube control device material rubber control design large heat technology D. Blei Modeling Science 29 / 53
  37. 37. Visualizing trends within a topic "Theoretical Physics" "Neuroscience" FORCE OXYGEN o o o o LASER o o o o o o o o o NERVE o o o o o o o o o o o o o o o o o o o o o o o o o o o RELATIVITY o o o o o o o o o o o o o o o o o o o o o o o o o o o o o NEURON o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 1880 1900 1920 1940 1960 1980 2000 1880 1900 1920 1940 1960 1980 2000 D. Blei Modeling Science 30 / 53
  38. 38. Time-corrected document similarity • Consider the expected Hellinger distance between the topic proportions of two documents, K dij = E ( θi,k − θj,k )2 | wi , wj k =1 • Uses the latent structure to define similarity • Time has been factored out because the topics associated to the components are different from year to year. • Similarity based only on topic proportions D. Blei Modeling Science 31 / 53
  39. 39. Time-corrected document similarity The Brain of the Orang (1880) D. Blei Modeling Science 32 / 53
  40. 40. Time-corrected document similarity Representation of the Visual Field on the Medial Wall of Occipital-Parietal Cortex in the Owl Monkey (1976) D. Blei Modeling Science 33 / 53
  41. 41. Browser of Science D. Blei Modeling Science 34 / 53
  42. 42. Quantitative comparison • Compute the probability of each year’s documents conditional on all the previous year’s documents, p(wt | w1 , . . . , wt−1 ) • Compare exchangeable and dynamic topic models D. Blei Modeling Science 35 / 53
  43. 43. Quantitative comparison q LDA 25 DTM Per−word negative log likelihood 20 q 15 q q q q 10 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1920 1940 1960 1980 2000 Year D. Blei Modeling Science 36 / 53
  44. 44. Outline1 Introduction2 Latent Dirichlet allocation3 Dynamic topic models4 Correlated topic models D. Blei Modeling Science 37 / 53
  45. 45. The hidden assumptions of the Dirichlet distribution • The Dirichlet is an exponential family distribution on the simplex, positive vectors that sum to one. • However, the near independence of components makes it a poor choice for modeling topic proportions. • An article about fossil fuels is more likely to also be about geology than about genetics. D. Blei Modeling Science 38 / 53
  46. 46. The logistic normal distribution • The logistic normal is a distribution on the simplex that can model dependence between components. • The natural parameters of the multinomial are drawn from a multivariate Gaussian distribution. X ∼ NK −1 (µ, ) K −1 θi = exp{xi − log(1 + j=1 exp{xj })} D. Blei Modeling Science 39 / 53
  47. 47. Correlated topic model (CTM) Σ βk ηd Zd,n Wd,n K N µ D • Draw topic proportions from a logistic normal, where topic occurrences can exhibit correlation. • Use for: • Providing a “map” of topics and how they are related • Better prediction via correlated topics D. Blei Modeling Science 40 / 53
  48. 48. neurons brain stimulus motor memory visual activated subjects synapses tyrosine phosphorylation cortical left ltp activation phosphorylation p53 task surface glutamate kinase cell cycle proteins tip synaptic activity protein cyclin binding rna image neurons regulation domain dna sample materials computer domains rna polymerase organic problem device receptor cleavage information polymer science amino acidsresearch scientists receptors cdna site computers polymers funding molecules physicists support says ligand sequence problems laser particles nih research ligands isolated optical physicsprogram people protein sequence light apoptosis sequences surface particle electrons experiment genome liquid quantum wild type dna surfaces stars mutant enzyme sequencing fluid mutations enzymes model reaction astronomersunited states mutants iron active site reactions universe women cells mutation reduction molecule galaxies universities cell molecules expression magnetic galaxy cell lines plants magnetic field transition state students bone marrow plant spin superconductivity gene education genes superconducting pressure mantle arabidopsis bacteria high pressure crust sun bacterial pressures upper mantle solar wind host fossil record core meteorites earth resistance development birds inner core ratios planets mice parasite embryos fossils planet gene dinosaurs antigen virus drosophila species disease fossil t cells hiv genes forest mutations antigens aids expression forests families earthquake co2 immune response infection populations mutation earthquakes carbon viruses ecosystems fault carbon dioxide ancient images methane patients genetic found disease cells population impact data water ozone treatment proteins populations million years ago volcanic atmospheric drugs differences africa clinical researchers deposits climate measurements variation stratosphere protein magma ocean eruption ice concentrations found volcanism changes climate change D. Blei Modeling Science 41 / 53
  49. 49. Summary • Topic models provide useful descriptive statistics for analyzing and understanding the latent structure of large text collections. • Probabilistic graphical models are a useful way to express assumptions about the hidden structure of complicated data. • Variational methods allow us to perform posterior inference to automatically infer that structure from large data sets. • Current research • Choosing the number of topics • Continuous time dynamic topic models • Topic models for prediction • Inferring the impact of a document D. Blei Modeling Science 42 / 53
  50. 50. “We should seek out unfamiliar summaries of observational material,and establish their useful properties... And still more novelty cancome from finding, and evading, still deeper lying constraints.”(John Tukey, The Future of Data Analysis, 1962) D. Blei Modeling Science 43 / 53
  51. 51. Supervised topic models (with Jon McAuliffe) • Most topic models are unsupervised. They are fit by maximizing the likelihood of a collection of documents. • Consider documents paired with response variables. For example: • Movie reviews paired with a number of stars • Web pages paired with a number of “diggs” • We develop supervised topic models, models of documents and responses that are fit to find topics predictive of the response. D. Blei Modeling Science 44 / 53
  52. 52. Supervised LDA α θd Zd,n Wd,n βk K N Yd D η, σ 2 1 Draw topic proportions θ | α ∼ Dir(α). 2 For each word 1 Draw topic assignment zn | θ ∼ Mult(θ ). 2 Draw word wn | zn , β1:K ∼ Mult(βzn ). 3 Draw response variable y | z1:N , η, σ 2 ∼ N η z, σ 2 , where ¯ N z = (1/N) ¯ n=1 zn . D. Blei Modeling Science 45 / 53
  53. 53. Comments • SLDA is used as follows. • Fit coefficients and topics from a collection of document-response pairs. • Use the fitted model to predict the responses of previously unseen documents, E[Y | w1:N , α, β1:K , η, σ 2 ] = η E[Z | w1:N , α, β1:K ]. ¯ • The process enforces that the document is generated first, followed by the response. The response is generated from the particular topics that were realized in generating the document. D. Blei Modeling Science 46 / 53
  54. 54. Example: Movie reviews least bad more awful his both problem guys has featuring their motion unfortunately watchable than routine character simple supposed its films dry many perfect worse not director offered while fascinating flat one will charlie performance power dull movie characters paris between complex ● ● ● ●● ● ● ● ● −30 −20 −10 0 10 20 have not one however like about from cinematography you movie there screenplay was all which performances just would who pictures some they much effective out its what picture • We fit a 10-topic sLDA model to movie review data (Pang and Lee, 2005). • The documents are the words of the reviews. • The responses are the number of stars associated with each review (modeled as continuous). • Each component of coefficient vector η is associated with a topic. D. Blei Modeling Science 47 / 53
  55. 55. Simulations Movie corpus 0.5 ● ● ● ● ● ● ● ● −6.37 ● ● ● ● ● Per−word held out log likelihood 0.4 −6.38 ● ● ● ● ● ● Predictive R2 ● 0.3 −6.39 ● ● ● ● ● ● ● ● ● ● ● ● ● ● −6.40 0.2 ● ● ● ● −6.41 0.1 ● sLDA LDA −6.42 ● 0.0 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 Number of topics Number of topics Digg corpus −8.0 0.12 −8.1 0.10 Per−word held out log likelihood ● −8.2 ● 0.08 ● ● ● ● ● ● Predictive R2 −8.3 ● ● ● 0.06 ● ● −8.4 ● ● 0.04 ● ● −8.5 0.02 ● ● ● −8.6 ● ● ● 0.00 ● ● ● ● 2 4 10 20 30 2 4 10 20 30 Number of topics Number of topics D. Blei Modeling Science 48 / 53
  56. 56. Diversion: Variational inference • Let x1:N be observations and z1:M be latent variables • Our goal is to compute the posterior distribution p(z1:M , x1:N ) p(z1:M | x1:N ) = p(z1:M , x1:N )dz1:M • For many interesting distributions, the marginal likelihood of the observations is difficult to efficiently compute D. Blei Modeling Science 49 / 53
  57. 57. Variational inference • Use Jensen’s inequality to bound the log prob of the observations: log p(x1:N ) ≥ Eqν [log p(z1:M , x1:N )] − Eqν [log qν (z1:M )]. • We have introduced a distribution of the latent variables with free variational parameters ν. • We optimize those parameters to tighten this bound. • This is the same as finding the member of the family qν that is closest in KL divergence to p(z1:M | x1:N ). D. Blei Modeling Science 50 / 53
  58. 58. Mean-field variational inference • Complexity of optimization is determined by factorization of qν • In mean field variational inference qν is fully factored M qν (z1:M ) = qνm (zm ). m=1 • The latent variables are independent. • Each is governed by its own variational parameter νm . • In the true posterior they can exhibit dependence (often, this is what makes exact inference difficult). D. Blei Modeling Science 51 / 53
  59. 59. MFVI and conditional exponential families • Suppose the distribution of each latent variable conditional on the observations and other latent variables is in the exponential family: p(zm | z−m , x) = hm (zm ) exp{gm (z−m , x)T zm − am (gi (z−m , x))} • Assume qν is fully factorized and each factor is in the same exponential family: qνm (zm ) = hm (zm ) exp{νm zm − am (νm )} T D. Blei Modeling Science 52 / 53
  60. 60. MFVI and conditional exponential families • Variational inference is the following coordinate ascent algorithm νm = Eqν [gm (Z−m , x)] • Notice the relationship to Gibbs sampling D. Blei Modeling Science 52 / 53
  61. 61. Variational family for the DTM βk,1 βk,2 βk,T ... ˆ βk,1 ˆ βk,2 ˆ βk,T • Distribution of θ and z is fully-factorized (Blei et al., 2003) • Distribution of {β1,k , . . . , βT ,k } is a variational Kalman filter • Gaussian state-space model with free observations βk ,t . ˆ • Fit observations such that the corresponding posterior over the chain is close to the true posterior. D. Blei Modeling Science 53 / 53
  62. 62. Variational family for the DTM βk,1 βk,2 βk,T ... ˆ βk,1 ˆ βk,2 ˆ βk,T • Given a document collection, use coordinate ascent on all the variational parameters until the KL converges. • Yields a distribution close to the true posterior of interest • Take expectations w/r/t the simpler variational distribution D. Blei Modeling Science 53 / 53

×