• Save
Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process
Upcoming SlideShare
Loading in...5
×
 

Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process

on

  • 428 views

HDP LDA CRP Chinese Restaurant Process Hierarchical Latent Dirichlet Allocation

HDP LDA CRP Chinese Restaurant Process Hierarchical Latent Dirichlet Allocation

Statistics

Views

Total Views
428
Views on SlideShare
403
Embed Views
25

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 25

http://owl-nest.com 16
http://owlnest.kr 9

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process Presentation Transcript

  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process in the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Joon Hee Kim, Dongwoo Kim, Suin Kim, Alice Oh presented by Minsu Ko ryan0802@owl-nest.com http://owl-nest.com/lab/1 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionContents 1. Introduction 2. Topic Hierarchies 3. Recursive Chinese Restaurant Process 3.1. Table Assignment with CRP 3.2. Dish Assignment with recursive CRP 3.3. Generative Process 4. Posterior Inference 5. Experiments 5.1. Datasets 5.2. Topic Tree Visualization 5.3. Heldout likelihood 6. Hierarchy Analysis 6.1. Topic Specialization 6.2. Hierarchical Affinity 7. Discussion2 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionMy limitation Limitation of my previous methods Dataset : Korean news articles, 5 Topics, average chunk 201,854 1 Morphological analysis for lemexe-based approach 2 Stop lexeme removal 3 Model parameter(α, β) optimization (Wallach, 2008) 4 Find the optimal number of iteration using sum of log-scaled perplexities Clean result doesn’t make a clear correlation for human!3 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionMotivation Why I choose this paper, I prefer the unsupervised approaches. → Supervised approach should be the last choice. → It requires a human resource. I want to find a semantically well-organized topic model. → Unsatisfied with results of nCRP. → Can I find the correlation in every hierarchies? Minimize data preprocessing. → Data preprossing should not be an essential part in any models. → Can I just cut off the redundant leaves in hierarchical topic tree?4 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion1. Introduction Major limitation of basic topic models Just discover topics in flat structures? → Topics can be naturally organized into hierarchies.5 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion1. Introduction Three characteristics of an intuitive and flexible structure. 1 The number of topics should be unbounded. 2 Topics should be structured in a hierarchy of unbounded depth from general to specific. 3 A document should be composed of multiple topics from anywhere in the hierarchy of topics. recursive Chinese Restaurant Process Flexible hierarchical topic modeling. Consistent with the general intuition. Topics within the immediate family are much more similar than the topics outside the family.6 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 1 : Dirichlet Process (DP) Dirichlet Process A Dirichlet Process (DP) is a random process that is a probability distribution whose domain is itself a random distribution. DP has two parameters: ▶ H : Base distribution, which is like the mean of the DP. ▶ α : Strength parameter, which is like an inverse-variance of the DP. G is a random probability measure that has the same support as G0 . if for any partition (A1 , . . . , An ) of X: G ∼ DP(α, H) (G (A1 ), . . . , G (An )) ∼ Dirichlet(αH(A1 ), . . . , αH(An ))7 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 2 : Hierarchical Dirichlet Process (HDP) Hierarchical Dirichlet Process HDP is built on multiple DPs. HDP enables data in groups to share countable infinite cluster identities and to exhibit unique cluster propositions. DP mixture concerns clustering data in a group/document. HDP mixture concerns sharing clusters among multiple groups/documents. A hierarchical Dirichlet process: G0 ∼ DP(α0 , H) G1 , G2 |G0 ∼ DP(α, G0 ) Put a DP prior on the common base distribution. Extension to deeper hierarchies is straightforward.8 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 2 : Hierarchical Dirichlet Process (HDP) Dirichlet Process Mixtures DP mixture for each group. Make the base distribution H discrete.9 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 3 : Chinese Restaurant Process (CRP) Table Assignment with CRP The first customer always sits on the first table. The ith customer sits on a table depending on a draw from the following distribution. The CRP exhibits the clustering property of the DP.10 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 3 : Chinese Restaurant Process (CRP) Clustering effect of CRP A customer is more likely to sit at a table if there are already many people sitting there. However, with probability proportional to α, the customer will sit at a new table. New customers entering a restaurant join tables in proportion to how popular those tables already are (∝ nk ). With some probability (proportional to α), a new customer starts a new table.11 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 4 : Chinese Restaurant Franchise (CRF) At the franchise level, table ψjt choose dishes θk according to the following distribution: (γ is a model parameter to make a new dish.) K mk γ ψjt |ψ11 , ψ12, . . . , ψj1, . . . , ψj,t−1 , γ, H ∼ k=1 δ mk +γ θk + mk +γ H k k The restaurants share a public menu with unbounded number of dishes. At each table of each restaurant, one dish is ordered by the first customer who sits there, and it is shared among all customers who sit at that table.12 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionPreliminary step 4 : Chinese Restaurant Franchise (CRF)13 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion2. Topic Hierarchies Four documents highlighting the different assumptions rCRP enables a document to have a distribution over the entire topic tree. A document is modeled as PAM : Distribution over the topics at the leaves of the topic hierarchy. nCRP : Distribution over a single path from the root to the leaf node. TS-SB : A single node of the tree. rCRP : Distribution over all of the nodes of the hierarchy.14 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion3. Recursive Chinese Restaurant Process rCRP combines two levels of CRP. (with CRF metaphor) First level : associates each group with a mixture component. Second level : partitions data points into homogeneous groups.15 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion3.2 Dish Assignment with recursive CRP A recursive search beginning from the root dish k : dish index, t : table index j : restaurant (document) index, i : customer (word) index n : number of customers, m : number of tables, M : cumulative counts of m φk : current dish, φk : direct descendent of φk , φnew : new child dish of φk16 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion3.2 Dish Assignment with recursive CRP A recursive search beginning from the root dish17 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion3.3 Generative Process Topic Tree Generation The measure Gtree of the global topic tree is drawn from the rCRP. Gtree ∼ rCRP(α) Document Generation Gj (the topic distribution of jth document) is distributed according to Gtree . θji denotes the topic of ith word in the jth document. xji denotes the word generated from the topic. Gj ∼ DP(Gtree ) θji ∼ Gj xji ∼ F (θji )18 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion4. Posterior Inference Computing an exact posterior of DP is intractable. P´lya urn scheme based on the marginalization of unknown infinite-dimensions. o Variables of interest: xji : ith observed word of jth document. θji : the topic of xji . (associated with one ψjt ) φk : an atom of Gtree ψjt : the topic of t-th table in the jth doc. (atom of Gj , associated with one φk ) For the posterior inference, we marginalize out φk , ψjt , θji . Two index variables are needed to assign the relationship between these variables. tji : the index variable of tables such that ψjt ji = θji . kjt : the index variable of topics such that φkjt = ψjt19 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion4. Posterior Inference By marginalizing out φk , we can simply compute the conditional density. The conditional density of xji only depends on the other xk already assigned to that dish and its decendents.20 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion4. Posterior Inference Sampling t (t : table index) It is natural to sample table tji before sampling dish kjt with the CRF metaphor. The conditional distribution of tji given xji m.. : the number of all tables in all documents.21 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion4. Posterior Inference Sampling k (k : dish index) Sampling kjt is important as it potentially changes the membership of all data sitting at table t.22 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5.1 Datasets Synthetic Data 1,000 documents each having 1,000 word tokens. 9 unique terms. ① Make a synthetic topic tree. ② Follow a document generation process.23 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5.1 Datasets Real Data New York Times MovieLens Wikipedia Contemporary Art24 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5.2 Topic Tree Visualization25 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5.2 Topic Tree Visualization26 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5.2 Topic Tree Visualization27 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion5. Experiments 5.3 Heldout likelihood Heldout likelihood evaluates how well the trained model explains the heldout data. 90% data for inference, 10% data - heldout data per-word log-likelihood : each word has its log-likelihood. L = log p(Wheldout |Mtrained )28 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion6. Hierarchy Analysis Need for an evaluation metrics! Perplexity perplexity (ω) = exp 1 N d log ( w θjd φj ) i (φ:topic-word distribution, θ:document-topic distribution) Likelihood of held-out data There is no commonly used evaluation metric for measuring each model. Use fundamental characteristics. Topic specialization Hierarchical affinity29 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion6.1 Topic Specialization General-to-specific characteristic Root : Most general semantic category Leaves : Most specific category φNorm : Word distribution of the entire corpus φNorm : Most general topic freq(xi ) : Frequency of word xi in the entire corpus V : Set of entire vocabulary β : Smoothing factor freq(xi )+β p(xi |φNorm ) = freq(xj )+β|V | j∈V Topic specialization of topic φk (Cosine distance) φk ×φNorm ∆(φk ) = 1 − φk φNorm30 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion6.1 Topic Specialization Topic specialization scores of rCRP and nCRP nCRP assumes that a document is generated only by the topics in a single path of the hierarchy. rCRP finds general topics at the root and increasingly more specialized topics toward the leaves.31 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion6.2 Hierarchical Affinity Parent-child relationship The average cosine similarity between topics at second and their direct children is compared against their non-children topics.32 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion7. Discussion Hierarchical nature of mixture components. The rCRP : a non parametric topic model that infers the hierarchical structure of topics from discrete data. Evaluation metrics. Topic specialization Hierarchical affinity Applications. The latent structure of general-to-specific themes. Recommendations based on collaborative filtering. The movie ratings data Social network search.33 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA for Dummies34 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLatent Dirichlet allocation35 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionWhy is the model called Latent Dirichlet allocation? θ is a per-document Dirichlet36 / 46 OwlNest Corp. Modeling random variable, and rCRP Topic Hierarchies with the it is latent.
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionMilestone 1 왜 Dirichlet distribution 을 사용하는가? 모델 : 특정 문서가 갖는 토픽들의 분포에서 각 단어들에 토픽을 할당한다. Dirichlet distribution은 다항분포를 나타내는데 사용되는 확률분포 함수이다. Posterior distribution이 독립 확률변수들에 대한 확률분포 (다항분포) → Dirichlet distribution은 그에 대한 conjugate prior 이유 : 직관적 이해, 계산의 편의성37 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionMilestone 2 베이지언 모델 계산의 난점 marginalization이 intractable한 경우 unknown variable 문제 Conjugate prior란? 사후분포로부터 사후확률을 쉽게 얻을 수 있도록 하기 위해 미리 고안된 사전분포를 말한다 Conjugate prior를 사용할 경우 나오는 사후분포가 미리 알려진 확률분포가 되기 때문이다. 효과 : posterior 분포는 prior 분포를 따름, posterior의 계산이 편리38 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionMilestone 3 Gibbs sampling LDA에서 깁스 샘플링의 목적은 P(Z|W; α, β) 를 추정하는 것이다.39 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 LDA(K, alpha, beta, iter, docs, V) K:토픽의 갯수. 사람이 미리 적절한 값으로 정해준다. → 가장 적절한 K의 갯수를 찾는 일은 Perplexity를 측정하여 결정할 수 있음. (뒤에서 설명) α:토픽 분포에 대한 보정 패러미터, β:유니그램 분포에 대한 보정 패러미터 → α가 낮을수록 토픽 특이성 증가. β가 작을수록 단어들의 특이성 증가 → 만약 입력 데이터셋에 주제가 다양하면 α를 1.0 보다 크게 설정할 것! → 고정된 하나의 값을 정해서 투입. (본 예제에서는 0.1로 통일) iter:깁스샘플링 반복 횟수 → 각 단어들을 적절한 토픽에 할당시키기 위해 최적점에 수렴할 때까지 반복적으로 샘플링을 한다. docs:전체 문서의 갯수 → 모든 문서는 역색인으로! ex) doc1 = [5, 29, 24, 159, 524, ...], doc2 = [15, 68, 35, 817, 91, ...] V:모든 문서를 스페이스로 split해서 구한 전체 청크(단어)의 갯수 (본 예제에서는 4500개로 통일)40 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 φk : word distribution for topic K 전체 청크의 각 토픽에 대한 분포값을 구해봅시다. V:전체 청크 갯수, K:토픽 갯수 wv ;V :전체 V 개 중에 v 번째 단어, tk;K :정해진 K 개 토픽중 k번째 토픽 결국 마지막에 상위 20개를 출력한다면, 각 토픽을 의미하는 column 마다 가장 값이 높은 20개만 출력하는 것 초기화 1:{φk :토픽-청크} 행렬을 초기화하고 행렬 전체에 β 를 더해준다. t1;K t2;K t3;K . . . tk;K t1;K t2;K t3;K ... tk;K w1;V 0 0 0 ... 0 w1;V 0.1 0.1 0.1 ... 0.1 w2;V 0 0 0 ... 0 w2;V 0.1 0.1 0.1 ... 0.1 w3;V 0 0 0 ... 0 w3;V 0.1 0.1 0.1 ... 0.1 . 0 0 0 ... 0 . 0.1 0.1 0.1 ... 0.1 . 0 0 0 ... 0 . 0.1 0.1 0.1 ... 0.1 . 0 0 0 ... 0 . 0.1 0.1 0.1 ... 0.1 wv ;V 0 0 0 ... 0 wv ;V 0.1 0.1 0.1 ... 0.141 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 θj : topic distribution for document j θ는 Perplexity를 구할 때 쓰이고, 이는 값이 일정 점에 수렴할 때까지 반복되는 i-이터레이션의 기준이 됩니다. 전체 문서가 j개라면 j개 만큼 아래와 같은 행렬을 만듭니다. 초기화 2:문서별로 {θj :토픽-문서 당 단어청크} 행렬에 대한 분포값을 초기화 하고, α 를 더해준다. w1;Nj w2;Nj w3;Nj . . . wn;Nj t1;K 0.1 0.1 0.1 . . . 0.1 t2;K 0.1 0.1 0.1 . . . 0.1 t3;K 0.1 0.1 0.1 . . . 0.1 ......... . 0.1 0.1 0.1 . . . 0.1 . 0.1 0.1 0.1 . . . 0.1 . 0.1 0.1 0.1 . . . 0.1 tk;K 0.1 0.1 0.1 . . . 0.142 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 Collapsed gibbs sampling 단어마다 토픽 분포를 업데이트하기 위해서 샘플링을 주어진 이터레이션 만큼 반복한다. Posterior probability in collapsed gibbs sampling for (int i=iter; i≥0; i−−){ for(int j=count(J); j≥0; j−−){ //J:문서의 갯수 for(int n=length(N); n≥0; n−−){ //N:문서 내 단어 갯수 Posteriorz =({φk :토픽-청크}×{θj :토픽-문서 당 단어청크})/Vβ //[0.017 0.015 0.004 0.021 0.001 0.001 0.001 0.006 0.003 0.004] //업데이트된 토픽 인덱스는 3 //φk , θj 행렬과 V×β 벡터의 해당 업데이트 인덱스 부분을 +1 } } } //최종 i번 샘플링 후의 φk 결과에서 각 토픽들의 단어에 대한 값에 따라서 출력한다.43 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 Perplexity (j) (w ) nw +β φj = (.) nj +V β (j) nw : {φk :토픽-청크} (.) nj : 각 토픽에 속한 단어 갯수의 행렬 (d) (d) nj +α θj = (d) n. +K α (d) nj : 토픽 k개 전체에 대해 각각 몇 개의 문서가 할당되었는가를 나타내는 벡터 (예:토픽 갯수 출력 지정 10개) ( 8.1 10.1 14.1 8.1 4.1 7.1 26.1 5.1 3.1 1.1 ) ( 14.1 10.1 9.1 1.1 19.1 0.1 11.1 7.1 6.1 12.1 ) ( 17.1 16.1 8.1 36.1 17.1 10.1 19.1 11.1 11.1 15.1 ) (d) n. : 문서 d에 속한 단어 전체 갯수44 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis DiscussionLDA 구현 가이드 Perplexity get φj for(int j=count(J); j≥0; j−−){ get θ for(int n=length(N); n≥0; n−−){ log perp = get cumulative −log ((φ · θ)) } } return exp(log perp/N)45 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP
  • Introduction Topic Hierarchies rCRP Posterior Inference Experiments Hierarchy Analysis Discussion46 / 46 OwlNest Corp. Modeling Topic Hierarchies with the rCRP