SlideShare a Scribd company logo
Machine LearningMachine Learninggg
How to put things together ?How to put things together ?
A caseA case--study of model design, inference,study of model design, inference,
learning, evaluation in text analysislearning, evaluation in text analysis
Eric XingEric Xing
Eric Xing © Eric Xing @ CMU, 2006-2010 1
Lecture 20, August 16, 2010
Reading:
Need computers to help usNeed computers to help us…
(from images.google.cn)
 Humans cannot afford to deal with (e.g., search, browse, or
measure similarity) a huge number of text documents
 We need computers to help out …
Eric Xing © Eric Xing @ CMU, 2006-2010 2
p p
NLP and Data MiningNLP and Data Mining
We want:
 Semantic-based search
 infer topics and categorize
documents
 Multimedia inference
 Automatic translation
 Predict how topics
evolveevolve
 …
Eric Xing © Eric Xing @ CMU, 2006-2010 3
Research
topics
1900 2000
Research
topics
1900 2000
How to get started?How to get started?
 Here are some important elements to consider before you start:
 Task:
 Embedding? Classification? Clustering? Topic extraction? …
 Data representation:
 Input and output (e.g., continuous, binary, counts, …)
 Model:
 BN? MRF? Regression? SVM?
 Inference: Inference:
 Exact inference? MCMC? Variational?
 Learning:
 MLE? MCLE? Max margin?g
 Evaluation:
 Visualization? Human interpretability? Perperlexity? Predictive accuracy?
 It is better to consider one element at a time!
Eric Xing © Eric Xing @ CMU, 2006-2010 4
Tasks:Tasks:
 Say, we want to have a mapping …, so that

 Compare similarity
 Classify contents
 Cluster/group/categorizing
Eric Xing © Eric Xing @ CMU, 2006-2010 5
 Cluster/group/categorizing
 Distill semantics and perspectives
 ..
Modeling document collectionsModeling document collections
 A document collection is a dataset where each data point isp
itself a collection of simpler data.
 Text documents are collections of words.
Segmented images are collections of regions Segmented images are collections of regions.
 User histories are collections of purchased items.
 Many modern problems ask questions of such data.
 Is this text document relevant to my query?
Whi h t i thi i i ? Which category is this image in?
 What movies would I probably like?
 Create a caption for this image.
 Modeling document collections
Eric Xing © Eric Xing @ CMU, 2006-2010 6
 Modeling document collections
Representation:Representation:
 Data: Bag of Words Representation
As for the Arabian and Palestinean voices that are against
the current negotiations and the so-called peace process,
they are not against peace per se, but rather for their well-
founded predictions that Israel would NOT give an inch of
Arabian
p g
the West bank (and most probably the same for Golan
Heights) back to the Arabs. An 18 months of "negotiations"
in Madrid, and Washington proved these predictions. Now
many will jump on me saying why are you blaming israelis
for no result negotiations I would say why would the
negotiations
against
peace
Israel
 Each document is a vector in the word space
for no-result negotiations. I would say why would the
Arabs stall the negotiations, what do they have to loose ?
Israel
Arabs
blaming
 Ignore the order of words in a document. Only count matters!
 A high-dimensional and sparse representation
– Not efficient text processing tasks, e.g., search, document
Eric Xing © Eric Xing @ CMU, 2006-2010 7
classification, or similarity measure
– Not effective for browsing
How to Model Semantic?How to Model Semantic?
 Q: What is it about?
 A: Mainly MT, with syntax, some learning
A Hierarchical Phrase-Based Model
f St ti ti l M hi T l ti0 6 0 3 0 1 Mixing for Statistical Machine Translation
We present a statistical phrase-based
Translation model that uses hierarchical
phrases—phrases that contain sub-phrases.
The model is formally a synchronous
context-free grammar but is learned
f bit t ith t t tiSource
MT Syntax Learning
0.6 0.3 0.1 g
Proportion
from a bitext without any syntactic
information. Thus it can be seen as a
shift to the formal machinery of syntax
based translation systems without any
linguistic commitment. In our experiments
using BLEU as a metric, the hierarchical
Phrase based model achieves a relative
Improvement of 7 5% over Pharaoh
Source
Target
SMT
Alignment
S
Parse
Tree
Noun
Phrase
likelihood
EM
Hidden
Parameters
pics
Improvement of 7.5% over Pharaoh,
a state-of-the-art phrase-based system.Score
BLEU
Phrase
Grammar
CFG
Parameters
Estimation
argMax
Top
Eric Xing © Eric Xing @ CMU, 2006-2010 8
Unigram over vocabulary
Topic Models
Why this is Useful?Why this is Useful?
 Q: What is it about?
 A: Mainly MT, with syntax, some learning
A Hierarchical Phrase-Based Model
f St ti ti l M hi T l ti
Mixing0 6 0 3 0 1 for Statistical Machine Translation
We present a statistical phrase-based
Translation model that uses hierarchical
phrases—phrases that contain sub-phrases.
The model is formally a synchronous
context-free grammar but is learned
f bit t ith t t ti
MT Syntax Learning
g
Proportion
0.6 0.3 0.1
from a bitext without any syntactic
information. Thus it can be seen as a
shift to the formal machinery of syntax
based translation systems without any
linguistic commitment. In our experiments
using BLEU as a metric, the hierarchical
Phrase based model achieves a relative
Improvement of 7 5% over Pharaoh
 Q: give me similar document?
 Structured way of browsing the collection
 Other tasks Improvement of 7.5% over Pharaoh,
a state-of-the-art phrase-based system.
 Dimensionality reduction
 TF-IDF vs. topic mixing proportion
 Classification, clustering, and more …
Eric Xing © Eric Xing @ CMU, 2006-2010 9
Topic Models: The Big PictureTopic Models: The Big Picture
Unstructured Collection Structured Topic NetworkUnstructured Collection Structured Topic Network
Topic Discovery
w1 T1
Dimensionality
Reduction
w2
wn
x
x
x
x Tk T2
x x x
x
Eric Xing © Eric Xing @ CMU, 2006-2010 10
Reduction
Word Simplex Topic Simplex
Topic ModelsTopic Models
Generating a documentg
Prior
 
each wordFor
priorthefrom
n
Draw 
θ
 
   from,|Draw-
fromDraw-
:1 nzknn
n
lmultinomiazw
lmultinomiaz


z
wβ
NK Nd
N
K
Which prior to use?
Eric Xing © Eric Xing @ CMU, 2006-2010 11
Choices of PriorsChoices of Priors
 Dirichlet (LDA) (Blei et al. 2003)( ) ( )
 Conjugate prior means efficient inference
 Can only capture variations in each topic’s
intensity independently
 Logistic Normal (CTM=LoNTAM) Logistic Normal (CTM=LoNTAM)
(Blei & Lafferty 2005, Ahmed &
Xing 2006)
 Capture the intuition that some topics are highly Capture the intuition that some topics are highly
correlated and can rise up in intensity together
 Not a conjugate prior implies hard inference
Eric Xing © Eric Xing @ CMU, 2006-2010 12
Generative Semantic of LoNTAMGenerative Semantic of LoNTAM
Generating a document μ
Σ
g
 fromDraw-
each wordFor
priorthefrom
n lmultinomiaz
n
Draw



μ
 
   from,|Draw- :1 nzknn
n
lmultinomiazw 
zβ
K
 
 ,~
,~


1 0KK
K
N
LN

 w
Nd
N
logexp













 


1
1
1
1
K
K
i
ii
i
e

N
Eric Xing © Eric Xing @ CMU, 2006-2010 13
- Log Partition Function
- Normalization Constant
  log 





 


1
1
1
K
i
i
eC 

Using the ModelUsing the Model
 Inference Inference
 Given a Document D
 Posterior: P(Θ | μ,Σ, β ,D)
E l ti P(D| Σ β ) Evaluation: P(D| μ,Σ, β )
L i Learning
 Given a collection of documents {Di}
 Parameter estimation
   



,,logmaxarg
)(
iDP
Eric Xing © Eric Xing @ CMU, 2006-2010 14
  ),,(
InferenceInference
   
dN
μ
Σ
   
  

dzwP
wPDP
dN K
n
n





,,,,,
,,,,
1

μ
 
      


dPzPzwP
dzwP
d
n
N K
nnn
n z
nn
 











 
,,
,,,,,
1 1
zβ
K
 nn z
 


  1 1
Intractable  approximate inference
w
Nd
N
K
Intractable  approximate inference
N
Eric Xing © Eric Xing @ CMU, 2006-2010 15
Variational InferenceVariational Inference
μ Σ
β

z
μ Σ
Approximate
the Integral 
z
μ* Σ*
Φ*
)|( 1 DzP 
z
w
Approximate
the Posterior       zqqzq  **
z
w
Φ
β
)|,( :1 DzP n the Posterior       nnn zqqzq  ,, :1
 i Optimization
μ*,Σ*,φ1:n*
Eric Xing © Eric Xing @ CMU, 2006-2010 16
 pqKL
n
minarg
**,*, :1 
Optimization
Problem
Solve
Variational Inference With no TearsVariational Inference With no Tears
μ Σ
Iterate until Convergence
 Pretend you know E[Z1:n]
 P(|E[z1:n], μ, Σ)
 Now you know E[]β

z
μ Σ
y []
 P(z1:n|E[], w1:n, β1:k)
)|}{( DzP 
β z
w
)|}{,( DzP 
  
Message Passing Scheme (GMF)
 More Formally:   



  MBqYCC XySXPXq
y
:*
Eric Xing © Eric Xing @ CMU, 2006-2010 17
Message Passing Scheme (GMF)
Equivalent to previous method (Xing et. al.2003)
LoNTAM Variations InferenceLoNTAM Variations Inference
 Fully Factored Distribution μ Σ
 Fully Factored Distribution
      nn zqqzq  :1, β

z
μ
)|}{,( DzP 
w
 Two clusters: and Z1:n
    XSXPX*  



  MBqYCC XySXPXq
y
:*
 Fixed Point Equations

μ Σ
   
  
 qzSPq
z
,,*  β

z
w
Eric Xing © Eric Xing @ CMU, 2006-2010 18
      nn zqqzq  :1,
  



 kqz SzPzq :1,* 
Variational 
μ Σ
    zSPq  ,,*
Variational 

z
μ Σ
Now what is ?

   
   

z
z
qz
qz
SPP
SPq


,,
,,
zqzS
      CNmN
zq
exp,,
qz   





  n n
nnz kzIzImS ,...,1
 
 






 
 CNm
z
z
q
q
'
2
1
exp 11
β

z
μ Σ
)|}{,( DzP 
β z
w
   1
     ^^^^ '5.')()(    HgCC
Eric Xing © Eric Xing @ CMU, 2006-2010 19
   ,
*
  Nq  
 ^
1
1
NgmHN
NHinv




 

Approximation QualityApproximation Quality
Eric Xing © Eric Xing @ CMU, 2006-2010 20
Variational 

Variational 

z
q
 
 kjk
qz
wPSP
wSzPzq

 ,,*







 
  kjq
k
j
q
zP
zwPSzP



 ,






w

μ Σ
 
  kjk
jq


exp ,
)|}{,( DzP 
β z
w
Eric Xing © Eric Xing @ CMU, 2006-2010 21
)|}{,(
Variational Inference: RecapVariational Inference: Recap
 Run one document at a timeRun one document at a time
 Message Passing Scheme
 GMF{z}<m>{ } 
 GMFz<>
 Iterate until Convergence
 Posterior over is a MVN with full covariance
Eric Xing © Eric Xing @ CMU, 2006-2010 22
Now you’ve got an algorithm, don’t
forget to compare with related work
Ahmed&Xing Blei&Lafferty
forget to compare with related work
γ
μ Σ
γ
μ* Σ*
β z
w
z
w
φ
β
)|}{,( DzP 
Σ* is assumed to be diagonalΣ* is full matrix
      nnn zqqzq  **,, :1
Log Partition Function
1
 K
Multivariate
Quadratic Approx.
Tangent Approx.
Numerical
Eric Xing © Eric Xing @ CMU, 2006-2010 23
1log
1
1






 
K
i
i
e Closed Form
Solution for μ*, Σ*
Numerical
Optimization to
fit μ*, Diag(Σ*)
Tangent ApproximationTangent Approximation
Eric Xing © Eric Xing @ CMU, 2006-2010 24
EvaluationEvaluation
 A common (but not so right) practice( g ) p
 Try models on real data for some empirical task, say classifications or
topic extraction; two reactions
 Hmm! The results “make sense to me”, so the model is good!
 Objective?
 Gee! The results are terrible! Model is bad!
 Where does the error come from?
Eric Xing © Eric Xing @ CMU, 2006-2010 25
Evaluation: testing inferenceEvaluation: testing inference
 Simulated Data
 We know the ground truth for Θ ,
 This is a crucial step because it can discern performance loss due to modeling
insufficiency from inference inaccuracy
 Vary model dimensions
 K= Number of topics
 M= vocabulary size
 Nd= number of words per document Nd= number of words per document
 Test
 Inference
 Accuracy of the recovered Θ
 Number of Iteration to converge (1e-6, default setting)
 Parameter Estimation
G l     logmaxarg DP
Eric Xing © Eric Xing @ CMU, 2006-2010 26
 Goal:
 Standard VEM + Deterministic Annealing
   



,,logmaxarg
),,(
iDP
Test on Synthetic Text

μ Σ
Test on Synthetic Text
w
β z
Eric Xing © Eric Xing @ CMU, 2006-2010 27
Comparison: accuracy and speedComparison: accuracy and speed
L2 error in topic vector est.
and # of iterations
 Varying Num. of Topics
 Varying Voc. Size
V i N W d P
Eric Xing © Eric Xing @ CMU, 2006-2010 28
 Varying Num. Words Per
Document
Parameter EstimationParameter Estimation
 Goal:
   l
 Standard VEM
   



,,logmaxarg
),,(
iDP
 fitting μ*, Σ* for each document
 get model parameter using their expected sufficient Statistics
 Problems
H i f ll i f th t i t d l i l l i Having full covariance for the posterior traps our model in local maxima
 Solution:
 Deterministic Annealing
μ Σ
β

z
Eric Xing © Eric Xing @ CMU, 2006-2010 29
Deterministic Annealing: Big Picture
 X)P(Y,maxarg),,( )|(
,,
* 

 XYpE


Eric Xing © Eric Xing @ CMU, 2006-2010 30
Deterministic Annealing
 DA-EM
Deterministic Annealing
 EM
 X)P(Y,maxarg),,( )|(
,,
*
XYpE



  X)P(Y,maxarg),,( )|(
,,
* 

 XYpE


Eric Xing © Eric Xing @ CMU, 2006-2010 31
Deterministic Annealing
 DA-EM
Deterministic Annealing
 EM
 X)P(Y,maxarg),,( )|(
,,
*
XYpE



  X)P(Y,maxarg),,( )|(
,,
* 

 XYpE


Eric Xing © Eric Xing @ CMU, 2006-2010 32
Deterministic Annealing
 DA-EM
Deterministic Annealing
 EM
 X)P(Y,maxarg),,( )|(
,,
*
XYpE



  X)P(Y,maxarg),,( )|(
,,
* 

 XYpE


Life is not always that Good
Eric Xing © Eric Xing @ CMU, 2006-2010 33
Deterministic Annealing
 DA-EM
Deterministic Annealing
 EM
 X)P(Y,maxarg),,( )|(
,,
*
XYpE



  X)P(Y,maxarg),,( )|(
,,
* 

 XYpE


DA VEMVEM  DA-VEM VEM
 X)P(Y,maxarg),,( )|(
*
XYE   X)P(Y,maxarg),,( )|(
* 

 XYqE

 X)P(Y,maxarg),,( )|(
,,
XYqE



 ,,  
For exponential Families this requires two line change to standard (V)EM
Eric Xing © Eric Xing @ CMU, 2006-2010 34
-For exponential Families, this requires two line change to standard (V)EM
-Read more on that (Noah Smith & Jason Eisner ACL2004, COLLING-ACL2006)
Result on NIPS collectionResult on NIPS collection
 NIPS proceeding from 1988-2003p g
 14036 words
 2484 docs
 80% for training and 20% for testing
 Fit both models with 10,20,30,40 topics
C l it h ld t d t Compare perplexity on held out data
 The perplexity of a language model with respect to text x is the reciprocal of the
geometric average of the probabilities of the predictions in text x. So, if text x has
k words then the perplexity of the language model with respect to that text isk words, then the perplexity of the language model with respect to that text is
Pr(x) -1/k
Eric Xing © Eric Xing @ CMU, 2006-2010 35
Comparison: perplexityComparison: perplexity
Eric Xing © Eric Xing @ CMU, 2006-2010 36
Topics and topic graphsTopics and topic graphs
Eric Xing © Eric Xing @ CMU, 2006-2010 37
Classification Result on PNAS
collectioncollection
 PNAS abstracts from 1997-2002
 2500 documents
 Average of 170 words per document
 Fitted 40-topics model using both approaches
 Use low dimensional representation to predict the abstract category
 Use SVM classifier
 85% for training and 15% for testing
Classification Accuracy
-Notable Difference
-Examine the low dimensional
representations below
Eric Xing © Eric Xing @ CMU, 2006-2010 38
representations below
Are we done?Are we done?
 What was our task?
 Embedding (lower dimensional representation): yes, Dec  
 Distillation of semantics: kind of, we’ve learned “topics” 
 Classification: is it good?
 Clustering: is it reasonable?
 Other predictive tasks?
Eric Xing © Eric Xing @ CMU, 2006-2010 39
Some shocking results on LDASome shocking results on LDA
RetrievalClassification Annotation
 LDA is actually doing very poor on several “objectively”
Eric Xing © Eric Xing @ CMU, 2006-2010 40
y g y p j y
evaluatable predictive tasks
Why?Why?
 LDA is not designed, nor trained for such tasks, such asg
classification, there is not warrantee that the estimated topic
vector  is good at discriminating documents
Eric Xing © Eric Xing @ CMU, 2006-2010 41
Supervised Topic Model (sLDA)Supervised Topic Model (sLDA)
 LDA ignores documents’ side information (e.g., categories or rating score), thus lead to
suboptimal topic representation for supervised taskssuboptimal topic representation for supervised tasks
 Supervised Topic Models handle such problems, e.g., sLDA (Blei & McAuliffe, 2007) and
DiscLDA(Simon et al., 2008)
 Generative Procedure (sLDA):
 For each document d:
 Sample a topic proportion
F h d
Continuous (regression)
 For each word:
– Sample a topic
– Sample a word
 Sample
(Blei & McAuliffe, 2007)
Discrete (classification)
( g )
 Joint distribution:
Eric Xing © Eric Xing @ CMU, 2006-2010 42
 Variational inference:
MedLDA: a max-margin approachMedLDA: a max-margin approach
 Big picture of supervised topic models
– sLDA: optimizes the joint likelihood for regression and classification
DiscLDA: optimizes the conditional likelihood for classification ONLY– DiscLDA: optimizes the conditional likelihood for classification ONLY
– MedLDA: based on max-margin learning for both regression and classification
Eric Xing © Eric Xing @ CMU, 2006-2010 43
MedLDA Regression ModelMedLDA Regression Model
 Generative Procedure (Bayesian sLDA):
 Sample a parameter Sample a parameter
 For each document d:
 Sample a topic proportion
 For each word:
– Sample a topicp p
– Sample a word
 Sample :
 Def:
predictive accuracy
model fitting
Eric Xing © Eric Xing @ CMU, 2006-2010 44
ExperimentsExperiments
 Goals:
 To qualitatively and quantitatively evaluate how the max-margin
estimates of MedLDA affect its topic discovering procedure
Data Sets Data Sets:
 20 Newsgroups
 Documents from 20 categories
 ~ 20,000 documents in each group20,000 documents in each group
 Remove stop word as listed in
 Movie Review
006 d d 1 6M d 5006 documents, and 1.6M words
 Dictionary: 5000 terms selected by tf-idf
 Preprocessing to make the response approximately normal (Blei & McAuliffe, 2007)
Eric Xing © Eric Xing @ CMU, 2006-2010 45
Document ModelingDocument Modeling
 Data Set: 20 Newsgroups
 110 topics + 2D embedding with t-SNE (var der Maaten & Hinton,
2008)
Eric Xing © Eric Xing @ CMU, 2006-2010 46
MedLDA LDA
Document Modeling (cont’)Document Modeling (cont )
 Comp.graphics:
comp.graphics
g
politics.mideast
Eric Xing © Eric Xing @ CMU, 2006-2010 47
ClassificationClassification
 Data Set: 20Newsgroups
Binary classification: “alt atheism” and “talk religion misc” (Simon et al 2008)– Binary classification: “alt.atheism” and “talk.religion.misc” (Simon et al., 2008)
– Multiclass Classification: all the 20 categories
 Models: DiscLDA, sLDA(Binary ONLY! Classification sLDA (Wang et al., 2009)),
LDA+SVM (baseline), MedLDA, MedLDA+SVM
Meas re Relati e Impro ement Ratio Measure: Relative Improvement Ratio
Eric Xing © Eric Xing @ CMU, 2006-2010 48
RegressionRegression
 Data Set: Movie Review (Blei & McAuliffe, 2007)
 Models: MedLDA(partial), MedLDA(full), sLDA, LDA+SVR
 Measure: predictive R2 and per-word log-likelihood
Sharp decrease in SVsp
Eric Xing © Eric Xing @ CMU, 2006-2010 49
Time EfficiencyTime Efficiency
 Binary Classification
 Multiclass:
— MedLDA is comparable with LDA+SVM
 Regression:
Eric Xing © Eric Xing @ CMU, 2006-2010 50
— MedLDA is comparable with sLDA
Finally, think about
a general framework
 MedLDA can be generalized to arbitrary topic models:
a general framework
– Unsupervised or supervised
– Generative or undirected random fields (e.g., Harmoniums)
 MED Topic Model (MedTM):
 : hidden r.v.s in the underlying topic model, e.g., in LDA
 : parameters in predictive model e g in sLDA : parameters in predictive model, e.g., in sLDA
 : parameters of the topic model, e.g., in LDA
 : an variational upper bound of the log-likelihood
 : a convex function over slack variables
Eric Xing © Eric Xing @ CMU, 2006-2010 51
 : a convex function over slack variables
SummarySummary
 A 6-dimensional space of working with graphical models
 Task:
 Embedding? Classification? Clustering? Topic extraction? …
 Data representation:
 Input and output (e.g., continuous, binary, counts, …)
 Model:
 BN? MRF? Regression? SVM?
 Inference: Inference:
 Exact inference? MCMC? Variational?
 Learning:
 MLE? MCLE? Max margin?g
 Evaluation:
 Visualization? Human interpretability? Perperlexity? Predictive accuracy?
 It is better to consider one element at a time!
Eric Xing © Eric Xing @ CMU, 2006-2010 52

More Related Content

What's hot

Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
Hiroyuki Kuromiya
 
Survey of Generative Clustering Models 2008
Survey of Generative Clustering Models 2008Survey of Generative Clustering Models 2008
Survey of Generative Clustering Models 2008
Roman Stanchak
 
The Duet model
The Duet modelThe Duet model
The Duet model
Bhaskar Mitra
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
Bhaskar Mitra
 
Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018
Federico Bianchi
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalDustin Smith
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Bhaskar Mitra
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
Bhaskar Mitra
 
E43022023
E43022023E43022023
E43022023
IJERA Editor
 
Lecture02 - Fundamental Programming with Python Language
Lecture02 - Fundamental Programming with Python LanguageLecture02 - Fundamental Programming with Python Language
Lecture02 - Fundamental Programming with Python Language
National College of Business Administration & Economics ( NCBA&E)
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Textbutest
 
A survey of xml tree patterns
A survey of xml tree patternsA survey of xml tree patterns
A survey of xml tree patterns
IEEEFINALYEARPROJECTS
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
Bhaskar Mitra
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
Manohar Swamynathan
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Leonardo Di Donato
 
PPT slides
PPT slidesPPT slides
PPT slidesbutest
 
Ju3517011704
Ju3517011704Ju3517011704
Ju3517011704
IJERA Editor
 

What's hot (19)

Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Survey of Generative Clustering Models 2008
Survey of Generative Clustering Models 2008Survey of Generative Clustering Models 2008
Survey of Generative Clustering Models 2008
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018Type Vector Representations from Text. DL4KGS@ESWC 2018
Type Vector Representations from Text. DL4KGS@ESWC 2018
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
E43022023
E43022023E43022023
E43022023
 
Lecture02 - Fundamental Programming with Python Language
Lecture02 - Fundamental Programming with Python LanguageLecture02 - Fundamental Programming with Python Language
Lecture02 - Fundamental Programming with Python Language
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
A survey of xml tree patterns
A survey of xml tree patternsA survey of xml tree patterns
A survey of xml tree patterns
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Ir
IrIr
Ir
 
PPT slides
PPT slidesPPT slides
PPT slides
 
Ju3517011704
Ju3517011704Ju3517011704
Ju3517011704
 

Viewers also liked

Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
Tianlu Wang
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
Tianlu Wang
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
Tianlu Wang
 
Lecture14 xing fei-fei
Lecture14 xing fei-feiLecture14 xing fei-fei
Lecture14 xing fei-fei
Tianlu Wang
 
Lecture2 xing
Lecture2 xingLecture2 xing
Lecture2 xing
Tianlu Wang
 
Lecture16 xing
Lecture16 xingLecture16 xing
Lecture16 xing
Tianlu Wang
 
Lecture5 xing
Lecture5 xingLecture5 xing
Lecture5 xing
Tianlu Wang
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
Tianlu Wang
 
Lecture6 xing
Lecture6 xingLecture6 xing
Lecture6 xing
Tianlu Wang
 
Lecture18 xing
Lecture18 xingLecture18 xing
Lecture18 xing
Tianlu Wang
 
Lecture11 xing
Lecture11 xingLecture11 xing
Lecture11 xing
Tianlu Wang
 
Lecture19 xing
Lecture19 xingLecture19 xing
Lecture19 xing
Tianlu Wang
 
Lecture8 xing
Lecture8 xingLecture8 xing
Lecture8 xing
Tianlu Wang
 
Lecture1 xing fei-fei
Lecture1 xing fei-feiLecture1 xing fei-fei
Lecture1 xing fei-fei
Tianlu Wang
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
Tianlu Wang
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-fei
Tianlu Wang
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
Tianlu Wang
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-fei
Tianlu Wang
 

Viewers also liked (18)

Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
 
Lecture14 xing fei-fei
Lecture14 xing fei-feiLecture14 xing fei-fei
Lecture14 xing fei-fei
 
Lecture2 xing
Lecture2 xingLecture2 xing
Lecture2 xing
 
Lecture16 xing
Lecture16 xingLecture16 xing
Lecture16 xing
 
Lecture5 xing
Lecture5 xingLecture5 xing
Lecture5 xing
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
Lecture6 xing
Lecture6 xingLecture6 xing
Lecture6 xing
 
Lecture18 xing
Lecture18 xingLecture18 xing
Lecture18 xing
 
Lecture11 xing
Lecture11 xingLecture11 xing
Lecture11 xing
 
Lecture19 xing
Lecture19 xingLecture19 xing
Lecture19 xing
 
Lecture8 xing
Lecture8 xingLecture8 xing
Lecture8 xing
 
Lecture1 xing fei-fei
Lecture1 xing fei-feiLecture1 xing fei-fei
Lecture1 xing fei-fei
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-fei
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-fei
 

Similar to Lecture20 xing

Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational Modelling
Marco Wirthlin
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.pptbutest
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
Marina Santini
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AI
Giovanni Ciatto
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
ijnlc
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
kevig
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
krisztianbalog
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
Massimo Schenone
 
2007bai7604.doc.doc
2007bai7604.doc.doc2007bai7604.doc.doc
2007bai7604.doc.docbutest
 
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
cseij
 
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Codemotion
 
Data Science Using Python.pptx
Data Science Using Python.pptxData Science Using Python.pptx
Data Science Using Python.pptx
Sarkunavathi Aribal
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
Adrian Paschke
 
Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002
Yannis Kalfoglou
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
Derek Kane
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
Lihua Zhao
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsArchiLab 7
 

Similar to Lecture20 xing (20)

Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational Modelling
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AI
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
2007bai7604.doc.doc
2007bai7604.doc.doc2007bai7604.doc.doc
2007bai7604.doc.doc
 
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
 
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
Deep Learning for Machine Translation: a paradigm shift - Alberto Massidda - ...
 
The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...
 
Data Science Using Python.pptx
Data Science Using Python.pptxData Science Using Python.pptx
Data Science Using Python.pptx
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worlds
 

More from Tianlu Wang

14 pro resolution
14 pro resolution14 pro resolution
14 pro resolution
Tianlu Wang
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculus
Tianlu Wang
 
12 adversal search
12 adversal search12 adversal search
12 adversal search
Tianlu Wang
 
11 alternative search
11 alternative search11 alternative search
11 alternative search
Tianlu Wang
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculus
Tianlu Wang
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learning
Tianlu Wang
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidence
Tianlu Wang
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledge
Tianlu Wang
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systems
Tianlu Wang
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based system
Tianlu Wang
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolution
Tianlu Wang
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolution
Tianlu Wang
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic search
Tianlu Wang
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed search
Tianlu Wang
 

More from Tianlu Wang (20)

L7 er2
L7 er2L7 er2
L7 er2
 
L8 design1
L8 design1L8 design1
L8 design1
 
L9 design2
L9 design2L9 design2
L9 design2
 
14 pro resolution
14 pro resolution14 pro resolution
14 pro resolution
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculus
 
12 adversal search
12 adversal search12 adversal search
12 adversal search
 
11 alternative search
11 alternative search11 alternative search
11 alternative search
 
10 2 sum
10 2 sum10 2 sum
10 2 sum
 
22 planning
22 planning22 planning
22 planning
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculus
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learning
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidence
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledge
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systems
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based system
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolution
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolution
 
15 predicate
15 predicate15 predicate
15 predicate
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic search
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed search
 

Recently uploaded

一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
taqyed
 
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
beduwt
 
Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
SuryaKalyan3
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
luforfor
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
SuryaKalyan3
 
Sundabet | Slot gacor dan terpercaya mudah menang
Sundabet | Slot gacor dan terpercaya mudah menangSundabet | Slot gacor dan terpercaya mudah menang
Sundabet | Slot gacor dan terpercaya mudah menang
Sundabet | Situs Slot gacor dan terpercaya
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
CristianMestre
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
angelicafronda7
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
Aine Greaney Ellrott
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
Sandhya J.Nair
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
aditiyad2020
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
iraqartsandculture
 
CLASS XII- HISTORY-THEME 4-Thinkers, Bes
CLASS XII- HISTORY-THEME 4-Thinkers, BesCLASS XII- HISTORY-THEME 4-Thinkers, Bes
CLASS XII- HISTORY-THEME 4-Thinkers, Bes
aditiyad2020
 
The Legacy of Breton In A New Age by Master Terrance Lindall
The Legacy of Breton In A New Age by Master Terrance LindallThe Legacy of Breton In A New Age by Master Terrance Lindall
The Legacy of Breton In A New Age by Master Terrance Lindall
BBaez1
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
zvaywau
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
ZackSpencer3
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
ClaireWilson398082
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
zvaywau
 
thGAP - BAbyss in Moderno!! Transgenic Human Germline Alternatives Project
thGAP - BAbyss in Moderno!!  Transgenic Human Germline Alternatives ProjectthGAP - BAbyss in Moderno!!  Transgenic Human Germline Alternatives Project
thGAP - BAbyss in Moderno!! Transgenic Human Germline Alternatives Project
Marc Dusseiller Dusjagr
 

Recently uploaded (20)

一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
 
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
一比一原版UPenn毕业证宾夕法尼亚大学毕业证成绩单如何办理
 
Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
 
European Cybersecurity Skills Framework Role Profiles.pdf
European Cybersecurity Skills Framework Role Profiles.pdfEuropean Cybersecurity Skills Framework Role Profiles.pdf
European Cybersecurity Skills Framework Role Profiles.pdf
 
Sundabet | Slot gacor dan terpercaya mudah menang
Sundabet | Slot gacor dan terpercaya mudah menangSundabet | Slot gacor dan terpercaya mudah menang
Sundabet | Slot gacor dan terpercaya mudah menang
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
 
CLASS XII- HISTORY-THEME 4-Thinkers, Bes
CLASS XII- HISTORY-THEME 4-Thinkers, BesCLASS XII- HISTORY-THEME 4-Thinkers, Bes
CLASS XII- HISTORY-THEME 4-Thinkers, Bes
 
The Legacy of Breton In A New Age by Master Terrance Lindall
The Legacy of Breton In A New Age by Master Terrance LindallThe Legacy of Breton In A New Age by Master Terrance Lindall
The Legacy of Breton In A New Age by Master Terrance Lindall
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
 
thGAP - BAbyss in Moderno!! Transgenic Human Germline Alternatives Project
thGAP - BAbyss in Moderno!!  Transgenic Human Germline Alternatives ProjectthGAP - BAbyss in Moderno!!  Transgenic Human Germline Alternatives Project
thGAP - BAbyss in Moderno!! Transgenic Human Germline Alternatives Project
 

Lecture20 xing

  • 1. Machine LearningMachine Learninggg How to put things together ?How to put things together ? A caseA case--study of model design, inference,study of model design, inference, learning, evaluation in text analysislearning, evaluation in text analysis Eric XingEric Xing Eric Xing © Eric Xing @ CMU, 2006-2010 1 Lecture 20, August 16, 2010 Reading:
  • 2. Need computers to help usNeed computers to help us… (from images.google.cn)  Humans cannot afford to deal with (e.g., search, browse, or measure similarity) a huge number of text documents  We need computers to help out … Eric Xing © Eric Xing @ CMU, 2006-2010 2 p p
  • 3. NLP and Data MiningNLP and Data Mining We want:  Semantic-based search  infer topics and categorize documents  Multimedia inference  Automatic translation  Predict how topics evolveevolve  … Eric Xing © Eric Xing @ CMU, 2006-2010 3 Research topics 1900 2000 Research topics 1900 2000
  • 4. How to get started?How to get started?  Here are some important elements to consider before you start:  Task:  Embedding? Classification? Clustering? Topic extraction? …  Data representation:  Input and output (e.g., continuous, binary, counts, …)  Model:  BN? MRF? Regression? SVM?  Inference: Inference:  Exact inference? MCMC? Variational?  Learning:  MLE? MCLE? Max margin?g  Evaluation:  Visualization? Human interpretability? Perperlexity? Predictive accuracy?  It is better to consider one element at a time! Eric Xing © Eric Xing @ CMU, 2006-2010 4
  • 5. Tasks:Tasks:  Say, we want to have a mapping …, so that   Compare similarity  Classify contents  Cluster/group/categorizing Eric Xing © Eric Xing @ CMU, 2006-2010 5  Cluster/group/categorizing  Distill semantics and perspectives  ..
  • 6. Modeling document collectionsModeling document collections  A document collection is a dataset where each data point isp itself a collection of simpler data.  Text documents are collections of words. Segmented images are collections of regions Segmented images are collections of regions.  User histories are collections of purchased items.  Many modern problems ask questions of such data.  Is this text document relevant to my query? Whi h t i thi i i ? Which category is this image in?  What movies would I probably like?  Create a caption for this image.  Modeling document collections Eric Xing © Eric Xing @ CMU, 2006-2010 6  Modeling document collections
  • 7. Representation:Representation:  Data: Bag of Words Representation As for the Arabian and Palestinean voices that are against the current negotiations and the so-called peace process, they are not against peace per se, but rather for their well- founded predictions that Israel would NOT give an inch of Arabian p g the West bank (and most probably the same for Golan Heights) back to the Arabs. An 18 months of "negotiations" in Madrid, and Washington proved these predictions. Now many will jump on me saying why are you blaming israelis for no result negotiations I would say why would the negotiations against peace Israel  Each document is a vector in the word space for no-result negotiations. I would say why would the Arabs stall the negotiations, what do they have to loose ? Israel Arabs blaming  Ignore the order of words in a document. Only count matters!  A high-dimensional and sparse representation – Not efficient text processing tasks, e.g., search, document Eric Xing © Eric Xing @ CMU, 2006-2010 7 classification, or similarity measure – Not effective for browsing
  • 8. How to Model Semantic?How to Model Semantic?  Q: What is it about?  A: Mainly MT, with syntax, some learning A Hierarchical Phrase-Based Model f St ti ti l M hi T l ti0 6 0 3 0 1 Mixing for Statistical Machine Translation We present a statistical phrase-based Translation model that uses hierarchical phrases—phrases that contain sub-phrases. The model is formally a synchronous context-free grammar but is learned f bit t ith t t tiSource MT Syntax Learning 0.6 0.3 0.1 g Proportion from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntax based translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical Phrase based model achieves a relative Improvement of 7 5% over Pharaoh Source Target SMT Alignment S Parse Tree Noun Phrase likelihood EM Hidden Parameters pics Improvement of 7.5% over Pharaoh, a state-of-the-art phrase-based system.Score BLEU Phrase Grammar CFG Parameters Estimation argMax Top Eric Xing © Eric Xing @ CMU, 2006-2010 8 Unigram over vocabulary Topic Models
  • 9. Why this is Useful?Why this is Useful?  Q: What is it about?  A: Mainly MT, with syntax, some learning A Hierarchical Phrase-Based Model f St ti ti l M hi T l ti Mixing0 6 0 3 0 1 for Statistical Machine Translation We present a statistical phrase-based Translation model that uses hierarchical phrases—phrases that contain sub-phrases. The model is formally a synchronous context-free grammar but is learned f bit t ith t t ti MT Syntax Learning g Proportion 0.6 0.3 0.1 from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntax based translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical Phrase based model achieves a relative Improvement of 7 5% over Pharaoh  Q: give me similar document?  Structured way of browsing the collection  Other tasks Improvement of 7.5% over Pharaoh, a state-of-the-art phrase-based system.  Dimensionality reduction  TF-IDF vs. topic mixing proportion  Classification, clustering, and more … Eric Xing © Eric Xing @ CMU, 2006-2010 9
  • 10. Topic Models: The Big PictureTopic Models: The Big Picture Unstructured Collection Structured Topic NetworkUnstructured Collection Structured Topic Network Topic Discovery w1 T1 Dimensionality Reduction w2 wn x x x x Tk T2 x x x x Eric Xing © Eric Xing @ CMU, 2006-2010 10 Reduction Word Simplex Topic Simplex
  • 11. Topic ModelsTopic Models Generating a documentg Prior   each wordFor priorthefrom n Draw  θ      from,|Draw- fromDraw- :1 nzknn n lmultinomiazw lmultinomiaz   z wβ NK Nd N K Which prior to use? Eric Xing © Eric Xing @ CMU, 2006-2010 11
  • 12. Choices of PriorsChoices of Priors  Dirichlet (LDA) (Blei et al. 2003)( ) ( )  Conjugate prior means efficient inference  Can only capture variations in each topic’s intensity independently  Logistic Normal (CTM=LoNTAM) Logistic Normal (CTM=LoNTAM) (Blei & Lafferty 2005, Ahmed & Xing 2006)  Capture the intuition that some topics are highly Capture the intuition that some topics are highly correlated and can rise up in intensity together  Not a conjugate prior implies hard inference Eric Xing © Eric Xing @ CMU, 2006-2010 12
  • 13. Generative Semantic of LoNTAMGenerative Semantic of LoNTAM Generating a document μ Σ g  fromDraw- each wordFor priorthefrom n lmultinomiaz n Draw    μ      from,|Draw- :1 nzknn n lmultinomiazw  zβ K    ,~ ,~   1 0KK K N LN   w Nd N logexp                  1 1 1 1 K K i ii i e  N Eric Xing © Eric Xing @ CMU, 2006-2010 13 - Log Partition Function - Normalization Constant   log           1 1 1 K i i eC  
  • 14. Using the ModelUsing the Model  Inference Inference  Given a Document D  Posterior: P(Θ | μ,Σ, β ,D) E l ti P(D| Σ β ) Evaluation: P(D| μ,Σ, β ) L i Learning  Given a collection of documents {Di}  Parameter estimation        ,,logmaxarg )( iDP Eric Xing © Eric Xing @ CMU, 2006-2010 14   ),,(
  • 15. InferenceInference     dN μ Σ         dzwP wPDP dN K n n      ,,,,, ,,,, 1  μ            dPzPzwP dzwP d n N K nnn n z nn                ,, ,,,,, 1 1 zβ K  nn z       1 1 Intractable  approximate inference w Nd N K Intractable  approximate inference N Eric Xing © Eric Xing @ CMU, 2006-2010 15
  • 16. Variational InferenceVariational Inference μ Σ β  z μ Σ Approximate the Integral  z μ* Σ* Φ* )|( 1 DzP  z w Approximate the Posterior       zqqzq  ** z w Φ β )|,( :1 DzP n the Posterior       nnn zqqzq  ,, :1  i Optimization μ*,Σ*,φ1:n* Eric Xing © Eric Xing @ CMU, 2006-2010 16  pqKL n minarg **,*, :1  Optimization Problem Solve
  • 17. Variational Inference With no TearsVariational Inference With no Tears μ Σ Iterate until Convergence  Pretend you know E[Z1:n]  P(|E[z1:n], μ, Σ)  Now you know E[]β  z μ Σ y []  P(z1:n|E[], w1:n, β1:k) )|}{( DzP  β z w )|}{,( DzP     Message Passing Scheme (GMF)  More Formally:         MBqYCC XySXPXq y :* Eric Xing © Eric Xing @ CMU, 2006-2010 17 Message Passing Scheme (GMF) Equivalent to previous method (Xing et. al.2003)
  • 18. LoNTAM Variations InferenceLoNTAM Variations Inference  Fully Factored Distribution μ Σ  Fully Factored Distribution       nn zqqzq  :1, β  z μ )|}{,( DzP  w  Two clusters: and Z1:n     XSXPX*        MBqYCC XySXPXq y :*  Fixed Point Equations  μ Σ         qzSPq z ,,*  β  z w Eric Xing © Eric Xing @ CMU, 2006-2010 18       nn zqqzq  :1,        kqz SzPzq :1,* 
  • 19. Variational  μ Σ     zSPq  ,,* Variational   z μ Σ Now what is ?           z z qz qz SPP SPq   ,, ,, zqzS       CNmN zq exp,, qz           n n nnz kzIzImS ,...,1              CNm z z q q ' 2 1 exp 11 β  z μ Σ )|}{,( DzP  β z w    1      ^^^^ '5.')()(    HgCC Eric Xing © Eric Xing @ CMU, 2006-2010 19    , *   Nq    ^ 1 1 NgmHN NHinv       
  • 20. Approximation QualityApproximation Quality Eric Xing © Eric Xing @ CMU, 2006-2010 20
  • 21. Variational   Variational   z q    kjk qz wPSP wSzPzq   ,,*            kjq k j q zP zwPSzP     ,       w  μ Σ     kjk jq   exp , )|}{,( DzP  β z w Eric Xing © Eric Xing @ CMU, 2006-2010 21 )|}{,(
  • 22. Variational Inference: RecapVariational Inference: Recap  Run one document at a timeRun one document at a time  Message Passing Scheme  GMF{z}<m>{ }   GMFz<>  Iterate until Convergence  Posterior over is a MVN with full covariance Eric Xing © Eric Xing @ CMU, 2006-2010 22
  • 23. Now you’ve got an algorithm, don’t forget to compare with related work Ahmed&Xing Blei&Lafferty forget to compare with related work γ μ Σ γ μ* Σ* β z w z w φ β )|}{,( DzP  Σ* is assumed to be diagonalΣ* is full matrix       nnn zqqzq  **,, :1 Log Partition Function 1  K Multivariate Quadratic Approx. Tangent Approx. Numerical Eric Xing © Eric Xing @ CMU, 2006-2010 23 1log 1 1         K i i e Closed Form Solution for μ*, Σ* Numerical Optimization to fit μ*, Diag(Σ*)
  • 24. Tangent ApproximationTangent Approximation Eric Xing © Eric Xing @ CMU, 2006-2010 24
  • 25. EvaluationEvaluation  A common (but not so right) practice( g ) p  Try models on real data for some empirical task, say classifications or topic extraction; two reactions  Hmm! The results “make sense to me”, so the model is good!  Objective?  Gee! The results are terrible! Model is bad!  Where does the error come from? Eric Xing © Eric Xing @ CMU, 2006-2010 25
  • 26. Evaluation: testing inferenceEvaluation: testing inference  Simulated Data  We know the ground truth for Θ ,  This is a crucial step because it can discern performance loss due to modeling insufficiency from inference inaccuracy  Vary model dimensions  K= Number of topics  M= vocabulary size  Nd= number of words per document Nd= number of words per document  Test  Inference  Accuracy of the recovered Θ  Number of Iteration to converge (1e-6, default setting)  Parameter Estimation G l     logmaxarg DP Eric Xing © Eric Xing @ CMU, 2006-2010 26  Goal:  Standard VEM + Deterministic Annealing        ,,logmaxarg ),,( iDP
  • 27. Test on Synthetic Text  μ Σ Test on Synthetic Text w β z Eric Xing © Eric Xing @ CMU, 2006-2010 27
  • 28. Comparison: accuracy and speedComparison: accuracy and speed L2 error in topic vector est. and # of iterations  Varying Num. of Topics  Varying Voc. Size V i N W d P Eric Xing © Eric Xing @ CMU, 2006-2010 28  Varying Num. Words Per Document
  • 29. Parameter EstimationParameter Estimation  Goal:    l  Standard VEM        ,,logmaxarg ),,( iDP  fitting μ*, Σ* for each document  get model parameter using their expected sufficient Statistics  Problems H i f ll i f th t i t d l i l l i Having full covariance for the posterior traps our model in local maxima  Solution:  Deterministic Annealing μ Σ β  z Eric Xing © Eric Xing @ CMU, 2006-2010 29
  • 30. Deterministic Annealing: Big Picture  X)P(Y,maxarg),,( )|( ,, *    XYpE   Eric Xing © Eric Xing @ CMU, 2006-2010 30
  • 31. Deterministic Annealing  DA-EM Deterministic Annealing  EM  X)P(Y,maxarg),,( )|( ,, * XYpE      X)P(Y,maxarg),,( )|( ,, *    XYpE   Eric Xing © Eric Xing @ CMU, 2006-2010 31
  • 32. Deterministic Annealing  DA-EM Deterministic Annealing  EM  X)P(Y,maxarg),,( )|( ,, * XYpE      X)P(Y,maxarg),,( )|( ,, *    XYpE   Eric Xing © Eric Xing @ CMU, 2006-2010 32
  • 33. Deterministic Annealing  DA-EM Deterministic Annealing  EM  X)P(Y,maxarg),,( )|( ,, * XYpE      X)P(Y,maxarg),,( )|( ,, *    XYpE   Life is not always that Good Eric Xing © Eric Xing @ CMU, 2006-2010 33
  • 34. Deterministic Annealing  DA-EM Deterministic Annealing  EM  X)P(Y,maxarg),,( )|( ,, * XYpE      X)P(Y,maxarg),,( )|( ,, *    XYpE   DA VEMVEM  DA-VEM VEM  X)P(Y,maxarg),,( )|( * XYE   X)P(Y,maxarg),,( )|( *    XYqE   X)P(Y,maxarg),,( )|( ,, XYqE     ,,   For exponential Families this requires two line change to standard (V)EM Eric Xing © Eric Xing @ CMU, 2006-2010 34 -For exponential Families, this requires two line change to standard (V)EM -Read more on that (Noah Smith & Jason Eisner ACL2004, COLLING-ACL2006)
  • 35. Result on NIPS collectionResult on NIPS collection  NIPS proceeding from 1988-2003p g  14036 words  2484 docs  80% for training and 20% for testing  Fit both models with 10,20,30,40 topics C l it h ld t d t Compare perplexity on held out data  The perplexity of a language model with respect to text x is the reciprocal of the geometric average of the probabilities of the predictions in text x. So, if text x has k words then the perplexity of the language model with respect to that text isk words, then the perplexity of the language model with respect to that text is Pr(x) -1/k Eric Xing © Eric Xing @ CMU, 2006-2010 35
  • 36. Comparison: perplexityComparison: perplexity Eric Xing © Eric Xing @ CMU, 2006-2010 36
  • 37. Topics and topic graphsTopics and topic graphs Eric Xing © Eric Xing @ CMU, 2006-2010 37
  • 38. Classification Result on PNAS collectioncollection  PNAS abstracts from 1997-2002  2500 documents  Average of 170 words per document  Fitted 40-topics model using both approaches  Use low dimensional representation to predict the abstract category  Use SVM classifier  85% for training and 15% for testing Classification Accuracy -Notable Difference -Examine the low dimensional representations below Eric Xing © Eric Xing @ CMU, 2006-2010 38 representations below
  • 39. Are we done?Are we done?  What was our task?  Embedding (lower dimensional representation): yes, Dec    Distillation of semantics: kind of, we’ve learned “topics”   Classification: is it good?  Clustering: is it reasonable?  Other predictive tasks? Eric Xing © Eric Xing @ CMU, 2006-2010 39
  • 40. Some shocking results on LDASome shocking results on LDA RetrievalClassification Annotation  LDA is actually doing very poor on several “objectively” Eric Xing © Eric Xing @ CMU, 2006-2010 40 y g y p j y evaluatable predictive tasks
  • 41. Why?Why?  LDA is not designed, nor trained for such tasks, such asg classification, there is not warrantee that the estimated topic vector  is good at discriminating documents Eric Xing © Eric Xing @ CMU, 2006-2010 41
  • 42. Supervised Topic Model (sLDA)Supervised Topic Model (sLDA)  LDA ignores documents’ side information (e.g., categories or rating score), thus lead to suboptimal topic representation for supervised taskssuboptimal topic representation for supervised tasks  Supervised Topic Models handle such problems, e.g., sLDA (Blei & McAuliffe, 2007) and DiscLDA(Simon et al., 2008)  Generative Procedure (sLDA):  For each document d:  Sample a topic proportion F h d Continuous (regression)  For each word: – Sample a topic – Sample a word  Sample (Blei & McAuliffe, 2007) Discrete (classification) ( g )  Joint distribution: Eric Xing © Eric Xing @ CMU, 2006-2010 42  Variational inference:
  • 43. MedLDA: a max-margin approachMedLDA: a max-margin approach  Big picture of supervised topic models – sLDA: optimizes the joint likelihood for regression and classification DiscLDA: optimizes the conditional likelihood for classification ONLY– DiscLDA: optimizes the conditional likelihood for classification ONLY – MedLDA: based on max-margin learning for both regression and classification Eric Xing © Eric Xing @ CMU, 2006-2010 43
  • 44. MedLDA Regression ModelMedLDA Regression Model  Generative Procedure (Bayesian sLDA):  Sample a parameter Sample a parameter  For each document d:  Sample a topic proportion  For each word: – Sample a topicp p – Sample a word  Sample :  Def: predictive accuracy model fitting Eric Xing © Eric Xing @ CMU, 2006-2010 44
  • 45. ExperimentsExperiments  Goals:  To qualitatively and quantitatively evaluate how the max-margin estimates of MedLDA affect its topic discovering procedure Data Sets Data Sets:  20 Newsgroups  Documents from 20 categories  ~ 20,000 documents in each group20,000 documents in each group  Remove stop word as listed in  Movie Review 006 d d 1 6M d 5006 documents, and 1.6M words  Dictionary: 5000 terms selected by tf-idf  Preprocessing to make the response approximately normal (Blei & McAuliffe, 2007) Eric Xing © Eric Xing @ CMU, 2006-2010 45
  • 46. Document ModelingDocument Modeling  Data Set: 20 Newsgroups  110 topics + 2D embedding with t-SNE (var der Maaten & Hinton, 2008) Eric Xing © Eric Xing @ CMU, 2006-2010 46 MedLDA LDA
  • 47. Document Modeling (cont’)Document Modeling (cont )  Comp.graphics: comp.graphics g politics.mideast Eric Xing © Eric Xing @ CMU, 2006-2010 47
  • 48. ClassificationClassification  Data Set: 20Newsgroups Binary classification: “alt atheism” and “talk religion misc” (Simon et al 2008)– Binary classification: “alt.atheism” and “talk.religion.misc” (Simon et al., 2008) – Multiclass Classification: all the 20 categories  Models: DiscLDA, sLDA(Binary ONLY! Classification sLDA (Wang et al., 2009)), LDA+SVM (baseline), MedLDA, MedLDA+SVM Meas re Relati e Impro ement Ratio Measure: Relative Improvement Ratio Eric Xing © Eric Xing @ CMU, 2006-2010 48
  • 49. RegressionRegression  Data Set: Movie Review (Blei & McAuliffe, 2007)  Models: MedLDA(partial), MedLDA(full), sLDA, LDA+SVR  Measure: predictive R2 and per-word log-likelihood Sharp decrease in SVsp Eric Xing © Eric Xing @ CMU, 2006-2010 49
  • 50. Time EfficiencyTime Efficiency  Binary Classification  Multiclass: — MedLDA is comparable with LDA+SVM  Regression: Eric Xing © Eric Xing @ CMU, 2006-2010 50 — MedLDA is comparable with sLDA
  • 51. Finally, think about a general framework  MedLDA can be generalized to arbitrary topic models: a general framework – Unsupervised or supervised – Generative or undirected random fields (e.g., Harmoniums)  MED Topic Model (MedTM):  : hidden r.v.s in the underlying topic model, e.g., in LDA  : parameters in predictive model e g in sLDA : parameters in predictive model, e.g., in sLDA  : parameters of the topic model, e.g., in LDA  : an variational upper bound of the log-likelihood  : a convex function over slack variables Eric Xing © Eric Xing @ CMU, 2006-2010 51  : a convex function over slack variables
  • 52. SummarySummary  A 6-dimensional space of working with graphical models  Task:  Embedding? Classification? Clustering? Topic extraction? …  Data representation:  Input and output (e.g., continuous, binary, counts, …)  Model:  BN? MRF? Regression? SVM?  Inference: Inference:  Exact inference? MCMC? Variational?  Learning:  MLE? MCLE? Max margin?g  Evaluation:  Visualization? Human interpretability? Perperlexity? Predictive accuracy?  It is better to consider one element at a time! Eric Xing © Eric Xing @ CMU, 2006-2010 52