SlideShare a Scribd company logo
1 of 58
Download to read offline
Spatial Latent Dirichlet
Allocation
Spatial Latent Dirichlet
Allocation
Authors : - Xiaogang Wang and Eric Crimson
Review By: George Mathew(george2)
Applications
Applications
• Text Mining
• Identifying similar chapters in a book
Applications
• Text Mining
• Identifying similar chapters in a book
• Computer Vision
• Face Recognition
Applications
• Text Mining
• Identifying similar chapters in a book
• Computer Vision
• Face Recognition
• Colocation Mining
• Identifying forest fires
Applications
• Text Mining
• Identifying similar chapters in a book
• Computer Vision
• Face Recognition
• Colocation Mining
• Identifying forest fires
• Music search
• Identifying genre of music based on segment of the song
LDA-Overview
LDA-Overview
• A generative probabilistic model
LDA-Overview
• A generative probabilistic model
• Represented as words, documents, corpus and labels
LDA-Overview
• A generative probabilistic model
• Represented as words, documents, corpus and labels
• words - primary unit of discrete data
LDA-Overview
• A generative probabilistic model
• Represented as words, documents, corpus and labels
• words - primary unit of discrete data
• document - sequence of words
LDA-Overview
• A generative probabilistic model
• Represented as words, documents, corpus and labels
• words - primary unit of discrete data
• document - sequence of words
• corpus - collection of all documents
LDA-Overview
• A generative probabilistic model
• Represented as words, documents, corpus and labels
• words - primary unit of discrete data
• document - sequence of words
• corpus - collection of all documents
• label(Output) - class of the document
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
• So here words would represent visual words which
could consist of
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
• So here words would represent visual words which
could consist of
• image patches
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
• So here words would represent visual words which
could consist of
• image patches
• spatial and temporal interest points
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
• So here words would represent visual words which
could consist of
• image patches
• spatial and temporal interest points
• moving pixels etc
Wait a minute … So how are we going to
perform computer vision applications
using words and documents
• So here words would represent visual words which
could consist of
• image patches
• spatial and temporal interest points
• moving pixels etc
• The paper takes an example of image classification
in computer vision.
Data Preprocessing
Data Preprocessing
• Image is convolved against a series of filters, 3
Gaussians, 4 Laplacians of Gaussians and 4 first
order derivatives of Gaussians
Data Preprocessing
• Image is convolved against a series of filters, 3
Gaussians, 4 Laplacians of Gaussians and 4 first
order derivatives of Gaussians
• A grid is used to divide the image into local
patches and the patch is sampled densely for a
particular local descriptor.
Data Preprocessing
• Image is convolved against a series of filters, 3
Gaussians, 4 Laplacians of Gaussians and 4 first
order derivatives of Gaussians
• A grid is used to divide the image into local
patches and the patch is sampled densely for a
particular local descriptor.
• The local descriptors of each patch in the entire
image set is clustered using k-means and stored in
an auxiliary data structure(lets call it a “Workbook”).
Clustering using LDA
Clustering using LDA
• Framework:
Clustering using LDA
• Framework:
• M documents(images)
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
• wji is the observed value of word i in document j
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
• wji is the observed value of word i in document j
• All words will be clustered into k topics
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
• wji is the observed value of word i in document j
• All words will be clustered into k topics
• Each topic k is modeled as a multinomial distribution over the
WorkBook
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
• wji is the observed value of word i in document j
• All words will be clustered into k topics
• Each topic k is modeled as a multinomial distribution over the
WorkBook
• 𝛼 and β are Dirichlet prior hyperparameters.
Clustering using LDA
• Framework:
• M documents(images)
• Each document j has Nj words
• wji is the observed value of word i in document j
• All words will be clustered into k topics
• Each topic k is modeled as a multinomial distribution over the
WorkBook
• 𝛼 and β are Dirichlet prior hyperparameters.
• ɸk, ∏j and zji are hidden variables used.
Clustering using LDA(cntd)
• Generative algorithm:
• For a topic k a multinomial parameter ɸk
is sampled from the Dirichlet prior ɸk
~
Dir(β)
• For a document j, a multinomial parameter ∏j
over K topics is sampled from
Dirichlet prior ∏j
~ Dir(𝛼)
• For a word i in document j, a topic label zji
is sampled from the discrete
distribution zji
~ Discrete(∏j
)
• The value wji
of word i in document j is sampled for the discrete distribution of topic
zji
, wji
~ Discrete(ɸzji
)
• zji
is sampled through Gibbs sampling procedure as follows:





• n
(k)
-ji,w
represents number of words in the corpus with value w assigned to topic k
excluding word i in document j
• n
(j)
-ji,k
represents number of words in the document j assigned to topic k excluding
word i in document j
What’s the issue with LDA?
What’s the issue with LDA?
• Spatial and Temporal components of the visual
words are not considered. So co-occurence
information is not utilized.
What’s the issue with LDA?
• Spatial and Temporal components of the visual
words are not considered. So co-occurence
information is not utilized.
• Consider the scenario where there is a series of
animals with grass as the background. Since we
assume an image to be a document and since the
animal is only a small part of the image, it would
most likely be classified as grass.
How can we resolve it?
How can we resolve it?
• Use a grid layout on each image and each region in the grid could
be considered a document.
How can we resolve it?
• Use a grid layout on each image and each region in the grid could
be considered a document.
• But how would you handle overlap of a patch between two
regions?
How can we resolve it?
• Use a grid layout on each image and each region in the grid could
be considered a document.
• But how would you handle overlap of a patch between two
regions?
• We could use overlapping regions as a document.
How can we resolve it?
• Use a grid layout on each image and each region in the grid could
be considered a document.
• But how would you handle overlap of a patch between two
regions?
• We could use overlapping regions as a document.
• But since each overlapping document could contain a patch
how would you decide which of the documents it should belong
to?
How can we resolve it?
• Use a grid layout on each image and each region in the grid could
be considered a document.
• But how would you handle overlap of a patch between two
regions?
• We could use overlapping regions as a document.
• But since each overlapping document could contain a patch
how would you decide which of the documents it should belong
to?
• So we could replace each document(region) as a point and if a
patch is closer to a particular point, we could assign it to that
document.
• Framework:
• Besides the parameters used in LDA spatial information is also captured
• A hidden variable di indicates the document which word i is assigned to.
• Additionally for each document g
d
j, x
d
j, y
d
j, represents the index, x
coordinate and y coordinate of the document respectively.
• Additionally for each image gi, xi, yi, represents the index, x coordinate
and y coordinate of the image respectively.
• Generative Algorithm:
• For a topic k a multinomial parameter ɸk is sampled from the Dirichlet
prior ɸk ~ Dir(β)
• For a document j, a multinomial parameter ∏j over K topics is sampled
from Dirichlet prior ∏j ~ Dir(𝛼)
• For a word i, a random variable di, is sampled from prior of p(di|η),
indicating document for word i.
Clustering using Spatial LDA
Clustering using Spatial
LDA(contd)
• Generative Algorithm:
• Image index and location of word is chosen from distribution p(ci|c
d
di,𝝈). A gaussian kernel is chosen



• For a word j in document di, a topic label zi is sampled from the discrete distribution zji ~ Discrete(∏di)
• The value wi of word i is sampled for the discrete distribution of topic zi, wi ~ Discrete(ɸzi)
• zji is sampled through Gibbs sampling procedure as follows:





• n
(k)
-i,w represents number of words in the corpus with value w assigned to topic k excluding word i and 



n
(j)
-i,k represents number of words in the document j assigned to topic k excluding word i
• The conditional distribution of di is represented as follows:





Results
cows cars faces bicycles
LDA(D) 0.376 0.555 0.717 0.556
SLDA(D) 0.566 0.684 0.697 0.566
LDA(FA) 0.558 0.396 0.586 0.529
SLDA(FA) 0.033 0.244 0.371 0.422
What the paper missed
What the paper missed
• Comparisons with other standard clustering methods could have been
mentioned to highlight the efficiency of the algorithm.
What the paper missed
• Comparisons with other standard clustering methods could have been
mentioned to highlight the efficiency of the algorithm.
• For the given experimental data an intuition on the selection of input
parameters 𝛼,β and η could have been provided.
What the paper missed
• Comparisons with other standard clustering methods could have been
mentioned to highlight the efficiency of the algorithm.
• For the given experimental data an intuition on the selection of input
parameters 𝛼,β and η could have been provided.
• In case of moving images, the temporal aspect of the images are ignored.
In future, this could be considered as a parameter and the algorithm could
be updated.
What the paper missed
• Comparisons with other standard clustering methods could have been
mentioned to highlight the efficiency of the algorithm.
• For the given experimental data an intuition on the selection of input
parameters 𝛼,β and η could have been provided.
• In case of moving images, the temporal aspect of the images are ignored.
In future, this could be considered as a parameter and the algorithm could
be updated.
• Few Advancements were made in the paper:
What the paper missed
• Comparisons with other standard clustering methods could have been
mentioned to highlight the efficiency of the algorithm.
• For the given experimental data an intuition on the selection of input
parameters 𝛼,β and η could have been provided.
• In case of moving images, the temporal aspect of the images are ignored.
In future, this could be considered as a parameter and the algorithm could
be updated.
• Few Advancements were made in the paper:
• James Philbin, Josef Sivic and Andrew Zisserman. Geometric Latent
Dirichlet Allocation on a Matching Graph for Large-scale Image
Datasets, International Journal of Computer Vision, Volume 95, Number
2, page 138--153, nov 2011
Libraries for LDA
Libraries for LDA
• R - “lda” - http://cran.r-project.org/web/packages/
lda/lda.pdf
Libraries for LDA
• R - “lda” - http://cran.r-project.org/web/packages/
lda/lda.pdf
• Python - lda v1.0.2 - https://pypi.python.org/pypi/
lda
Libraries for LDA
• R - “lda” - http://cran.r-project.org/web/packages/
lda/lda.pdf
• Python - lda v1.0.2 - https://pypi.python.org/pypi/
lda
• Java - GibbsLDA - http://gibbslda.sourceforge.net/
References
References
• Wang, Xiaogang and Eric Grimson. Spatial Latent
Dirichlet Allocation. Advances in Neural Information
Processing Systems 20 (NIPS 2007)
• D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent
dirichlet allocation. Journal of Machine Learning
Research, 3:993–1022, 2003.
• Diane J. Hu. Latent Dirichlet Allocation for Text,
Images, and Music.

More Related Content

What's hot

Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevDatabricks
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlBen Healey
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big DataSameer Wadkar
 
Yoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherYoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherMLReview
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Session 2.5 matching natural language relations to knowledge graph properti...
Session 2.5   matching natural language relations to knowledge graph properti...Session 2.5   matching natural language relations to knowledge graph properti...
Session 2.5 matching natural language relations to knowledge graph properti...semanticsconference
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Sebastian Ruder
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubMehwish Alam
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With RJahnab Kumar Deka
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Mehwish Alam
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practicehen_drik
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Recordspbajcsy
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)fridolin.wild
 

What's hot (20)

Text mining and Visualizations
Text mining  and VisualizationsText mining  and Visualizations
Text mining and Visualizations
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco Control
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big Data
 
Yoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherYoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and Whither
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Session 2.5 matching natural language relations to knowledge graph properti...
Session 2.5   matching natural language relations to knowledge graph properti...Session 2.5   matching natural language relations to knowledge graph properti...
Session 2.5 matching natural language relations to knowledge graph properti...
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...
 
Framester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data HubFramester: A Wide Coverage Linguistic Linked Data Hub
Framester: A Wide Coverage Linguistic Linked Data Hub
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practice
 
search engine
search enginesearch engine
search engine
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Text Similarity
Text SimilarityText Similarity
Text Similarity
 

Viewers also liked

Preventing Stealthy Threats with Next Generation Endpoint Security
Preventing Stealthy Threats with Next Generation Endpoint SecurityPreventing Stealthy Threats with Next Generation Endpoint Security
Preventing Stealthy Threats with Next Generation Endpoint SecurityIntel IT Center
 
Cyber security on social networks
Cyber security on social networksCyber security on social networks
Cyber security on social networksDev Nair
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentationSoojung Hong
 
Cyber crime social media &; family
Cyber crime social media &; familyCyber crime social media &; family
Cyber crime social media &; familyDr.Keshav Sathaye
 
Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)KU Leuven
 
Cyber crime final report
Cyber crime final report Cyber crime final report
Cyber crime final report Shishupal Nagar
 
presentation on cyber crime and security
presentation on cyber crime and securitypresentation on cyber crime and security
presentation on cyber crime and securityAlisha Korpal
 
Cyber security presentation
Cyber security presentationCyber security presentation
Cyber security presentationBijay Bhandari
 
Cyber Crime and Security
Cyber Crime and SecurityCyber Crime and Security
Cyber Crime and SecurityDipesh Waghela
 
Cybercrime.ppt
Cybercrime.pptCybercrime.ppt
Cybercrime.pptAeman Khan
 
Cyber security
Cyber securityCyber security
Cyber securitySiblu28
 
Cyber crime ppt
Cyber crime pptCyber crime ppt
Cyber crime pptMOE515253
 

Viewers also liked (20)

Preventing Stealthy Threats with Next Generation Endpoint Security
Preventing Stealthy Threats with Next Generation Endpoint SecurityPreventing Stealthy Threats with Next Generation Endpoint Security
Preventing Stealthy Threats with Next Generation Endpoint Security
 
Cyber Crime
Cyber CrimeCyber Crime
Cyber Crime
 
Cyber crime
Cyber crimeCyber crime
Cyber crime
 
Cyber theft !!!
Cyber theft !!!Cyber theft !!!
Cyber theft !!!
 
Cyber crime
Cyber crimeCyber crime
Cyber crime
 
Cyber Crimes
Cyber CrimesCyber Crimes
Cyber Crimes
 
Cyber Crime
Cyber CrimeCyber Crime
Cyber Crime
 
Cyber security on social networks
Cyber security on social networksCyber security on social networks
Cyber security on social networks
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Cyber crime social media &; family
Cyber crime social media &; familyCyber crime social media &; family
Cyber crime social media &; family
 
Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)
 
Cyber Terrorism
Cyber TerrorismCyber Terrorism
Cyber Terrorism
 
Cyber crime final report
Cyber crime final report Cyber crime final report
Cyber crime final report
 
presentation on cyber crime and security
presentation on cyber crime and securitypresentation on cyber crime and security
presentation on cyber crime and security
 
Cyber security presentation
Cyber security presentationCyber security presentation
Cyber security presentation
 
Cyber-crime PPT
Cyber-crime PPTCyber-crime PPT
Cyber-crime PPT
 
Cyber Crime and Security
Cyber Crime and SecurityCyber Crime and Security
Cyber Crime and Security
 
Cybercrime.ppt
Cybercrime.pptCybercrime.ppt
Cybercrime.ppt
 
Cyber security
Cyber securityCyber security
Cyber security
 
Cyber crime ppt
Cyber crime pptCyber crime ppt
Cyber crime ppt
 

Similar to Spatial LDA

Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectorsSimon Hughes
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
 
Cork AI Meetup Number 3
Cork AI Meetup Number 3Cork AI Meetup Number 3
Cork AI Meetup Number 3Nick Grattan
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingSujit Pal
 
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)rchbeir
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologyLucidworks
 
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Lucidworks
 
Topic Extraction using Machine Learning
Topic Extraction using Machine LearningTopic Extraction using Machine Learning
Topic Extraction using Machine LearningSanjib Basak
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Noemi Derzsy
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsPyData
 
Segmentation - based Historical Handwritten Word Spotting using document-spec...
Segmentation - based Historical Handwritten Word Spotting using document-spec...Segmentation - based Historical Handwritten Word Spotting using document-spec...
Segmentation - based Historical Handwritten Word Spotting using document-spec...Konstantinos Zagoris
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Vivek Gupta
 
Natural Language Processing with Graphs
Natural Language Processing with GraphsNatural Language Processing with Graphs
Natural Language Processing with GraphsNeo4j
 
Topic extraction using machine learning
Topic extraction using machine learningTopic extraction using machine learning
Topic extraction using machine learningSanjib Basak
 

Similar to Spatial LDA (20)

Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
Cork AI Meetup Number 3
Cork AI Meetup Number 3Cork AI Meetup Number 3
Cork AI Meetup Number 3
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language Processing
 
Probabilistic Topic models
Probabilistic Topic modelsProbabilistic Topic models
Probabilistic Topic models
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW Technology
 
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
 
Topic Extraction using Machine Learning
Topic Extraction using Machine LearningTopic Extraction using Machine Learning
Topic Extraction using Machine Learning
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
Segmentation - based Historical Handwritten Word Spotting using document-spec...
Segmentation - based Historical Handwritten Word Spotting using document-spec...Segmentation - based Historical Handwritten Word Spotting using document-spec...
Segmentation - based Historical Handwritten Word Spotting using document-spec...
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)
 
Natural Language Processing with Graphs
Natural Language Processing with GraphsNatural Language Processing with Graphs
Natural Language Processing with Graphs
 
Topic extraction using machine learning
Topic extraction using machine learningTopic extraction using machine learning
Topic extraction using machine learning
 

Recently uploaded

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 

Recently uploaded (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 

Spatial LDA

  • 2. Spatial Latent Dirichlet Allocation Authors : - Xiaogang Wang and Eric Crimson Review By: George Mathew(george2)
  • 4. Applications • Text Mining • Identifying similar chapters in a book
  • 5. Applications • Text Mining • Identifying similar chapters in a book • Computer Vision • Face Recognition
  • 6. Applications • Text Mining • Identifying similar chapters in a book • Computer Vision • Face Recognition • Colocation Mining • Identifying forest fires
  • 7. Applications • Text Mining • Identifying similar chapters in a book • Computer Vision • Face Recognition • Colocation Mining • Identifying forest fires • Music search • Identifying genre of music based on segment of the song
  • 9. LDA-Overview • A generative probabilistic model
  • 10. LDA-Overview • A generative probabilistic model • Represented as words, documents, corpus and labels
  • 11. LDA-Overview • A generative probabilistic model • Represented as words, documents, corpus and labels • words - primary unit of discrete data
  • 12. LDA-Overview • A generative probabilistic model • Represented as words, documents, corpus and labels • words - primary unit of discrete data • document - sequence of words
  • 13. LDA-Overview • A generative probabilistic model • Represented as words, documents, corpus and labels • words - primary unit of discrete data • document - sequence of words • corpus - collection of all documents
  • 14. LDA-Overview • A generative probabilistic model • Represented as words, documents, corpus and labels • words - primary unit of discrete data • document - sequence of words • corpus - collection of all documents • label(Output) - class of the document
  • 15. Wait a minute … So how are we going to perform computer vision applications using words and documents
  • 16. Wait a minute … So how are we going to perform computer vision applications using words and documents • So here words would represent visual words which could consist of
  • 17. Wait a minute … So how are we going to perform computer vision applications using words and documents • So here words would represent visual words which could consist of • image patches
  • 18. Wait a minute … So how are we going to perform computer vision applications using words and documents • So here words would represent visual words which could consist of • image patches • spatial and temporal interest points
  • 19. Wait a minute … So how are we going to perform computer vision applications using words and documents • So here words would represent visual words which could consist of • image patches • spatial and temporal interest points • moving pixels etc
  • 20. Wait a minute … So how are we going to perform computer vision applications using words and documents • So here words would represent visual words which could consist of • image patches • spatial and temporal interest points • moving pixels etc • The paper takes an example of image classification in computer vision.
  • 22. Data Preprocessing • Image is convolved against a series of filters, 3 Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians
  • 23. Data Preprocessing • Image is convolved against a series of filters, 3 Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians • A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor.
  • 24. Data Preprocessing • Image is convolved against a series of filters, 3 Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians • A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor. • The local descriptors of each patch in the entire image set is clustered using k-means and stored in an auxiliary data structure(lets call it a “Workbook”).
  • 27. Clustering using LDA • Framework: • M documents(images)
  • 28. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words
  • 29. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words • wji is the observed value of word i in document j
  • 30. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words • wji is the observed value of word i in document j • All words will be clustered into k topics
  • 31. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words • wji is the observed value of word i in document j • All words will be clustered into k topics • Each topic k is modeled as a multinomial distribution over the WorkBook
  • 32. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words • wji is the observed value of word i in document j • All words will be clustered into k topics • Each topic k is modeled as a multinomial distribution over the WorkBook • 𝛼 and β are Dirichlet prior hyperparameters.
  • 33. Clustering using LDA • Framework: • M documents(images) • Each document j has Nj words • wji is the observed value of word i in document j • All words will be clustered into k topics • Each topic k is modeled as a multinomial distribution over the WorkBook • 𝛼 and β are Dirichlet prior hyperparameters. • ɸk, ∏j and zji are hidden variables used.
  • 34. Clustering using LDA(cntd) • Generative algorithm: • For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β) • For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼) • For a word i in document j, a topic label zji is sampled from the discrete distribution zji ~ Discrete(∏j ) • The value wji of word i in document j is sampled for the discrete distribution of topic zji , wji ~ Discrete(ɸzji ) • zji is sampled through Gibbs sampling procedure as follows:
 
 
 • n (k) -ji,w represents number of words in the corpus with value w assigned to topic k excluding word i in document j • n (j) -ji,k represents number of words in the document j assigned to topic k excluding word i in document j
  • 35. What’s the issue with LDA?
  • 36. What’s the issue with LDA? • Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized.
  • 37. What’s the issue with LDA? • Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized. • Consider the scenario where there is a series of animals with grass as the background. Since we assume an image to be a document and since the animal is only a small part of the image, it would most likely be classified as grass.
  • 38. How can we resolve it?
  • 39. How can we resolve it? • Use a grid layout on each image and each region in the grid could be considered a document.
  • 40. How can we resolve it? • Use a grid layout on each image and each region in the grid could be considered a document. • But how would you handle overlap of a patch between two regions?
  • 41. How can we resolve it? • Use a grid layout on each image and each region in the grid could be considered a document. • But how would you handle overlap of a patch between two regions? • We could use overlapping regions as a document.
  • 42. How can we resolve it? • Use a grid layout on each image and each region in the grid could be considered a document. • But how would you handle overlap of a patch between two regions? • We could use overlapping regions as a document. • But since each overlapping document could contain a patch how would you decide which of the documents it should belong to?
  • 43. How can we resolve it? • Use a grid layout on each image and each region in the grid could be considered a document. • But how would you handle overlap of a patch between two regions? • We could use overlapping regions as a document. • But since each overlapping document could contain a patch how would you decide which of the documents it should belong to? • So we could replace each document(region) as a point and if a patch is closer to a particular point, we could assign it to that document.
  • 44. • Framework: • Besides the parameters used in LDA spatial information is also captured • A hidden variable di indicates the document which word i is assigned to. • Additionally for each document g d j, x d j, y d j, represents the index, x coordinate and y coordinate of the document respectively. • Additionally for each image gi, xi, yi, represents the index, x coordinate and y coordinate of the image respectively. • Generative Algorithm: • For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β) • For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼) • For a word i, a random variable di, is sampled from prior of p(di|η), indicating document for word i. Clustering using Spatial LDA
  • 45. Clustering using Spatial LDA(contd) • Generative Algorithm: • Image index and location of word is chosen from distribution p(ci|c d di,𝝈). A gaussian kernel is chosen
 
 • For a word j in document di, a topic label zi is sampled from the discrete distribution zji ~ Discrete(∏di) • The value wi of word i is sampled for the discrete distribution of topic zi, wi ~ Discrete(ɸzi) • zji is sampled through Gibbs sampling procedure as follows:
 
 
 • n (k) -i,w represents number of words in the corpus with value w assigned to topic k excluding word i and 
 
 n (j) -i,k represents number of words in the document j assigned to topic k excluding word i • The conditional distribution of di is represented as follows:
 
 

  • 46. Results cows cars faces bicycles LDA(D) 0.376 0.555 0.717 0.556 SLDA(D) 0.566 0.684 0.697 0.566 LDA(FA) 0.558 0.396 0.586 0.529 SLDA(FA) 0.033 0.244 0.371 0.422
  • 47. What the paper missed
  • 48. What the paper missed • Comparisons with other standard clustering methods could have been mentioned to highlight the efficiency of the algorithm.
  • 49. What the paper missed • Comparisons with other standard clustering methods could have been mentioned to highlight the efficiency of the algorithm. • For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.
  • 50. What the paper missed • Comparisons with other standard clustering methods could have been mentioned to highlight the efficiency of the algorithm. • For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided. • In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.
  • 51. What the paper missed • Comparisons with other standard clustering methods could have been mentioned to highlight the efficiency of the algorithm. • For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided. • In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated. • Few Advancements were made in the paper:
  • 52. What the paper missed • Comparisons with other standard clustering methods could have been mentioned to highlight the efficiency of the algorithm. • For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided. • In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated. • Few Advancements were made in the paper: • James Philbin, Josef Sivic and Andrew Zisserman. Geometric Latent Dirichlet Allocation on a Matching Graph for Large-scale Image Datasets, International Journal of Computer Vision, Volume 95, Number 2, page 138--153, nov 2011
  • 54. Libraries for LDA • R - “lda” - http://cran.r-project.org/web/packages/ lda/lda.pdf
  • 55. Libraries for LDA • R - “lda” - http://cran.r-project.org/web/packages/ lda/lda.pdf • Python - lda v1.0.2 - https://pypi.python.org/pypi/ lda
  • 56. Libraries for LDA • R - “lda” - http://cran.r-project.org/web/packages/ lda/lda.pdf • Python - lda v1.0.2 - https://pypi.python.org/pypi/ lda • Java - GibbsLDA - http://gibbslda.sourceforge.net/
  • 58. References • Wang, Xiaogang and Eric Grimson. Spatial Latent Dirichlet Allocation. Advances in Neural Information Processing Systems 20 (NIPS 2007) • D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003. • Diane J. Hu. Latent Dirichlet Allocation for Text, Images, and Music.