15. +
What is LDA?
Latent Dirichlet Allocation(DLA) is a topic model that generates topics based on word
frequency from a set of documents.
LDA is particularly useful for finding reasonably accurate mixtures of topics within a
given document set.
16. +
What are topics?
LDA is not given topics!
LDA infers topics from raw text as a distribuiton over words.
17. +
Example
This data set consists of 20000 messages taken from 20 newsgroups.
The articles are typical postings and thus have headers including subject lines,
signiture files, and quoted portions of other articles.
Sports space exploration computers