Comparing topic models for a movie recommendation system webist2014
1. Sonia Bergamaschi, Laura Po and Serena Sorrentino
Department of Engineering “Enzo Ferrari”, University of
Modena and Reggio Emilia, Italy
Comparing Topic
Models for a Movie
Recommendation
System
11. Cast&Crew
Movie Person
IMDB Movie
Collection
1,861,736
IMDB Personality
Collection
3,165,235
TMDB Film
Collection
20,861
IMDB Cast
Collection
24,662,392
TMDB Person
Collection
234,986
TMDB Production
Collection
225,494
English Dbpedia
Movie Collection
164,508
EnglishDbpedia
Crew Collection
6,102
German Dbpedia
Movie Collection
164,508
German Dbpedia
Crew Collection
866
English Dbpedia
Actor Collection
6,151
German Dbpedia
Actor Collection
1,039
12. 1. Plot Vectorization -
2. Weights Computation-
3. Matrix Reduction by using Topic Models
4. Movie Similarity Computation-
16. T
Document by
Keyword Matrix
(d x k)
K
Topic by
Keyword
Matrix
(z x k)
= x S
Topic by
Topic Matrix
(z x z)
DT
Document by
Topic Matrix
(d x z)
x
P(k|d)
Document
distribution over
Keywords
(d x k)
P(k|z)
Topic
distrib.
over
Keywords
(z x k)
= x
LSA
LDA
P(z|d)
Document distrib.
over Topics
(d x z)
204,000 plots x
220,000 keywords
204,000 plots x
500 topics
204,000 plots x
50 topics204,000 plots x
220,000 keywords
A test on the IMDb database, about 1,8 million of
multimedia only 204,000 has a plot available.
17.
LSA allows to select plots that are
better related to the target’s plot themes
19. • 20 users
• 18 movies
• the top 6
recommendations
from both LSA and
LDA
• 594 evaluations
collected
20. LDA does not have good
performance on movie
recommendations: it is not able to
suggest movies of the same saga
and it suggests erroneous entries for
movies that have short plot
LSA achieves good performance
on movie recommendations:
it is able to suggest movies of the
same saga and also unknown
movies related to the target one
21. • 30 users
• 18 movies
• the top 6
recommendations
from both LDA and
IMDb
• 146 evaluations
collected