We have built an online Movie Recommender System which is based on the analysis of users' ratings history to several movies and their demographic information. We used data from Movielens website. Collaborative filtering and matrix factorization techniques have been used for the implementation. The end result is a web application where a user is recommended with top 20 movies.
Demo Video: http://goo.gl/VgZ2uI
An Online Social Network based
● We have built an online Movie Recommender System
which is based on the analysis of users' ratings history
to several movies and their demographic information.
● We used data from Movielens website.
● Collaborative filtering and matrix factorization
techniques have been used for the implementation.
● The end result is a web application where a user is
recommended with top 20 movies.
● We have built a Movie Recommender system using
● We are provided with User's ratings to some of the
available movies Movies information , Demographic
information about the users.
● Using the above information and applying collaborative
filtering and matrix factorization techniques, top 20
movies have been recommended to the users.
We first used cosine similarity method to find similar users and later
recommended movies by applying several heuristics (using age, movie genre,
gender, occupation etc).
This method even though gave accurate results, its performance deteriorated
when user-movie ratings matrix became sparse.
Hence we came up with a better approach which uses matrix factorization
technique to predict the ratings. This technique can used to meet the real time
scenario where the utility matrix is often sparse.
The two approaches used will be mentioned in detail in
We have also developed a module which suggests movies
for the facebook users based on the movies he liked and
also from the movies liked by people in his friends list.
Suggestion is based on the movie genres.
Approach A: Using Collaborative filtering and
● In this approach, top movies are recommended to users
by finding out the similar users using cosine distance
similarity and demographic information of users, and
then applying several heuristics.
● Such an approach shows explain-ability of the results
but its performance decreases when matrix data gets
● Our second approach uses matrix-factorization method of collaborative
Filtering for the rate prediction and ranking.
● SVDFeature has been used to implement the same. SVDFeature is a
machine learning toolkit for feature-based collaborative filtering.
● The feature-based setting allows us to build factorization models.
● SVDFeature will learn a feature-based matrix factorization model with the
given training data and make predictions on supplied test feature files.
Approach B: Using Matrix Factorization for
Evaluation (Using Matrix Factorization)
Evaluation can be done based on RMSE value after every model is generated.
Ideally RMSE value should be zero (0)
Two chunks of disjoint data has been taken from the dataset. One for training
and the other for testing. The training was done for 40 rounds. For the first
round RMSE value came out to be 1.265039. Eventually it became better and
for the 40th round it is 0.932842.
Even though Approach A's advantages include; the explainability of the results,
which is an important aspect of recommendation systems and new data can be
added easily, the disadvantage identified is its performance decreases when
data gets sparse, which is frequent with web related items.
This prevents the scalability and has problems with large datasets. This can be
overcome by Matrix Factorization Method.It handles the sparsity better than the
previous one. This helps with scalability with large data sets. It improves the
prediction performance. All at the cost of expensive model building.