Collaborative Filtering using KNN


Published on

Rating Prediction System Using Collaborative Filtering and K-Nearest Neighbour Algorithm

Published in: Technology, Education

Collaborative Filtering using KNN

  2. 2. Recommender Systems• Software tools and techniques providing suggestions for itemsto be of use to a user• Recommender systems analyze patterns of user interest initems or products to provide personalized recommendationsof items that will suit a user’s tasteItem - What the system recommends to the user(CD, news, books, movies...)User preferences - ratings for productsUser actions - user browsing history
  3. 3. RS Techniques• Collaborative-Filtering system– recommends to the active user the items thatother users with similar tastes liked in the past• Content-based system– recommend items that are similar to the ones thatthe user liked in the past• Hybrid-Collaborative Filtering• Tagging: recommends items using tagsassigned by different users
  4. 4. Collaborative Filtering• trying to predict the opinion the user will have on thedifferent items and be able to recommend the “best”items to each user based on the user’s previouslikings and the opinions of other like minded users.
  5. 5. Collaborative Filtering• The task of a CF algorithm is to find item likeliness of twoforms :Prediction – a numerical value, expressing the predictedlikeliness value about an item of the active userRecommendation – a list of N items that the active user willlike the most
  6. 6. K Nearest Neighbour Algorithm• A distance measure is needed to determine the“closeness” of instances• Classify an instance by finding its nearest neighborsand picking the most popular class among theneighbors
  7. 7. MegaMindToy Story Despicable MeLion King Kung FuPandaZeynep 4 5 3 2 4Funda 3 3 2 3 5Pınar 3 3 4 2 3Gülten 4 4 5 4 5Yağız 4 5 ? 4 5Rating Prediction
  8. 8. Application• MovieLens Database (1M) 3883 movies 6040 users 1000209 ratings• Technologies ASP.Net 4.0 MS SQL Server 2008
  9. 9. RATING PREDICTION DATABASE DIAGRAMMoviesMovieIDTitleGenreRatingsIDUserIDMovieIDRatingTimestampUsersUserIDGenderAgeOccupationZipCodeAgeIdDescriptionOccupationIdDescriptionPredictionsIDUserIDMostSimilarUserIDDifferenceTimeElapsedMovieIDPredictedRatingActualRating
  10. 10. Error MeasurementMean Square Error (MSE)=0.975Mean Absolute Error(MAE)=0.679
  11. 11. DEMO
  12. 12. ProCon• Cold-start Problem• Storage: all trainingexamples are saved inmemory• Time: to classify x, youneed to loop over alltraining examples (x’,y’) tocompute distance betweenx and x’. Simple to implement anduse Comprehensible – easy toexplain prediction Robust to noisy data byaveraging k-nearestneighborsKNN Algorithm
  13. 13. Conclusion Recommending and personalization are importantapproaches to combating information over-load. Machine Learning is an important part of systems forthese tasks. Collaborative Filtering has its own problems Better results would be achieved by use ofcontent, tags and more optimized similarityfunctions.
  14. 14. Thank you