Successfully reported this slideshow.
You’ve unlocked unlimited downloads on SlideShare!
Recommender Approaches Model Based Training SVM, LDA, SVD for Collaborative implicit features Filtering – Item- Item similarity (You like Godfather so you will like Attribute-based Scarface - Netflix) recommendations (You like action movies, starringClint Eastwood, you Social+Interest might like “Good, Graph Based (Your Bad and the Ugly” friends like Lady Netflix) Collaborative Gaga so you will Filtering – User- like Lady Gaga, User Similarity PYMK – Facebook, LinkedIn) (People like you who bought beer Item also bought Hierarchy diapers - Target) (You bought Printer you will also need ink - BestBuy)
Other/Model-based Approaches• Slope one recommender• Latent factor Models for Web Data – Matrix factorization using SVD, ALS, with Regularization – LDA, SVM, Bayesian Clustering
General Steps •Problem definition (user-based, item-based, ratings/binary…) Data Prep •Map-Reduce, cleansing, massaging data (input matrix) •Training Set, Validation Set Normalize • bias removal - Z-score, Mean-centering, Log • Pearson Correlation Coefficient Similarity • Cosine Similarityweights/Neighbors • K-nearest neighbor Train • Training model (only in model-based approaches) • Predict missing ratings Predict • top-N predictions for every user Denormalize • Reverse of normalizationEvaluate Accuracy • Accuracy, Precision, Recall, F1, ROC
Challenges• Dimensionality reduction (e.g. use PCA)• Input data sparsity (aka cold start problem)• Overfitting to training data set (use regularization)• Data wrangling, in general…
Just How Good is your Recommender?• Evaluation of predicted ratings (Mean Average Error, Root Mean Sq Error)• Evaluation of top-N recommendations – Mean Absolute Error – Accuracy – Precision & Recall (F1 score) – ROC curve
Open Source ToolsSoftware Description Language URL Hadoop ML library that includes http://mahout.apache.org/Apache Mahout Collaborative Filtering JavaCofi Collaborative Filtering Library Java http://www.nongnu.org/cofi/ Components to createCrab recommender systems Python https://github.com/muricoca/crabeasyrec Recommender for web pages Java http://easyrec.org/ Collaborative Filtering algorithmsLensKit from GroupLens Research Java http://lenskit.grouplens.org/MyMediaLite Recommender system algorithms C#/Mono http://mloss.org/software/view/282/ Toolkit for Feature based MatrixSVDFeature Factorization C++ http://mloss.org/software/view/333/ Collaborative Filtering forVogoo PHP LIB personalized web sites PHP http://sourceforge.net/projects/vogoo/ http://cran.r- R library for developing and testing project.org/web/packages/recommenderrecommenderlab collaborative filtering systems R lab/index.html Python module integrating classic ML algorithms in scientific Python packagesScikit-learn (numpy, scipy, matplotlib) Python http://scikit-learn.org/stable/
MahoutDataModel model = new FileDataModel(new File("data.txt"));// Construct the list of pre-computed correlationsCollection<GenericItemSimilarity.ItemItemSimilarity> correlations = ...;ItemSimilarity itemSimilarity = new GenericItemSimilarity(correlations);Recommender recommender = new GenericItemBasedRecommender(model, itemSimilarity);Recommender cachingRecommender = new CachingRecommender(recommender);...List<RecommendedItem> recommendations = cachingRecommender.recommend (1234, 10);
2. References & Reading• High Level Reading – Programming Collective Intelligence by Toby Segaran. The 2nd chapter gives a good introduction to collaborative filtering with Python examples (non-SVD). – Matrix Factorization Techniques for Recommender Systems Yehuda Koren; Robert Bell; Chris Volinsky, IEEE Computer, 2009, 8• Singular Value Decomposition (SVD) Reading – The Singular Value Decomposition, by Jody Hourigan and Lynn McIndoo, Linear Algebra – Math 45. http://online.redwoods.edu/INSTRUCT/darnold/LAPROJ/Fall98/ JodLynn/report2.pdf w/ Matlab & image examples – Numerical Recipes, 3rd Edition, Press et. al.,2007, p65-75.
References & Reading (continued)• Collaborative Filtering Reading – See papers on research.yahoo.com/Yehuda_Koren – Collaborative Filtering for Implicit Feedback Datasets, Yifan Hu; Yehuda Koren; Chris Volinsky, IEEE International Conference on Data Mining (ICDM 2008), IEEE, 2008 – Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model, Yehuda Koren, ACM Int. Conference on Knowledge Discovery and Data Mining (KDD’08), 2008 – Collaborative Filtering with Temporal Dynamics, Yehuda Koren, KDD 2009, ACM, 2009 – James Thornton’s CF Blog http://original.jamesthornton.com/cf/ – Apache Mahout Recommender https://cwiki.apache.org/MAHOUT/recommender- documentation.html – Flexible Collaborative Filtering In Java With Mahout Taste - Philippe Adjiman – Books, Articles and Tutorials on Mahout/Cofi