SlideShare a Scribd company logo
1 of 16
Download to read offline
ACM Data Mining Hackathon
          8/18/2012




Recommender Systems
       Navisro Analytics
            @navisro
       info@navisro.com
    http://www.navisro.com
Capturing the Long Tail…
Recommender Approaches
                                                         Model Based
                                                         Training SVM,
                                                         LDA, SVD for
                               Collaborative             implicit features
                            Filtering – Item-
                             Item similarity
                         (You like Godfather
                             so you will like
    Attribute-based        Scarface - Netflix)
  recommendations
     (You like action
    movies, starring
Clint Eastwood, you                               Social+Interest
  might like “Good,                               Graph Based (Your
 Bad and the Ugly”                                friends like Lady
              Netflix)        Collaborative       Gaga so you will
                              Filtering – User-   like Lady Gaga,
                              User Similarity     PYMK – Facebook,
                                                  LinkedIn)
                              (People like you
                              who bought beer
       Item                   also bought
       Hierarchy              diapers - Target)
       (You bought
       Printer you
       will also need
       ink - BestBuy)
Other/Model-based
           Approaches
• Slope one recommender
• Latent factor Models for Web Data
  – Matrix factorization using SVD, ALS,
    with Regularization
  – LDA, SVM, Bayesian Clustering
General Steps
                    •Problem definition (user-based, item-based, ratings/binary…)
    Data Prep       •Map-Reduce, cleansing, massaging data (input matrix)
                    •Training Set, Validation Set


   Normalize        • bias removal - Z-score, Mean-centering, Log

                     • Pearson Correlation Coefficient
    Similarity
                     • Cosine Similarity
weights/Neighbors    • K-nearest neighbor

      Train         • Training model (only in model-based approaches)

                    • Predict missing ratings
     Predict
                    • top-N predictions for every user

  Denormalize       • Reverse of normalization

Evaluate Accuracy   • Accuracy, Precision, Recall, F1, ROC
User-based CF




Reference: Recommenderlab vignette, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
Challenges
• Dimensionality reduction (e.g. use PCA)
• Input data sparsity (aka cold start
  problem)
• Overfitting to training data set (use
  regularization)
• Data wrangling, in general…
Just How Good is your
          Recommender?
• Evaluation of predicted ratings (Mean
  Average Error, Root Mean Sq Error)

• Evaluation of top-N recommendations
  – Mean Absolute Error
  – Accuracy
  – Precision & Recall (F1 score)
  – ROC curve
Tools
Open Source Tools
Software          Description                          Language   URL
                  Hadoop ML library that includes                 http://mahout.apache.org/
Apache Mahout     Collaborative Filtering              Java

Cofi              Collaborative Filtering Library      Java       http://www.nongnu.org/cofi/
                  Components to create
Crab              recommender systems                  Python     https://github.com/muricoca/crab

easyrec           Recommender for web pages            Java       http://easyrec.org/
                  Collaborative Filtering algorithms
LensKit           from GroupLens Research              Java       http://lenskit.grouplens.org/

MyMediaLite       Recommender system algorithms        C#/Mono    http://mloss.org/software/view/282/
                  Toolkit for Feature based Matrix
SVDFeature        Factorization                        C++        http://mloss.org/software/view/333/
                  Collaborative Filtering for
Vogoo PHP LIB     personalized web sites               PHP        http://sourceforge.net/projects/vogoo/
                                                                  http://cran.r-
               R library for developing and testing               project.org/web/packages/recommender
recommenderlab collaborative filtering systems      R             lab/index.html
               Python module integrating
               classic ML algorithms in
               scientific Python packages
Scikit-learn   (numpy, scipy, matplotlib)           Python        http://scikit-learn.org/stable/
recommenderlab




Reference: Recommenderlab vignette, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
Mahout
DataModel model = new FileDataModel(new File("data.txt"));

// Construct the list of pre-computed correlations
Collection<GenericItemSimilarity.ItemItemSimilarity> correlations =
           ...;
ItemSimilarity itemSimilarity =
          new GenericItemSimilarity(correlations);

Recommender recommender =
       new GenericItemBasedRecommender(model, itemSimilarity);
Recommender cachingRecommender = new CachingRecommender(recommender);
...
List<RecommendedItem> recommendations = cachingRecommender.recommend (1234, 10);
Peter Harrington’s Sample Py
            Code
2. References & Reading
• High Level Reading
  – Programming Collective Intelligence by Toby Segaran. The 2nd
    chapter gives a good introduction to collaborative filtering with Python
    examples (non-SVD).
  – Matrix Factorization Techniques for Recommender Systems
    Yehuda Koren; Robert Bell; Chris Volinsky, IEEE Computer,
    2009, 8
• Singular Value Decomposition (SVD) Reading
  – The Singular Value Decomposition, by Jody Hourigan and Lynn
    McIndoo, Linear Algebra – Math 45.
    http://online.redwoods.edu/INSTRUCT/darnold/LAPROJ/Fall98/
    JodLynn/report2.pdf w/ Matlab & image examples
  – Numerical Recipes, 3rd Edition, Press et. al.,2007, p65-75.
References & Reading (continued)
• Collaborative Filtering Reading
   – See papers on research.yahoo.com/Yehuda_Koren
   – Collaborative Filtering for Implicit Feedback Datasets, Yifan Hu;
     Yehuda Koren; Chris Volinsky, IEEE International Conference on
     Data Mining (ICDM 2008), IEEE, 2008
   – Factorization Meets the Neighborhood: a Multifaceted Collaborative
     Filtering Model, Yehuda Koren, ACM Int. Conference on
     Knowledge Discovery and Data Mining (KDD’08), 2008
   – Collaborative Filtering with Temporal Dynamics, Yehuda Koren,
     KDD 2009, ACM, 2009
   – James Thornton’s CF Blog http://original.jamesthornton.com/cf/
   – Apache Mahout Recommender
     https://cwiki.apache.org/MAHOUT/recommender-
     documentation.html
   – Flexible Collaborative Filtering In Java With Mahout Taste - Philippe
     Adjiman
   – Books, Articles and Tutorials on Mahout/Cofi
Questions?

More Related Content

What's hot

Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
Ernesto Mislej
 
Session-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networksSession-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networks
Zimin Park
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
Georgian Micsa
 

What's hot (20)

Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender Systems
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Session-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networksSession-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networks
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly Detection
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Movie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens DatasetMovie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens Dataset
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
AB Test Platform - 우종호
AB Test Platform - 우종호AB Test Platform - 우종호
AB Test Platform - 우종호
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Recommendation Engine Project Presentation
Recommendation Engine Project PresentationRecommendation Engine Project Presentation
Recommendation Engine Project Presentation
 

Viewers also liked

Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Spark Summit
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 

Viewers also liked (16)

Apache Spark RDD 101
Apache Spark RDD 101Apache Spark RDD 101
Apache Spark RDD 101
 
Developing a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkDeveloping a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with Spark
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Crab: A Python Framework for Building Recommender Systems
Crab: A Python Framework for Building Recommender Systems Crab: A Python Framework for Building Recommender Systems
Crab: A Python Framework for Building Recommender Systems
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
 
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS Function
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
 
Machine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlibMachine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlib
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 

Similar to Collaborative Filtering and Recommender Systems By Navisro Analytics

Python & Django TTT
Python & Django TTTPython & Django TTT
Python & Django TTT
kevinvw
 
Building Recommendation Platforms with Hadoop
Building Recommendation Platforms with HadoopBuilding Recommendation Platforms with Hadoop
Building Recommendation Platforms with Hadoop
Jayant Shekhar
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
Sri Ambati
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 

Similar to Collaborative Filtering and Recommender Systems By Navisro Analytics (20)

Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDC
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
Sparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research Recommendations
 
Python & Django TTT
Python & Django TTTPython & Django TTT
Python & Django TTT
 
Building Recommendation Platforms with Hadoop
Building Recommendation Platforms with HadoopBuilding Recommendation Platforms with Hadoop
Building Recommendation Platforms with Hadoop
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用Apache Mahout 於電子商務的應用
Apache Mahout 於電子商務的應用
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Python meetup 050316
Python meetup 050316Python meetup 050316
Python meetup 050316
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Collaborative Filtering and Recommender Systems By Navisro Analytics

  • 1. ACM Data Mining Hackathon 8/18/2012 Recommender Systems Navisro Analytics @navisro info@navisro.com http://www.navisro.com
  • 3. Recommender Approaches Model Based Training SVM, LDA, SVD for Collaborative implicit features Filtering – Item- Item similarity (You like Godfather so you will like Attribute-based Scarface - Netflix) recommendations (You like action movies, starring Clint Eastwood, you Social+Interest might like “Good, Graph Based (Your Bad and the Ugly” friends like Lady Netflix) Collaborative Gaga so you will Filtering – User- like Lady Gaga, User Similarity PYMK – Facebook, LinkedIn) (People like you who bought beer Item also bought Hierarchy diapers - Target) (You bought Printer you will also need ink - BestBuy)
  • 4. Other/Model-based Approaches • Slope one recommender • Latent factor Models for Web Data – Matrix factorization using SVD, ALS, with Regularization – LDA, SVM, Bayesian Clustering
  • 5. General Steps •Problem definition (user-based, item-based, ratings/binary…) Data Prep •Map-Reduce, cleansing, massaging data (input matrix) •Training Set, Validation Set Normalize • bias removal - Z-score, Mean-centering, Log • Pearson Correlation Coefficient Similarity • Cosine Similarity weights/Neighbors • K-nearest neighbor Train • Training model (only in model-based approaches) • Predict missing ratings Predict • top-N predictions for every user Denormalize • Reverse of normalization Evaluate Accuracy • Accuracy, Precision, Recall, F1, ROC
  • 6. User-based CF Reference: Recommenderlab vignette, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
  • 7. Challenges • Dimensionality reduction (e.g. use PCA) • Input data sparsity (aka cold start problem) • Overfitting to training data set (use regularization) • Data wrangling, in general…
  • 8. Just How Good is your Recommender? • Evaluation of predicted ratings (Mean Average Error, Root Mean Sq Error) • Evaluation of top-N recommendations – Mean Absolute Error – Accuracy – Precision & Recall (F1 score) – ROC curve
  • 10. Open Source Tools Software Description Language URL Hadoop ML library that includes http://mahout.apache.org/ Apache Mahout Collaborative Filtering Java Cofi Collaborative Filtering Library Java http://www.nongnu.org/cofi/ Components to create Crab recommender systems Python https://github.com/muricoca/crab easyrec Recommender for web pages Java http://easyrec.org/ Collaborative Filtering algorithms LensKit from GroupLens Research Java http://lenskit.grouplens.org/ MyMediaLite Recommender system algorithms C#/Mono http://mloss.org/software/view/282/ Toolkit for Feature based Matrix SVDFeature Factorization C++ http://mloss.org/software/view/333/ Collaborative Filtering for Vogoo PHP LIB personalized web sites PHP http://sourceforge.net/projects/vogoo/ http://cran.r- R library for developing and testing project.org/web/packages/recommender recommenderlab collaborative filtering systems R lab/index.html Python module integrating classic ML algorithms in scientific Python packages Scikit-learn (numpy, scipy, matplotlib) Python http://scikit-learn.org/stable/
  • 11. recommenderlab Reference: Recommenderlab vignette, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
  • 12. Mahout DataModel model = new FileDataModel(new File("data.txt")); // Construct the list of pre-computed correlations Collection<GenericItemSimilarity.ItemItemSimilarity> correlations = ...; ItemSimilarity itemSimilarity = new GenericItemSimilarity(correlations); Recommender recommender = new GenericItemBasedRecommender(model, itemSimilarity); Recommender cachingRecommender = new CachingRecommender(recommender); ... List<RecommendedItem> recommendations = cachingRecommender.recommend (1234, 10);
  • 14. 2. References & Reading • High Level Reading – Programming Collective Intelligence by Toby Segaran. The 2nd chapter gives a good introduction to collaborative filtering with Python examples (non-SVD). – Matrix Factorization Techniques for Recommender Systems Yehuda Koren; Robert Bell; Chris Volinsky, IEEE Computer, 2009, 8 • Singular Value Decomposition (SVD) Reading – The Singular Value Decomposition, by Jody Hourigan and Lynn McIndoo, Linear Algebra – Math 45. http://online.redwoods.edu/INSTRUCT/darnold/LAPROJ/Fall98/ JodLynn/report2.pdf w/ Matlab & image examples – Numerical Recipes, 3rd Edition, Press et. al.,2007, p65-75.
  • 15. References & Reading (continued) • Collaborative Filtering Reading – See papers on research.yahoo.com/Yehuda_Koren – Collaborative Filtering for Implicit Feedback Datasets, Yifan Hu; Yehuda Koren; Chris Volinsky, IEEE International Conference on Data Mining (ICDM 2008), IEEE, 2008 – Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model, Yehuda Koren, ACM Int. Conference on Knowledge Discovery and Data Mining (KDD’08), 2008 – Collaborative Filtering with Temporal Dynamics, Yehuda Koren, KDD 2009, ACM, 2009 – James Thornton’s CF Blog http://original.jamesthornton.com/cf/ – Apache Mahout Recommender https://cwiki.apache.org/MAHOUT/recommender- documentation.html – Flexible Collaborative Filtering In Java With Mahout Taste - Philippe Adjiman – Books, Articles and Tutorials on Mahout/Cofi