SlideShare a Scribd company logo
1 of 67
Download to read offline
Recommender Systems
Class Algorithmic Methods of Data Mining
Program M. Sc. Data Science
University Sapienza University of Rome
Semester Fall 2015
Lecturer Carlos Castillo http://chato.cl/
Sources:
● Ricci, Rokach and Shapira: Introduction to Recommender
Systems Handbook [link]
● Bobadilla et al. Survey 2013 [link]
● Xavier Amatriain 2014 tutorial on rec systems [link]
● Ido Guy 2011 tutorial on social rec systems [link]
● Alex Smola's tutorial on recommender systems [link]
2
Why recommender systems?
3
Definition
Recommender systems are software tools
and techniques providing suggestions for
items to be of use to a user.
4
User-based recommendations
5
6
Assumptions
● Users rely on recommendations
● Users lack sufficient personal
expertise
● Number of items is very large
– e.g. around 1010 books in Amazon
● Recommendations need to be
personalized
Amazon as of
December 2015
7
Who uses recommender systems?
● Retailers and e-commerce in general
– Amazon, Netflix, etc.
● Service sites, e.g. travel sites
● Media organizations
● Dating apps
● ...
8
Why?
● Increase number of items sold
– 2/3 of Netflix watched are recommendations
– 1/3 of Amazon sales are from recommendations
– ...
9
Why? (cont.)
● Sell more diverse items
● Increase user satisfaction
– Users enjoy the recommendations
● Increase user fidelity
– Users feel recognized (but not creeped out)
● Understand users (see next slides)
10
By-products
● Recommendations generate by-products
● Recommending requires understanding users
and items, which is valuable by itself
● Some recommender systems are very good at
this (e.g. factorization methods)
● Automatically identify marketing profiles
● Describe users to better understand them
11
The recommender system problem
Estimate the utility for a user of an item for
which the user has not expressed utility
What information can be used?
12
Types of problem
● Find some good items (most common)
● Find all good items
● Annotate in context (why I would like this)
● Recommend a sequence (e.g. tour of a city)
● Recommend a bundle (camera+lens+bag)
● Support browsing (seek longer session)
● ...
13
Data sources
● Items, Users
– Structured attributes, semi-structured or
unstructured descriptions
● Transactions
– Appraisals
● Numerical ratings (e.g. 1-5)
● Binary ratings (like/dislike)
● Unary ratings (like/don't know)
– Sales
– Tags/descriptions/reviews
14
Recommender system process
Why is part of the processing done offline?
15
Aspects of this process
● Data preparation
– Normalization, removal of outliers, feature selection,
dimensionality reduction, ...
● Data mining
– Clustering, classification, rule generation, ...
● Post-processing
– Visualization, interpretation, meta-mining, ...
16
Desiderata for recommender system
● Must inspire trust
● Must convince users to
try the items
● Must offer a good
combination of novelty,
coverage, and precision
● Must have a somewhat
transparent logic
● Must be user-tunable
17
Human factors
● Advanced systems are conversational
● Transparency and scrutability
– Explain users how the system works
– Allow users to tell the system it is wrong
● Help users make a good decision
● Convince users in a persuasive manner
● Increase enjoyment to users
● Provide serendipity
18
Serendipity
● “An aptitude for making desirable discoveries
by accident”
● Don't recommend items the user already knows
● Delight users by expanding their taste
– But still recommend them something somewhat
familiar
● It can be controlled by specific parameters
Peregrinaggio di tre giovani figliuoli del re di Serendippo; Michele Tramezzino, Venice, 1557. Tramezzino claimed to have heard the story from
one Christophero Armeno who had translated the Persian fairy tale into Italian adapting Book One of Amir Khusrau's Hasht-Bihisht of 1302 [link]
19
High-level approaches
● Memory-based
– Use data from the past in a somewhat “raw” form
● Model-based
– Use models built from data from the past
20
Approaches
● Collaborative filtering
● Content-based (item features)
● Knowledge-based (expert system)
● Personalized learning to rank
– Estimate ranking function
● Demographic
● Social/community based
– Based on connections
● Hybrid (combination of some of the above)
21
Collaborative filtering
22
Collaborative Filtering approach
● User has seen/liked certain items
● Community has seen/liked certain items
● Recommend to users items similar to the ones
they have seen/liked
– Based on finding similar users
– Based on finding similar items
23
Algorithmic elements
● M users and N items
● Transaction matrix RM x N
● Active user
● Method to compute similarity of users
● Method to sample high-similarity users
● Method to aggregate their ratings on an item
24
k nearest users algorithm
● Compute common elements with other users
● Compute distance between rating vectors
● Pick top 3 most similar users
● For every unrated item
– Average rating of 3 most similar users
● Recommend highest score unrated items
25
Ratings data
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
26
Try it! Generate recommendations
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
Given the red user
Determine 3 nearest users
Average their ratings on unrated items
Pick top 3 unrated elements
27
Compute user intersection size
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
0
3
3
3
1
3
28
Compute user similarity
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
0
3 0.33
3 2.67
3 0.00
1 3.00
3 0.67
29
Pick top-3 most similar
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
2 5 1 5 4 5 3 2
4 3 5 4 5 1
5 2 5 4 4 5
7 4 1 4 5 4 1
ITEMS
USERS
3 0.33
3 0.00
3 0.67
30
Estimate unrated items
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
2 5 1 5 4 5 3 2
4 5.0 3 1.0 2.0 5 4 4.5 - 4.0 5 3.5 1
5 2 5 4 4 5
7 4 1 4 5 4 1
ITEMS
USERS
3 0.33
3 0.00
3 0.67
31
Recommend top-3 estimated
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
2 5 1 5 4 5 3 2
4 5.0 3 1.0 2.0 5 4 4.5 - 4.0 5 3.5 1
5 2 5 4 4 5
7 4 1 4 5 4 1
ITEMS
USERS
3 0.33
3 0.00
3 0.67
32
Improvements?
● How would you improve the algorithm?
● How would you provide explanations?
33
Item-based collaborative filtering
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 ? 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
● Would user 4 like item 11?
34
Item-based collaborative filtering
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
● Compute pair-wise similarities to target item
2.0 1.5 2.3 2.0 1.0 1.0 1.0 1.0 1.5 2.0 - 2.0
35
Item-based collaborative filtering
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
● Pick k most similar items
1.0 1.0 1.0 1.0 -
36
Item-based collaborative filtering
1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1 1 5 2 3
2 5 1 5 4 5 3 2
3 5 5 1 3 2 4
4 3 5 4 5 4.5 1
5 2 5 4 4 5
6 4 3 1 4 2
7 4 1 4 5 4 1
ITEMS
USERS
● Average ratings of target user on item
1.0 1.0 1.0 1.0 -
37
Performance implications
● Similarity between users is uncovered slowly
● Similarity between items is supposedly static
– Can be precomputed!
● Item-based clusters can also be precomputed
[source]
38
Weaknesses
● Assumes standardized products
– E.g. a touristic destination at any time of the year
and under any circumstance is the same item
● Does not take into account context
● Requires a relatively large number of
transactions to yield reasonable results
39
Cold-start problem
● What to do with a new item?
● What to do with a new user?
40
Assumptions
● Collaborative filtering assumes the following:
– We take recommendations from friends
– Friends have similar tastes
– A person who have similar tastes to you, could be
your friend
– Discover people with similar tastes, use them as
friends
● BUT, people's tastes are complex!
41
Ordinary people and
extraordinary tastes
[Goel et al. 2010]
Distribution of user eccentricity: the median rank of consumed items.
In the null model, users select items proportional to item popularity
42
Matrix factorization approaches
43
2D projection of interests
[Koren et al. 2009]
44
SVD approach
● R is the matrix of ratings
– n users, m items
● U is a user-factor matrix
● S is a diagonal matrix, strength of each factor
● V is a factor-item matrix
● Matrices USV can be computed used an
approximate SVD method
45
General factorization approach
● R is the matrix of ratings
– n users, m items
● P is a user-factor matrix
● Q is a factor-item matrix
(Sometimes we force P, Q to be non-negative:
factors are easier to interpret!)
46
What is this plot?
[Koren et al. 2009]
47
Computing expected ratings
● Given:
– user vector
– item vector
● Expected rating is
48
Model directly observed ratings
● Ro are the observed ratings
● We want to minimize a reconstruction error
● Second term avoids over-fitting
– Parameter λ found by cross-validation
● Two basic optimization methods
49
1. Stochastic gradient descend
● Compute reconstruction error
● Update in opposite direction to gradient
http://sifter.org/~simon/journal/20061211.html
learning speed
50
Illustration: batch gradient descent
vs stochastic gradient descent
Batch: gradient Stochastic: single-example gradient
[source]
51
A simpler example of gradient
descent
Fit a set of n two-dimensional data points (xi,yi)
with a line L(x)=w1+w2x, means minimizing:
The update rule is to take a random point and do:
https://en.wikipedia.org/wiki/Stochastic_gradient_descent
52
2. Alternating least squares
● With vectors p fixed:
– Find vectors q that minimize above function
● With vectors q fixed
– Find vectors p that minimize above function
● Iterate until convergence
● Slower in general, but parallelizes better
53
https://xkcd.com/1098/
54
Ratings are not normally-distributed
[Marlin et al. 2007, Hu et al. 2009]
Sometimes referred to as the “J” distribution of ratings
Amazon
(DVDs, Videos, Books)
55
How you label ratings matters
56
In general, there are many biases
● Some movies always get better (or worse)
ratings than others
● Some people always give better (or worse)
ratings than others
● Some systems make people give better (or
worse) ratings than others, e.g. labels
● Time-sensitive user preferences
57
Other approaches
58
Other approaches
● Association rules (sequence-mining based)
● Regression (e.g. using a neural network)
– e.g. based on user characteristics, number of
ratings in different tags/categories
● Clustering
● Learning-to-rank
59
Hybrid methods (some types)
● Weighted
– E.g. average recommended scores of two methods
● Switching
– E.g. use one method when little info is available, a
different method when more info is available
● Cascade
– E.g. use a clustering-based approach, then refine
using a collaborative filtering approach
60
Context-sensitive methods
● Context: where, when, how, ...
● Pre-filter
● Post-filter
● Context-aware methods
– E.g. tensor factorization
61
Evaluation
62
Evaluation methodologies
● User experiments
● Precision @ Cut-off
● Ranking-based metrics
– E.g. Kendall's Tau
● Score-based metrics
– E.g. RMSE
http://www-users.cs.umn.edu/~cosley/research/gl/evaluating-herlocker-tois2004.pdf
63
Example user testing
● [Liu et al. IUI 2010] News recommender
64
Score-based metric: RMSE
● “Root of mean square error”
● Problem with all score-based metrics: niche items
with good ratings (by those who consumed them)
65
Evaluation by RMSE
[Slide from Smola 2012]
66
Evaluation by RMSE
BIAS
TEMPORAL
FACTORS
[Slide from Smola 2012]
67
Netflix challenge results
● It is easy to provide
reasonable results
● It is hard to improve
them

More Related Content

What's hot

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsJames Kirk
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Ernesto Mislej
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemRishabh Mehta
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation systemPranav Prakash
 

What's hot (20)

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation system
 

Viewers also liked

Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online ClassificationPrasad Chalasani
 
Stochastic gradient descent and its tuning
Stochastic gradient descent and its tuningStochastic gradient descent and its tuning
Stochastic gradient descent and its tuningArsalan Qadri
 
Cross domain security reference architecture
Cross domain security reference architectureCross domain security reference architecture
Cross domain security reference architectureWen Zhu
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Daniel Lewis
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systemszhayefei
 
segmentation, positioning and targeting
segmentation, positioning and targetingsegmentation, positioning and targeting
segmentation, positioning and targetingMonika Maciuliene
 
The digital transformation of society and business: 2020. Futurist keynote sp...
The digital transformation of society and business: 2020. Futurist keynote sp...The digital transformation of society and business: 2020. Futurist keynote sp...
The digital transformation of society and business: 2020. Futurist keynote sp...Gerd Leonhard
 

Viewers also liked (8)

Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online Classification
 
Stochastic gradient descent and its tuning
Stochastic gradient descent and its tuningStochastic gradient descent and its tuning
Stochastic gradient descent and its tuning
 
Cross domain security reference architecture
Cross domain security reference architectureCross domain security reference architecture
Cross domain security reference architecture
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systems
 
segmentation, positioning and targeting
segmentation, positioning and targetingsegmentation, positioning and targeting
segmentation, positioning and targeting
 
The digital transformation of society and business: 2020. Futurist keynote sp...
The digital transformation of society and business: 2020. Futurist keynote sp...The digital transformation of society and business: 2020. Futurist keynote sp...
The digital transformation of society and business: 2020. Futurist keynote sp...
 

Similar to Recommender Systems

Recommender.system.presentation.pjug.01.21.2014
Recommender.system.presentation.pjug.01.21.2014Recommender.system.presentation.pjug.01.21.2014
Recommender.system.presentation.pjug.01.21.2014rpbrehm
 
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...Mladen Jovanovic
 
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Rajasekar Nonburaj
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfAlphaIssaghaDiallo
 
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Dr. Cornelius Ludmann
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviewsmaranlar
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsZia Babar
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFYusuke Yamamoto
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionPerumalPitchandi
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In IndustryXavier Amatriain
 
recommendationsystem-140410131156-phpapp01 (1).pdf
recommendationsystem-140410131156-phpapp01 (1).pdfrecommendationsystem-140410131156-phpapp01 (1).pdf
recommendationsystem-140410131156-phpapp01 (1).pdfssuserff0096
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Sc Huang
 
Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Mark Levy
 

Similar to Recommender Systems (20)

Recommender.system.presentation.pjug.01.21.2014
Recommender.system.presentation.pjug.01.21.2014Recommender.system.presentation.pjug.01.21.2014
Recommender.system.presentation.pjug.01.21.2014
 
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdf
 
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviews
 
Role of Data Science in eCommerce
Role of Data Science in eCommerceRole of Data Science in eCommerce
Role of Data Science in eCommerce
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
 
Recommender systems
Recommender systems Recommender systems
Recommender systems
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
recommendationsystem-140410131156-phpapp01 (1).pdf
recommendationsystem-140410131156-phpapp01 (1).pdfrecommendationsystem-140410131156-phpapp01 (1).pdf
recommendationsystem-140410131156-phpapp01 (1).pdf
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
 
Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?
 

More from Carlos Castillo (ChaTo)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social MediaCarlos Castillo (ChaTo)
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Carlos Castillo (ChaTo)
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Carlos Castillo (ChaTo)
 

More from Carlos Castillo (ChaTo) (20)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social Media
 
When no clicks are good news
When no clicks are good newsWhen no clicks are good news
When no clicks are good news
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
 
Discrimination Discovery
Discrimination DiscoveryDiscrimination Discovery
Discrimination Discovery
 
Fairness-Aware Data Mining
Fairness-Aware Data MiningFairness-Aware Data Mining
Fairness-Aware Data Mining
 
Big Crisis Data for ISPC
Big Crisis Data for ISPCBig Crisis Data for ISPC
Big Crisis Data for ISPC
 
Databeers: Big Crisis Data
Databeers: Big Crisis DataDatabeers: Big Crisis Data
Databeers: Big Crisis Data
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Natural experiments
Natural experimentsNatural experiments
Natural experiments
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Link prediction
Link predictionLink prediction
Link prediction
 
Graph Partitioning and Spectral Methods
Graph Partitioning and Spectral MethodsGraph Partitioning and Spectral Methods
Graph Partitioning and Spectral Methods
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Link-Based Ranking
Link-Based RankingLink-Based Ranking
Link-Based Ranking
 
Text Indexing / Inverted Indices
Text Indexing / Inverted IndicesText Indexing / Inverted Indices
Text Indexing / Inverted Indices
 
Indexing
IndexingIndexing
Indexing
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 

Recently uploaded

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Recommender Systems

  • 1. Recommender Systems Class Algorithmic Methods of Data Mining Program M. Sc. Data Science University Sapienza University of Rome Semester Fall 2015 Lecturer Carlos Castillo http://chato.cl/ Sources: ● Ricci, Rokach and Shapira: Introduction to Recommender Systems Handbook [link] ● Bobadilla et al. Survey 2013 [link] ● Xavier Amatriain 2014 tutorial on rec systems [link] ● Ido Guy 2011 tutorial on social rec systems [link] ● Alex Smola's tutorial on recommender systems [link]
  • 3. 3 Definition Recommender systems are software tools and techniques providing suggestions for items to be of use to a user.
  • 5. 5
  • 6. 6 Assumptions ● Users rely on recommendations ● Users lack sufficient personal expertise ● Number of items is very large – e.g. around 1010 books in Amazon ● Recommendations need to be personalized Amazon as of December 2015
  • 7. 7 Who uses recommender systems? ● Retailers and e-commerce in general – Amazon, Netflix, etc. ● Service sites, e.g. travel sites ● Media organizations ● Dating apps ● ...
  • 8. 8 Why? ● Increase number of items sold – 2/3 of Netflix watched are recommendations – 1/3 of Amazon sales are from recommendations – ...
  • 9. 9 Why? (cont.) ● Sell more diverse items ● Increase user satisfaction – Users enjoy the recommendations ● Increase user fidelity – Users feel recognized (but not creeped out) ● Understand users (see next slides)
  • 10. 10 By-products ● Recommendations generate by-products ● Recommending requires understanding users and items, which is valuable by itself ● Some recommender systems are very good at this (e.g. factorization methods) ● Automatically identify marketing profiles ● Describe users to better understand them
  • 11. 11 The recommender system problem Estimate the utility for a user of an item for which the user has not expressed utility What information can be used?
  • 12. 12 Types of problem ● Find some good items (most common) ● Find all good items ● Annotate in context (why I would like this) ● Recommend a sequence (e.g. tour of a city) ● Recommend a bundle (camera+lens+bag) ● Support browsing (seek longer session) ● ...
  • 13. 13 Data sources ● Items, Users – Structured attributes, semi-structured or unstructured descriptions ● Transactions – Appraisals ● Numerical ratings (e.g. 1-5) ● Binary ratings (like/dislike) ● Unary ratings (like/don't know) – Sales – Tags/descriptions/reviews
  • 14. 14 Recommender system process Why is part of the processing done offline?
  • 15. 15 Aspects of this process ● Data preparation – Normalization, removal of outliers, feature selection, dimensionality reduction, ... ● Data mining – Clustering, classification, rule generation, ... ● Post-processing – Visualization, interpretation, meta-mining, ...
  • 16. 16 Desiderata for recommender system ● Must inspire trust ● Must convince users to try the items ● Must offer a good combination of novelty, coverage, and precision ● Must have a somewhat transparent logic ● Must be user-tunable
  • 17. 17 Human factors ● Advanced systems are conversational ● Transparency and scrutability – Explain users how the system works – Allow users to tell the system it is wrong ● Help users make a good decision ● Convince users in a persuasive manner ● Increase enjoyment to users ● Provide serendipity
  • 18. 18 Serendipity ● “An aptitude for making desirable discoveries by accident” ● Don't recommend items the user already knows ● Delight users by expanding their taste – But still recommend them something somewhat familiar ● It can be controlled by specific parameters Peregrinaggio di tre giovani figliuoli del re di Serendippo; Michele Tramezzino, Venice, 1557. Tramezzino claimed to have heard the story from one Christophero Armeno who had translated the Persian fairy tale into Italian adapting Book One of Amir Khusrau's Hasht-Bihisht of 1302 [link]
  • 19. 19 High-level approaches ● Memory-based – Use data from the past in a somewhat “raw” form ● Model-based – Use models built from data from the past
  • 20. 20 Approaches ● Collaborative filtering ● Content-based (item features) ● Knowledge-based (expert system) ● Personalized learning to rank – Estimate ranking function ● Demographic ● Social/community based – Based on connections ● Hybrid (combination of some of the above)
  • 22. 22 Collaborative Filtering approach ● User has seen/liked certain items ● Community has seen/liked certain items ● Recommend to users items similar to the ones they have seen/liked – Based on finding similar users – Based on finding similar items
  • 23. 23 Algorithmic elements ● M users and N items ● Transaction matrix RM x N ● Active user ● Method to compute similarity of users ● Method to sample high-similarity users ● Method to aggregate their ratings on an item
  • 24. 24 k nearest users algorithm ● Compute common elements with other users ● Compute distance between rating vectors ● Pick top 3 most similar users ● For every unrated item – Average rating of 3 most similar users ● Recommend highest score unrated items
  • 25. 25 Ratings data 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS
  • 26. 26 Try it! Generate recommendations 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS Given the red user Determine 3 nearest users Average their ratings on unrated items Pick top 3 unrated elements
  • 27. 27 Compute user intersection size 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS 0 3 3 3 1 3
  • 28. 28 Compute user similarity 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS 0 3 0.33 3 2.67 3 0.00 1 3.00 3 0.67
  • 29. 29 Pick top-3 most similar 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 2 5 1 5 4 5 3 2 4 3 5 4 5 1 5 2 5 4 4 5 7 4 1 4 5 4 1 ITEMS USERS 3 0.33 3 0.00 3 0.67
  • 30. 30 Estimate unrated items 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 2 5 1 5 4 5 3 2 4 5.0 3 1.0 2.0 5 4 4.5 - 4.0 5 3.5 1 5 2 5 4 4 5 7 4 1 4 5 4 1 ITEMS USERS 3 0.33 3 0.00 3 0.67
  • 31. 31 Recommend top-3 estimated 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 2 5 1 5 4 5 3 2 4 5.0 3 1.0 2.0 5 4 4.5 - 4.0 5 3.5 1 5 2 5 4 4 5 7 4 1 4 5 4 1 ITEMS USERS 3 0.33 3 0.00 3 0.67
  • 32. 32 Improvements? ● How would you improve the algorithm? ● How would you provide explanations?
  • 33. 33 Item-based collaborative filtering 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 ? 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS ● Would user 4 like item 11?
  • 34. 34 Item-based collaborative filtering 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS ● Compute pair-wise similarities to target item 2.0 1.5 2.3 2.0 1.0 1.0 1.0 1.0 1.5 2.0 - 2.0
  • 35. 35 Item-based collaborative filtering 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS ● Pick k most similar items 1.0 1.0 1.0 1.0 -
  • 36. 36 Item-based collaborative filtering 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 1 5 2 3 2 5 1 5 4 5 3 2 3 5 5 1 3 2 4 4 3 5 4 5 4.5 1 5 2 5 4 4 5 6 4 3 1 4 2 7 4 1 4 5 4 1 ITEMS USERS ● Average ratings of target user on item 1.0 1.0 1.0 1.0 -
  • 37. 37 Performance implications ● Similarity between users is uncovered slowly ● Similarity between items is supposedly static – Can be precomputed! ● Item-based clusters can also be precomputed [source]
  • 38. 38 Weaknesses ● Assumes standardized products – E.g. a touristic destination at any time of the year and under any circumstance is the same item ● Does not take into account context ● Requires a relatively large number of transactions to yield reasonable results
  • 39. 39 Cold-start problem ● What to do with a new item? ● What to do with a new user?
  • 40. 40 Assumptions ● Collaborative filtering assumes the following: – We take recommendations from friends – Friends have similar tastes – A person who have similar tastes to you, could be your friend – Discover people with similar tastes, use them as friends ● BUT, people's tastes are complex!
  • 41. 41 Ordinary people and extraordinary tastes [Goel et al. 2010] Distribution of user eccentricity: the median rank of consumed items. In the null model, users select items proportional to item popularity
  • 43. 43 2D projection of interests [Koren et al. 2009]
  • 44. 44 SVD approach ● R is the matrix of ratings – n users, m items ● U is a user-factor matrix ● S is a diagonal matrix, strength of each factor ● V is a factor-item matrix ● Matrices USV can be computed used an approximate SVD method
  • 45. 45 General factorization approach ● R is the matrix of ratings – n users, m items ● P is a user-factor matrix ● Q is a factor-item matrix (Sometimes we force P, Q to be non-negative: factors are easier to interpret!)
  • 46. 46 What is this plot? [Koren et al. 2009]
  • 47. 47 Computing expected ratings ● Given: – user vector – item vector ● Expected rating is
  • 48. 48 Model directly observed ratings ● Ro are the observed ratings ● We want to minimize a reconstruction error ● Second term avoids over-fitting – Parameter λ found by cross-validation ● Two basic optimization methods
  • 49. 49 1. Stochastic gradient descend ● Compute reconstruction error ● Update in opposite direction to gradient http://sifter.org/~simon/journal/20061211.html learning speed
  • 50. 50 Illustration: batch gradient descent vs stochastic gradient descent Batch: gradient Stochastic: single-example gradient [source]
  • 51. 51 A simpler example of gradient descent Fit a set of n two-dimensional data points (xi,yi) with a line L(x)=w1+w2x, means minimizing: The update rule is to take a random point and do: https://en.wikipedia.org/wiki/Stochastic_gradient_descent
  • 52. 52 2. Alternating least squares ● With vectors p fixed: – Find vectors q that minimize above function ● With vectors q fixed – Find vectors p that minimize above function ● Iterate until convergence ● Slower in general, but parallelizes better
  • 54. 54 Ratings are not normally-distributed [Marlin et al. 2007, Hu et al. 2009] Sometimes referred to as the “J” distribution of ratings Amazon (DVDs, Videos, Books)
  • 55. 55 How you label ratings matters
  • 56. 56 In general, there are many biases ● Some movies always get better (or worse) ratings than others ● Some people always give better (or worse) ratings than others ● Some systems make people give better (or worse) ratings than others, e.g. labels ● Time-sensitive user preferences
  • 58. 58 Other approaches ● Association rules (sequence-mining based) ● Regression (e.g. using a neural network) – e.g. based on user characteristics, number of ratings in different tags/categories ● Clustering ● Learning-to-rank
  • 59. 59 Hybrid methods (some types) ● Weighted – E.g. average recommended scores of two methods ● Switching – E.g. use one method when little info is available, a different method when more info is available ● Cascade – E.g. use a clustering-based approach, then refine using a collaborative filtering approach
  • 60. 60 Context-sensitive methods ● Context: where, when, how, ... ● Pre-filter ● Post-filter ● Context-aware methods – E.g. tensor factorization
  • 62. 62 Evaluation methodologies ● User experiments ● Precision @ Cut-off ● Ranking-based metrics – E.g. Kendall's Tau ● Score-based metrics – E.g. RMSE http://www-users.cs.umn.edu/~cosley/research/gl/evaluating-herlocker-tois2004.pdf
  • 63. 63 Example user testing ● [Liu et al. IUI 2010] News recommender
  • 64. 64 Score-based metric: RMSE ● “Root of mean square error” ● Problem with all score-based metrics: niche items with good ratings (by those who consumed them)
  • 65. 65 Evaluation by RMSE [Slide from Smola 2012]
  • 67. 67 Netflix challenge results ● It is easy to provide reasonable results ● It is hard to improve them