SlideShare a Scribd company logo
Recommender Systems
Based Rajaraman and Ullman: Mining
Massive Data Sets &
Francesco Ricci et al. Recommender
Systems Handbook.
Recommender System
Recommender System
o Predict ratings for unrated items
o Recommend top-k items
RS – Major Approaches
• Basic question: Given 𝑅 𝑈 ×|𝐼| (highly
incomplete/sparse), given 𝑢, 𝑖, predict 𝑟𝑢,𝑖.
𝒊𝟏 𝒊𝟐 𝒊𝟑 𝒊𝟒 𝒊𝟓 𝒊𝟔
1 3 5
1 4 4
4 2 3
3 5 4
4 4 3
𝑢1
𝑢2
𝑢3
𝑢4
𝑢5
RS – Approaches
• Content-based: how similar is 𝑖 to items 𝑢 has
rated/liked in the past?
– Use metadata for measuring similarity.
+ works even when no ratings available on affected
items.
- Requires metadata!
• Collaborative Filtering: Identify items (users)
with their rating vector; no need for
metadata; but cold-start is a problem.
RS – Approaches
• CF can be memory-based (as sketched on p5): item 𝑖’s
“characteristics captured by the ratings it has received
(rating vector).
• Or it can be model-based: model user/item’s behavior
via latent factors (to be learned from data).
– Dimensionality reduction
– Original ratings matrix is usually (very) low rank.
 Matrix completion:
• using Singular value decomposition (SVD).
• Using matrix factorization (MF) [and variants].
• MovieLens – example of RS using CF.
Collaborative Filtering
Key concepts/questions
• How is user f/b expressed: ratings or implicit?
• How to measure similarity?
• How many nearest neighbors to pick (if
memory- or neighborhood-based).
• How to predict unknown ratings?
• Distinguished (also called active) user and
(target) item.
A Naïve Algorithm (memory-based)
• Find top-ℓ most similar neighbors to
distinguished user 𝑢 (using chosen similarity
or proximity measure).
• ∀item 𝑖 rated by sufficiently many of these,
compute 𝑟𝑢𝑖by aggregating by chosen
neighbors above.
• Sort items with predicted ratings and
recommend top-𝑘 items to 𝑢.
An Example
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7
𝑢1 4 5 1
𝑢2 5 5 4
𝑢3 2 4 5
𝑢4 3 3
• Jaccard(A,B) = 1/5 <2/4 = Jaccard(A,C)!
• cos 𝐴, 𝐵 = 4 × 5/ 𝐴 . |𝐵| ≈ 0.380 > 0.322 ≈
cos 𝐴, 𝐶 . – OK, but ignores internal “rating scales” 
easy/hard graders.
• See the Rajaraman et al. book for “rounded”
Jaccard/Cosine.
• A more principled approach: subtract from each rating the
corresponding user’s mean rating, then apply
Jaccard/cosine.
An Example
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7
𝑢1 2/3 5/3 -7/3
𝑢2 1/3 1/3 -2/3
𝑢3 -5/3 1/3 4/3
𝑢4 0 0
• See what just happened to the ratings!
• Behavior and items more well-separated.
• Cosine can now be + or -: check (A,B) and
(A,C).
Prediction using Memory/Neighborhood-
based approaches
• A popular approach – using Pearson correlation
coefficient.
• 𝑟𝑢𝑖 = 𝑟𝑢 + 𝐾. 𝑣∈𝑁 𝑢 ∩𝑈 𝑖 𝑤𝑢𝑣. 𝑟𝑣𝑖 − 𝑟𝑣 , where 𝑤𝑢𝑣 =
{ 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 − 𝑟𝑢 𝑟𝑣𝑗 − 𝑟𝑣 }/{√ 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 −
User-User vs Item-Item.
• User-User CF: what we just discussed!
• Item-Item – dual in principle: find items most
similar to distinguished item 𝑖; for every user
𝑢 who did not rate the distinguished item but
rated sufficiently many from the similarity
group, compute 𝑟𝑢𝑖.
• In practice, item-item has been found to be
better than user-user.
Simpler Alternatives for Rating
Estimation
• Simple average of ratings by most similar neighbors.
• Weighted average.
• User’s mean plus offset corresponding to weighted
average of offsets by most similar neighbors (Pearson!).
• Or you can see the popular vote by most similar
neighbors: e.g., 𝑢 has 5 most similar neighbors who
have rated 𝑖.
– 𝑣1, 𝑣2 rated 1; 𝑣_3 rated 3; 𝑣4 rated 4; 𝑣5 rated 5.
– Simple majority: 𝑟𝑢𝑖 = 1.
– Suppose 𝑤𝑢𝑣1
= 𝑤𝑢𝑣2
= 0.2; 𝑤𝑢𝑣3
= 0.3; 𝑤𝑢𝑣4
=
0.8; 𝑤𝑢𝑣5
=1.0. Then 𝑟𝑢𝑖 = 5. Tie-breaking arbitrary.
Item-based CF
• Dual to user-based CF, in principle.
• “People who bought 𝑆 also bought 𝑇”.
• Natural connection to association rules (each user = a
transaction).
• Predict unknown rating of user 𝑢 on item 𝑖 as the aggregate
of ratings by 𝑢 on items similar to 𝑖.
• E.g., using mean-centering and Pearson correlation for
item-item similarity,
𝑟𝑢𝑖 = 𝑟𝑖 + 𝐾
𝑗∈𝐼 𝑢 ∩𝑁(𝑖)
𝑤𝑖𝑗. (𝑟𝑢𝑗 − 𝑟
𝑗)
where 𝑟𝑖 =mean rating of 𝑖 by various users and 𝑤𝑖𝑗 =
similarity b/w 𝑖 and 𝑗, and 𝐾– the usual normalization factor.
Item-based CF Computation Illustrated
• Similarities: computing sim. b/w all pairs of items is prohibitive!
• But do we need to?
• How efficiently can we compute the sim. of all pairs of items for
which the sim. Is positive?
X
X
X
X
𝑖
𝑢
…
Item-based CF – Recommendation
Generation
X
X
X
X
𝑖
𝑢 X X X X X
similar items?
similar items?
How efficiently can we generate recommendations for a given user?
Some empirical facts re. user-based vs.
item-based CF
• User profiles are typically thinner than item
profiles; depends on application domain.
– Certainly holds for movies (Netflix).
•  as users provide more ratings, user-user sim.
can chage more dyamically than item-item sim.
• Can we precompute item-item sim. and speed up
prediction computation?
• What about refreshing sim. against updates? Can
we do it incrementally? How often should we do
this?
• Why not do this for user-user?
User & Item-based CF are both
personalized
• Non-personalized would estimate an unknown
rating as a global average.
• Every user gets the same recommendation
list, modulo items s/he may have already
rated.
• Personalized clearly leads to better
predictions.

More Related Content

Similar to Recommender Systems.pptx

Introduction to Recommendation System
Introduction to Recommendation SystemIntroduction to Recommendation System
Introduction to Recommendation System
Minha Hwang
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
sonangrai
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
Pratik Doshi
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
Keeyong Han
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
Cataldo Musto
 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithms
Aravindharamanan S
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
Alexander Konduforov
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
HackerEarth
 
Fashiondatasc
FashiondatascFashiondatasc
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithmsnextlib
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
Aravind Reddy
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
NagendraBabu27244
 
Nbvtalkonfeatureselection
NbvtalkonfeatureselectionNbvtalkonfeatureselection
Nbvtalkonfeatureselection
Nagasuri Bala Venkateswarlu
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
Dinakar nk
 
Recommender systems in practice
Recommender systems in practiceRecommender systems in practice
Recommender systems in practice
BigData Republic
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
hiblooms
 
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
DrPArivalaganASSTPRO
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Robin Reni
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
Spark Summit
 

Similar to Recommender Systems.pptx (20)

Introduction to Recommendation System
Introduction to Recommendation SystemIntroduction to Recommendation System
Introduction to Recommendation System
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithms
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Fashiondatasc
FashiondatascFashiondatasc
Fashiondatasc
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
 
Nbvtalkonfeatureselection
NbvtalkonfeatureselectionNbvtalkonfeatureselection
Nbvtalkonfeatureselection
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Recommender systems in practice
Recommender systems in practiceRecommender systems in practice
Recommender systems in practice
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
few common Feature of Size Datum Features are bores, cylinders, slots, or tab...
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 

Recently uploaded

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Recommender Systems.pptx

  • 1. Recommender Systems Based Rajaraman and Ullman: Mining Massive Data Sets & Francesco Ricci et al. Recommender Systems Handbook.
  • 3. Recommender System o Predict ratings for unrated items o Recommend top-k items
  • 4. RS – Major Approaches • Basic question: Given 𝑅 𝑈 ×|𝐼| (highly incomplete/sparse), given 𝑢, 𝑖, predict 𝑟𝑢,𝑖. 𝒊𝟏 𝒊𝟐 𝒊𝟑 𝒊𝟒 𝒊𝟓 𝒊𝟔 1 3 5 1 4 4 4 2 3 3 5 4 4 4 3 𝑢1 𝑢2 𝑢3 𝑢4 𝑢5
  • 5. RS – Approaches • Content-based: how similar is 𝑖 to items 𝑢 has rated/liked in the past? – Use metadata for measuring similarity. + works even when no ratings available on affected items. - Requires metadata! • Collaborative Filtering: Identify items (users) with their rating vector; no need for metadata; but cold-start is a problem.
  • 6. RS – Approaches • CF can be memory-based (as sketched on p5): item 𝑖’s “characteristics captured by the ratings it has received (rating vector). • Or it can be model-based: model user/item’s behavior via latent factors (to be learned from data). – Dimensionality reduction – Original ratings matrix is usually (very) low rank.  Matrix completion: • using Singular value decomposition (SVD). • Using matrix factorization (MF) [and variants]. • MovieLens – example of RS using CF.
  • 8. Key concepts/questions • How is user f/b expressed: ratings or implicit? • How to measure similarity? • How many nearest neighbors to pick (if memory- or neighborhood-based). • How to predict unknown ratings? • Distinguished (also called active) user and (target) item.
  • 9. A Naïve Algorithm (memory-based) • Find top-ℓ most similar neighbors to distinguished user 𝑢 (using chosen similarity or proximity measure). • ∀item 𝑖 rated by sufficiently many of these, compute 𝑟𝑢𝑖by aggregating by chosen neighbors above. • Sort items with predicted ratings and recommend top-𝑘 items to 𝑢.
  • 10. An Example 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑢1 4 5 1 𝑢2 5 5 4 𝑢3 2 4 5 𝑢4 3 3 • Jaccard(A,B) = 1/5 <2/4 = Jaccard(A,C)! • cos 𝐴, 𝐵 = 4 × 5/ 𝐴 . |𝐵| ≈ 0.380 > 0.322 ≈ cos 𝐴, 𝐶 . – OK, but ignores internal “rating scales”  easy/hard graders. • See the Rajaraman et al. book for “rounded” Jaccard/Cosine. • A more principled approach: subtract from each rating the corresponding user’s mean rating, then apply Jaccard/cosine.
  • 11. An Example 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑢1 2/3 5/3 -7/3 𝑢2 1/3 1/3 -2/3 𝑢3 -5/3 1/3 4/3 𝑢4 0 0 • See what just happened to the ratings! • Behavior and items more well-separated. • Cosine can now be + or -: check (A,B) and (A,C).
  • 12. Prediction using Memory/Neighborhood- based approaches • A popular approach – using Pearson correlation coefficient. • 𝑟𝑢𝑖 = 𝑟𝑢 + 𝐾. 𝑣∈𝑁 𝑢 ∩𝑈 𝑖 𝑤𝑢𝑣. 𝑟𝑣𝑖 − 𝑟𝑣 , where 𝑤𝑢𝑣 = { 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 − 𝑟𝑢 𝑟𝑣𝑗 − 𝑟𝑣 }/{√ 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 −
  • 13. User-User vs Item-Item. • User-User CF: what we just discussed! • Item-Item – dual in principle: find items most similar to distinguished item 𝑖; for every user 𝑢 who did not rate the distinguished item but rated sufficiently many from the similarity group, compute 𝑟𝑢𝑖. • In practice, item-item has been found to be better than user-user.
  • 14. Simpler Alternatives for Rating Estimation • Simple average of ratings by most similar neighbors. • Weighted average. • User’s mean plus offset corresponding to weighted average of offsets by most similar neighbors (Pearson!). • Or you can see the popular vote by most similar neighbors: e.g., 𝑢 has 5 most similar neighbors who have rated 𝑖. – 𝑣1, 𝑣2 rated 1; 𝑣_3 rated 3; 𝑣4 rated 4; 𝑣5 rated 5. – Simple majority: 𝑟𝑢𝑖 = 1. – Suppose 𝑤𝑢𝑣1 = 𝑤𝑢𝑣2 = 0.2; 𝑤𝑢𝑣3 = 0.3; 𝑤𝑢𝑣4 = 0.8; 𝑤𝑢𝑣5 =1.0. Then 𝑟𝑢𝑖 = 5. Tie-breaking arbitrary.
  • 15. Item-based CF • Dual to user-based CF, in principle. • “People who bought 𝑆 also bought 𝑇”. • Natural connection to association rules (each user = a transaction). • Predict unknown rating of user 𝑢 on item 𝑖 as the aggregate of ratings by 𝑢 on items similar to 𝑖. • E.g., using mean-centering and Pearson correlation for item-item similarity, 𝑟𝑢𝑖 = 𝑟𝑖 + 𝐾 𝑗∈𝐼 𝑢 ∩𝑁(𝑖) 𝑤𝑖𝑗. (𝑟𝑢𝑗 − 𝑟 𝑗) where 𝑟𝑖 =mean rating of 𝑖 by various users and 𝑤𝑖𝑗 = similarity b/w 𝑖 and 𝑗, and 𝐾– the usual normalization factor.
  • 16. Item-based CF Computation Illustrated • Similarities: computing sim. b/w all pairs of items is prohibitive! • But do we need to? • How efficiently can we compute the sim. of all pairs of items for which the sim. Is positive? X X X X 𝑖 𝑢 …
  • 17. Item-based CF – Recommendation Generation X X X X 𝑖 𝑢 X X X X X similar items? similar items? How efficiently can we generate recommendations for a given user?
  • 18. Some empirical facts re. user-based vs. item-based CF • User profiles are typically thinner than item profiles; depends on application domain. – Certainly holds for movies (Netflix). •  as users provide more ratings, user-user sim. can chage more dyamically than item-item sim. • Can we precompute item-item sim. and speed up prediction computation? • What about refreshing sim. against updates? Can we do it incrementally? How often should we do this? • Why not do this for user-user?
  • 19. User & Item-based CF are both personalized • Non-personalized would estimate an unknown rating as a global average. • Every user gets the same recommendation list, modulo items s/he may have already rated. • Personalized clearly leads to better predictions.