12th
Seminar on ICT
18 July 2013
09.00 – 11.00, BSc0502
Basic of
Recommender System
Worapot Jakkhupan, Ph.D.
Information a...
About me
Worapot Jakkhupan, Ph.D.
Affiliation
 BSc0406/2
 8662
 worapot.j(at)psu.ac.th
 http://ict.sci.psu.ac.th/new...
Outline
• Why Recommender System (RS)?
• Types of RS
• Basic approachs
– Collaborative Filtering (CF) RS
– Content-based (...
4 of 41
Information Overload
News items,
Books, Journals,
Research
papers
TV programs,
Music CDs,
Movie titles
Consumer
pr...
Basic of Recommender System Seminar on ICT5 of 41
Which books should I buy?
Which movies should I see/watch?
Seminar on ICTBasic of Recommender System 6 of 41
Basic of Recommender System Seminar on ICT7 of 41
Which news should I read?
Recommender System (RS)
Seminar on ICT8 of 41Basic of Recommender System
• Recommender System (RS) is a system that
predic...
Recommender System
 A Recommender System (RS) helps people
that have not sufficient personal experience or
competence to ...
10 of 41
Recommender System
Basic of Recommender System Seminar on ICT
Two Basic Approaches
Basic of Recommender System Seminar on ICT11 of 41
• Collaborative filtering RS
– The user will be re...
Collaborative Filtering (CF)
?Positive
Rating
Negative
Rating
Reasonably predict the opinion the user will have on the
dif...
13 of 41
Positive/Negative Rating
Item 1 Item 2 Item 3 Item 4 Item 5
User A     
User B     
User C     
Us...
Collaborative Filtering
Follow these steps:
 Find users who are similar (nearest, best
neighbors) in term of tastes, pre...
15 of 41
Calculate similarity between users
Item 1 Item 2 Item 3 Item 4 Item 5
Similarity
(Item u, D)
User A      Sim...
16 of 41
Score Rating
Item 1 Item 2 Item 3 Item 4 Item 5
User A 4 4 1 4 3
User B 2 1 4 2 5
User C 3 1 3 2 1
User D 5 4 2 ?...
Similarity Weighting
Pearson’s correlation coefficient (PCC)
ua rr
ua
ua
rr
PCC
σσ
),(covar
, =
ra and ru are the ratings...
Similarity Weighting
Covariance
Standard Deviation
m
rrrr
rr
m
i
uiuaia
ua
∑=
−−
= 1
,, ))((
),(covar
m
r
r
m
i
ix
x
∑=
...
19 of 41
Item 1 Item 2 Item 3 Item 4 Item 5
Average
(Item 1 2 3 5)
STDDEV
(Item 1 2 3 5)
User A 4 4 1 4 3 1.2
User B 2 1 4...
20 of 41
Covariance
Item 1 Item 2 Item 3 Item 4 Item 5 Average Covariance
User
A
4 4 1 4 3 3 cov(A,D)=1.25
User
B
2 1 4 2 ...
21 of 41
Pearson’s correlation coefficient
Covariance STDDEV PCC
Best
Neighbor
User A 1.25 1.2 0.9 
User B -1.25 1.6 -0.7...
Rating Prediction
Predict a rating, pa,i, for each item i, for active user, a, by
using the n selected neighbor users, u ...
23 of 41
Rating Prediction
Item 1 Item 2 Item 3 Item 4 Item 5 Average PCC
User A 4 4 1 4 3 3 0.9
User D 5 4 2 3 3.5 -
5.4
...
Pros of CF RS
Not too complicated but powerful and effective
Very relevant recommendation
The bigger the database, the ...
Drawbacks of CF RS
Cold Start: A critical mass of users is needed in
order to create a database of preferences
First Rat...
Content-Based (CB) RS
The key motivation behind these schemes is that a
customer will more likely purchase items that are...
27 of 41
Content-based RS
Basic of Recommender System Seminar on ICT
Content-based RS
Basic of Recommender System Seminar on ICT28 of 41
Recommend items similar to those users preferred
in t...
Item profile
Information Retrieval (IR)
 Keyword-based feature extraction
 TF-IDF
iijij
i
kjk
ij
ij
IDFTFTFIDF
n
N
IDF
...
building data
mining
applications for
crm
Accelerating
Customer
Relationships:
Using CRM and
Relationship
Technologies
Mas...
building data
mining
applications for
crm
Accelerating
Customer
Relationships:
Using CRM and
Relationship
Technologies
Mas...
32 of 41
Content-based RS
User A’s
Rating on Movies
Item Profile
Romance Action
Movie 1 5 1 0
Movie 2 4 1 0
Movie 3 ??? 1 ...
33 of 41
Similarity between items
User A
Item Profile
sim(m,3)
Romance Action
Movie 1 5 1 0 Sim(1,3)=1
Movie 2 4 1 0 Sim(2...
Rating Prediction
User A Sim(m,3)
Movie 1 5 1
Movie 2 4 1
Movie 3 -
∑
∑
=
=
= N
i
ui
N
i
uiua
ai
sim
rsim
P
1
,
1
,,
,
5.4...
Pros of CB RS
User Independence
 No cold-start or sparsity problems
Able to recommend users with unique tastes
Able to...
Drawbacks of CB RS
Keyword-based item profiling may cause ambiguous
 Apple = fruit or computer?
 Requires high techniqu...
Hybrid Collaborative Filtering
IMDbIMDbEachMovieEachMovie Web Crawler
Movie Content
Database
Movie Content
Database
Full U...
Trends of RS
Content Recommendation
 Video, News, Blog
Social Recommendation
 Friends (Closed, Just Friends), Advisor,...
Basic of Recommender System Seminar on ICT
Read More…
J. Bobadilla, F. Ortega, A. Hernando, A. Gutiérrez,
Recommender systems survey, Knowledge-Based
Systems, Volum...
Upcoming SlideShare
Loading in...5
×

Basic of Recommender System

903

Published on

Presented on 18 July 2013 in the 12th Seminar on Information and Communication Technology, Prince Songkla University

Published in: Education, Technology, Business
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
903
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide
  • Some simple solutions to these decision problems are just to index the information so that the user (that knows what to buy) can locate more easily the searched item. The point is that in some cases the “reader” may not know what to search. Or may know it up to a certain level. This problem is similar to the movie selection but has its own features. How to decide within a category? The context of the search can help in narrowing down the set of candidates but only to a certain point.
  • This is another interesting “product” selection problem. Here the product is a piece of information a news. News are produced in a dynamic way hence there is not a stable repository of news. When a new content is produced there is the need to funnel the information only to those that may be interested in reading it. Consider for instance this situation in a mobile scenario, when traveling, and using a mobile phone or a PDA. It is important to list on the small screen of a PDA only the news that are potentially very interesting to the user. Classification can be extremely useful. Or simple “summaries” are useful (see RSS feed). The fruition of the information is also important. In this case you can opt in for a particular channel.
  • ใส่รูป
  • Basic of Recommender System

    1. 1. 12th Seminar on ICT 18 July 2013 09.00 – 11.00, BSc0502 Basic of Recommender System Worapot Jakkhupan, Ph.D. Information and Communication Technology, Faculty of Science, Prince of Songkla University
    2. 2. About me Worapot Jakkhupan, Ph.D. Affiliation  BSc0406/2  8662  worapot.j(at)psu.ac.th  http://ict.sci.psu.ac.th/new/worapot_j.php Fields of Interest  RFID, NFC  Information Integration  Recommender System  Data Warehouse  Web Technologies  Augmented Reality  Etc. Basic of Recommender System Seminar on ICT2 of 41
    3. 3. Outline • Why Recommender System (RS)? • Types of RS • Basic approachs – Collaborative Filtering (CF) RS – Content-based (CB) RS • Trends of RS Basic of Recommender System Seminar on ICT3 of 41
    4. 4. 4 of 41 Information Overload News items, Books, Journals, Research papers TV programs, Music CDs, Movie titles Consumer products, e- commerce items, Web pages, Usenet articles, e-mails Basic of Recommender System Seminar on ICT
    5. 5. Basic of Recommender System Seminar on ICT5 of 41 Which books should I buy?
    6. 6. Which movies should I see/watch? Seminar on ICTBasic of Recommender System 6 of 41
    7. 7. Basic of Recommender System Seminar on ICT7 of 41 Which news should I read?
    8. 8. Recommender System (RS) Seminar on ICT8 of 41Basic of Recommender System • Recommender System (RS) is a system that predicts users’ personal tastes, and offer them the pleasurable products or services. • RS serves two important functions. – RS helps users deal with the information overload by giving them recommendations of products, etc. – RS helps businesses make more profits, i.e., selling more products.
    9. 9. Recommender System  A Recommender System (RS) helps people that have not sufficient personal experience or competence to evaluate the number of alternatives offered by a Web site.  Provide consumers with information to help them decide which items to purchase  RS recommends items to their users in from of personalized and ranked lists of items Seminar on ICTBasic of Recommender System 9 of 41
    10. 10. 10 of 41 Recommender System Basic of Recommender System Seminar on ICT
    11. 11. Two Basic Approaches Basic of Recommender System Seminar on ICT11 of 41 • Collaborative filtering RS – The user will be recommended items that people with similar tastes and preferences liked in the past • Content-based RS – The user will be recommended items similar to the ones the user preferred in the past • Hybrids: • Combine collaborative and content-based methods
    12. 12. Collaborative Filtering (CF) ?Positive Rating Negative Rating Reasonably predict the opinion the user will have on the different items and be able to recommend the “best” items to each user based on: the user’s previous likings and the opinions of other like minded users Basic of Recommender System Seminar on ICT12 of 41
    13. 13. 13 of 41 Positive/Negative Rating Item 1 Item 2 Item 3 Item 4 Item 5 User A      User B      User C      User D    ???  Should we recommend Item 4 to User D? Basic of Recommender System Seminar on ICT
    14. 14. Collaborative Filtering Follow these steps:  Find users who are similar (nearest, best neighbors) in term of tastes, preferences, past behaviors with target user  Aggregate weighted preferences of these neighbors  Make recommendations based on these aggregated, weighted preferences Basic of Recommender System Seminar on ICT14 of 41
    15. 15. 15 of 41 Calculate similarity between users Item 1 Item 2 Item 3 Item 4 Item 5 Similarity (Item u, D) User A      Sim(A,D) = 4/4 = 1.0 User B      Sim(B,D) = ¼ = 0.25 User C      Sim(C,D) = ¼ = 0.25 User D     - • User A has a highest similarity score with User D • Select A as the recommender for D • If A likes Item 4, D would have high probabilistic to like Item 4 Basic of Recommender System Seminar on ICT 
    16. 16. 16 of 41 Score Rating Item 1 Item 2 Item 3 Item 4 Item 5 User A 4 4 1 4 3 User B 2 1 4 2 5 User C 3 1 3 2 1 User D 5 4 2 ??? 3 • How to calculate similarity between users? • How to predict the rating score of user D on item 4? Basic of Recommender System Seminar on ICT
    17. 17. Similarity Weighting Pearson’s correlation coefficient (PCC) ua rr ua ua rr PCC σσ ),(covar , = ra and ru are the ratings vectors for the m items rated by both a and u ri,j is user i’s rating for item j Basic of Recommender System Seminar on ICT17 of 41
    18. 18. Similarity Weighting Covariance Standard Deviation m rrrr rr m i uiuaia ua ∑= −− = 1 ,, ))(( ),(covar m r r m i ix x ∑= = 1 , m rr m i xix rx ∑= − = 1 2 , )( σ Basic of Recommender System Seminar on ICT18 of 41
    19. 19. 19 of 41 Item 1 Item 2 Item 3 Item 4 Item 5 Average (Item 1 2 3 5) STDDEV (Item 1 2 3 5) User A 4 4 1 4 3 1.2 User B 2 1 4 2 5 1.6 User C 3 1 3 2 1 1 User D 5 4 2 ??? 3 1.1 Average score and STDDEV 224745.1 4 )33()31()34()34( 2222 = −+−+−+− =Arσ 3 4 3144 = +++ =Ar 3 4 5412 = +++ =Br 2 4 1313 = +++ =cr 5.3 4 3245 = +++ =Dr Basic of Recommender System Seminar on ICT
    20. 20. 20 of 41 Covariance Item 1 Item 2 Item 3 Item 4 Item 5 Average Covariance User A 4 4 1 4 3 3 cov(A,D)=1.25 User B 2 1 4 2 5 3 cov(B,D)=-1.25 User C 3 1 3 2 1 2 cov(C,D)=0 User D 5 4 2 ??? 3 3.5 - 25.1 4 )5.33)(33()5.32)(31()5.34)(34()5.35)(34( )r,rcov( DA = −−+−−+−−+−− = Basic of Recommender System Seminar on ICT m )rr)(rr( )r,r(cov m 1i ui,uai,a ua ∑= −− =
    21. 21. 21 of 41 Pearson’s correlation coefficient Covariance STDDEV PCC Best Neighbor User A 1.25 1.2 0.9  User B -1.25 1.6 -0.7  User C 0 1 0  User D - 1.1 - - 9.0 1.11.2 25.1),(covar , = × == DA rr DA DA rr PCC σσ Basic of Recommender System Seminar on ICT
    22. 22. Rating Prediction Predict a rating, pa,i, for each item i, for active user, a, by using the n selected neighbor users, u ∈ {1,2,…n}. To account for users different ratings levels, base predictions on differences from a user’s average rating. Weight users’ ratings contribution by their similarity to the active user. ∑ ∑ = = − += n u ua n u uiuua aia w rrw rp 1 , 1 ,, , )( Basic of Recommender System Seminar on ICT22 of 41
    23. 23. 23 of 41 Rating Prediction Item 1 Item 2 Item 3 Item 4 Item 5 Average PCC User A 4 4 1 4 3 3 0.9 User D 5 4 2 3 3.5 - 5.4 9.0 )34(9.0 5.3 w )rr(w rp n 1u u,D n 1u u4,uD,u D4,D = −× += − += ∑ ∑ = = Basic of Recommender System Seminar on ICT 4.5
    24. 24. Pros of CF RS Not too complicated but powerful and effective Very relevant recommendation The bigger the database, the more the past behaviors, the better the recommendation Basic of social recommandation Basic of Recommender System Seminar on ICT24 of 41
    25. 25. Drawbacks of CF RS Cold Start: A critical mass of users is needed in order to create a database of preferences First Rater: New items cannot be recommended until someone has rated them The scarcity of ratings (the user profiles are sparse vectors of ratings) also presents a problem Basic of Recommender System Seminar on ICT25 of 41
    26. 26. Content-Based (CB) RS The key motivation behind these schemes is that a customer will more likely purchase items that are similar or related to the items that he/she has purchased in the past. Item Profile  In a movie recommendation application, a movie may be represented by such features as specific actors, director, genre, subject matter, etc. Basic of Recommender System Seminar on ICT26 of 41
    27. 27. 27 of 41 Content-based RS Basic of Recommender System Seminar on ICT
    28. 28. Content-based RS Basic of Recommender System Seminar on ICT28 of 41 Recommend items similar to those users preferred in the past Steps  Create item profile  Calculate the similarity of active item with the other items  Select the high similarity item(s)  Predict the rating score of suspected user on the active item
    29. 29. Item profile Information Retrieval (IR)  Keyword-based feature extraction  TF-IDF iijij i kjk ij ij IDFTFTFIDF n N IDF f f TF ×= = = log max Basic of Recommender System Seminar on ICT29 of 41
    30. 30. building data mining applications for crm Accelerating Customer Relationships: Using CRM and Relationship Technologies Mastering Data Mining: The Art and Science of Customer Relationship Management Data Mining Your Website Introduction to marketing consumer behavior marketing research, a handbook customer knowledge management a 1 accelerating 1 and 1 1 application 1 art 1 behavior 1 building 1 consumer 1 crm 1 1 customer 1 1 1 data 1 1 1 for 1 handbook 1 introduction 1 knowledge 1 management 1 1 marketing 1 1 mastering 1 mining 1 1 1 of 1 relationship 2 1 research 1 science 1 technology 1 the 1 to 1 using 1 website 1 your 1 COUNT
    31. 31. building data mining applications for crm Accelerating Customer Relationships: Using CRM and Relationship Technologies Mastering Data Mining: The Art and Science of Customer Relationship Management Data Mining Your Website Introduction to marketing consumer behavior marketing research, a handbook customer knowledge management a 0.000 0.000 0.000 0.000 0.000 0.000 0.537 0.000 accelerating 0.000 0.432 0.000 0.000 0.000 0.000 0.000 0.000 and 0.000 0.296 0.256 0.000 0.000 0.000 0.000 0.000 application 0.502 0.000 0.000 0.000 0.000 0.000 0.000 0.000 art 0.000 0.000 0.374 0.000 0.000 0.000 0.000 0.000 behavior 0.000 0.000 0.000 0.000 0.000 0.707 0.000 0.000 building 0.502 0.000 0.000 0.000 0.000 0.000 0.000 0.000 consumer 0.000 0.000 0.000 0.000 0.000 0.707 0.000 0.000 crm 0.344 0.296 0.000 0.000 0.000 0.000 0.000 0.000 customer 0.000 0.216 0.187 0.000 0.000 0.000 0.000 0.381 data 0.251 0.000 0.187 0.316 0.000 0.000 0.000 0.000 for 0.502 0.000 0.000 0.000 0.000 0.000 0.000 0.000 handbook 0.000 0.000 0.000 0.000 0.000 0.000 0.537 0.000 introduction 0.000 0.000 0.000 0.000 0.636 0.000 0.000 0.000 knowledge 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.763 management 0.000 0.000 0.256 0.000 0.000 0.000 0.000 0.522 marketing 0.000 0.000 0.000 0.000 0.436 0.000 0.368 0.000 mastering 0.000 0.000 0.374 0.000 0.000 0.000 0.000 0.000 mining 0.251 0.000 0.187 0.316 0.000 0.000 0.000 0.000 of 0.000 0.000 0.374 0.000 0.000 0.000 0.000 0.000 relationship 0.000 0.468 0.256 0.000 0.000 0.000 0.000 0.000 research 0.000 0.000 0.000 0.000 0.000 0.000 0.537 0.000 science 0.000 0.000 0.374 0.000 0.000 0.000 0.000 0.000 technology 0.000 0.432 0.000 0.000 0.000 0.000 0.000 0.000 the 0.000 0.000 0.374 0.000 0.000 0.000 0.000 0.000 to 0.000 0.000 0.000 0.000 0.636 0.000 0.000 0.000 using 0.000 0.432 0.000 0.000 0.000 0.000 0.000 0.000 website 0.000 0.000 0.000 0.632 0.000 0.000 0.000 0.000 your 0.000 0.000 0.000 0.632 0.000 0.000 0.000 0.000 TFIDF Normed Vectors 0.187 0.316 iijij IDFTFTFIDF ×=
    32. 32. 32 of 41 Content-based RS User A’s Rating on Movies Item Profile Romance Action Movie 1 5 1 0 Movie 2 4 1 0 Movie 3 ??? 1 0 Movie 4 0 0 1 Movie 5 0 0 1 • Active Item = Movie 3 • Active User = User A • User Profiling -- User A prefers romantic movies than action movies • Item Content – Movie 3 seems to present a romantic content • Should we recommend Movie 3 to User A ?? Basic of Recommender System Seminar on ICT
    33. 33. 33 of 41 Similarity between items User A Item Profile sim(m,3) Romance Action Movie 1 5 1 0 Sim(1,3)=1 Movie 2 4 1 0 Sim(2,3)=1 Movie 3 ??? 1 0 - Movie 4 0 0 1 Sim(4,3)=0 Movie 5 0 0 1 Sim(5,3)=0 ∑∑ ∑ == = = N i bi N i ai bi N i ai ww ww basim 1 2 , 1 2 , , 1 , ),( 1 0101 )00()11( )3,1( 2222 = ++ ×+× =MMsim Basic of Recommender System Seminar on ICT
    34. 34. Rating Prediction User A Sim(m,3) Movie 1 5 1 Movie 2 4 1 Movie 3 - ∑ ∑ = = = N i ui N i uiua ai sim rsim P 1 , 1 ,, , 5.4 11 )41()51( P 3movie,userA = + ×+× = Basic of Recommender System Seminar on ICT34 of 41 4.5
    35. 35. Pros of CB RS User Independence  No cold-start or sparsity problems Able to recommend users with unique tastes Able to recommend new and unpopular items  No first-rater problem Can provide explanations of recommended items by listing content-features that caused an item be recommended Basic of Recommender System Seminar on ICT35 of 41
    36. 36. Drawbacks of CB RS Keyword-based item profiling may cause ambiguous  Apple = fruit or computer?  Requires high technique and high computational performance Homophily  Love of the same  Offer only items that similar to the items that users have rated Basic of Recommender System Seminar on ICT36 of 41
    37. 37. Hybrid Collaborative Filtering IMDbIMDbEachMovieEachMovie Web Crawler Movie Content Database Movie Content Database Full User Ratings Matrix Full User Ratings Matrix Collaborative Filtering Active User Ratings Active User Ratings User Ratings Matrix (Sparse) User Ratings Matrix (Sparse) Content-based Predictor RecommendationsRecommendations Basic of Recommender System Seminar on ICT37 of 41
    38. 38. Trends of RS Content Recommendation  Video, News, Blog Social Recommendation  Friends (Closed, Just Friends), Advisor, Family  Communities, Groups Text Comments, Text Rating Time Factor, Seasoning Products Basic of Recommender System Seminar on ICT38 of 41
    39. 39. Basic of Recommender System Seminar on ICT
    40. 40. Read More… J. Bobadilla, F. Ortega, A. Hernando, A. Gutiérrez, Recommender systems survey, Knowledge-Based Systems, Volume 46, July 2013, Pages 109-132. http://en.wikipedia.org/wiki/Recommender_system http://www.slideshare.net/idoguy/social-recommender-sy Basic of Recommender System Seminar on ICT40 of 41

    ×