Recommendation andInformation Retrieval:Two sides of the same coin?Prof.dr.ir. Arjen P. de Vriesarjen@acm.orgCWI, TU Delft...
Outline• Recommendation Systems– Collaborative Filtering (CF)• Probabilistic approaches– Language modelling for Informatio...
Recommendation• Informally:– Search for information “without a query”• Three types:– Content-based recommendation– Collabo...
Recommendation• Informally:– Search for information “without a query”• Three types:– Content-based recommendation– Collabo...
Collaborative Filtering• Collaborative filtering (originally introduced byPatti Maes as “social information filtering”)1. ...
User-based CFSimilarityMeasure
Rating Matrix
Users
Items
Rating
User Profile
Item Profile
Unknown rating
ObjectiveIf user Boris watchedLove Actually, howwould he rate it?
Prediction
Collaborative Filtering• Standard item-based formulation(Adomavicius & Tuzhilin 2005)( )( )( )( )sim ,rat , rat ,sim ,uuj ...
Collaborative Filtering• Benefits over content-based approach– Overcomes problems with finding suitablefeatures to represe...
Prediction vs. Ranking• Original formulations focused onmodelling the users’ item ratings: ratingprediction– Evaluation of...
Relevance Ranking• Core problem of Information Retrieval!
Generative Model• A statistical model for generating data– Probability distribution over samples in agiven ‘language’MP ( ...
Unigram models etc.• Unigram Models• N-gram Models (here, N=2)= P ( ) P ( | ) P ( | ) P ( | )P ( ) P ( ) P ( ) P ( )P ( )P...
Fundamental Problem• Usually we don’t know the model M– But have a sample representative of thatmodel• First estimate a mo...
• Unigram Language Models (LM)– Urn metaphorLanguage Models…• P( ) ~ P ( ) P ( ) P ( ) P ( )= 4/9 * 2/9 * 4/9 * 3/9© Victo...
… for Information Retrieval• Rank models (documents) by probability ofgenerating the query:Q:P( | ) = 4/9 * 2/9 * 4/9 * 3/...
Zero-frequency Problem• Suppose some event not in our example– Model will assign zero probability to that event– And to an...
Smoothing• Idea:Shift part of probability mass to unseenevents• Interpolate document-based model with abackground model (o...
Relevance Ranking• Core problem of Information Retrieval!– Question arising naturally:Are CF and IR, from a modelling pers...
User-Item Relevance Models• Idea: apply the probabilistic retrievalmodel to CF
User-Item Relevance Models• Idea: apply the probabilistic retrievalmodel to CF• Treat user profile as query and answerthe ...
Implicit or explicitrelevance?• Rating-based CF:– Users explicitly rate “items”We use “items” to represent contents (movie...
User-Item Relevance Models• Existing User-based/Item-basedapproaches– Heuristic implementations of “word-of-mouth”– Unclea...
User-Item Relevance ModelsOther Items thatthe target user likedOther users who likedthe target itemTarget ItemTarget UserR...
User-Item Relevance Models• We introduce the following randomvariables:• Using the probabilistic model to IR, werank items...
User-Item Relevance Model• [Lafferty, Zhai 02]( | , )RSV ( ) log( | , )( | , ) ( | )log log( | , ) ( | )QP r Q DDP r Q DP ...
Item RepresentationQuery Items:other Items thatthe target user likedTarget ItemTarget UserRelevanceItem Representation{ib}...
User-Item Relevance Models• Item representation– Use items that I liked to represent target user– Assume the item “ratings...
Co-occurrence popularityUser-Item Relevance Models• Probabilistic justification of Item-based CF– The RSV of a target item...
Co-occurrence between target item and query itemPopularity of query itemUser-Item Relevance Models• Probabilistic justific...
User RepresentationOther users who likedthe target itemTarget ItemTarget UserRelevance{ub}im?uk
User-Item Relevance Models• User representation– Represent target item by users who like it– Assume the user profiles are ...
Co-occurrence between the target user and the other usersPopularity of the other usersUser-Item Relevance Models• Probabil...
Empirical Results• Data Set:– Music play-lists from audioscrobbler.com– 428 users and 516 items– 80% users as training set...
P@N vs. lambda
Effectiveness (P@N)
So far…• User-Item relevance models– Give a probabilistic justification for CF– Deal with the problem of sparsity– Provide...
Rating Prediction?• Previous log-based CF method predictsnor uses rating information– Ranks items solely by usage frequenc...
…… bi Bi1i,1ax1,mx,A bx,a Bx, ?a bxauAu1u……
bi … …Sorted Item Similarity1,bx,A bxau , ?a bx Rating PredictionSIRUnknown Rating
biau……,1ax ,a Bx, ?a bxSortedUserSimilarityRatingPredictionUnknown RatingSUR
Sparseness• Whether you choose SIR or SUR, in manycases, the neighborhood extends toinclude “not-so-similar” users and/or ...
bi …, ?a bxau……SortedUserSimilarity… Sorted Item SimilaritySIRUnknown RatingSUIRSURRating PredictionRatingPrediction
,a bx,k mx2I1I1 1I =1 0I =2 1I =,a bx SIR∈ ,a bx SUR∈,a bx SUIR∈ ,a bx SUIR∈2 0I =Similarity Fusion
Sketch of Derivation2,, 2 2, 2 2, 2 2, ,, , ,( | , , )( , | , , ) ( )( , 1| , , ) ( 1)( , 0 | , , )(1 ( 1))( | ) ( | , )(1...
User-Item RelevanceModelsTheoretical LevelInformationRetrievalFieldMachineLearningFieldUser RepresentationItem Representat...
Remarks• SIGIR 2006 paper estimates probabilitiesdirectly from the similarity distance givenbetween users and items• TOIS ...
Relevance Feedback• Relevance Models for query expansion in IR– Language model estimated from known relevant orfrom top-k ...
CF =~ IR?Follow-up question:Can we go beyond “model level”equivalences observed so far, and actuallycast the CF problem su...
IR SystemQuery ProcessTextRetrievalEngineOutputInvertedIndexTermoccurrences(term-docmatrix)Query
CF RecSys?!User Profile ProcessItemSimilarityTextRetrievalEngineOutputInvertedIndexUserProfiles(User-itemmatrix)User profi...
Collaborative Filtering• Standard item-based formulation• More general( ) ( )( )( ) ( )( )1 2rat , , , , ,j g u j g uu i f...
Text Retrieval• In (Metzler & Zaragoza, 2009)– In particular: factored form( ) ( )( ), , ,t g qs q d s q d t∈= ∑( ) ( ) ( ...
Text Retrieval• Examples– TF:– TF-IDF:– BM25:( ) ( )( ) ( )12, qf, tf ,w q t tw d t t d==( ) ( )( ) ( )( )12, qf, tf , log...
IR =~ CF?• In item-based Collaborative Filtering• Apply different models– With different normalizations and norms: sqd,L1 ...
IR =~ CF!• TF L1 s01 is equivalent to item-based CF( ) ( )( )( )sim ,rat , rat ,sim ,uuj Ij Ii ju i u ji j∈∈= ∑∑( ) ( ) ( ...
Empirical Results• Movielens 1M– Movielens100k: comparable results0.000.050.100.150.200.250.300.350.40TF L1s01TF-IDFL1 s01...
Vector Space Model• Challenge: get users and items incommon space– Intuition: the shared “words” are missing,which in IR d...
Item Space• User• Item• Rank• Predict rate
User space• User• Item• Rank• Predict rate
Linear Algebra• Users and items in shared orthonormalspace:• Consider covariance matrix• Spectral theorem now states that ...
Linear Algebra• Use this basis to represent items andusers:• The dot product then has a remarkableform (of the IR models d...
Subspaces…• Number of items (n) vs. number of users(m):– If n < m, a linear dependency must existbetween users in terms of...
Subspaces…• Matrix Factorization methods are capturedby assuming a lower-dimensionality spaceto project items and users in...
Ratings into Inverted File• Note: distribution of item occurrences not Zipfianlike text, so existing implementations (incl...
Weighting schemes
Empirical results 1M
Empirical results 10M
Rating prediction
Concluding Remarks• The probabilistic models are elegant (oftendeploying impressive maths), but what dothey really add in ...
Concluding Remarks• Clearly, the models in CF & IR are closelyrelated• Should these then really be studied in twodifferent...
References• Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems:a survey of the state-of-the-...
Thanks• Alejandro Bellogín• Jun Wang• Thijs Westerveld• Victor Lavrenko
Upcoming SlideShare
Loading in...5
×

Recommendation and Information Retrieval: Two Sides of the Same Coin?

1,302

Published on

Status update on our current understanding of how collaborative filtering relates far more closely to information retrieval than usually thought. Includes work by Jun Wang and Alejandro Bellogín. This presentation has been given at the Siks PhD student course on computational intelligence, May 24th, 2013

Published in: Education, Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,302
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
52
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Recommendation can be obtained by looking at top-N similar peers&apos; user preferences. More similar a peer, more chance its liked content will get recommended.
  • Recently they introduce a general framework for the term weighting problem
  • If we take … as …, we obtain equivalent scoring functions
  • Recommendation and Information Retrieval: Two Sides of the Same Coin?

    1. 1. Recommendation andInformation Retrieval:Two sides of the same coin?Prof.dr.ir. Arjen P. de Vriesarjen@acm.orgCWI, TU Delft, Spinque
    2. 2. Outline• Recommendation Systems– Collaborative Filtering (CF)• Probabilistic approaches– Language modelling for Information Retrieval– Language modelling for log-based CF– Brief: adaptations for rating-based CF• Vector Space Model (“Back to the Future”)– User and item spaces, orthonormal bases and“the spectral theorem”
    3. 3. Recommendation• Informally:– Search for information “without a query”• Three types:– Content-based recommendation– Collaborative filtering (CF)• Memory-based• Model-based– Hybrid approaches
    4. 4. Recommendation• Informally:– Search for information “without a query”• Three types:– Content-based recommendation– Collaborative filtering• Memory-based• Model-based– Hybrid approachesToday’s focus!
    5. 5. Collaborative Filtering• Collaborative filtering (originally introduced byPatti Maes as “social information filtering”)1. Compare user judgments2. Recommend differences betweensimilar users• Leading principle:People’s tastes are not randomlydistributed–A.k.a. “You are what you buy”
    6. 6. User-based CFSimilarityMeasure
    7. 7. Rating Matrix
    8. 8. Users
    9. 9. Items
    10. 10. Rating
    11. 11. User Profile
    12. 12. Item Profile
    13. 13. Unknown rating
    14. 14. ObjectiveIf user Boris watchedLove Actually, howwould he rate it?
    15. 15. Prediction
    16. 16. Collaborative Filtering• Standard item-based formulation(Adomavicius & Tuzhilin 2005)( )( )( )( )sim ,rat , rat ,sim ,uuj Ij Ii ju i u ji j∈∈= ∑∑
    17. 17. Collaborative Filtering• Benefits over content-based approach– Overcomes problems with finding suitablefeatures to represent e.g. art, music– Serendipity– Implicit mechanism for qualitative aspects likestyle• Problems: large groups, broad domains
    18. 18. Prediction vs. Ranking• Original formulations focused onmodelling the users’ item ratings: ratingprediction– Evaluation of algorithms (e.g., Netflix prize) byMean Absolute Error (MAE) or Root MeanSquare Error (RMSE) between predicted andactual ratings• However, for the end user, the ranking ofrecommended items is the essentialproblem: relevance ranking– Evaluation by precision at fixed rank (P@N)
    19. 19. Relevance Ranking• Core problem of Information Retrieval!
    20. 20. Generative Model• A statistical model for generating data– Probability distribution over samples in agiven ‘language’MP ( | M ) = P ( | M )P ( | M, )P ( | M, )P ( | M, )© Victor Lavrenko, Aug. 2002
    21. 21. Unigram models etc.• Unigram Models• N-gram Models (here, N=2)= P ( ) P ( | ) P ( | ) P ( | )P ( ) P ( ) P ( ) P ( )P ( )P ( ) P ( | ) P ( | ) P ( | )© Victor Lavrenko, Aug. 2002
    22. 22. Fundamental Problem• Usually we don’t know the model M– But have a sample representative of thatmodel• First estimate a model from a sample• Then compute the observation probabilityP ( | M ( ) )M© Victor Lavrenko, Aug. 2002
    23. 23. • Unigram Language Models (LM)– Urn metaphorLanguage Models…• P( ) ~ P ( ) P ( ) P ( ) P ( )= 4/9 * 2/9 * 4/9 * 3/9© Victor Lavrenko, Aug. 2002
    24. 24. … for Information Retrieval• Rank models (documents) by probability ofgenerating the query:Q:P( | ) = 4/9 * 2/9 * 4/9 * 3/9 = 96/9P( | ) = 3/9 * 3/9 * 3/9 * 3/9 = 81/9P( | ) = 2/9 * 3/9 * 2/9 * 4/9 = 48/9P( | ) = 2/9 * 5/9 * 2/9 * 2/9 = 40/9
    25. 25. Zero-frequency Problem• Suppose some event not in our example– Model will assign zero probability to that event– And to any set of events involving the unseen event• Happens frequently in natural language text, andit is incorrect to infer zero probabilities– Especially when dealing with incomplete samples?
    26. 26. Smoothing• Idea:Shift part of probability mass to unseenevents• Interpolate document-based model with abackground model (of “general English”)– Reflects expected frequency of events– Plays role of IDFλ + (1-λ)
    27. 27. Relevance Ranking• Core problem of Information Retrieval!– Question arising naturally:Are CF and IR, from a modelling perspective,really two different problems then?Jun Wang, Arjen P. de Vries, Marcel JT Reinders, A User-ItemRelevance Model for Log-Based Collaborative Filtering, ECIR 2006
    28. 28. User-Item Relevance Models• Idea: apply the probabilistic retrievalmodel to CF
    29. 29. User-Item Relevance Models• Idea: apply the probabilistic retrievalmodel to CF• Treat user profile as query and answerthe following question:– “ What is the probability that this itemis relevant to this user, given his or herprofile”• Hereto, apply the language modellingapproach to IR as a formal model tocompute the user-item relevance
    30. 30. Implicit or explicitrelevance?• Rating-based CF:– Users explicitly rate “items”We use “items” to represent contents (movie, music,etc.)• Log-based CF:– User profiles are gathered by logging theinteractions. Music play-list, web surf log, etc.
    31. 31. User-Item Relevance Models• Existing User-based/Item-basedapproaches– Heuristic implementations of “word-of-mouth”– Unclear how to best deal with the sparse data!• User-Item Relevance Models– Give probabilistic justification– Integrate smoothing to tackle the problem ofsparsity
    32. 32. User-Item Relevance ModelsOther Items thatthe target user likedOther users who likedthe target itemTarget ItemTarget UserRelevance?Item RepresentationUser Representation
    33. 33. User-Item Relevance Models• We introduce the following randomvariables:• Using the probabilistic model to IR, werank items by their log odds of relevancy:( | , )RSV ( ) log( | , )UP R r U IIP R r U I===1 1Users: { ,..., } Items: I { ,..., }K MU u u i i∈ ∈Relevance : { , } , "relevant", "not relevant"R r r r r∈
    34. 34. User-Item Relevance Model• [Lafferty, Zhai 02]( | , )RSV ( ) log( | , )( | , ) ( | )log log( | , ) ( | )QP r Q DDP r Q DP D r Q P r QP D r Q P r Q== +The Robertson-Sparck JonesModels( | , ) ( | )log log( | , ) ( | )P Q r D P r DP Q r D P r D= +Or:The LanguageModels( | , )P D r•D: Document; Q: Query: relevance{ , }R r r=
    35. 35. Item RepresentationQuery Items:other Items thatthe target user likedTarget ItemTarget UserRelevanceItem Representation{ib}im?uk
    36. 36. User-Item Relevance Models• Item representation– Use items that I liked to represent target user– Assume the item “ratings” are independent– Linear interpolation smoothing to addresssparsity: ( , ) 0( | , ) ( | , ) ( | )RSV ( ) log log( | , ) ( | , ) ( | )...(1 ) ( | , )log(1 ) log ( | )( | )kb b u b mkm k k m mu mm k k m mml b mmi i L c i i bP r i u P u r i P r iiP r i u P u r i P r iP i i rP i rP i rλλ∀ ∈ ∩ >= =−= + +∑( | , ) (1 ) ( | , ) ( | )b m ml b m ml bP i i r P i i r P i rλ λ= − +[0,1] is a parameter to adjust the strength of smoothingλ ∈
    37. 37. Co-occurrence popularityUser-Item Relevance Models• Probabilistic justification of Item-based CF– The RSV of a target item is the combination ofits popularity and its co-occurrence withitems (query items) that the target user liked.: ( , ) 0(1 ) ( | , )RSV ( ) log(1 ) log ( | )( | )kb b u b mkml b mu m mi i L c i i bP i i ri P i rP i rλλ∀ ∈ ∩ >−= + +∑
    38. 38. Co-occurrence between target item and query itemPopularity of query itemUser-Item Relevance Models• Probabilistic justification of Item-based CF– The RSV of a target item is the combination ofits popularity and its co-occurrence withitems (query items) that the target user liked• Item co-occurrence should be emphasized if moreusers express interest in both target & query item• Item co-occurrence should be suppressed whenthe popularity of the query item is high: ( , ) 0(1 ) ( | , )RSV ( ) log(1 ) log ( | )( | )kb b u b mkml b mu m mi i L c i i bP i i ri P i rP i rλλ∀ ∈ ∩ >−= + +∑
    39. 39. User RepresentationOther users who likedthe target itemTarget ItemTarget UserRelevance{ub}im?uk
    40. 40. User-Item Relevance Models• User representation– Represent target item by users who like it– Assume the user profiles are independent– Linear interpolation smoothing to address sparsity:( | , ) ( | , ) ( | )RSV ( ) log log( | , ) ( | , ) ( | )...(1 ) ( | , )log(1 )( | )kb b imm k m k ku mm k m k kml b ku u L bP r i u P i r u P r uiP r i u P i r u P r uP u u rP u rλλ∀ ∈= =−= +∑( | , ) (1 ) ( | , ) ( | )ml b k ml b k ml bP u u r P u u r P u rλ λ= − +[0,1] is a parameter to adjust the strength of smoothingλ ∈
    41. 41. Co-occurrence between the target user and the other usersPopularity of the other usersUser-Item Relevance Models• Probabilistic justification of User-based CF– The RSV of a target item towards a target user iscalculated by the target user’s co-occurrence withother users who liked the target item• User co-occurrence is emphasized if more items liked bytarget user are also liked by the other user• User co-occurrence should be suppressed when this userliked many items:(1 ) ( | , )RSV ( ) log(1 )( | )kb b imml b ku mu u L bP u u riP u rλλ∀ ∈−= +∑
    42. 42. Empirical Results• Data Set:– Music play-lists from audioscrobbler.com– 428 users and 516 items– 80% users as training set and 20% users as test set.– Half of items in test set as ground truth, others as userprofiles• Measurement– Recommendation Precision:(num of corrected items)/(num. of recommended)– Averaged over 5 runs– Compared with the suggestion lib developed in GroupLens
    43. 43. P@N vs. lambda
    44. 44. Effectiveness (P@N)
    45. 45. So far…• User-Item relevance models– Give a probabilistic justification for CF– Deal with the problem of sparsity– Provide state-of-art performance
    46. 46. Rating Prediction?• Previous log-based CF method predictsnor uses rating information– Ranks items solely by usage frequency– Appropriate for, e.g., music recommendationin a service like Spotify or personalised TV
    47. 47. …… bi Bi1i,1ax1,mx,A bx,a Bx, ?a bxauAu1u……
    48. 48. bi … …Sorted Item Similarity1,bx,A bxau , ?a bx Rating PredictionSIRUnknown Rating
    49. 49. biau……,1ax ,a Bx, ?a bxSortedUserSimilarityRatingPredictionUnknown RatingSUR
    50. 50. Sparseness• Whether you choose SIR or SUR, in manycases, the neighborhood extends toinclude “not-so-similar” users and/or items• Idea:Take into considerations the similar itemratings made by similar users as extrasource for predictionJun Wang, Arjen P. de Vries, Marcel JT Reinders, Unifying user-basedand item-based collaborative filtering approaches by similarityfusion, SIGIR 2006
    51. 51. bi …, ?a bxau……SortedUserSimilarity… Sorted Item SimilaritySIRUnknown RatingSUIRSURRating PredictionRatingPrediction
    52. 52. ,a bx,k mx2I1I1 1I =1 0I =2 1I =,a bx SIR∈ ,a bx SUR∈,a bx SUIR∈ ,a bx SUIR∈2 0I =Similarity Fusion
    53. 53. Sketch of Derivation2,, 2 2, 2 2, 2 2, ,, , ,( | , , )( , | , , ) ( )( , 1| , , ) ( 1)( , 0 | , , )(1 ( 1))( | ) ( | , )(1 )( | ) ( ( ) ( )(1 )k mk mIk mk mk m k mk m k m k mP x SUR SIR SUIRP x I SUR SIR SUIR P IP x I SUR SIR SUIR P IP x I SUR SIR SUIR P IP x SUIR P x SUR SIRP x SUIR P x SUR P x SIRδ δδ λ λ== = = += − == + −= + + −∑)(1 )δ−See SIGIR 2006 paper for more details
    54. 54. User-Item RelevanceModelsTheoretical LevelInformationRetrievalFieldMachineLearningFieldUser RepresentationItem RepresentationCombination rulesSimilarity FusionIndividual PredictorLatent Predictor Space,Latent semantic analysis,manifold learning etc.Relevance Feedback.Query expansion etc
    55. 55. Remarks• SIGIR 2006 paper estimates probabilitiesdirectly from the similarity distance givenbetween users and items• TOIS 2008 paper below applies Parzen windowkernel density estimation to the rating data itself,to give a full probabilistic derivation– Shows how the “kernel trick” let’s us generalize thedistance measure; such that a cosine (projection)kernel (length-normalized dot product) can bechosen, while keeping Gaussian kernel ParzenwindowsJun Wang, Arjen P. de Vries, and Marcel J. T. Reinders. Unifiedrelevance models for rating prediction in collaborative filtering. ACMTOIS 26 (3), June 2008
    56. 56. Relevance Feedback• Relevance Models for query expansion in IR– Language model estimated from known relevant orfrom top-k documents (Pseudo-RFB)– Expand query with terms generated by the LM• Application to recommendation– User profile used to identify neighbourhood; aRelevance Model estimated from that neighbourhoodused to expand the profile– Deploy probabilistic clustering method PPC toconstruct the neighbourhood– Very good empirical results on P@NJavier Parapar, Alejandro Bellogín, Pablo Castells, ÁlvaroBarreiro. Relevance-Based Language Modelling for RecommenderSystems.Information Processing & Management 49 (4), pp. 966-980
    57. 57. CF =~ IR?Follow-up question:Can we go beyond “model level”equivalences observed so far, and actuallycast the CF problem such that we can usethe full IR machinery?Alejandro Bellogín, Jun Wang, and Pablo Castells.TextRetrieval Methods for Item Ranking in CollaborativeFiltering. ECIR 2011
    58. 58. IR SystemQuery ProcessTextRetrievalEngineOutputInvertedIndexTermoccurrences(term-docmatrix)Query
    59. 59. CF RecSys?!User Profile ProcessItemSimilarityTextRetrievalEngineOutputInvertedIndexUserProfiles(User-itemmatrix)User profile(as query)
    60. 60. Collaborative Filtering• Standard item-based formulation• More general( ) ( )( )( ) ( )( )1 2rat , , , , ,j g u j g uu i f u i j f u j f i j∈ ∈= =∑ ∑( )( )( )( )sim ,rat , rat ,sim ,uuj Ij Ii ju i u ji j∈∈= ∑∑
    61. 61. Text Retrieval• In (Metzler & Zaragoza, 2009)– In particular: factored form( ) ( )( ), , ,t g qs q d s q d t∈= ∑( ) ( ) ( )1 2, , , ,s q d t w q t w d t=
    62. 62. Text Retrieval• Examples– TF:– TF-IDF:– BM25:( ) ( )( ) ( )12, qf, tf ,w q t tw d t t d==( ) ( )( ) ( )( )12, qf, tf , logdfw q t tNw d t t dt= =  ÷ ÷ ( )( ) ( )( )( )( )( )( ) ( )( ) ( )( ) ( )3131211 qf,qfdf 0.5 1 tf ,, logdf 0.5 1 dl / dl tf ,k tw q tk tN t k t dw d tt k b b d t d+=+ − + +=  ÷ ÷+ − + × + 
    63. 63. IR =~ CF?• In item-based Collaborative Filtering• Apply different models– With different normalizations and norms: sqd,L1 and L2( ) ( )( ) ( )tf , sim ,qf rat ,t d i jt u j==sqdDocumentNo norm Norm ( /|D|)QueryNo norm s00 s01Norm ( /|Q|) s10 s11t jd iq u≈≈≈
    64. 64. IR =~ CF!• TF L1 s01 is equivalent to item-based CF( ) ( )( )( )sim ,rat , rat ,sim ,uuj Ij Ii ju i u ji j∈∈= ∑∑( ) ( ) ( )( )( )( )( )( )( )1 2tf ,, , , qftf ,t g q t g qt g qt ds q d w q t w d t tt d∈ ∈∈= =∑ ∑∑( ) ( )( ) ( )tf , sim ,qf rat ,t d i jt u j==
    65. 65. Empirical Results• Movielens 1M– Movielens100k: comparable results0.000.050.100.150.200.250.300.350.40TF L1s01TF-IDFL1 s01TF-IDFL2 s11BM25L2 s11TF L1s10BM25L1 s01TF-IDFL1 s10BM25L1 s00TF-IDFL2 s10TF-IDFL1 s00TF L2s10BM25L2 s10TF L1s00BM25L1 s11BM25L1 s10BM25L2 s01TF L2s11TF-IDFL2 s01TF L2s01nDCG
    66. 66. Vector Space Model• Challenge: get users and items incommon space– Intuition: the shared “words” are missing,which in IR directly relate documents toqueries– Two extreme settings:• Project items into a space with dimensionality ofthe number of items• Project users into a space with dimensionality ofthe number of usersA. Bellogín, J. Wang, P. Castells. Bridging Memory-BasedCollaborative Filtering and Text Retrieval.Information Retrieval (to appear)
    67. 67. Item Space• User• Item• Rank• Predict rate
    68. 68. User space• User• Item• Rank• Predict rate
    69. 69. Linear Algebra• Users and items in shared orthonormalspace:• Consider covariance matrix• Spectral theorem now states that anorthonormal basis of eigenvectors exists
    70. 70. Linear Algebra• Use this basis to represent items andusers:• The dot product then has a remarkableform (of the IR models discussed):
    71. 71. Subspaces…• Number of items (n) vs. number of users(m):– If n < m, a linear dependency must existbetween users in terms of the item spacecomponents– In this case, it has been known empiricallythat item-based algorithms tend to performbetter• Dimension of sub-space key for the performanceof the algorithm?• ~ better estimation (more data per item) in theprobabilistic versions
    72. 72. Subspaces…• Matrix Factorization methods are capturedby assuming a lower-dimensionality spaceto project items and users into (usuallyconsidered “model-based” rather than“memory-based”)~ Latent Semantic Indexing (a VSM methodreplicated as pLSA and variants)
    73. 73. Ratings into Inverted File• Note: distribution of item occurrences not Zipfianlike text, so existing implementations (includingchoice of compression etc.) may be sub-optimalfor CF runtime performance
    74. 74. Weighting schemes
    75. 75. Empirical results 1M
    76. 76. Empirical results 10M
    77. 77. Rating prediction
    78. 78. Concluding Remarks• The probabilistic models are elegant (oftendeploying impressive maths), but what dothey really add in understanding IR & CF –i.e., beyond the (often claimed to be “ad-hoc”) approaches of the VSM?
    79. 79. Concluding Remarks• Clearly, the models in CF & IR are closelyrelated• Should these then really be studied in twodifferent (albeit overlapping) communities,RecSys vs. SIGIR?
    80. 80. References• Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems:a survey of the state-of-the-art and possible extensions. IEEE TKDE 17(6), 734-749(2005)• Alejandro Bellogín, Jun Wang, and Pablo Castells.Text Retrieval Methods for ItemRanking in Collaborative Filtering. ECIR 2011.• Metzler, D., Zaragoza, H.: Semi-parametric and non-parametric term weighting forinformation retrieval. ECIR 2009.• Javier Parapar, Alejandro Bellogín, Pablo Castells, Álvaro Barreiro. Relevance-Based Language Modelling for Recommender Systems.Information Processing &Management 49 (4), pp. 966-980• A. Bellogín, J. Wang, P. Castells. Bridging Memory-Based Collaborative Filteringand Text Retrieval.Information Retrieval (to appear)• Jun Wang, Arjen P. de Vries, Marcel JT Reinders, Unifying user-based and item-based collaborative filtering approaches by similarity fusion, SIGIR 2006• Jun Wang, Arjen P. de Vries, Marcel JT Reinders, A User-Item Relevance Model forLog-Based Collaborative Filtering, ECIR 2006• Jun Wang, Arjen P. de Vries, and Marcel J. T. Reinders. Unified relevance modelsfor rating prediction in collaborative filtering. ACM TOIS 26 (3), June 2008.• Jun Wang, Stephen Robertson, Arjen P. de Vries, and Marcel J.T. Reinders.Probabilistic relevance ranking for collaborative filtering. Information Retrieval 11(6):477-497, 2008
    81. 81. Thanks• Alejandro Bellogín• Jun Wang• Thijs Westerveld• Victor Lavrenko
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×