Towards Diverse Recommendation
Upcoming SlideShare
Loading in...5
×
 

Towards Diverse Recommendation

on

  • 1,882 views

Keynote talk at DiveRS workshop at ACM Recommender System conference 2011

Keynote talk at DiveRS workshop at ACM Recommender System conference 2011

Statistics

Views

Total Views
1,882
Views on SlideShare
1,855
Embed Views
27

Actions

Likes
5
Downloads
94
Comments
0

3 Embeds 27

https://twitter.com 24
https://twitbridge.com 2
https://web.tweetdeck.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Towards Diverse Recommendation Towards Diverse Recommendation Presentation Transcript

  • Towards Diverse Recommendation Towards Diverse Recommendation Neil Hurley Complex Adaptive System Laboratory Computer Science and Informatics University College Dublin Clique Strategic Research Cluster clique.ucd.ie October 2011 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse RecommendationOutline 1 Setting the Context DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse RecommendationOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse RecommendationOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance I Much effort has been spent on improving the performance of recommenders from the point of view of rating prediction. It is a well-defined statistical problem; We have agreed objective measure of prediction quality. Efficient algorithms have been developed that are good at maximising predictive accuracy. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance I Much effort has been spent on improving the performance of recommenders from the point of view of rating prediction. It is a well-defined statistical problem; We have agreed objective measure of prediction quality. Efficient algorithms have been developed that are good at maximising predictive accuracy. Not a completely solved problem – e.g. dealing with dynamic data. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance I Much effort has been spent on improving the performance of recommenders from the point of view of rating prediction. It is a well-defined statistical problem; We have agreed objective measure of prediction quality. Efficient algorithms have been developed that are good at maximising predictive accuracy. Not a completely solved problem – e.g. dealing with dynamic data. But, there are well accepted evaluation methodologies and quality measures. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: Novelty DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: Novelty Interestingness DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: Novelty Interestingness Diversity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: Novelty Interestingness Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance II But good recommendation is not about ability to predict past ratings. Recommendation quality is subjective; People’s tastes fluctuate; People can be influenced and persuaded; Recommendation can be as much about psychology as statistics. A number of ‘qualities’ are being more and more talked about with regard to other dimensions of recommendation: Novelty Interestingness Diversity Serendipity User satisfaction DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextRecommendation Performance III Clearly user-surveys may be the only way to determine subject satisfaction with a system. (Castagnos et al, 2010) present useful survey results on the importance of diversity. In order to make progress on recommendation algorithms that seek improvements along these dimensions, we need Agreed (objective?) measures of these qualities and agreed evaluation methodologies DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Setting the ContextAgenda Focus in this talk on measures of novelty and diversity, rather than algorithms for diversification. Initially look at how these concepts are defined in IR research. Then examine ideas that have emerged from the RS community. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval The Probability Ranking Principle “If a reference retrieval system’s response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance . . . the overall effectiveness of the system to its user will be the best that is obtainable” (W.S. Cooper) Nevertheless, relevance measured for each single document has been challenged since as long ago as 1964 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval The Probability Ranking Principle “If a reference retrieval system’s response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance . . . the overall effectiveness of the system to its user will be the best that is obtainable” (W.S. Cooper) Nevertheless, relevance measured for each single document has been challenged since as long ago as 1964 Goffman (1964). . . one must define relevance in relation to the entire set of documents rather than to only one document DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval The Probability Ranking Principle “If a reference retrieval system’s response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance . . . the overall effectiveness of the system to its user will be the best that is obtainable” (W.S. Cooper) Nevertheless, relevance measured for each single document has been challenged since as long ago as 1964 Goffman (1964). . . one must define relevance in relation to the entire set of documents rather than to only one document Boyce (1982) . . . A retrieval system which aspires to the retrieval of relevant documents should have a second stage which will order the topical set in a manner so as to provide maximum informativeness DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval The Maximal Marginal Relevance (MMR) criterion “ reduce redundancy while maintaining query relevance in re-ranking retrieved documents” (Carbonell and Goldstein 1998) Given a set of retrieved documents R, for a query Q incrementally rank the documents according to MMR arg max λsim1 (Di , Q) − (1 − λ) max sim2 (Di , Dj ) Di ∈RS Dj ∈S where S is the set of documents already ranked from R. Iterative greedy approach to increasing the diversity of a ranking. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval The Expected Metric Principle “in a probabilistic context, one should directly optimize for the expected value of the metric of interest” Chen and Karger (2006). Chen and Karger (2006) introduces a greedy optimisation framework in which the next document is selected to greedily optimise the selected objective. An objective such as mean k-call at n where k-call is 1 if the top-n result contains at least k relevant documents, naturally increases result-set diversity. For 1-call, this results in an approach of selecting the next document, assuming that all documents selected so far are not relevant. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval PMP – rank according to Pr(d|r) Pr(r|d) =⇒ Pr(d| r) k-call at n – rank according to Pr(at least k of r0 , ..., rn−1 |d0 , d1 , ...dn−1 ) Consider a query such as Trojan Horse, whose meaning is ambiguous. The PMP criterion would determine the most likely meaning and present a ranked list reflecting that meaning. A 1-call at n criterion would present a result pertaining to each possible meaning, with an aim of getting at least one right. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval Figure: Results from Chen and Karger (2006) on TREC2004 Robust Track. MSL = Mean Search Length (mean of rank of first relevant document minus one) MRR = Mean Reciprocal Rank (mean of the reciprocal rank of first relevant document) DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval Agrawal et al. (2009) propose a similar approach of an objective function to maximise the probability of finding at least one relevant result. They dub their approach the result diversification problem and state it as S = arg max Pr(S|q) S⊆D,|S|=k Pr(S|q) = Pr(c|q)(1 − (1 − V (d|q, c))) c d∈S where S is the retrieved result set of k documents c ∈ C is a set of categories V (d|q, c) is the likelihood of the document satisfying the user intent, given the query q. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalNovelty and Diversity in Information Retrieval Zhai and Lafferty (2006) – risk minimization of a loss function over possible returned document rankings measuring how unhappy the user is with that set. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) r(.) : D × Q → R+ a measure of relevance d(., .) : D × D → R+ a similarity function Diversification objective ∗ Rk = arg max{Rk ⊆D,|Rk |=k} f (Rk , q, r(.), d(., .)) What properties should f () satisfy? DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) I 1 Scale Invariance – insensitive to scaling distance and relevance by constant. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) I 1 Scale Invariance – insensitive to scaling distance and relevance by constant. 2 Consistency – Making output more relevance and more diverse and other documents less relevant and less diverse should not change output of the ranking. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) I 1 Scale Invariance – insensitive to scaling distance and relevance by constant. 2 Consistency – Making output more relevance and more diverse and other documents less relevant and less diverse should not change output of the ranking. 3 Richness – Should be able to obtain any possible set as output by appropriate choice of r(.) and d(., .). DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) I 1 Scale Invariance – insensitive to scaling distance and relevance by constant. 2 Consistency – Making output more relevance and more diverse and other documents less relevant and less diverse should not change output of the ranking. 3 Richness – Should be able to obtain any possible set as output by appropriate choice of r(.) and d(., .). 4 Stability – Output should not change arbitrarily with size: ∗ ∗ Rk ⊆ Rk+1 . DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) I 1 Scale Invariance – insensitive to scaling distance and relevance by constant. 2 Consistency – Making output more relevance and more diverse and other documents less relevant and less diverse should not change output of the ranking. 3 Richness – Should be able to obtain any possible set as output by appropriate choice of r(.) and d(., .). 4 Stability – Output should not change arbitrarily with size: ∗ ∗ Rk ⊆ Rk+1 . 5 Independence of Irrelevant Attributes f (R) independent of r(u) and d(u, v) for u, v ∈ S. / DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). 7 Strength of Relevance – No f (.) ignores the relevance scores. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). 7 Strength of Relevance – No f (.) ignores the relevance scores. 8 Strength of Similarity – No f (.) ignores the similarity scores. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). 7 Strength of Relevance – No f (.) ignores the relevance scores. 8 Strength of Similarity – No f (.) ignores the similarity scores. No Function satisfies all 8 axioms DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). 7 Strength of Relevance – No f (.) ignores the relevance scores. 8 Strength of Similarity – No f (.) ignores the similarity scores. No Function satisfies all 8 axioms MaxSum Diversification Weighted sum of the sums of relevance and dissimilarity of items in the selected set. safisfies all axioms except stability. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrievalAxioms of Diversification (Gollapudi and Sharma 2009) II 6 Monotonicity – Addition of a document to R should not decrease the score : f (R ∪ {d}) ≥ f (R). 7 Strength of Relevance – No f (.) ignores the relevance scores. 8 Strength of Similarity – No f (.) ignores the similarity scores. No Function satisfies all 8 axioms MaxSum Diversification Weighted sum of the sums of relevance and dissimilarity of items in the selected set. safisfies all axioms except stability. MaxMin Diversification Weighted sum of the min relevance and min dissimilarity of items in the selected set. satisfies all axioms except consistency and stability. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of DiversityOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of DiversityIR Measure of Diversity S-recall (Zhai and Lafferty 2006) S-recall at rank n is defined as the number of subtopics retrieved up to a given rank n divided by the total number of subtopics : Let Si ⊆ S be the number of subtopics in the ith document di then n | i=1 Si | S − recall@n = |S| Let minrank(S, k) = size of the smallest subset of documents that cover at least k subtopics. Usually most useful to consider S − recall@n where n = minrank(S, |S|) DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of DiversityIR Measures of Diversity S-precision (Zhai and Lafferty 2006) S-precision at rank n is the ratio of the minimum rank at which a given recall value can optimally be achieved to the first rank at which the same recall value actually has been achieved. Let k = | n Si |. Then i=1 j minrank(S, k) S − precision@n = where m∗ = arg min | Si | ≥ k m∗ j i=1 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of DiversityIR Measures of Diversity α-NDCG (Clarke et al. 2008) Standard NDCG (Normalised Cumulative Discounted Gain) calculates a gain for each document based on its relevance and a logarithmic discount for the rank it appears at. Extended for diversity evaluation, the gain is incremented by 1 for each new subtopic, and αk (0 ≤ α ≤ 1) for a subtopic that has been seen k times in previously-ranked documents. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of DiversityIR Measures of Diversity Intent-aware Precision (Agrawal et al. 2009) Intent-aware precision precIA is calculated by first calculating precision for each distinct subtopic separately, then averaging these precisions according to a distribution of the proportion of users that are interested in that subtopic: n 1 precIA@n = Pr(s|q) I(s ∈ di ) n s∈S i=1 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltyOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltyIR Measures of Novelty Novelty Measures (Agrawal et al. 2009) KL-divergence D(di ||dj ) is used to measure novelty of di wrt dj . Alternatively, di can be modelled as a mixture of dj and a background model. The higher the weight of dj in the mixture, the less novel is di wrt dj . Pairwise measures are combined to give overall measure of novelty wrt all documents in result set. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. Considering that each document consists of a set of subtopics, information nuggets or facets DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. Considering that each document consists of a set of subtopics, information nuggets or facets The novelty of a document is a measure of how much redundancy it contains, where it is redundant w.r.t. a facet, if that facet is already covered by another document. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. Considering that each document consists of a set of subtopics, information nuggets or facets The novelty of a document is a measure of how much redundancy it contains, where it is redundant w.r.t. a facet, if that facet is already covered by another document. The diversity of a result list is a measure of the number of relevant facets it contains. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. Considering that each document consists of a set of subtopics, information nuggets or facets The novelty of a document is a measure of how much redundancy it contains, where it is redundant w.r.t. a facet, if that facet is already covered by another document. The diversity of a result list is a measure of the number of relevant facets it contains. No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research Long recognised that the probability ranking principle does not adequately measure result list quality – the usefulness of a document depends on what other documents are on the list. Considering that each document consists of a set of subtopics, information nuggets or facets The novelty of a document is a measure of how much redundancy it contains, where it is redundant w.r.t. a facet, if that facet is already covered by another document. The diversity of a result list is a measure of the number of relevant facets it contains. No complete consensus here – e.g. Gollapudi and Sharma (2009)define “novelty” as fraction of topics covered Consider selecting document with least redundancy vs selecting document that improves overall diversity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Novelty and Diversity in Information retrieval IR Measures of NoveltySummary of IR Research In general, IR lines of research wrt diversity and novelty consider the following: Relevance scores for documents are not independent – need to consider relevance wrt to the entire result set, rather than each document in turn. Diversity is related to query ambiguity – Difference between selecting documents according to the probability of meaning; or Selecting documents to cover all meanings, so that at least one is relevant. Diversity is a measure of set; novelty is a measure of each document wrt a particular set in which it is contained. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The Long Tail Problem Figure: Sales Demand for 1000 products DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The Long Tail Problem Figure: Top 2% of Most Popular Products Account for 13% of Sales DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The Long Tail Problem Figure: Least Popular Items Account for 30% of Sales DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The Long Tail Problem “Less is More” – Chris Anderson [Why the Future of Business is Selling Less of More] DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsRecommenders and The Long Tail Problem To support an increase in sales, need to increase the diversity of the set of recommendations made to the end-user. Recommend items in the long-tail that are highly likely to be liked by the current-user. Implies finding those items that are liked by the current user and relatively few other users. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The End-user Perspective Definition The diversity of a set L of size p is the average dissimilarity of the items in the set 2 fD (L) = (1 − s(i, j)) p(p − 1) i∈L j<i∈L We have found it useful to define novelty (or relative diversity) as follows: Definition The novelty of an item i in a set L is 1 nL (i) = (1 − s(i, j)) p−1 j∈L,j=i DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsDiversity – The End-user Perspective User Profile from Movielens Dataset, |Pu | = 764, N = 20, |Tu | = 0.1 × |Pu | 40% of most novel items accrue no hits at all. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsOther Definitions of Novelty/Diversity in RS Castells et al. (2011) outlines some of the ways that novelty impacts on recommender system design. Distinguishes item popularity and item similarity; user-relative measures and global measures Popularity-based novelty: novelty(i) = − log p(i) global measure or 1 − log(p(K|i)) novelty(i) = − log(p(i|u)) user perspective or 1 − log(p(K|i, u)) Similarity perspective novelty(i|S) = p(j|S)d(i, j) j∈S DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender SystemsNovelty for Recommender Systems Pablo introduces rank-sensitive and relevance-aware measures of recommendation set diversity and novelty Recommendation Novelty Metric m(R|u) = disc(n)p(rel|in , u)novelty(in |u) n Novelty-Based Diversity Metric novelty(R|u) = disc(n)p(rel|in , u)p(j|u)d(in , j) n,j∈u diversity(R|u) = disc(n)disc(k)p(rel|in , u)p(j|u)d(in , ik ) k<n DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityEvaluating Diversity In our 2009 RecSys paper, we evaluated our diversification method on test sets T (µ) consisting of items chosen from the top 100 × (1 − µ)% most novel items in the user profiles. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityToy Example Motivate our diversity methodology using a toy example in which a user-base of four users, u1 , u2 , u3 , u4 is recommended items from a catalogue of 4 items i1 , i2 , i3 , i4 . The system recommends N = 2 items to each user. Any particular scenario can be represented in a table that indicates whether a user actually likes an item or not (1 or 0) and the probability that the recommender system will recommend the corresponding item to the user. Assume that G1 = {i1 , i2 } is a single genre (e.g. horror movies) and G2 = {i3 , i4 } is another. Simple similarity measure s(i1 , i2 ) = s(i3 , i4 ) = 1 and cross-genre similarities are zero. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityToy Example Biased but Full Recommended Set Diversity i1 i2 i3 i4 u1 1 (1) 1 (0) 1 (1) 2 1 0 (2) u2 1 (0) 1 (1) 1 (1) 2 1 0 (2) u3 0 (1) 2 1 1 (2) 1 (1) 1 (0) u4 1 (1) 0 (0) 1( 1 ) 2 1 (1) 2 Always recommends an item from G1 and an item from G2 . Probability of i1 being recommended to a randomly selected user – 1 (1 + 0 + 1 + 1) = 5 – is higher than that of i2 ( 8 ), 4 2 8 3 for instance. Recommendations do not spread evenly across the product catalogue. Biased towards consistently recommending i1 to u1 but never recommending i2 to u1 . DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityToy Example No System Level Biases i1 i2 i3 i4 1 u1 1 (1) 1 (3) 1 (1) 3 1 0 (3) u2 0 (1) 3 1 (1) 1 1 (3) 1 1 (3) u3 1 (1) 3 1 0 (3) 1 (1) 1 1 (3) u4 1 (1) 3 1 1 (3) 0 (1) 3 1 (1) The probability of recommending i1 , to a randomly chosen relevant user (i.e. u1 , u3 or u4 ) is 3 (1 + 1 + 1 ) = 9 . 1 3 3 5 Similarly, for i2 , i3 and i4 . Focusing on the set of items that are relevant to u1 (i.e. i1 , i2 and i3 ), the algorithm is three times as likely to recommend i1 as either of the other relevant items. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityToy Example No System or User Level Biases i1 i2 i3 i4 u1 1 (1) 3 1 1 (3) 1 (1) 3 0 (1) 1 u2 0 (1) 1 (3) 1 (1) 3 1 1 (3) u3 1 (1) 3 0 (1) 1 1 (3) 1 1 (3) u4 1 (1) 3 1 1 (3) 0 (1) 1 1 (3) Same probability of recommending any relevant item to a user Same probability that an item is recommended when it is relevant. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityAlgorithm Diversity Definition We define an algorithm to be fully diverse from the user perspective if it recommends any of the user’s set of relevant items with equal probability. Definition We define an algorithm to be fully diverse from the system perspective if the probability of recommending an item, when it is relevant is equal across all items. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityLorenz Curve and the Gini Index A plot of the cumulative proportion of the product catalogue against cumulative proportion of sales DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityLorenz Curve and the Gini Index 69% of the sales are of the 10% top selling products. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityLorenz Curve and the Gini Index G = 0 implies equal sales to all products. G = 1 when single product gets all sales. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityMeasuring Recommendation Success Measurement unit of success in recommender systems = Hit Interpret as the recommendation of a product known to be liked by the user. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityHits Inequality – Concentration Curves of Hits Lorenz curve and gini index measure inequality within the hits distribution over all items in the product catalogue. Concentration curve and concentration index of hits vs popularity measures bias of hits distribution towards popular items. Concentraion curve and concentration index of hits vs novelty measures bias of hits distribution towards novel items. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityConcentration Curves n products accrue hits {h1 , . . . , hn } – concentration curve depends on correlation between hits and popularity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityConcentration Curves n products accrue hits {h1 , . . . , hn } – concentration curve depends on correlation between hits and popularity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityConcentration Curves n products accrue hits {h1 , . . . , hn } – concentration curve depends on correlation between hits and popularity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityConcentration Curves n products accrue hits {h1 , . . . , hn } – concentration curve depends on correlation between hits and popularity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Concentration Measures of DiversityTemporal Diversity Lathia et al. (2010) investigates diversity over time – do recommendations change over time? Now diversity is measured between two recommended sets, formed at different points in time 1 |Ri+1 Ri | diversity(Ri+1 , Ri ) = n And novelty is measured as the number of new items over all time 1 novelty(Ri+1 ) = |Ri+1 ∪i Ri | j=1 n kNN algorihms exhibit more temporal diversity than SVD matrix factorisation Switching between multiple algorithms is offered as one means to improve temporal diversity. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityOutline 1 Setting the Context 2 Novelty and Diversity in Information retrieval IR Measures of Diversity IR Measures of Novelty 3 Diversity Research in Recommender Systems Concentration Measures of Diversity Serendipity DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityMeasuring the Unexpected Serendipity – the extent to which recommendations may positively surprise users. Murakami et al. (2008) propose to measure unexpectedness as the “distance between results produced by the method to be evaluated and those produced by a primitive prediction method”. n i 1 j=1 rel(sj ) = max(Pr(si ) − Prim(si ), 0) × rel(si ) × n i i=1 Ge et al. (2010) follow a similar approach, such that if R1 is the recommended set returned by the RS and R2 is a set returned by Prim then 1 serendipity = rel(sj ) |R1 R2 | sj ∈R1 R2 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityNovelty vs Serendipity Novelty with regard to a given set is a measure of how different an item is to other items in the set; It does not involve any notion of relevance Is a serendipitous recommendation equivalent to a relevant novel recommendation? To me, serendipity encapsulates a higher degree of risk – a novel item with a low chance of relevance, according to our model, which yet turns out to be relevant. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityConclusions IR research gives some directions in how to define and evaluate diversity and novelty We can ask Are these adequate for RS research? Can we map them to the needs of RS evaluation? How are they deficient? Recent research is beginning to clarify these issues for RS I believe that objective measures are possible DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityConclusions IR research gives some directions in how to define and evaluate diversity and novelty We can ask Are these adequate for RS research? Can we map them to the needs of RS evaluation? How are they deficient? Recent research is beginning to clarify these issues for RS I believe that objective measures are possible I look forward to some interesting discussions on these issues!! DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems Serendipity Thank You My research is sponsored by Science Foundation Ireland under grant 08/SRC/I1407: Clique: Graph and Network Analysis Cluster DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityReferences I Agrawal, R., Gollapudi, S., Halverson, A. and Ieong, S.: 2009, Diversifying search results, Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM ’09, ACM, New York, NY, USA, pp. 5–14. URL: http://doi.acm.org/10.1145/1498759.1498766 Boyce, B. R.: 1982, Beyond topicality : A two stage view of relevance and the retrieval process, Inf. Process. Manage. 18(3), 105–109. DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityReferences II Carbonell, J. and Goldstein, J.: 1998, The use of mmr, diversity-based reranking for reordering documents and producing summaries, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’98, ACM, New York, NY, USA, pp. 335–336. URL: http://doi.acm.org/10.1145/290941.291025 Castells, P., Vargas, S. and Wang, J.: 2011, Novelty and Diversity Metrics for Recommender Systems: Choice, Discovery and Relevance, International Workshop on Diversity in Document Retrieval (DDR 2011) at the 33rd European Conference on Information Retrieval (ECIR 2011). DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityReferences III Chen, H. and Karger, D. R.: 2006, Less is more: probabilistic models for retrieving fewer relevant documents, in E. N. Efthimiadis, S. T. Dumais, D. Hawking and K. J¨rvelin (eds), a SIGIR, ACM, pp. 429–436. Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., B¨ttcher, S. and MacKinnon, I.: 2008, Novelty and diversity u in information retrieval evaluation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, ACM, New York, NY, USA, pp. 659–666. URL: http://doi.acm.org/10.1145/1390334.1390446 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityReferences IV Ge, M., Delgado-Battenfeld, C. and Jannach, D.: 2010, Beyond accuracy: evaluating recommender systems by coverage and serendipity, Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10, ACM, New York, NY, USA, pp. 257–260. URL: http://doi.acm.org/10.1145/1864708.1864761 Goffman, W.: 1964, On relevance as a measure, Information Storage and Retrieval 2(3), 201–203. Gollapudi, S. and Sharma, A.: 2009, An axiomatic approach for result diversification, Proceedings of the 18th international conference on World wide web, WWW ’09, ACM, New York, NY, USA, pp. 381–390. URL: http://doi.acm.org/10.1145/1526709.1526761 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems
  • Towards Diverse Recommendation Diversity Research in Recommender Systems SerendipityReferences V Lathia, N., Hailes, S., Capra, L. and Amatriain, X.: 2010, Temporal diversity in recommender systems, in F. Crestani, S. Marchand-Maillet, H.-H. Chen, E. N. Efthimiadis and J. Savoy (eds), SIGIR, ACM, pp. 210–217. Murakami, T., Mori, K. and Orihara, R.: 2008, Metrics for evaluating the serendipity of recommendation lists, in K. Satoh, A. Inokuchi, K. Nagao and T. Kawamura (eds), New Frontiers in Artificial Intelligence, Vol. 4914 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, pp. 40–46. Zhai, C. and Lafferty, J.: 2006, A risk minimization framework for information retrieval, Inf. Process. Manage. 42, 31–55. URL: http://dx.doi.org/10.1016/j.ipm.2004.11.003 DiveRS: International Workshop on Novelty and Diversity in Recommender Systems