Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SIGIR 2018 - Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems

1,097 views

Published on

Full paper presented at ACM SIGIR 2018.

Published in: Data & Analytics

SIGIR 2018 - Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems

  1. 1. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Rocío Cañamares and Pablo Castells Universidad Autónoma de Madrid http://ir.ii.uam.es Ann Arbor, USA, 10 July 2018
  2. 2. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Popularity in recommendation Would you find it useful to recommend these items?
  3. 3. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Outline 1. The popularity bias in recommender systems 2. Formal analysis 3. Experiments 4. Conclusions
  4. 4. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Outline 1. The popularity bias in recommender systems 2. Formal analysis 3. Experiments 4. Conclusions
  5. 5. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Recommender systems Users who bought this also bought… Music discovery Related videos People you may know… . . . now
  6. 6. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 The recommender system’s task  Rating matrix with some available cell values, most cells empty 4 4 2 2 2 4 1 4 4 3 2 5 2 4 3 5 2 1 5 1 Users Items Abstraction of user-item interaction The “rating” matrix
  7. 7. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 The recommender system’s task 4 4 2 2 2 4 1 4 4 3 ? 2 5 ? 2 4 3 5 2 1 5 1 Users Items Abstraction of user-item interaction The “rating” matrix  Rating matrix with some available cell values, most cells empty  Rank items by predicting missing ratings  Offline evaluation: split the data into training and test  Evaluate with IR metrics  test ratings = relevance judgments
  8. 8. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 The popularity bias in recommender systems Items Users In the data In algorithms Mat. factorization Nr. positive ratings Nr.timestop10 0 400 800 0 1000 2000 Popularity 800 400 0 0 1000 2000 User-based kNN Nr. positive ratings 2000 1000 0 0 1000 2000 Popular items Rest of items (long tail) Items Nr.ratings
  9. 9. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 The popularity bias in recommender systems Items Users In the data In offline evaluation? In algorithms Mat. factorization Nr. positive ratings Nr.timestop10 0 400 800 0 1000 2000 Popularity 800 400 0 0 1000 2000 User-based kNN Nr. positive ratings 2000 1000 0 0 1000 2000 Popular items Rest of items (long tail) Random Nr. positive ratings User-based kNN Matrix factorization0.3 0.2 0.1 0 nDCG@10 MovieLens 1M
  10. 10. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Avoiding popularity Popularity is not personalized, trivial, suspicious… Metrics and algorithms have been proposed that remove or cope with popularity biases But… is popularity “good” or “bad”?
  11. 11. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Should I follow the crowd? ?  Lack of novelty is not a sufficient answer: degrees of popularity  Majority provides useful default choices  Technically simplest and cheap solution – Most apps have majority listings  Majority is not always right – Randomness factors, conformity, manipulation, etc.
  12. 12. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Outline 1. The popularity bias in recommender systems 2. Formal analysis 3. Experiments 4. Conclusions
  13. 13. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Our formal analysis Research questions  Is popularity really effective (accurate) in recommendation? – Which popularity: positive rating count / ratio?  Are we measuring properly its effectiveness in offline experiments? Expected precision Observed vs. true metric values Random variables Rankings Depend- encies Theoretical findings
  14. 14. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Formalization: observed vs. true accuracy Computed on available user taste observations Computed with full knowledge of user tastes Observed metric value True metric value Items Users Relevant Non relevant Missing ratings ≈ ? Items Users Expected precision Observed vs. true metric values Random variables Rankings Depend- encies Theoretical findings
  15. 15. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Formalization: expectation  Analysis on the expected accuracy of popularity- based recommendation – How good is popularity compared to “something else” – Do observed and true accuracy agree (in expectation)?  A most simple metric: 𝔼 𝑃@1  Expectation  random variables, probabilities Expected precision Observed vs. true metric values Random variables Rankings Depend- encies Theoretical findings
  16. 16. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 𝑟𝑒𝑙 Formalization: key random variables 𝑟𝑎𝑡𝑒𝑑 Users × Items Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings
  17. 17. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 𝑟𝑒𝑙 Formalization: key random variables Users × Items 𝑟𝑎𝑡𝑒𝑑 Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings
  18. 18. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Optimal and popularity rankings  Popularity variants as probability ranking functions Nr. relevant ratings ⟶ 𝑝𝑜𝑝 𝑖 ∝ 𝑝(𝑟𝑒𝑙, 𝑡𝑟𝑎𝑖𝑛|𝑖) Average rating value ⟶ 𝑎𝑣𝑔 𝑖 = 𝑝(𝑟𝑒𝑙|𝑡𝑟𝑎𝑖𝑛, 𝑖)  Lemma: the optimal non-personalized rankings are For true 𝔼 𝑃@1 ⟶ 𝑜𝑝𝑡 𝑖 = 𝑝 𝑟𝑒𝑙 ¬𝑡𝑟𝑎𝑖𝑛, 𝑖 For observed 𝔼 𝑃@1 ⟶ 𝑜𝑝𝑡 𝑖 = 𝑝 𝑟𝑒𝑙, 𝑡𝑒𝑠𝑡 ¬𝑡𝑟𝑎𝑖𝑛, 𝑖  We reach several findings by reasoning on rank equivalence Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings
  19. 19. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Conditional (in)dependences between variables Discover Rate Like User Item Items Nr.ratings Rating distribution 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑖 Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings 𝑠𝑒𝑒𝑛 𝑖 𝑟𝑒𝑙 𝑟𝑎𝑡𝑒𝑑 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑖 = 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑟𝑒𝑙, 𝑖 𝑝 𝑟𝑒𝑙 𝑖 + 𝑝 𝑟𝑎𝑡𝑒𝑑 ¬𝑟𝑒𝑙, 𝑖 𝑝(¬𝑟𝑒𝑙|𝑖) 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑟𝑒𝑙, 𝑖 = 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑠𝑒𝑒𝑛, 𝑟𝑒𝑙, 𝑖 𝑝 𝑠𝑒𝑒𝑛 𝑟𝑒𝑙, 𝑖
  20. 20. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Conditional (in)dependences between variables Discover Rate Like User Item Items Nr.ratings Rating distribution 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑖 Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑟𝑒𝑙, 𝑖 = 𝑝 𝑟𝑎𝑡𝑒𝑑 𝑠𝑒𝑒𝑛, 𝑟𝑒𝑙, 𝑖 𝑝 𝑠𝑒𝑒𝑛 𝑟𝑒𝑙, 𝑖 𝑠𝑒𝑒𝑛 𝑖 𝑟𝑒𝑙 𝑟𝑎𝑡𝑒𝑑 𝑠𝑒𝑒𝑛 𝑖 𝑟𝑒𝑙 𝑟𝑎𝑡𝑒𝑑 𝑠𝑒𝑒𝑛 𝑖 𝑟𝑒𝑙 𝑟𝑎𝑡𝑒𝑑 1. Rating depends just on relevance 2. Rating independent from relevance 3. Rating depends both on items and relevance
  21. 21. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 0 0.5 1 Observed True Theoretical findings 1. Rating depends only on relevance Observed and true optimals agree: 𝒑𝒐𝒑 ∝ 𝒂𝒗𝒈 ∝ optimal 2. Rating independent from relevance Observed precision: random < 𝒂𝒗𝒈 < 𝒑𝒐𝒑 ∝ optimal True precision: random < 𝒑𝒐𝒑 < 𝒂𝒗𝒈 ∝ optimal 3. General case: no independence assumption Monte Carlo Observed precision: random ∼ 𝒂𝒗𝒈 < 𝒑𝒐𝒑 ∼ optimal True precision: random ≼ 𝒑𝒐𝒑 ≼ 𝒂𝒗𝒈 ≺ optimal 𝔼 𝑃@1 𝜃 = න Ω 𝑛 𝔼 𝑃@1 𝜃, 𝜔 𝑑𝜔 P@1 Random Optimal Popularity Avg rating Observed and true precision agree Observed and true precision disagree Expected precision Random variables Rankings Depend- encies Observed vs. true metric values Theoretical findings
  22. 22. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Outline 1. The popularity bias in recommender systems 2. Formal analysis 3. Experiments 4. Conclusions
  23. 23. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Experiments We wish to… 1. Run recommendations by rating count, average rating, random, optimal 2. Compute observed and true precision 3. See what comes out
  24. 24. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018  We build a dataset free of observational (popularity) bias 1. We sample 1,000 music tracks from deezer.com uniformly at random 2. We ask anonymous workers on CrowdFlower to rate 100 tracks each, sampled uniformly at random → ~100 judgments per user × ~100 judgments per track = ~100,000 judgments total  CM100k dataset available at http://ir.ii.uam.es/cm100k Data – Crowdsourced dataset
  25. 25. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Computing observed vs. true precision: judgments and ratings 100k judgments Items Seen before Never seen before
  26. 26. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Computing observed vs. true precision: judgments and ratings 100k judgments Items Seen before “Ratings” Never seen before
  27. 27. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Observed precision Computing observed vs. true precision: judgments and ratings 100k judgments Items “Ratings” Random split Input for algorithm Relevance judgments Training
  28. 28. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Computing observed vs. true precision: judgments and ratings Items True precision estimate Biased input for algorithm Full relevance judgment sample Training Test Items Training Observed precision
  29. 29. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 0 0.02 0.04 0.06 Observed True Results MovieLens 1M CM100k nDCG@10 Random Optimal Average rating Popularity 0 0.1 0.2 Observed Observed TrueObserved Similar qualitative outcome
  30. 30. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 0 0.02 0.04 0.06 Observed True Results nDCG@10 Random Optimal Average rating Popularity 0 0.1 0.2 Observed Observed TrueObserved • Popularity almost optimal • Popularity ∼ random• Popularity almost optimal • Popularity ∼ random CM100kMovieLens 1M
  31. 31. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 0 0.02 0.04 0.06 Observed True Results nDCG@10 Random Optimal Average rating Popularity 0 0.1 0.2 Observed Observed TrueObserved • Popularity almost optimal • Avg rating < popularity • Popularity ∼ random • Avg rating > popularity • Popularity almost optimal • Popularity ∼ random CM100kMovieLens 1M
  32. 32. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Test relevance dependence extremes 1. Full relevance dependence 2. Relevance independence nDCG@10 0 0.02 0.04 0.06 Observed True 0 0.02 0.04 0.06 Observed True Random Optimal Avg rating Popularity Observed TrueObserved True Shuffling the discovery distribution in different ways… Discovery
  33. 33. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Discovery Test relevance dependence extremes 1. Full relevance dependence 2. Relevance independence nDCG@10 0 0.02 0.04 0.06 Observed True 0 0.02 0.04 0.06 Observed True Random Optimal Avg rating Popularity TrueObserved True Popularity and avg rating are just ok Random < avg rating < popularity < optimal Observed Roughly agree Shuffling the discovery distribution in different ways…
  34. 34. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Test relevance dependence extremes 1. Full relevance dependence 2. Relevance independence nDCG@10 0 0.02 0.04 0.06 Observed True 0 0.02 0.04 0.06 Observed True Random Optimal Avg rating Popularity Observed TrueObserved True • Popularity near optimal • Avg rating near random Random < popularity < avg rating < optimal Shuffling the discovery distribution in different ways… Discovery
  35. 35. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Implications on personalized algorithms: user-based kNN 0 0.01 0.02 0.03 Obs True 0 0.1 0.2 0.3 Obs MovieLens 1M CM100k (full dependencies) nDCG@10 0 0.01 0.02 0.03 Obs True 0 0.1 0.2 0.3 Obs Non-normalized kNN (biased to popularity) Normalized kNN (biased to avg rating) Observed TrueObserved Non-normalized > normalized Non-normalized < normalized
  36. 36. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Outline 1. The popularity bias in recommender systems 2. Formal analysis 3. Experiments 4. Conclusions
  37. 37. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Conclusions  So… is it good or bad to recommend popular items? – It depends on the relation between rating, discovery and relevance – Tends to be good; worst case: complex simultaneous dependencies – Weak relevance dependence can twist offline measurements  The average rating can work better than the rating count – Observed accuracy tends to be unfair to the average rating  Implications on collaborative filtering algorithms – Understanding popularity can help improve state of the art algorithms  Evaluation with random samples can uncover new findings
  38. 38. IRGIRGroup @UAM Should I Follow the Crowd? A Probabilistic Analysis of the Effectiveness of Popularity in Recommender Systems 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) Ann Arbor, MI, USA, 10 July 2018 Future work  Further questions can be attempted upon a similar formal approach  Undiscovered accuracy stop by our poster!  Other data split procedures, e.g. temporal  Dynamic taste development  Recommender system in the discovery loop  Further research on personalized algorithms  Denser (full?) unbiased judgment matrix 

×