Modeling Difficulty in Recommender Systems

777 views

Published on

Presentation given at the Workshop on Recommendation Utility Evaluation: Beyond RMSE in conjunction with the conference on recommender systems (ACM) on September 9, 2012

  • Be the first to comment

Modeling Difficulty in Recommender Systems

  1. 1. Modeling Difficulty in Recommender SystemsBenjamin Kille (@bennykille)Competence Center Information Retrieval & Machine Learning September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012)
  2. 2. Outline► Recommender System Evaluation► Problem definition► Difficulty in Recommender Systems► Future work► Conclusions September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 2
  3. 3. Recommender Systems Evaluation► Definition of Evaluation measure:  RMSE (rating prediction scenario)  nDCG (ranking scenario)  Precision@N (top-N scenario)► Splitting data into training and test partition► Reporting results as average over the full set of users► Is recommending to all users equally difficult? September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 3
  4. 4. Observed Differences► Users differ with respect to  Demographics (e.g., age, gender and location)  Taste  Needs  Expectations  Consumption patterns …► Recommendation algorithms do not perform equally for each single userusers should not be evaluated all in the same way! September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 4
  5. 5. Risks of disregarding users‘ differences► A subset of users receives worse recommendations than possible► recommendation algorithm optimization targets all users equally:  „easy“ users  costs could be saved  „difficult“ users  insufficient optimization Control optimization towards those users who really require it! How to determine difficulty? September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 5
  6. 6. Problem Formulation► Measuring how difficult it will be to recommend items to a user► Ideally: deriving difficulty directly from user attributes► Problem: unkown correlation between (combinations of) attributes and difficulty► We need a method to calculate the correlation of user attributes and the recommendation difficulty September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 6
  7. 7. Difficulty in Information Retrieval► Target object: query► Method: QueryIR-System IR-System IR-System IR-System IR-System Doc 1 Doc 1 Doc 1 Doc 2 Doc 1 Doc 2 Doc 2 Doc 3 Doc 1 Doc 2 Doc 3 Doc 3 Doc 2 Doc 4 Doc 4 … … … … … Difficulty = Diversity of returned list of documents September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 7
  8. 8. Difficulty in Recommender Systems► Selecting several recommendation methods (state-of-the-art)► Measure the diversity of their output for a specific user► Based on the methods‘ agreement with respect to predicted rating / ranking / top-N items, we conclude:  high agreement  low difficulty  low agreement  high difficulty► Target correlation (user attributes ~ difficulty) can be estimated using the observed difficulties for a sufficiently large set of users September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 8
  9. 9. Future Work► Experimentally verify feasability of difficulty estimation► Evaluate observed correlation (user attributes ~ difficulty) on data sets► Investigate business rationale (reduced costs through controlled optimization efforts)► How to deal with sparsity / cold-start issues September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 9
  10. 10. Conclusions► Users should not be treated equally when evaluating recommender systems► Difficulty of recommendation tasks varies between users► Difficulty will allow to control optimization towards those users who require it► Diversity metrics could be used to estimate difficulty scores (analogously to information retrieval)► Proposed method needs to be evaluated September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 10
  11. 11. Thank you for your attention!Questions September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 11
  12. 12. References[He2008] J. He, M. Larson, and M. De Rijke. Using coherence-based measures to predict query difficulty. ECIR 2008[Herlocker2004] J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. ACM TOIS 22(1) 2004[Kuncheva2003] L. Kuncheva and C. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51 2003[Vargas2011] S. Vargas and P. Castells. Rank and relevance in novelty and diversity metrics for recommender systems. RecSys 2011 September 9, 2012 Recommendation Utility Evaluation: Beyond RMSE (2012) 12

×