Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Tips on How to Evaluate a
Recommender System
Dr Carol Hargreaves
Chief of Business Analytics
Institute of Systems Science
...
Tips on How to Evaluate a
Recommender System
• What does a rating mean?
• Are Ratings Reliable &
Accurate (E.g. Self Selec...
Tips on How to Evaluate a
Recommender System
Classification Accuracy Metrics
 Measure the frequency with which a
recommen...
Recommender Systems Beyond
Accuracy
Coverage Metric
Recommender system suitability
includes coverage, which measures
the p...
Some Problem Scenarios…
• Suppose you have users that rate items on
your website. You want to put the highest
rated items ...
To Conclude….
• Many techniques have been proposed….
• You need to test your recommender
system
• Who is the subject custo...
To Conclude….
Dr Carol Anne Hargreaves
Institute of Systems Science
National University of Singapore
carol.hargreaves@nus....
Upcoming SlideShare
Loading in …5
×

Tips on How to Evaluate a Recommender System Dr Carol Hargreaves

331 views

Published on

Evaluation of recommender systems is essential to understand whether the recommender system will be effective for you and whether it is taking the right metrics into account for decision making. Measuring the effectiveness of a recommender system and whether the customer is satisfied with the recommendation is key to understanding how good your recommender systems is. This presentation will outline a few concepts to consider and ensure that your recommender system is taking the right metrics into consideration for recommendations

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Tips on How to Evaluate a Recommender System Dr Carol Hargreaves

  1. 1. Tips on How to Evaluate a Recommender System Dr Carol Hargreaves Chief of Business Analytics Institute of Systems Science National University of Singapore carol.hargreaves@nus.edu.sg
  2. 2. Tips on How to Evaluate a Recommender System • What does a rating mean? • Are Ratings Reliable & Accurate (E.g. Self Selection Bias)? • Do users use the ratings the same (subjective)? • Do user preferences change over time? • Average ratings lack context! • People who bought A also bought product B & C • Accuracy Metrics: Mean Absolute Error (MAE) Similar (MSE, RMSE, Normalised MSE) • PROS:  Is simple and easy to understand  Has statistical properties that provide for testing the significance of a difference between the MAE of 2 different systems  Scale 1-7, measuring MAE of extremes can be valuable CONS :  Less appropriate for tasks such as “FIND GOOD ITEMS”  Less appropriate when the granularity of true preference is small.
  3. 3. Tips on How to Evaluate a Recommender System Classification Accuracy Metrics  Measure the frequency with which a recommender system makes correct or incorrect decisions about whether an item is good.  Appropriate for tasks as “FIND GOOD ITEMS” where users have true binary preferences  Recall, Precision, F1, Overall Accuracy Recommender systems must provide not just accuracy but also usefulness.  For example, a recommender system may achieve high accuracy by only computing prediction for easy-to-predict items. But, those are the very items for which the user is least likely to need predictions  Further, a recommender system that always recommends popular items can promise that users will like most of the items recommended but a simple popularity metric could do the same thing.
  4. 4. Recommender Systems Beyond Accuracy Coverage Metric Recommender system suitability includes coverage, which measures the percentage of a dataset that the recommender system is able to provide predictions for. Confidence Metrics Confidence metrics that can help users make more effective decisions. Learning Rate Learning rate measures how quickly an algorithm can produce good recommendations Novelty Novelty measures whether a recommendation is a novel possibility for a user. Managers do not need a recommender system to tell them which products are popular overall. We need new dimensions for analyzing recommender systems. Serendipity A serendipitous recommendation helps the user find a surprisingly interesting item he might not have otherwise discovered
  5. 5. Some Problem Scenarios… • Suppose you have users that rate items on your website. You want to put the highest rated items at the top and the lowest rated at the bottom. You need some sort of ‘score’ to sort by. • Score = Positive Ratings - Negative Ratings  Suppose one item has 600 positive ratings and 400 negative ratings (Score 200 - 60% positive)  Suppose item two has 5500 positive ratings and 4500 negative ratings (Score 1 000 – 55% positive) • This algorithm puts item two (score =1000, but only 55% positive) above the item one (score=200 and 60% positive) Score = Positive Ratings/ Total Ratings  Average rating works fine if you always have a ton of ratings, but suppose item 1 has 2 positive ratings and 0 negative ratings.  Suppose item 2 has 100 positive ratings and 1 negative rating.  This algorithm puts item 2 (tons of positive ratings) below item 1 (very few positive ratings)
  6. 6. To Conclude…. • Many techniques have been proposed…. • You need to test your recommender system • Who is the subject customer, at the focus for testing the recommender system? (Online customers, students, historical online sessions) • What research methods/design is used for testing? (experiments, non- experiments) • What setting is the testing performed? (Laboratory, Real world scenarios) • What comparative analysis will you use to compare your recommender systems, and based on which optimality criterion?  Accuracy  User satisfaction  Response time  Serendipity  Online conversion
  7. 7. To Conclude…. Dr Carol Anne Hargreaves Institute of Systems Science National University of Singapore carol.hargreaves@nus.edu.sg • Which technique is the best in a given application domain? • What are the success factors of different techniques? • Do customers like/buy recommended items? • Do customers buy items they otherwise would not have bought? • Are they satisfied with the recommendation after the purchase?

×