This study estimated the "magic barrier" of recommender systems by collecting additional ratings ("opinions") from users on items they had previously rated. The magic barrier represents the lowest expected error rate achievable by any recommendation algorithm, given natural inconsistencies in human ratings. The researchers collected over 6,000 new opinions from 300+ users on movies they had rated previously. They calculated the standard deviation of errors between original ratings and new opinions, finding a magic barrier of approximately 1.2 on the site's 0-10 rating scale. This suggests recommender systems cannot achieve perfect predictions and that errors within 1.2 points are attributable to natural human inconsistencies rather than algorithm quality.
Estimating the Magic Barrier of Recommender Systems
1. Estimating the Magic Barrier of Recommender
Systems: A User Study
Alan Said, Brijnesh J. Jain, Sascha Narr,
Till Plumbaum, Sahin Albayrak, Christian Scheel
SIGIR 2012 – Portland, OR, USA
Evaluating Recommender Systems The User Study
Recommender systems evaluation generally measures the quality of the We asked users of www.moviepilot.de to provide new
algorithm based on some accuracy metric, e.g. precision, or error measure, e.g. ratings (so‐called opinions) for movies they had rated in
root‐mean‐square error. However, these measures neglect the inherent the past. We specifically asked for opinions and not re‐
inconsistencies users – people – are afflicted by. ratings so not to suggest a change of heart.
These are the first results from a noise measurement user study for estimating The user interface for collecting opinions was created so
the magic barrier of recommender systems conducted on a commercial movie that it resembled the regular rating page of moviepilot in
recommendation community. order to create a feeling of familiarity for the users and
lower rating inconsistencies related to unfamiliarity with
The magic barrier is the expected squared error of the optimal the system.
recommendation algorithm, or, the lowest error we can expect from any
recommendation algorithm. Our results show that the barrier can be estimated
by collecting the opinions of users on already rated items.
Data
The study ran in April and May 2011 and resulted in a dataset containing 6,299
opinions on 2,329 movies by 306 users – i.e. 6,299 rating‐opinion pairs. All
participating users had to have had rated at least 50 movies on moviepilot.de The ”rate new movies” page on
Our interface for collecting new opinions
and gave at least 20 new opinions. moviepilot.de
The Magic Barrier Calculated Magic Barrier
Root‐mean‐square error (RMSE) is commonly used for accuracy evaluation of a 1,6
rating function on a set of ratings 1,4
1,417
1,201
1,2
1,043
1
0,8
Having new opinions we can express the the error between an original rating
and and a new opinion on item i by user u as 0,6
0,4
We can suppose there is an unknown true rating function that knows the true
0,2
opinions of each user on each item. We can derive an estimate of the RMSE of
as 0
all r ≥ avg r < avg
Standard deviation of the error, where all refers to the
deviation over all opinions; r ≥ avg and r < avg refer to
the deviation over all ratings above and below average.
which is equal to the standard deviation of where ,
Moviepilot’s rating scale is 0‐10 stars. A magic barrier of
It is possible that there are ratings functions with a lower RMSE than , these ±1,2 means that rating prediction errors within that
functions tend to overfit and their lower RMSE does not mean they perform boundary are part of user’s rating inconsistencies.
better – they perform within the boundaries of the magic barrier.
Further Reading Results & Conclusion
We presented a study on the inherent noise found in rating values given by users in a
Detailed explanation of the commercial recommendation system.
magic barrier
Our assumption, that the magic barrier of recommender systems can be better assessed by
noise estimation seems to hold.
Users and Noise: The Magic We presented an early model for the magic barrier and the level of accuracy a recommender
Barrier of Recommender systems can achieve without over‐fitting on the noise in the data. Performing an estimate of
Systems [UMAP2012, Said et al.] the magic barrier of a system makes it possible ot assess whether a system can be further
improved or not.
Paper version of the poster We suggest that in order to estimate a system’s prediction quality, opinion collection for
magic barrier estaimation should be conducted regularly.
Technische Universität Berlin {alan, jain, narr, till, sahin, scheel}@dai‐lab.de www.dai‐lab.de