Choice Overload in RecommendersRecommenders reduce information overload… But large personalized sets cause choice overload! Less attractive 30% sales More attractive Higher purchase satisfaction 3% sales From Iyengar and Lepper (2000) Satisfaction decreases with larger sets as increased attractiveness is counteracted by choice difficulty
Satisfaction and item set lengthMore options provide more benefits in terms offinding the right optionBut result in higher opportunity costs More comparisons required Increased potential regret Larger expectations for larger setsParadox of choice(Barry Schwartz)
Choice Overload in Recommenders (Bollen, Knijnenburg, Willemsen & Graus, RecSys 2010) Top-20 Lin-20 vs Top-5 recommendations vs Top-5 recommendations .401 (.189) -.540 (.196) -.633 (.177) .938 (.249) p < .05 p < .01 p < .001 p < .001 + + - +perceived recommendation + perceived recommendation + choice variety .449 (.072) quality .445 (.102) difﬁculty .496 (.152) p < .001 p < .001 p < .005 + + .170 (.069) .172 (.068) .346 (.125) -.217 (.070) p < .05 p < .05 p < .01 p < .005 Choice satisfaction + - 0.5 0.4 movie choice Objective System Aspects (OSA) 0.3 expertise satisfaction Subjective System Aspects (SSA) 0.2 Experience (EXP) Personal Characteristics (PC) 0.1 Interaction (INT) 0 -0.1 Top-5 Top-20 Lin-20
Research on Choice overloadChoice overload is not omnipresent Meta-analysis (Scheibehenne et al., JCR 2010) suggests an overall effect size of zeroChoice overload stronger when: No strong prior preferences Little difference in attractiveness itemsIn consumer literature, most itemsets are not personalized…Prior studies did not control forthe diversity of the item set
Choice Difficulty and DiversityLarger sets are often more difficult because of theincreased uniformity of these sets (Fasolo et al., 2009;Reutskaja et al., 2009) High Density Larger item sets have many small tradeoffs similar options 200 # text messages small inter-product distances 150 and small tradeoffs 100 High density! 50Choice Difficulty related to 0 50 150 250 350lack of justification minutes
Choice difficulty and trade-offsAs item sets become more diverse (less dense)tradeoff size increases High Density Low Density, small tradeoffs large tradeoffs 200 200 # text messages # text messages 150 150 100 100 50 50 0 0 50 150 250 350 50 150 250 350 minutes minutesTradeoffs are effortful… give up one aspect for anotherBut can be justified very easily!
Double Mediation Model for difficulty(Scholten and Sherman, JEP:G 2006)U-shaped relation between diversity and difficulty: Choosing from uniform set is hard to justify but has no uniform diverse Difficulty difficult tradeoffs Choosing from a diverse set encompasses difficult tradeoffs but is easy to justify Diversity / tradeoff sizeDoes this also apply to personalized item sets?Can we use diversity to reduce choice overload?And to what level of diversity?
User StudyManipulate Diversity in a personalized item setwhile keeping attractiveness constant! Recommenders provide the perfect tools for this…Latent Features as means of diversification Factorization algorithms describe movies and users as vectors on a set of latent features Preference dimensions related to real word concepts (e.g. Escapist/serious) (Koren, Bell and Volinsky,2009) Parallel to how choice sets are described in MAUT(multi-attribute utility theory)
Matrix Factorization Usual Suspects The Godfather Die HardJack ? Titanic ?Dylan ? ?OliviaMark ? ? ?
Diversifying attractive items Olivia Dylan Jack Mark
Diversity manipulation in detail10-dimensional MF model Personalized top 200 (close in attractiveness)Low: closest to centroidGreedy algorithm Select N movies with highest inter-item distanceMedium:100 items closest to centroidHigh: from top 200
System characteristicsMF recommender based on MyMedia project10M MovieLens dataset: movies from 1994 5.6M ratings for 70k users and 5.4k movies RMSE of 0.854, MAE of 0.656Movies shown with title and predicted rating: hovering the mouse over the title reveals additional information: short synopsis, cast, director and image
Experimental design/procedurePre-questionnaire Personal characteristicsRating task to train the system (10 ratings)Assess three lists of recommendations Within subjects: low / mid / high diversification Between subjects: number of items (5,10,15,20,25)After each list we measured: Perceived Diversity & Attractiveness Expected Trade-Off Difficulty & Choice Difficulty
Pre-questionnaireStrength of preference 3 items, e.g. “I know what kind of movies I like”Movie expertise 4 items, e.g. “Compared to others I watch a lot of movies”Maximizing tendency 2 items, e.g. “I try to reach the highest standards in everything I do”
After each listPerceived recommendation diversity 5 items, e.g. “The list of movies was varied”Perceived recommendation attractiveness 5 items, e.g. “The list of recommendations was attractive”Choice difficulty (indicator) “I would find it difficult to choose a movie from this list”Tradeoff difficulty (indicator) “I had to put a lot of effort into comparing the different aspects of the movies”
Participants and Manipulation checks97 Participants from an online database Paid for participation Mean age: 29 years, 52 females and 45 malesLow, medium and high diversificationdiffered in the feature score range Average predicted ratings of the sets were not different! Average feature score range Predicted SD predicted (AFSR) rating rating diversity mean (SE) mean (SE) mean (SE) Low 0.959 (0.015) 4.486 (0.042) 0.163 (0.010) Medium 1.273 (0.016) 4.486 (0.041) 0.184 (0.011) High 1.744 (0.024) 4.527 (0.039) 0.206 (0.013)
Conclusions & Discussion choice diff. tradeoff diff.Diversifying on latent features 0.2 scale difference 0 -0.2 Increases attractiveness/diversity -0.4 -0.6 Reduces trade-off difficulty (high) -0.8 -1 low mid high Reduces choice difficulty (linearly) diversificationNo evidence for U-shaped difficulty model High diversity does not result in trade-off conflicts (perhaps due to the nature of the domain/MF?)No effect of number on items Small sets benefit as much from diversificationDiversification on MF features seemspromising to increase attractiveness!
Future WorkIn this study, no actual choice was made Explains limited effect of number of items We could not measure choice satisfaction or justification-based processesWe will use diversification with list length as twofactors in a new choice overload experiment Item size: 5, 10 and 20 Low and high diversityWe expect choice overload to be more prominentfor low diversity sets