Right now, researchers seem to focus on the algorithmic performance. They believe that better algorithms lead to a better experience. Is that really true?
It can only be true under two assumptions: 1. users want to get personalized recommendations, and 2. they will provide enough feedback to make this possible In order to answer these questions, we need to evaluate the user experience, not the algorithm!
What existing evidence do we have? Increased recommendation accuracy is noticeable, but doesn&#x2019;t always lead to a better UX
McNee et al.: algorithm with best predictions was rated least helpful Torres et al.: algorithm with lowest accuracy resulted in highest satisfaction Ziegler et al.: diversifying recommendation set resulted in lower accuracy but a more positive evaluation
Let&#x2019;s say we have two systems, one with personalized recommendations, and one without: Perception tests whether we are able to notice the difference Evaluation tests whether this increases our satisfaction with the system and, ultimately, our choices These are measures by questionnaires, but we can also look at process data: Effective systems may show decreased browsing and overall viewing time In better systems, users will watch more clips from beginning to end
The more beneficial it seems to be, the more feedback users will provide (Spiekermann et al.; Brodie Karat & Karat; Kobsa & Teltzrow) Minority = Between 40 and 50% in an overview of privacy surveys Privacy concerns reduce users&#x2019; willingness to disclose personal information (Metzger et al.; Teltzrow & Kobsa) Most people = 80% of the respondents of a detailed survey Users&#x2019; actual feedback behavior may be different from their intentions (Spiekermann et al.)
So now we look at why users provide preference information We already know choice satisfaction and perceived system effectiveness, and we hypothesize that a better experience increase the intention to provide feedback
However, privacy concerns may reduce feedback intention, and privacy concerns may be higher for those who don&#x2019;t trust technology in general
Process data: Due to the intention-behavior gap actual feedback may only be moderately correlated to feedback intentions
So let&#x2019;s review the hypotheses (laser-point): Personalized recommendations should have a perceivably higher quality This should in turn increase the user experience of the system and the outcome (choices) A better experience in turn increases their intention to provide feedback However...
Tip: use two conditions to control the causal relations and to single out the effect
Also: log behavioral data and triangulate this with the constructs
Content and system are in German
To explain the rating feature and its effect on recommendations Opening recommendations before rating any items showed a similar explanation
Pps were allowed to close this pop-up without rating After rating, participants were transported to the recommendations
(the length of the vector depends on the impact the tags have) (in terms of cosine similarity)
Allowing ample opportunity to let their feedback behavior be influenced by their user experience Unless they ignored the rating-probe The median number of ratings per user was 15
Tip for UX researchers: you cannot measure UX concepts with a single question. Measurement is far more robust if you construct a scale based on several questions
Exploratory Factor Analysis validates the intended conceptual structure
Finally, test the model with path analysis (mediation on steroids)