SlideShare a Scribd company logo
1 of 82
Download to read offline
The User Experience
    of Recommender Systems
Introduction
What I want to do today
Introduction
Bart Knijnenburg
 - Current PhD candidate,
     -
     Informatics
         UC Irvine
     -   Research intern,
         Samsung R&D

 - Past
     -Researcher & teacher,
         TU Eindhoven
     -   MSc Human Technology
         Interaction, TU Eindhoven
     -   M Human-Computer
         Interaction, Carnegie Mellon


                                        INFORMATION AND COMPUTER SCIENCES
Introduction
Bart Knijnenburg
 - UMUAI paper Experience
    -Explaining the User
        of Recommender Systems
        (first UX framework in RecSys)

 - Founder ofonUCERSTI
    -workshop user-centric
        evaluation of recommender
        systems and their interfaces
    -   this year: tutorial on user
        experiments




                                        INFORMATION AND COMPUTER SCIENCES
Introduction
My goal:
   Introduce my framework for user-centric evaluation of
   recommender systems
   Explain how the framework can help in conducting user
   experiments

My approach:
 - Start with “beyond the five stars”
 - Explain how to improve upon this
 - Introduce a 21st century approach to user experiments
                                               INFORMATION AND COMPUTER SCIENCES
Introduction

        “What is a user experiment?”



  “A user experiment is a scientific method to
    investigate how and why system aspects
influence the users’ experience and behavior.”


                                   INFORMATION AND COMPUTER SCIENCES
Introduction
What I want to do today

Beyond the 5 stars
The Netflix approach

Towards user experience
How to improve upon the Netflix approach

Evaluation framework
A framework for user-centric evaluation

Measurement
Measuring subjective valuations

Analysis
Statistical evaluation of results

Some examples
Findings based on the framework
Beyond the 5 stars
    The Netflix approach
Beyond the 5 stars

Offline evaluations may not give the same outcome as
online evaluations
   Cosley et al., 2002; McNee et al., 2002

Solution: Test with real users




                                             INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

System                         Interaction
algorithm                            rating




                         INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

Higher accuracy does not always mean higher satisfaction
   McNee et al., 2006

Solution: Consider other behaviors
   “Beyond the five stars”




                                              INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

System                         Interaction
algorithm                            rating
                               consumption
                                  retention




                         INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

The algorithm counts for only 5% of the relevance of a
recommender system
   Francisco Martin - RecSys 2009 keynote

Solution: test those other aspects
   “Beyond the five stars”




                                               INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

  System                          Interaction
 algorithm                              rating
interaction                       consumption
presentation                         retention




                            INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Start with a hypothesis
   Algorithm/feature/design X will increase member
   engagement and ultimately member retention

Design a test
   Develop a solution or prototype
   Think about dependent & independent variables, control,
   significance…

Execute the test

                                              INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars

Let data speak for itself
   Many different metrics
   We ultimately trust member engagement (e.g. hours of
   play) and retention

Is this good enough?
   I think it is not



                                             INFORMATION AND COMPUTER SCIENCES
Towards user experience
   How to improve upon the Netflix approach
User experience


   “Testing a recommender against a static
videoclip system, the number of clicked clips
     and total viewing time went down!”




                                   INFORMATION AND COMPUTER SCIENCES
User experience
             number of
          clips watched
          from beginning
               to end                           total                        number of
                               +            viewing time                   clips clicked




                   personalized                     −                        −
                    recommendations
                                      OSA

                                                        perceived system
                                                     effectiveness
                           +                                               EXP

                                            +
                perceived recommendation
                           quality
                                      SSA
                                            +

                                                             choice
                                                        satisfaction
                                                                           EXP



Knijnenburg et al.: “Receiving Recommendations and Providing Feedback”, EC-Web 2010




                                                                                           INFORMATION AND COMPUTER SCIENCES
User experience
Behavior is hard to interpret
   Relationship between behavior and satisfaction is not
   always trivial

User experience is a better predictor of long-term retention
   With behavior only, you will need to run for a long time

Questionnaire data is more robust
   Fewer participants needed


                                                 INFORMATION AND COMPUTER SCIENCES
User experience
Measure subjective valuations with questionnaires
   Perception and experience

Triangulate these data with behavior
   Find out how they mediate the effect on behavior

Measure every step in your theory
   Create a chain of mediating variables


                                              INFORMATION AND COMPUTER SCIENCES
User experience
Perception: do users notice the manipulation?
 - Understandability, perceived control
 - Perceived recommendation diversity and quality
Experience: self-relevant evaluations of the system
 - Process-related (e.g. choice difficulty)
 - System-related (e.g. system effectiveness)
 - Outcome related (e.g. choice satisfaction)
                                                INFORMATION AND COMPUTER SCIENCES
User experience

  System        Perception   Experience         Interaction
 algorithm       usability     system                 rating
interaction       quality      process          consumption
presentation      appeal      outcome              retention




                                          INFORMATION AND COMPUTER SCIENCES
User experience

Personal and situational characteristics may have an
important impact
   Adomavicius et al., 2005; Knijnenburg et al., 2012

Solution: measure those as well




                                                 INFORMATION AND COMPUTER SCIENCES
User experience
                         Situational Characteristics
                   routine         system trust        choice goal


  System        Perception         Experience           Interaction
 algorithm       usability            system                  rating
interaction       quality            process            consumption
presentation      appeal             outcome               retention


                             Personal Characteristics
                   gender             privacy            expertise

                                                  INFORMATION AND COMPUTER SCIENCES
Evaluation framework
  A framework for user-centric evaluation
Framework
Framework for user-centric evaluation of recommenders
                                      Situational Characteristics
                               routine              system trust             choice goal


     System              Perception                Experience                Interaction
    algorithm                usability                 system                     rating
    interaction               quality                  process               consumption
   presentation               appeal                  outcome                   retention


                                        Personal Characteristics
                               gender                  privacy                expertise
        Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012


                                                                                    INFORMATION AND COMPUTER SCIENCES
Framework
                        Situational Characteristics
                   routine      system trust        choice goal


  System        Perception System Aspects (OSA)
                  Objective        Experience    Interaction
 algorithm         -
                  usability / interaction design
                     visual           system        rating
interaction        - recommenderprocess
                  quality           algorithm   consumption
presentation       - presentation of recommendations
                  appeal           outcome        retention
                   - additional features
                         Personal Characteristics
                   gender         privacy             expertise

                                               INFORMATION AND COMPUTER SCIENCES
Framework
                         Situational Characteristics
                   routine
  User Experience (EXP)          system trust    choice goal

  Different aspects may
  System different things
  influence       Perception       Experience         Interaction
 algorithm          usability       system                 rating
   - interface -> system
interaction
      evaluations    quality        process          consumption

   -
presentation        appeal
     preference elicitation        outcome              retention
       method -> choice process
   - algorithm -> quality of Personal Characteristics
                             the
       final choice    gender        privacy           expertise

                                               INFORMATION AND COMPUTER SCIENCES
Framework
                         Situational Characteristics
                   routine        system trust Subjectivegoal
                                 Perception or    choice
                                 System Aspects (SSA)
  System        Perception        Experienceto EXP
                                    Link OSA      Interaction
 algorithm       usability           system
                                    (mediation)      rating
interaction       quality            process          consumption
                                    Increase the robustness
presentation      appeal            of the effects of OSAs on
                                    outcome             retention
                                    EXP
                                    How and why OSAs
                             Personal Characteristics
                                    affect EXP
                   gender            privacy            expertise

                                                 INFORMATION AND COMPUTER SCIENCES
Framework
                           Situational Characteristics
                     routine         system trust        choice goal


  System        Perception      Experience                Interaction
  Personal characteristics (PC)
 algorithm         usability            system                  rating
     Users’ general disposition
interaction         quality            process            consumption
     Beyond the influence of the system
presentation        appeal             outcome               retention
     Typically stable across tasks

                               Personal Characteristics
                     gender             privacy            expertise

                                                    INFORMATION AND COMPUTER SCIENCES
Framework
                          Situational Characteristics
                    routine         system trust        choice goal


  System        Perception       Experience              Interaction
  Situational Characteristics (SC)
 algorithm        usability            system                  rating
     How the task influences the experience
interaction        quality            process            consumption
     Also beyond the influence of the system
presentation       appeal             outcome               retention
     Here used for evaluation, not for augmenting algorithms!

                              Personal Characteristics
                    gender             privacy            expertise

                                                   INFORMATION AND COMPUTER SCIENCES
Framework
                           Situational Characteristics
                     routine       system trust        choice goal
  Interaction (INT)
  System         Perception        Experience           Interaction
  Observable behavior
 algorithm         usability          system                  rating
   - browsing, viewing, log-ins
interaction        quality           process            consumption
  Final step of evaluation
presentation         appeal          outcome               retention
     But not the goal of the evaluation
     Triangulate with EXP Personal Characteristics
                     gender           privacy            expertise

                                                  INFORMATION AND COMPUTER SCIENCES
Framework
                         Situational Characteristics
                   routine         system trust        choice goal


  System        Perception         Experience           Interaction
 algorithm       usability            system                  rating
interaction       quality            process            consumption
presentation      appeal             outcome               retention


                             Personal Characteristics
                   gender             privacy            expertise

                                                  INFORMATION AND COMPUTER SCIENCES
Framework
Has suggestions for measurement scales
   e.g. How can we measure something like “satisfaction”?

Provides a good starting point for causal relations
   e.g. How and why do certain system aspects influence the
   user experience?

Useful for integrating existing work
   e.g. How do recommendation list length, diversity and
   presentation each have an influence on user experience?

                                                INFORMATION AND COMPUTER SCIENCES
Framework


“Statisticians, like artists, have the bad habit
    of falling in love with their models.”

                 George Box




                                      INFORMATION AND COMPUTER SCIENCES
Measurement
Measuring subjective valuations
Measurement


“To measure satisfaction, we asked users
     whether they liked the system
      (on a 5-point rating scale).”




                                 INFORMATION AND COMPUTER SCIENCES
Measurement

Does the question mean the same to everyone?
 - John likes the system because it is convenient
 - Mary likes the system because it is easy to use
 - Dave likes it because the recommendations are good
A single question is not enough to establish content validity
   We need a multi-item measurement scale



                                               INFORMATION AND COMPUTER SCIENCES
Measurement
Measure at least 5 items per concept
   At least 2-3 should remain after removing bad items

It is most convenient to statements to which the participant
can agree/disagree on a 5- or 7-point scale:

            “I am satisfied with this system.”
 Completely disagree, somewhat disagree, neutral, somewhat agree, completely agree




                                                                 INFORMATION AND COMPUTER SCIENCES
Measurement
Use both positively and negatively phrased items
 - They make the questionnaire less “leading”
 - They help filtering out bad participants
 - They explore the “flip-side” of the scale
The word “not” is easily overlooked
   Bad: “The recommendations were not very novel”
   Better: “The recommendations were not very novel”
   Best: “The recommendations felt outdated”

                                                INFORMATION AND COMPUTER SCIENCES
Measurement

Choose simple over specialized words
   Participants may have no idea they are using a
   “recommender system”

Use as few words as possible
   But always use complete sentences



                                                INFORMATION AND COMPUTER SCIENCES
Measurement
Use design techniques that help to improve recall and
provide appropriate time references
   Bad: “I liked the signup process” (one week ago)
   Good: Provide a screenshot
   Best: Ask the question immediately afterwards

Avoid double-barreled items
   Bad: “The recommendations were relevant and fun”


                                               INFORMATION AND COMPUTER SCIENCES
Measurement


“We asked users ten 5-point scale questions
       and summed the answers.”




                                  INFORMATION AND COMPUTER SCIENCES
Measurement
Is the scale really measuring a single thing?
 - 5 items measure satisfaction, the other 5 convenience
 - The items are not related enough to make a reliable scale
Are two scales really measuring different things?
 - They are so closely related that they actually measure the
   same thing

We need to establish convergent and discriminant validity
   This makes sure the scales are unidimensional

                                                INFORMATION AND COMPUTER SCIENCES
Measurement
Solution: factor analysis
 - Define latent factors, specify how items “load” on them
 - Factor analysis will determine how well the items “fit”
 - It will give you suggestions for improvement
Benefits of factor analysis:
 - Establishes convergent and discriminant validity
 - Outcome is a normally distributed measurement scale
 - The scale captures the “shared essence” of the items
                                               INFORMATION AND COMPUTER SCIENCES
Measurement

collct22                                                                      ctrl28


collct23       gipc16   gipc17     gipc18   gipc19   gipc21                   ctrl29


collct24                                                                      ctrl30


collct25         1                                            1               ctrl31

                                  General                Control
           Collection                                                         ctrl32
collct26                          privacy     1         concerns
           concerns
                                 concerns
collct27




                                                                   INFORMATION AND COMPUTER SCIENCES
Measurement


collct22                gipc17     gipc18   gipc21


collct24


collct25         1                                     1
                                                                           ctrl28
                                  General             Control
           Collection
collct26                          privacy    1       concerns
           concerns                                                        ctrl29
                                 concerns
collct27




                                                                INFORMATION AND COMPUTER SCIENCES
Analysis
Statistical evaluation of results
Analysis

                                         Perceived quality
Manipulation -> perception:          1
   Do these two algorithms        0.5
   lead to a different level of
                                    0
   perceived quality?
                                  -0.5
T-test
                                    -1
                                             A              B




                                                  INFORMATION AND COMPUTER SCIENCES
Analysis
                                      System effectiveness
                            2

Perception -> experience:    1
   Does perceived quality
                            0
   influence system
   effectiveness?           -1

Linear regression           -2
                                 -3             0                    3
                                      Recommendation quality



                                                    INFORMATION AND COMPUTER SCIENCES
Analysis

Manipulation -> perception
-> experience                  personalized
                                recommendations
                                                  OSA

   Does the algorithm
   influence effectiveness           +


   via perceived quality?    perceived recommendation
                                    quality
                                                               perceived system
                                                              effectiveness
                                                  SSA   +                         EXP




Path model



                                                        INFORMATION AND COMPUTER SCIENCES
Analysis
                                                  Perceived quality
                                   0.6
Two manipulations ->
                                                                          low diversification
perception:                        0.5
                                                                          high diversification
   What is the combined            0.4

   effect of list diversity and    0.3
   list length on perceived        0.2
   recommendation quality?
                                    0.1

Factorial ANOVA                      0
                                     5 items                10 items               20 items
                                  Willemsen et al.: “Not just more of the same”, submitted to TiiS



                                                                      INFORMATION AND COMPUTER SCIENCES
Analysis
                                          System usefulness
                                2
Manipulation x personal         case-based PE                  attribute-based PE
characteristic -> outcome        1
   Do experts and novices
                                0
   rate these two interaction
   methods differently in       -1
   terms of usefulness?
                                -2
                                     -3                  0                      3
ANCOVA
                                             Domain knowledge
                                Knijnenburg & WIllemsen: “Understanding the Effect of
                                Adaptive Preference Elicitation Methods“, RecSys2009


                                                              INFORMATION AND COMPUTER SCIENCES
Analysis

Use only one method: structural equation models
 - A combination of Factor Analysis and Path Models
 - The statistical method of the 21st century
 - Statistical evaluation of causal effects
 - Report as causal paths

                                             INFORMATION AND COMPUTER SCIENCES
Analysis

We compared three
recommender systems
  Three different algorithms

MF-I and MF-E algorithms
are more effective
     Why?



                               INFORMATION AND COMPUTER SCIENCES
Analysis



Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012



The mediating variables show the entire story


                                                                            INFORMATION AND COMPUTER SCIENCES
Analysis
                                                                                                            participant's
                                                                                                               age
Matrix Factorization recommender with             Matrix Factorization recommender with                                         PC
                                                                                                                                     1.352 (.222)
explicit feedback (MF-E)                          implicit feedback (MF-I)                                                             p < .001
(versus generally most popular; GMP)                   (versus most popular; GMP)
                                  OSA                                                  OSA
                                                                                                                            +
                                                                                                                                  Number of
                                                                                                                                 items rated
                     .356 (.175)                                         .310 (.148)
              +        p < .05                                     +       p < .05
                                                                                                               20.094 (7.600)
                                                                                                                   p < .01           +
                                   1.250 (.183)                                           .846 (.127)
                                     p < .001                                               p < .001
     perceived recommendation                           perceived recommendation                                 perceived system
             variety                            +                 quality                               +      effectiveness
                            SSA                                                  SSA                                                 EXP



             .856 (.132)                    +       .314 (.111)                 1.865 (.729)
      +        p < .001                               p < .01
                                                                           +       p < .05



   Number of items                                                       Number of items
                                        Rating
    clicked in the                                                      rated higher than
                                   of clicked items
        player                                                         the predicted rating



           Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012

                                                                                                                     INFORMATION AND COMPUTER SCIENCES
Analysis


“All models are wrong, but some are useful.”

                George Box




                                   INFORMATION AND COMPUTER SCIENCES
Some examples
Findings based on the framework
Some examples
Giving users many options:
 - increases the chance they find something nice
 - also increases the difficulty of making a decision
Result: choice overload

Solution:
 - Provide short, diversified lists of recommendations

                                                 INFORMATION AND COMPUTER SCIENCES
Some examples
                              Top-20                                                 Lin-20
                    vs Top-5 recommendations                               vs Top-5 recommendations
                                        OSA                                                    OSA



          .455 (.211)          .612 (.220)                                             .879 (.265)
                                                          -.804 (.230)
          p < .05              p < .01                                                 p < .001
                                                          p < .001        −
              +            +                                                                      +
perceived recommendation                 +    perceived recommendation                        +                choice
       variety                                         quality                                              difficulty                 .894 (.287)
                               .503 (.090)                                        .336 (.089)                                         p < .005
                        SSA      p < .001                                SSA        p < .001                               EXP

              +                                       +                        1.151 (.161)               -.417 (.125)
          .181 (.075)                             .205 (.083)                      p < .001               p < .005
            p < .05                                 p < .05               +                           −

                               movie                                                   choice
                           expertise                                             satisfaction
                                             PC                                                       EXP




                                                                                                            INFORMATION AND COMPUTER SCIENCES
Some examples




           INFORMATION AND COMPUTER SCIENCES
Some examples
      Perceived quality                 Choice difficulty                Choice satisfaction
0.6                               1.5                               2

0.5                                 1
                                                                   1.5
0.4
                                 0.5
0.3                                                                  1
                                   0
0.2
                                                                   0.5
0.1                              -0.5

 0                                 -1                               0
 5 items   10 items   20 items     5 items   10 items   20 items    5 items       10 items       20 items

                      low diversification
                      high diversification
                                                                          INFORMATION AND COMPUTER SCIENCES
Some examples
Experts and novices differ in their decision-making process
 - Experts want to leverage their domain knowledge
 - Novices make decisions by choosing from examples
Result: they want different preference elicitation methods

Solution:
   Tailor the preference elicitation method to the user
   A needs-based approach seems to work well for everyone


                                                 INFORMATION AND COMPUTER SCIENCES
Some examples




           INFORMATION AND COMPUTER SCIENCES
Some examples
           System usefulness
 2
 case-based PE         attribute-based PE
  1

 0

 -1

 -2
      -3           0             3
            Domain knowledge



                                            INFORMATION AND COMPUTER SCIENCES
Some examples
                                                     2!                                                                         2!                                                                                       45!


                                         *!                                                                                                                                                                              40!




                                                                                                                                                                           User interface satisfaction!
                                                                                                                                                                                                                         35!
                                                     1!                                                                         1!
                                                                                                                                                *!                                                                                    ***!


                                                                                            Understandability!
                                                                                                                                                                                                                         30!
                                                                                TopN!                                                                          TopN!                                                                                 TopN!
Control!




                                                                                Sort!                                                                          Sort!                                                     25!                         Sort!
                                                     0!                                                                         0!
                                   -2!        -1!          0!   1!         2!   Explicit!                          -2!   -1!          0!   1!        2!        Explicit!                                                                             Explicit!
                                                                                                                                                                                                                         20!

                                                                                                                                                                                                                                          **!
                                                                                Implicit!                                                                      Implicit!                                                                             Implicit!
                                                                                Hybrid!                                                                        Hybrid!                                                   15!                         Hybrid!
                                                     -1!                                                                        -1!
                                                                                                                                                                                                                         10!

                                                                                                                                                                                                                          5!
                                                                                                                                                                                                          -2!      -1!       0!      1!         2!
                                                     -2!                                                                        -2!
                                                                                                                                                                                                                     Domain Knowledge!
                                                Domain Knowledge!                                                          Domain Knowledge!




                                                     2!                                                                         2!
                                         ***!                                                                                                             *!
                                                                                                                                                          1!
 Perceived system effectiveness!




                                         1!          1!              **!                                                        1!                        *!
                                                                                            Choice satisfaction!




                                                                                TopN!                                                                          TopN!
                                                                                Sort!                                                                          Sort!
                                                     0!                                                                         0!
                                   -2!        -1!          0!   1!         2!   Explicit!                          -2!   -1!          0!   1!        2!        Explicit!
                                                                                Implicit!                                                                      Implicit!
                                                                                Hybrid!                                                                        Hybrid!
                                                     -1!                                                                        -1!



                                                     -2!                                                                        -2!
                                                Domain Knowledge!                                                          Domain Knowledge!




                                                                                                                                                                                                                INFORMATION AND COMPUTER SCIENCES
Some examples
                             2!


                 *!          1!

                                                  TopN!
Control!




                                                  Sort!
                             0!
           -2!        -1!          0!   1!   2!   Explicit!
                                                  Implicit!
                                                  Hybrid!
                             -1!



                             -2!
                        Domain Knowledge!




                                                  INFORMATION AND COMPUTER SCIENCES
Some examples
                                  2!



                                  1!
                                                  *!
Understandability!




                                                            TopN!
                                                            Sort!
                                  0!
                     -2!   -1!          0!   1!        2!   Explicit!
                                                            Implicit!
                                                            Hybrid!
                                  -1!



                                  -2!
                             Domain Knowledge!




                                                            INFORMATION AND COMPUTER SCIENCES
Some examples
                                           45!

                                           40!
User interface satisfaction!


                                           35!

                                           30!
                                                        ***!
                                                                       TopN!
                                           25!                         Sort!
                                                                       Explicit!
                                           20!

                                                            **!
                                                                       Implicit!
                                           15!                         Hybrid!

                                           10!

                                            5!
                               -2!   -1!       0!      1!         2!
                                       Domain Knowledge!




                                                                       INFORMATION AND COMPUTER SCIENCES
Some examples
                                                    2!
Perceived system effectiveness!
                                        ***!
                                        1!          1!              **!
                                                                               TopN!
                                                                               Sort!
                                                    0!
                                  -2!        -1!          0!   1!         2!   Explicit!
                                                                               Implicit!
                                                                               Hybrid!
                                                    -1!



                                                    -2!
                                               Domain Knowledge!




                                                                               INFORMATION AND COMPUTER SCIENCES
Some examples
                                    2!
                                                         *!
                                                         1!

                                    1!                   *!
Choice satisfaction!




                                                              TopN!
                                                              Sort!
                                    0!
                       -2!   -1!          0!   1!   2!        Explicit!
                                                              Implicit!
                                                              Hybrid!
                                    -1!



                                    -2!
                               Domain Knowledge!




                                                              INFORMATION AND COMPUTER SCIENCES
Some examples
Social recommenders limit the neighbors to the users’
friends

Result:
 - More control: friends can be rated as well
 - Better inspectability: Recommendations can be traced
We show that each of these effects exist
   They both increase overall satisfaction


                                              INFORMATION AND COMPUTER SCIENCES
Some examples
Rate likes   Weigh friends




                    INFORMATION AND COMPUTER SCIENCES
Some examples




           INFORMATION AND COMPUTER SCIENCES
Some examples
                                                                                                 Personal Characteristics (PC)


                                                Familiarity with                                                          Music                                   Trusting
                                                recommenders                                                             expertise                               propensity


                                                           0.166 (0.077)*                  −0.332 (0.088)***
Objective System Aspects                              +         Subjective System Aspects (SSA)                   −                                         User Experience (EXP)
          (OSA)
                                                                               +             Perceived                            0.375
          !2(2) = 10.70**                                                                                                         (0.094)***
                                         +     Understandability                              control                                                   0.205           0.257
          item: 0.428 (0.207)*                       (R2 = .153)       0.377
          friend: 0.668 (0.206)**
                                                                                                (R2 = .311)                                             (0.100)*        (0.124)*
                                                                       (0.074)***
                                                                                                                                               0.955
          Control                                           +                               +                                                  (0.148)***
   item/friend vs. no control        0.459       +                                                      0.770          +           +                   +         +        +
                                   (0.148)**                                                          (0.094)***
   !2(2) = 10.81**                                         0.231                  0.249                                  Perceived
                                                                                                                                                                Satisfaction
   item: −0.181 (0.097)1                                   (0.114)*           (0.049)***                              recommendation
                                                                                                                                                              with the system
   friend: −0.389 (0.125)**                                                                                               quality           0.410 +               (R2 = .696)
                                                                                                                           (R2 = .512)    (0.092)***
                                                                                                                  +                                          −
                                                                                                     0.148
      Inspectability                                                                               (0.051)**                                   −0.152 (0.063)*
    full graph vs. list only                      −                Interaction (INT)
                                  0.288           Inspection                                                                        0.323
                                (0.091)** +       time (min)                                                                        (0.031)***
                                                     (R2 = .092)                                                           +
                                                                   +   number of known                        +
                                                                                                                      Average rating
                                                                       recommendations                                     (R2 = .508)
                                          0.695 (0.304)*                     (R2 = .044)
                                                                                                    0.067
                                                                                                    (0.022)**




                                                                                                                                                   INFORMATION AND COMPUTER SCIENCES
Introduction
I discussed how to do user experiments with my framework

Beyond the 5 stars
The Netflix approach is a step in the right direction

Towards user experience
We need to measure subjective valuations as well

Evaluation framework
With the framework you can put this all together

Measurement
Measure subjective valuations with questionnaires and factor analysis

Analysis
Use structural equation models to test comprehensive causal models

Some examples
I demonstrated the findings of some studies that used this approach
See you in Dublin!

 bart.k@uci.edu




 www.usabart.nl

    @usabart
Resources
User-centric evaluation
   Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C.:
   Explaining the User Experience of Recommender Systems. UMUAI 2012.
   Knijnenburg, B.P., Willemsen, M.C., Kobsa, A.: A Pragmatic Procedure to
   Support the User-Centric Evaluation of Recommender Systems. RecSys 2011.

Choice overload
   Willemsen, M.C., Graus, M.P., Knijnenburg, B.P., Bollen, D.: Not just more of the
   same: Preventing Choice Overload in Recommender Systems by Offering
   Small Diversified Sets. Submitted to TiiS.
   Bollen, D.G.F.M., Knijnenburg, B.P., Willemsen, M.C., Graus, M.P.:
   Understanding Choice Overload in Recommender Systems. RecSys 2010.



                                                                  INFORMATION AND COMPUTER SCIENCES
Resources
Preference elicitation methods
   Knijnenburg, B.P., Reijmer, N.J.M., Willemsen, M.C.: Each to His Own: How
   Different Users Call for Different Interaction Methods in Recommender
   Systems. RecSys 2011.
   Knijnenburg, B.P., Willemsen, M.C.: The Effect of Preference Elicitation
   Methods on the User Experience of a Recommender System. CHI 2010.
   Knijnenburg, B.P., Willemsen, M.C.: Understanding the effect of adaptive
   preference elicitation methods on user satisfaction of a recommender system.
   RecSys 2009.

Social recommenders
   Knijnenburg, B.P., Bostandjiev, S., O'Donovan, J., Kobsa, A.: Inspectability and
   Control in Social Recommender Systems. RecSys 2012.

                                                                   INFORMATION AND COMPUTER SCIENCES
Resources
User feedback and privacy
   Knijnenburg, B.P., Kobsa, A: Making Decisions about Privacy: Information
   Disclosure in Context-Aware Recommender Systems. Submitted to TiiS.
   Knijnenburg, B.P., Willemsen, M.C., Hirtbach, S.: Getting Recommendations and
   Providing Feedback: The User-Experience of a Recommender System. EC-
   Web 2010.




                                                               INFORMATION AND COMPUTER SCIENCES

More Related Content

Similar to Explaining the User Experience of Recommender Systems with User Experiments

Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?Katrien Verbert
 
Usability Evaluation
Usability EvaluationUsability Evaluation
Usability EvaluationSaqib Shehzad
 
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...Carine Lallemand
 
EclipseCon: Test Confessions - What Eclipsers think and do about testing
EclipseCon: Test Confessions - What Eclipsers think and do about testingEclipseCon: Test Confessions - What Eclipsers think and do about testing
EclipseCon: Test Confessions - What Eclipsers think and do about testingMichaela Greiler
 
Usability testing for qualitative researchers
Usability testing for qualitative researchersUsability testing for qualitative researchers
Usability testing for qualitative researchersKay Corry Aubrey
 
Usability testing for qualitative researchers
Usability testing for qualitative researchersUsability testing for qualitative researchers
Usability testing for qualitative researchersResearchShare
 
Advocating for usability: When, why, and how to improve user experiences
Advocating for usability: When, why, and how to improve user experiencesAdvocating for usability: When, why, and how to improve user experiences
Advocating for usability: When, why, and how to improve user experiencesSarah Joy Arnold
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom
 
Best practices
Best practicesBest practices
Best practicesmeroooo
 
Software Architecture: Why and What?
Software Architecture: Why and What?Software Architecture: Why and What?
Software Architecture: Why and What?Chris F Carroll
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusAlessandro Longo
 
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don Peters
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don PetersMHA2018 - Quality Advocacy: The next progression for Agile Testers - Don Peters
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don PetersAgileDenver
 
User Zoom Kli Health Webinar Sep09 Vf
User Zoom Kli Health Webinar Sep09 VfUser Zoom Kli Health Webinar Sep09 Vf
User Zoom Kli Health Webinar Sep09 VfAlfonso de la Nuez
 
What I Learned In Pr Writing
What I Learned In Pr WritingWhat I Learned In Pr Writing
What I Learned In Pr Writingcwhitin4
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangersguest08cd22
 
Designfo#{1} #{2}trangers
Designfo#{1} #{2}trangersDesignfo#{1} #{2}trangers
Designfo#{1} #{2}trangersguest0437b8
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangersguru100
 
Designfor strangers
Designfor strangersDesignfor strangers
Designfor strangersguestc72c35
 

Similar to Explaining the User Experience of Recommender Systems with User Experiments (20)

Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?Human-centered AI: how can we support lay users to understand AI?
Human-centered AI: how can we support lay users to understand AI?
 
Usability Evaluation
Usability EvaluationUsability Evaluation
Usability Evaluation
 
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...
NORDICHI'14 - Carine Lallemand - How Relevant is an Expert Evaluation of UX b...
 
EclipseCon: Test Confessions - What Eclipsers think and do about testing
EclipseCon: Test Confessions - What Eclipsers think and do about testingEclipseCon: Test Confessions - What Eclipsers think and do about testing
EclipseCon: Test Confessions - What Eclipsers think and do about testing
 
Usability testing for qualitative researchers
Usability testing for qualitative researchersUsability testing for qualitative researchers
Usability testing for qualitative researchers
 
Usability testing for qualitative researchers
Usability testing for qualitative researchersUsability testing for qualitative researchers
Usability testing for qualitative researchers
 
Advocating for usability: When, why, and how to improve user experiences
Advocating for usability: When, why, and how to improve user experiencesAdvocating for usability: When, why, and how to improve user experiences
Advocating for usability: When, why, and how to improve user experiences
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
 
Best practices
Best practicesBest practices
Best practices
 
Software Architecture: Why and What?
Software Architecture: Why and What?Software Architecture: Why and What?
Software Architecture: Why and What?
 
Usability_Evaluation
Usability_EvaluationUsability_Evaluation
Usability_Evaluation
 
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBusCorso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
 
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don Peters
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don PetersMHA2018 - Quality Advocacy: The next progression for Agile Testers - Don Peters
MHA2018 - Quality Advocacy: The next progression for Agile Testers - Don Peters
 
User Zoom Kli Health Webinar Sep09 Vf
User Zoom Kli Health Webinar Sep09 VfUser Zoom Kli Health Webinar Sep09 Vf
User Zoom Kli Health Webinar Sep09 Vf
 
What I Learned In Pr Writing
What I Learned In Pr WritingWhat I Learned In Pr Writing
What I Learned In Pr Writing
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangers
 
Biblioteca.
Biblioteca.Biblioteca.
Biblioteca.
 
Designfo#{1} #{2}trangers
Designfo#{1} #{2}trangersDesignfo#{1} #{2}trangers
Designfo#{1} #{2}trangers
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangers
 
Designfor strangers
Designfor strangersDesignfor strangers
Designfor strangers
 

More from Bart Knijnenburg

Profiling Facebook Users' Privacy Behaviors
Profiling Facebook Users' Privacy BehaviorsProfiling Facebook Users' Privacy Behaviors
Profiling Facebook Users' Privacy BehaviorsBart Knijnenburg
 
Information Disclosure Profiles for Segmentation and Recommendation
Information Disclosure Profiles for Segmentation and RecommendationInformation Disclosure Profiles for Segmentation and Recommendation
Information Disclosure Profiles for Segmentation and RecommendationBart Knijnenburg
 
Counteracting the negative effect of form auto-completion on the privacy calc...
Counteracting the negative effect of form auto-completion on the privacy calc...Counteracting the negative effect of form auto-completion on the privacy calc...
Counteracting the negative effect of form auto-completion on the privacy calc...Bart Knijnenburg
 
Simplifying Privacy Decisions: Towards Interactive and Adaptive Solutions
Simplifying Privacy Decisions: Towards Interactive and Adaptive SolutionsSimplifying Privacy Decisions: Towards Interactive and Adaptive Solutions
Simplifying Privacy Decisions: Towards Interactive and Adaptive SolutionsBart Knijnenburg
 
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...FYI: Communication Style Preferences Underlie Differences in Location-Sharing...
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...Bart Knijnenburg
 
Preference-based Location Sharing: Are More Privacy Options Really Better?
Preference-based Location Sharing: Are More Privacy Options Really Better?Preference-based Location Sharing: Are More Privacy Options Really Better?
Preference-based Location Sharing: Are More Privacy Options Really Better?Bart Knijnenburg
 
Big data - A critical appraisal
Big data - A critical appraisalBig data - A critical appraisal
Big data - A critical appraisalBart Knijnenburg
 
Helping Users with Information Disclosure Decisions: Potential for Adaptation...
Helping Users with Information Disclosure Decisions: Potential for Adaptation...Helping Users with Information Disclosure Decisions: Potential for Adaptation...
Helping Users with Information Disclosure Decisions: Potential for Adaptation...Bart Knijnenburg
 
CSCW networked privacy workshop - keynote
CSCW networked privacy workshop - keynoteCSCW networked privacy workshop - keynote
CSCW networked privacy workshop - keynoteBart Knijnenburg
 
Inspectability and Control in Social Recommenders
Inspectability and Control in Social RecommendersInspectability and Control in Social Recommenders
Inspectability and Control in Social RecommendersBart Knijnenburg
 
Privacy in Mobile Personalized Systems - The Effect of Disclosure Justifications
Privacy in Mobile Personalized Systems - The Effect of Disclosure JustificationsPrivacy in Mobile Personalized Systems - The Effect of Disclosure Justifications
Privacy in Mobile Personalized Systems - The Effect of Disclosure JustificationsBart Knijnenburg
 
Using latent features diversification to reduce choice difficulty in recommen...
Using latent features diversification to reduce choice difficulty in recommen...Using latent features diversification to reduce choice difficulty in recommen...
Using latent features diversification to reduce choice difficulty in recommen...Bart Knijnenburg
 
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...Bart Knijnenburg
 
Recommendations and Feedback - The user-experience of a recommender system
Recommendations and Feedback - The user-experience of a recommender systemRecommendations and Feedback - The user-experience of a recommender system
Recommendations and Feedback - The user-experience of a recommender systemBart Knijnenburg
 

More from Bart Knijnenburg (15)

Profiling Facebook Users' Privacy Behaviors
Profiling Facebook Users' Privacy BehaviorsProfiling Facebook Users' Privacy Behaviors
Profiling Facebook Users' Privacy Behaviors
 
Information Disclosure Profiles for Segmentation and Recommendation
Information Disclosure Profiles for Segmentation and RecommendationInformation Disclosure Profiles for Segmentation and Recommendation
Information Disclosure Profiles for Segmentation and Recommendation
 
Counteracting the negative effect of form auto-completion on the privacy calc...
Counteracting the negative effect of form auto-completion on the privacy calc...Counteracting the negative effect of form auto-completion on the privacy calc...
Counteracting the negative effect of form auto-completion on the privacy calc...
 
Simplifying Privacy Decisions: Towards Interactive and Adaptive Solutions
Simplifying Privacy Decisions: Towards Interactive and Adaptive SolutionsSimplifying Privacy Decisions: Towards Interactive and Adaptive Solutions
Simplifying Privacy Decisions: Towards Interactive and Adaptive Solutions
 
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...FYI: Communication Style Preferences Underlie Differences in Location-Sharing...
FYI: Communication Style Preferences Underlie Differences in Location-Sharing...
 
Preference-based Location Sharing: Are More Privacy Options Really Better?
Preference-based Location Sharing: Are More Privacy Options Really Better?Preference-based Location Sharing: Are More Privacy Options Really Better?
Preference-based Location Sharing: Are More Privacy Options Really Better?
 
Big data - A critical appraisal
Big data - A critical appraisalBig data - A critical appraisal
Big data - A critical appraisal
 
Helping Users with Information Disclosure Decisions: Potential for Adaptation...
Helping Users with Information Disclosure Decisions: Potential for Adaptation...Helping Users with Information Disclosure Decisions: Potential for Adaptation...
Helping Users with Information Disclosure Decisions: Potential for Adaptation...
 
CSCW networked privacy workshop - keynote
CSCW networked privacy workshop - keynoteCSCW networked privacy workshop - keynote
CSCW networked privacy workshop - keynote
 
Hcsd talk ibm
Hcsd talk ibmHcsd talk ibm
Hcsd talk ibm
 
Inspectability and Control in Social Recommenders
Inspectability and Control in Social RecommendersInspectability and Control in Social Recommenders
Inspectability and Control in Social Recommenders
 
Privacy in Mobile Personalized Systems - The Effect of Disclosure Justifications
Privacy in Mobile Personalized Systems - The Effect of Disclosure JustificationsPrivacy in Mobile Personalized Systems - The Effect of Disclosure Justifications
Privacy in Mobile Personalized Systems - The Effect of Disclosure Justifications
 
Using latent features diversification to reduce choice difficulty in recommen...
Using latent features diversification to reduce choice difficulty in recommen...Using latent features diversification to reduce choice difficulty in recommen...
Using latent features diversification to reduce choice difficulty in recommen...
 
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...
Recsys2011 presentation "Each to his own - How Different Users Call for Diffe...
 
Recommendations and Feedback - The user-experience of a recommender system
Recommendations and Feedback - The user-experience of a recommender systemRecommendations and Feedback - The user-experience of a recommender system
Recommendations and Feedback - The user-experience of a recommender system
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Explaining the User Experience of Recommender Systems with User Experiments

  • 1. The User Experience of Recommender Systems
  • 3. Introduction Bart Knijnenburg - Current PhD candidate, - Informatics UC Irvine - Research intern, Samsung R&D - Past -Researcher & teacher, TU Eindhoven - MSc Human Technology Interaction, TU Eindhoven - M Human-Computer Interaction, Carnegie Mellon INFORMATION AND COMPUTER SCIENCES
  • 4. Introduction Bart Knijnenburg - UMUAI paper Experience -Explaining the User of Recommender Systems (first UX framework in RecSys) - Founder ofonUCERSTI -workshop user-centric evaluation of recommender systems and their interfaces - this year: tutorial on user experiments INFORMATION AND COMPUTER SCIENCES
  • 5. Introduction My goal: Introduce my framework for user-centric evaluation of recommender systems Explain how the framework can help in conducting user experiments My approach: - Start with “beyond the five stars” - Explain how to improve upon this - Introduce a 21st century approach to user experiments INFORMATION AND COMPUTER SCIENCES
  • 6. Introduction “What is a user experiment?” “A user experiment is a scientific method to investigate how and why system aspects influence the users’ experience and behavior.” INFORMATION AND COMPUTER SCIENCES
  • 7. Introduction What I want to do today Beyond the 5 stars The Netflix approach Towards user experience How to improve upon the Netflix approach Evaluation framework A framework for user-centric evaluation Measurement Measuring subjective valuations Analysis Statistical evaluation of results Some examples Findings based on the framework
  • 8. Beyond the 5 stars The Netflix approach
  • 9. Beyond the 5 stars Offline evaluations may not give the same outcome as online evaluations Cosley et al., 2002; McNee et al., 2002 Solution: Test with real users INFORMATION AND COMPUTER SCIENCES
  • 10. Beyond the 5 stars System Interaction algorithm rating INFORMATION AND COMPUTER SCIENCES
  • 11. Beyond the 5 stars Higher accuracy does not always mean higher satisfaction McNee et al., 2006 Solution: Consider other behaviors “Beyond the five stars” INFORMATION AND COMPUTER SCIENCES
  • 12. Beyond the 5 stars System Interaction algorithm rating consumption retention INFORMATION AND COMPUTER SCIENCES
  • 13. Beyond the 5 stars The algorithm counts for only 5% of the relevance of a recommender system Francisco Martin - RecSys 2009 keynote Solution: test those other aspects “Beyond the five stars” INFORMATION AND COMPUTER SCIENCES
  • 14. Beyond the 5 stars System Interaction algorithm rating interaction consumption presentation retention INFORMATION AND COMPUTER SCIENCES
  • 15. Beyond the 5 stars Start with a hypothesis Algorithm/feature/design X will increase member engagement and ultimately member retention Design a test Develop a solution or prototype Think about dependent & independent variables, control, significance… Execute the test INFORMATION AND COMPUTER SCIENCES
  • 16. Beyond the 5 stars Let data speak for itself Many different metrics We ultimately trust member engagement (e.g. hours of play) and retention Is this good enough? I think it is not INFORMATION AND COMPUTER SCIENCES
  • 17. Towards user experience How to improve upon the Netflix approach
  • 18. User experience “Testing a recommender against a static videoclip system, the number of clicked clips and total viewing time went down!” INFORMATION AND COMPUTER SCIENCES
  • 19. User experience number of clips watched from beginning to end total number of + viewing time clips clicked personalized − − recommendations OSA perceived system effectiveness + EXP + perceived recommendation quality SSA + choice satisfaction EXP Knijnenburg et al.: “Receiving Recommendations and Providing Feedback”, EC-Web 2010 INFORMATION AND COMPUTER SCIENCES
  • 20. User experience Behavior is hard to interpret Relationship between behavior and satisfaction is not always trivial User experience is a better predictor of long-term retention With behavior only, you will need to run for a long time Questionnaire data is more robust Fewer participants needed INFORMATION AND COMPUTER SCIENCES
  • 21. User experience Measure subjective valuations with questionnaires Perception and experience Triangulate these data with behavior Find out how they mediate the effect on behavior Measure every step in your theory Create a chain of mediating variables INFORMATION AND COMPUTER SCIENCES
  • 22. User experience Perception: do users notice the manipulation? - Understandability, perceived control - Perceived recommendation diversity and quality Experience: self-relevant evaluations of the system - Process-related (e.g. choice difficulty) - System-related (e.g. system effectiveness) - Outcome related (e.g. choice satisfaction) INFORMATION AND COMPUTER SCIENCES
  • 23. User experience System Perception Experience Interaction algorithm usability system rating interaction quality process consumption presentation appeal outcome retention INFORMATION AND COMPUTER SCIENCES
  • 24. User experience Personal and situational characteristics may have an important impact Adomavicius et al., 2005; Knijnenburg et al., 2012 Solution: measure those as well INFORMATION AND COMPUTER SCIENCES
  • 25. User experience Situational Characteristics routine system trust choice goal System Perception Experience Interaction algorithm usability system rating interaction quality process consumption presentation appeal outcome retention Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 26. Evaluation framework A framework for user-centric evaluation
  • 27. Framework Framework for user-centric evaluation of recommenders Situational Characteristics routine system trust choice goal System Perception Experience Interaction algorithm usability system rating interaction quality process consumption presentation appeal outcome retention Personal Characteristics gender privacy expertise Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012 INFORMATION AND COMPUTER SCIENCES
  • 28. Framework Situational Characteristics routine system trust choice goal System Perception System Aspects (OSA) Objective Experience Interaction algorithm - usability / interaction design visual system rating interaction - recommenderprocess quality algorithm consumption presentation - presentation of recommendations appeal outcome retention - additional features Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 29. Framework Situational Characteristics routine User Experience (EXP) system trust choice goal Different aspects may System different things influence Perception Experience Interaction algorithm usability system rating - interface -> system interaction evaluations quality process consumption - presentation appeal preference elicitation outcome retention method -> choice process - algorithm -> quality of Personal Characteristics the final choice gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 30. Framework Situational Characteristics routine system trust Subjectivegoal Perception or choice System Aspects (SSA) System Perception Experienceto EXP Link OSA Interaction algorithm usability system (mediation) rating interaction quality process consumption Increase the robustness presentation appeal of the effects of OSAs on outcome retention EXP How and why OSAs Personal Characteristics affect EXP gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 31. Framework Situational Characteristics routine system trust choice goal System Perception Experience Interaction Personal characteristics (PC) algorithm usability system rating Users’ general disposition interaction quality process consumption Beyond the influence of the system presentation appeal outcome retention Typically stable across tasks Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 32. Framework Situational Characteristics routine system trust choice goal System Perception Experience Interaction Situational Characteristics (SC) algorithm usability system rating How the task influences the experience interaction quality process consumption Also beyond the influence of the system presentation appeal outcome retention Here used for evaluation, not for augmenting algorithms! Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 33. Framework Situational Characteristics routine system trust choice goal Interaction (INT) System Perception Experience Interaction Observable behavior algorithm usability system rating - browsing, viewing, log-ins interaction quality process consumption Final step of evaluation presentation appeal outcome retention But not the goal of the evaluation Triangulate with EXP Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 34. Framework Situational Characteristics routine system trust choice goal System Perception Experience Interaction algorithm usability system rating interaction quality process consumption presentation appeal outcome retention Personal Characteristics gender privacy expertise INFORMATION AND COMPUTER SCIENCES
  • 35. Framework Has suggestions for measurement scales e.g. How can we measure something like “satisfaction”? Provides a good starting point for causal relations e.g. How and why do certain system aspects influence the user experience? Useful for integrating existing work e.g. How do recommendation list length, diversity and presentation each have an influence on user experience? INFORMATION AND COMPUTER SCIENCES
  • 36. Framework “Statisticians, like artists, have the bad habit of falling in love with their models.” George Box INFORMATION AND COMPUTER SCIENCES
  • 38. Measurement “To measure satisfaction, we asked users whether they liked the system (on a 5-point rating scale).” INFORMATION AND COMPUTER SCIENCES
  • 39. Measurement Does the question mean the same to everyone? - John likes the system because it is convenient - Mary likes the system because it is easy to use - Dave likes it because the recommendations are good A single question is not enough to establish content validity We need a multi-item measurement scale INFORMATION AND COMPUTER SCIENCES
  • 40. Measurement Measure at least 5 items per concept At least 2-3 should remain after removing bad items It is most convenient to statements to which the participant can agree/disagree on a 5- or 7-point scale: “I am satisfied with this system.” Completely disagree, somewhat disagree, neutral, somewhat agree, completely agree INFORMATION AND COMPUTER SCIENCES
  • 41. Measurement Use both positively and negatively phrased items - They make the questionnaire less “leading” - They help filtering out bad participants - They explore the “flip-side” of the scale The word “not” is easily overlooked Bad: “The recommendations were not very novel” Better: “The recommendations were not very novel” Best: “The recommendations felt outdated” INFORMATION AND COMPUTER SCIENCES
  • 42. Measurement Choose simple over specialized words Participants may have no idea they are using a “recommender system” Use as few words as possible But always use complete sentences INFORMATION AND COMPUTER SCIENCES
  • 43. Measurement Use design techniques that help to improve recall and provide appropriate time references Bad: “I liked the signup process” (one week ago) Good: Provide a screenshot Best: Ask the question immediately afterwards Avoid double-barreled items Bad: “The recommendations were relevant and fun” INFORMATION AND COMPUTER SCIENCES
  • 44. Measurement “We asked users ten 5-point scale questions and summed the answers.” INFORMATION AND COMPUTER SCIENCES
  • 45. Measurement Is the scale really measuring a single thing? - 5 items measure satisfaction, the other 5 convenience - The items are not related enough to make a reliable scale Are two scales really measuring different things? - They are so closely related that they actually measure the same thing We need to establish convergent and discriminant validity This makes sure the scales are unidimensional INFORMATION AND COMPUTER SCIENCES
  • 46. Measurement Solution: factor analysis - Define latent factors, specify how items “load” on them - Factor analysis will determine how well the items “fit” - It will give you suggestions for improvement Benefits of factor analysis: - Establishes convergent and discriminant validity - Outcome is a normally distributed measurement scale - The scale captures the “shared essence” of the items INFORMATION AND COMPUTER SCIENCES
  • 47. Measurement collct22 ctrl28 collct23 gipc16 gipc17 gipc18 gipc19 gipc21 ctrl29 collct24 ctrl30 collct25 1 1 ctrl31 General Control Collection ctrl32 collct26 privacy 1 concerns concerns concerns collct27 INFORMATION AND COMPUTER SCIENCES
  • 48. Measurement collct22 gipc17 gipc18 gipc21 collct24 collct25 1 1 ctrl28 General Control Collection collct26 privacy 1 concerns concerns ctrl29 concerns collct27 INFORMATION AND COMPUTER SCIENCES
  • 50. Analysis Perceived quality Manipulation -> perception: 1 Do these two algorithms 0.5 lead to a different level of 0 perceived quality? -0.5 T-test -1 A B INFORMATION AND COMPUTER SCIENCES
  • 51. Analysis System effectiveness 2 Perception -> experience: 1 Does perceived quality 0 influence system effectiveness? -1 Linear regression -2 -3 0 3 Recommendation quality INFORMATION AND COMPUTER SCIENCES
  • 52. Analysis Manipulation -> perception -> experience personalized recommendations OSA Does the algorithm influence effectiveness + via perceived quality? perceived recommendation quality perceived system effectiveness SSA + EXP Path model INFORMATION AND COMPUTER SCIENCES
  • 53. Analysis Perceived quality 0.6 Two manipulations -> low diversification perception: 0.5 high diversification What is the combined 0.4 effect of list diversity and 0.3 list length on perceived 0.2 recommendation quality? 0.1 Factorial ANOVA 0 5 items 10 items 20 items Willemsen et al.: “Not just more of the same”, submitted to TiiS INFORMATION AND COMPUTER SCIENCES
  • 54. Analysis System usefulness 2 Manipulation x personal case-based PE attribute-based PE characteristic -> outcome 1 Do experts and novices 0 rate these two interaction methods differently in -1 terms of usefulness? -2 -3 0 3 ANCOVA Domain knowledge Knijnenburg & WIllemsen: “Understanding the Effect of Adaptive Preference Elicitation Methods“, RecSys2009 INFORMATION AND COMPUTER SCIENCES
  • 55. Analysis Use only one method: structural equation models - A combination of Factor Analysis and Path Models - The statistical method of the 21st century - Statistical evaluation of causal effects - Report as causal paths INFORMATION AND COMPUTER SCIENCES
  • 56. Analysis We compared three recommender systems Three different algorithms MF-I and MF-E algorithms are more effective Why? INFORMATION AND COMPUTER SCIENCES
  • 57. Analysis Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012 The mediating variables show the entire story INFORMATION AND COMPUTER SCIENCES
  • 58. Analysis participant's age Matrix Factorization recommender with Matrix Factorization recommender with PC 1.352 (.222) explicit feedback (MF-E) implicit feedback (MF-I) p < .001 (versus generally most popular; GMP) (versus most popular; GMP) OSA OSA + Number of items rated .356 (.175) .310 (.148) + p < .05 + p < .05 20.094 (7.600) p < .01 + 1.250 (.183) .846 (.127) p < .001 p < .001 perceived recommendation perceived recommendation perceived system variety + quality + effectiveness SSA SSA EXP .856 (.132) + .314 (.111) 1.865 (.729) + p < .001 p < .01 + p < .05 Number of items Number of items Rating clicked in the rated higher than of clicked items player the predicted rating Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012 INFORMATION AND COMPUTER SCIENCES
  • 59. Analysis “All models are wrong, but some are useful.” George Box INFORMATION AND COMPUTER SCIENCES
  • 60. Some examples Findings based on the framework
  • 61. Some examples Giving users many options: - increases the chance they find something nice - also increases the difficulty of making a decision Result: choice overload Solution: - Provide short, diversified lists of recommendations INFORMATION AND COMPUTER SCIENCES
  • 62. Some examples Top-20 Lin-20 vs Top-5 recommendations vs Top-5 recommendations OSA OSA .455 (.211) .612 (.220) .879 (.265) -.804 (.230) p < .05 p < .01 p < .001 p < .001 − + + + perceived recommendation + perceived recommendation + choice variety quality difficulty .894 (.287) .503 (.090) .336 (.089) p < .005 SSA p < .001 SSA p < .001 EXP + + 1.151 (.161) -.417 (.125) .181 (.075) .205 (.083) p < .001 p < .005 p < .05 p < .05 + − movie choice expertise satisfaction PC EXP INFORMATION AND COMPUTER SCIENCES
  • 63. Some examples INFORMATION AND COMPUTER SCIENCES
  • 64. Some examples Perceived quality Choice difficulty Choice satisfaction 0.6 1.5 2 0.5 1 1.5 0.4 0.5 0.3 1 0 0.2 0.5 0.1 -0.5 0 -1 0 5 items 10 items 20 items 5 items 10 items 20 items 5 items 10 items 20 items low diversification high diversification INFORMATION AND COMPUTER SCIENCES
  • 65. Some examples Experts and novices differ in their decision-making process - Experts want to leverage their domain knowledge - Novices make decisions by choosing from examples Result: they want different preference elicitation methods Solution: Tailor the preference elicitation method to the user A needs-based approach seems to work well for everyone INFORMATION AND COMPUTER SCIENCES
  • 66. Some examples INFORMATION AND COMPUTER SCIENCES
  • 67. Some examples System usefulness 2 case-based PE attribute-based PE 1 0 -1 -2 -3 0 3 Domain knowledge INFORMATION AND COMPUTER SCIENCES
  • 68. Some examples 2! 2! 45! *! 40! User interface satisfaction! 35! 1! 1! *! ***! Understandability! 30! TopN! TopN! TopN! Control! Sort! Sort! 25! Sort! 0! 0! -2! -1! 0! 1! 2! Explicit! -2! -1! 0! 1! 2! Explicit! Explicit! 20! **! Implicit! Implicit! Implicit! Hybrid! Hybrid! 15! Hybrid! -1! -1! 10! 5! -2! -1! 0! 1! 2! -2! -2! Domain Knowledge! Domain Knowledge! Domain Knowledge! 2! 2! ***! *! 1! Perceived system effectiveness! 1! 1! **! 1! *! Choice satisfaction! TopN! TopN! Sort! Sort! 0! 0! -2! -1! 0! 1! 2! Explicit! -2! -1! 0! 1! 2! Explicit! Implicit! Implicit! Hybrid! Hybrid! -1! -1! -2! -2! Domain Knowledge! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 69. Some examples 2! *! 1! TopN! Control! Sort! 0! -2! -1! 0! 1! 2! Explicit! Implicit! Hybrid! -1! -2! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 70. Some examples 2! 1! *! Understandability! TopN! Sort! 0! -2! -1! 0! 1! 2! Explicit! Implicit! Hybrid! -1! -2! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 71. Some examples 45! 40! User interface satisfaction! 35! 30! ***! TopN! 25! Sort! Explicit! 20! **! Implicit! 15! Hybrid! 10! 5! -2! -1! 0! 1! 2! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 72. Some examples 2! Perceived system effectiveness! ***! 1! 1! **! TopN! Sort! 0! -2! -1! 0! 1! 2! Explicit! Implicit! Hybrid! -1! -2! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 73. Some examples 2! *! 1! 1! *! Choice satisfaction! TopN! Sort! 0! -2! -1! 0! 1! 2! Explicit! Implicit! Hybrid! -1! -2! Domain Knowledge! INFORMATION AND COMPUTER SCIENCES
  • 74. Some examples Social recommenders limit the neighbors to the users’ friends Result: - More control: friends can be rated as well - Better inspectability: Recommendations can be traced We show that each of these effects exist They both increase overall satisfaction INFORMATION AND COMPUTER SCIENCES
  • 75. Some examples Rate likes Weigh friends INFORMATION AND COMPUTER SCIENCES
  • 76. Some examples INFORMATION AND COMPUTER SCIENCES
  • 77. Some examples Personal Characteristics (PC) Familiarity with Music Trusting recommenders expertise propensity 0.166 (0.077)* −0.332 (0.088)*** Objective System Aspects + Subjective System Aspects (SSA) − User Experience (EXP) (OSA) + Perceived 0.375 !2(2) = 10.70** (0.094)*** + Understandability control 0.205 0.257 item: 0.428 (0.207)* (R2 = .153) 0.377 friend: 0.668 (0.206)** (R2 = .311) (0.100)* (0.124)* (0.074)*** 0.955 Control + + (0.148)*** item/friend vs. no control 0.459 + 0.770 + + + + + (0.148)** (0.094)*** !2(2) = 10.81** 0.231 0.249 Perceived Satisfaction item: −0.181 (0.097)1 (0.114)* (0.049)*** recommendation with the system friend: −0.389 (0.125)** quality 0.410 + (R2 = .696) (R2 = .512) (0.092)*** + − 0.148 Inspectability (0.051)** −0.152 (0.063)* full graph vs. list only − Interaction (INT) 0.288 Inspection 0.323 (0.091)** + time (min) (0.031)*** (R2 = .092) + + number of known + Average rating recommendations (R2 = .508) 0.695 (0.304)* (R2 = .044) 0.067 (0.022)** INFORMATION AND COMPUTER SCIENCES
  • 78. Introduction I discussed how to do user experiments with my framework Beyond the 5 stars The Netflix approach is a step in the right direction Towards user experience We need to measure subjective valuations as well Evaluation framework With the framework you can put this all together Measurement Measure subjective valuations with questionnaires and factor analysis Analysis Use structural equation models to test comprehensive causal models Some examples I demonstrated the findings of some studies that used this approach
  • 79. See you in Dublin! bart.k@uci.edu www.usabart.nl @usabart
  • 80. Resources User-centric evaluation Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C.: Explaining the User Experience of Recommender Systems. UMUAI 2012. Knijnenburg, B.P., Willemsen, M.C., Kobsa, A.: A Pragmatic Procedure to Support the User-Centric Evaluation of Recommender Systems. RecSys 2011. Choice overload Willemsen, M.C., Graus, M.P., Knijnenburg, B.P., Bollen, D.: Not just more of the same: Preventing Choice Overload in Recommender Systems by Offering Small Diversified Sets. Submitted to TiiS. Bollen, D.G.F.M., Knijnenburg, B.P., Willemsen, M.C., Graus, M.P.: Understanding Choice Overload in Recommender Systems. RecSys 2010. INFORMATION AND COMPUTER SCIENCES
  • 81. Resources Preference elicitation methods Knijnenburg, B.P., Reijmer, N.J.M., Willemsen, M.C.: Each to His Own: How Different Users Call for Different Interaction Methods in Recommender Systems. RecSys 2011. Knijnenburg, B.P., Willemsen, M.C.: The Effect of Preference Elicitation Methods on the User Experience of a Recommender System. CHI 2010. Knijnenburg, B.P., Willemsen, M.C.: Understanding the effect of adaptive preference elicitation methods on user satisfaction of a recommender system. RecSys 2009. Social recommenders Knijnenburg, B.P., Bostandjiev, S., O'Donovan, J., Kobsa, A.: Inspectability and Control in Social Recommender Systems. RecSys 2012. INFORMATION AND COMPUTER SCIENCES
  • 82. Resources User feedback and privacy Knijnenburg, B.P., Kobsa, A: Making Decisions about Privacy: Information Disclosure in Context-Aware Recommender Systems. Submitted to TiiS. Knijnenburg, B.P., Willemsen, M.C., Hirtbach, S.: Getting Recommendations and Providing Feedback: The User-Experience of a Recommender System. EC- Web 2010. INFORMATION AND COMPUTER SCIENCES