Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Starke2017 - Effective User Interface Designs to Increase Energy-efficient Behavior in a Rasch-based Energy Recommender System

191 views

Published on

Presentation on our long paper for the #RecSys2017 conference on Recommender Systems, Como, Presented by Alain Starke. It shows how the psychometric Rasch model can enhance user recommendations in the energy domain.
In collaboration with Martijn Willemsen & Chris Snijders - Eindhoven University of Technology.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Starke2017 - Effective User Interface Designs to Increase Energy-efficient Behavior in a Rasch-based Energy Recommender System

  1. 1. Effective User Interface Designs to Increase Energy-efficient Behavior in a Rasch-based Energy Recommender System Alain Starke, Martijn Willemsen, Chris Snijders Human-Technology Interaction Group, Eindhoven University of Technology
  2. 2. Central question Can we design a recommender interface which effectively supports a user’s energy-saving goals? 2
  3. 3. Most consumers perform simple behavioral changes and think these are effective (Attari et al., 2010) 3
  4. 4. But governments mainly recommend large(r) ‘efficiency’ investments (e.g. Gardner & Stern, 2008) 4
  5. 5. How can we recommend from such a diverse set of energy-saving measures? 5
  6. 6. Regular RecSys approaches, e.g. collaborative filtering, are prone to reinforcing current behavior • If we want consumers and users to achieve (energy- saving) goals, we should not only focus on past behavior but ‘move forward’ (cf. Ekstrand & Willemsen, 2016) • We need a model which considers future goals 6
  7. 7. ‘Saving energy’ can be considered as an ordinal item-response model (in our case: a Rasch model) 7
  8. 8. Energy-saving measures can be ordered as increasingly difficult behavioral steps towards attaining the goal of saving energy (Kaiser et al., 2010; Urban & Scasny, 2014) < < 8
  9. 9. These steps reflect willingness & capacity to save energy: a person’s energy-saving ability (Kaiser et al., 2010; Urban & Scasny, 2014) < < 9
  10. 10. We infer behavioral difficulties based on engagement frequencies 10 INPUT Persons indicate which measures they perform Difficult / Obscure Easy / Popular
  11. 11. In a similar vein, we infer energy-saving abilities 11 Low ability, Performs few Persons indicate which measures they perform INPUT High ability, Performs many
  12. 12. Ordinal, one-dimensional Rasch scale of measures and persons  Starke et al. (2015) fitted a scale of 79 measures 12
  13. 13. One’s energy-saving ability is a good starting point to look for appropriate measures A person has a 50% probability of performing a measure with a difficulty equal to his/her ability 13
  14. 14. 14
  15. 15. This paper We developed two recommender interfaces for a Rasch scale of 79 measures, to examine whether it can effectively support the selection and adoption of energy-saving measures 15
  16. 16. Two recommender user studies Study 1: • Using a Rasch scale, are ability-tailored recommendations more satisfactory and effective than non-personalized suggestions? Study 2: • How should advice be tailored around a user’s ability to support energy-efficient behavior? • Can persuasive interface aspects support this? 16
  17. 17. Study 1: Effectiveness of Rasch- based, tailored advice 17
  18. 18. We presented the Rasch scale and the advice in a ‘web shop’ 18
  19. 19. Procedure – steps of our user study 19
  20. 20. Research design • Abilities were estimated using 13 behavioral self-report items (cf. Bond & Fox, 2006; Starke et al., 2015) • We compared four different types of advice: 4 between-subject conditions – Non-personalized, ascending difficulty order (‘Most popular’) – Non-personalized, descending difficulty order (‘Most difficult’) – Ability-tailored, ascending difficulty order – Ability-tailored, descending difficulty order 20
  21. 21. Dependent measures Users interacting with the website • Behavioral difficulty of chosen measures • Number of chosen measures • Clicking behavior Evaluative Survey (7-point Likert scale) • Perceived System Support • Choice Satisfaction • Perceived Effort Survey sent to users after 4 weeks • Extent of implementation of chosen measures (4-point scale) 21
  22. 22. We evaluated the recommender using the user experience framework (Knijnenburg & Willemsen, 2015 – Evaluating Recommender Systems with User Experiments) 22
  23. 23. Participants & analysis • 209 research panel participants used our interface & survey • 78 participants completed the follow-up survey four weeks later • Analysis: Structural Equation Modelling, using confirmatory factor analysis for the user experience aspects 23
  24. 24. Results study 1 24
  25. 25. Ability-tailored advice was perceived as less effortful and – in turn – more supportive & satisfactory 25 *** p < 0.001, ** p < 0.01, * p < 0.05. Tailored Rec’s Perceived Effort -.440* - Perceived Support Choice Satisfaction .746***-.767*** - +
  26. 26. Ability-tailored advice was perceived as less effortful and – in turn – more supportive & satisfactory 26 *** p < 0.001, ** p < 0.01, * p < 0.05. Tailored Rec’s Perceived Effort -.440* - Perceived Support Choice Satisfaction .746***-.767*** - +
  27. 27. User experience aspects reflected interface behavior • Users perceiving system support selected more (easy) items • Behavioral follow-up was higher for easy items 27 *** p < 0.001, ** p < 0.01, * p < 0.05. Tailored Rec’s Perceived Effort Perceived Support Choice Satisfaction .746*** % Executed items Chosen per click Difficulty chosen items No. of chosen items -.767*** .239*** -.113** .196*** .139** -.440* -.312** -.068** - + - - - + + +
  28. 28. Lessons learned • Ability-tailored advice was a more effective approach than simply using the Rasch scale • Ambiguous results for behavioral follow-up – Easy (feasible) measures might lead to more energy- efficient behavior in the long run – Difficult (novel) measures had a positive effect on choice satisfaction 28
  29. 29. Study 2: How should advice be tailored to support energy-efficient choices? (And can fit scores help to persuade users to pick more challenging measures?) 29
  30. 30. Supporting energy-efficient behavior using match/fit scores? 30
  31. 31. 31 Web shop interface with three lists (tabs): ‘Base’, ‘Recommended’ and ‘Challenging’ • ‘Recommended’ contains 15 best-matching measures, with fit scores ranging 100% to 60% • ‘Base’ are easier, ‘challenging’ more difficult
  32. 32. 3x2 Between-subject research design • 3 levels of difficulty, determining contents of the ‘recommended’ list: – Easy / below ability (~75% probability) – Ability-tailored (~50% probability) – Difficult / above ability (~25% probability) • 2 levels of fit score: they were either shown or not – The 100% score was consistent with the difficulty condition – E.g. in the easy condition, measures below a user’s ability (75%) had a 100% match score
  33. 33. Participants, procedure, analysis… • 288 participants used our interface and completed the survey • 46 participants reported behavioral follow-up • Procedure & analysis: Similar to study 1 – We now measure perceived feasibility instead of effort
  34. 34. Results study 2 34
  35. 35. Easy recommendations were perceived as feasible and, in turn, supportive & satisfactory SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990 35 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Rec difficulty -
  36. 36. Easy recommendations were perceived as feasible and, in turn, supportive & satisfactory SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990 36 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Perceived Support Choice Satisfaction.234*** Rec difficulty .506*** .221*** - + +
  37. 37. • Users who felt supported selected more measures • Satisfied users showed a higher % of follow-up 37 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Perceived Support Choice Satisfaction.234*** No. of chosen items % Executed items Difficulty chosen items Rec difficulty .506*** .221*** .385*** −.113** - - ++ + + + +
  38. 38. Users chose slightly more measures when presented easier ones (Showing fit scores did not really matter) 38
  39. 39. Fit scores boosted satisfaction levels for easy measures, but backfired for difficult ones 39
  40. 40. Lessons learned • A satisfactory user interface can lead to the adoption of more energy-saving measures (within system + after 4 weeks) • Easy tailored measures seem to be attractive, as they were perceived as feasible and chosen more often • Fit scores were merely self-reinforcing, not persuasive to attain ‘more difficult goals’ 40
  41. 41. Wrap-up & suggestions • We presented a novel approach to recommendations, using an ordinal Rasch scale in a user study, which also measured behavioral follow-up • A ‘light personalization’ algorithm had an effect on behavioral change  there’s room for more • Rasch-based interfaces can be effective in supporting actual behavior  perhaps in other domains too? 41
  42. 42. Thank you! Alain Starke a.d.starke@tue.nl This research was supported by the Netherlands Organization for Scientific Research (NWO), 406-14-088 42
  43. 43. Supplementary material 43
  44. 44. Full SEM Study 1 χ²(108) = 191.000, p < 0.001, CFI = 0.957, TLI = 0.949, RMSEA = 0.061, 90%-CI: [0.046,0.084]. 44
  45. 45. 45
  46. 46. 46
  47. 47. Surprising result study 1: Users tended to navigate to the right before choosing a measuring 47
  48. 48. Study 1: Choice satisfaction was positively influenced by both support (tailored advice) & advice difficulty 48 *** p < 0.001, ** p < 0.01, * p < 0.05. Tailored Rec’s Perceived Effort Perceived Support Choice Satisfaction .746*** % Executed items Chosen per click Difficulty chosen items No. of chosen items -.767*** .239*** -.113** .196*** .139** -.440* -.312** -.068** - + - - - + + +
  49. 49. Full SEM study 2 χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990, RMSEA = 0.046, 90%-CI: [0.034,0.057]. 49
  50. 50. 50
  51. 51. 51

×