Successfully reported this slideshow.
Your SlideShare is downloading. ×

Measuring user engagement: the do, the do not do, and the we do not know

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 40 Ad

Measuring user engagement: the do, the do not do, and the we do not know

Download to read offline

In the online world, user engagement refers to the quality of the user experience that emphasises the phenomena associated with wanting to use an application longer and frequently. User engagement is a multifaceted, complex phenomenon; this gives rise to a number of measurement approaches. Common ways to evaluate user engagement include self-report measures, e.g., questionnaires; physiological methods, e.g. cursor and eye tracking; and web analytics, e.g., number of site visits, click depth. These methods represent various trade-off in terms of the setting (laboratory versus in the wild), object of measurement (user behaviour, affect or cognition) and scale of data collected. This talk will present various efforts aiming at combining approaches to measure engagement. A particular focus will be what these measures individually and combined can tell us and not tell about user engagement. The talk will use examples of studies on news sites, social media, and native advertising.

In the online world, user engagement refers to the quality of the user experience that emphasises the phenomena associated with wanting to use an application longer and frequently. User engagement is a multifaceted, complex phenomenon; this gives rise to a number of measurement approaches. Common ways to evaluate user engagement include self-report measures, e.g., questionnaires; physiological methods, e.g. cursor and eye tracking; and web analytics, e.g., number of site visits, click depth. These methods represent various trade-off in terms of the setting (laboratory versus in the wild), object of measurement (user behaviour, affect or cognition) and scale of data collected. This talk will present various efforts aiming at combining approaches to measure engagement. A particular focus will be what these measures individually and combined can tell us and not tell about user engagement. The talk will use examples of studies on news sites, social media, and native advertising.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Measuring user engagement: the do, the do not do, and the we do not know (20)

Advertisement

More from Mounia Lalmas-Roelleke (15)

Recently uploaded (20)

Advertisement

Measuring user engagement: the do, the do not do, and the we do not know

  1. 1. Measuring user engagement: the do, the do not do, and the we do not know Mounia Lalmas mounia@acm.org World Usability Day Berlin – November 2014
  2. 2. About me § Since October 2013: Principal Research Scientist at Yahoo Labs London › User engagement, native advertising, social media, search § 2011- 2013: Visiting Principal Scientist at Yahoo Labs Barcelona › User engagement, social media, search § 2008 2010: Microsoft Research/RAEng Research Professor at the University of Glasgow › Quantum theory to model information retrieval § 1999 - 2008: Lecturer (assistant professor) to Professor at Queen Mary, University of London › XML retrieval and evaluation (INEX) Blog: labtomarket.wordpress.com
  3. 3. This talk § What is user engagement › Definitions › Characteristics › Approaches § Attributes of user engagement measurement › Scalability › Setting › Objectivity versus subjectivity › Temporality
  4. 4. What is user engagement?
  5. 5. What is user engagement? “User engagement is a quality of the user experience that emphasizes the phenomena associated with wanting to use a technological resource longer and frequently” (Attfield et al, 2011) self-report: happy, sad, enjoyment, … analytics: click, upload, read, comment, share … physiology: gaze, body heat, mouse movement, … emotional, cognitive and behavioural connection that exists, at any point in time and over time, between a user and a technological resource 5
  6. 6. Why is it important to engage users? § In today’s wired world, users have enhanced expectations about their interactions with technology … resulting in increased competition amongst the purveyors and designers of interactive systems. § In addition to utilitarian factors, such as usability, we must consider the hedonic and experiential factors of interacting with technology, such as fun, fulfillment, play, and user engagement. (O’Brien, Lalmas & Yom-Tov, 2014)
  7. 7. Patterns of user engagement Online sites differ concerning their engagement! Games Users spend much time per visit Search Users come frequently and do not stay long Social media Users come frequently and stay long Niche Users come on average once a week e.g. weekly post News Users come periodically, e.g. morning and evening Service Users visit site, when needed, e.g. to renew subscription (Lehmann etal, 2012)
  8. 8. Why is it important to measure and interpret user engagement well? CTR
  9. 9. Characteristics of user engagement • Users must be focused to be engaged • Distortions in the subjective perception of time used to measure it Focused attention (Webster & Ho, 1997; O’Brien, 2008) • Emotions experienced by user are intrinsically motivating • Initial affective “hook” can induce a desire for exploration, active discovery or participation Positive Affect (O’Brien & Toms, 2008) • Sensory, visual appeal of interface stimulates user & promotes focused attention • Linked to design principles (e.g. symmetry, balance, saliency) Aesthetics (Jacques et al, 1995; O’Brien, 2008) • People remember enjoyable, useful, engaging experiences and want to repeat them • Reflected in e.g. the propensity of users to recommend an experience/a site/a product Endurability (Read, MacFarlane, & Casey, 2002; O’Brien, 2008)
  10. 10. Characteristics of user engagement • Novelty, surprise, unfamiliarity and the unexpected • Appeal to users’ curiosity; encourages inquisitive behavior and promotes repeated engagement Novelty (Webster & Ho, 1997; O’Brien, 2008) • Richness captures the growth potential of an activity • Control captures the extent to which a person is able to achieve this growth potential Richness and control (Jacques et al, 1995; Webster & Ho, 1997) • Trust is a necessary condition for user engagement • Implicit contract among people and entities which is more than technological Reputation, trust and expectation (Attfield et al, 2011) • Difficulties in setting up “laboratory” style experiments • Why should users engage? Motivation, interests, incentives, and benefits (Jacques et al., 1995; O’Brien & Toms, 2008)
  11. 11. Attributes of user engagement § Scale (large versus small) § Setting (laboratory versus field) § Objective versus subjective § Temporality (short- versus long-term) one is not better than other: it depends on aims and constraints.
  12. 12. Measuring user engagement Measures Attributes Self-report Questionnaire, interview, think-aloud and think after protocols Subjective Short- and long-term Lab and field Small scale Physiology EEG, SCL, fMRI eye tracking mouse-tracking Objective Short-term Lab and field Small and large scale Analytics Intra and inter-session metrics Data science Objective Short- and long-term Field Large scale
  13. 13. Towards reliable and valid measurement
  14. 14. dozen – qualitative & physiology hundred to thousand – online survey million – analytics from rich but limited in generalisation … to powerful but hard to explain Scalability
  15. 15. Large scale measurement – analytics Metrics • Dwell time • Session duration • Bounce rate • Play time (video) • Mouse movement • Click through rate (CTR) • Number of pages viewed (click depth) • Conversion rate • Number of UCG (comments) • … Dwell time as a proxy of user interest Dwell time as a proxy of relevance Dwell time as a proxy of conversion Intra-session measurement
  16. 16. Small scale measurement – eye tracking 18 users, 16 tasks each (chose one story & rate it) eye movement recorded Attention (gaze) interest has no role position > saliency Selection mainly driven by interest position > attention (Lin et al, 2007) (Navalpakkam etal, 2012)
  17. 17. Small scale measurement – focused attention questionnaire 5-point scale (strong disagree to strong agree) 1. I lost myself in this news tasks experience 2. I was so involved in my news tasks that I lost track of time 3. I blocked things out around me when I was completing the news tasks 4. When I was performing these news tasks, I lost track of the world around me 5. The time I spent performing these news tasks just slipped away 6. I was absorbed in my news tasks 7. During the news tasks experience I let myself go (O'Brien & Toms, 2010)
  18. 18. Small scale measurement – PANAS questionnaire (10 positive items and 10 negative items) § You feel this way right now, that is, at the present moment [1 = very slightly or not at all; 2 = a little; 3 = moderately; 4 = quite a bit; 5 = extremely] [randomize items] distressed, upset, guilty, scared, hostile, irritable, ashamed, nervous, jittery, afraid interested, excited, strong, enthusiastic, proud, alert, inspired, determined, attentive, active (Watson, Clark & Tellegen, 1988)
  19. 19. Small scale measurement – gaze and self-reporting § News § interest § 57 users § reading task (114) Three metrics: gaze, focus attention and positive affect § questionnaire (qualitative data) § record eye tracking (quantitative data) All three metrics align: interesting content promote all engagement metrics (Arapakis etal, 2014)
  20. 20. From small- to large-scale measurement – mouse tracking § Navigation & interaction with digital environment usually involves the use of a mouse (selecting, positioning, clicking) § Several works show mouse cursor as weak proxy of gaze (attention) § Low-cost, scalable alternative § Can be performed in a non-invasive manner, without removing users from their natural setting
  21. 21. Relevance, dwell time & cursor (Guo & Agichtein, 2012) “reading” a relevant long document vs “scanning” a long non-relevant document
  22. 22. Mouse Gestures à Features x0y0 x2y2 x1y1 x3y3 x4y4 x5y5 x6y6 x7y7 x8y8 t Δt rest Δt rest resting cursor (500ms) resting cursor (1000ms) resting cursor (1500ms) click 4000 6000 y ● ●● ● ●● ● (Arapakis, Lalmas & Valkanas, 2014) 22 users reading two articles 176, 550 cursor positions 2,913 mouse gestures
  23. 23. Towards a taxonomy of mouse gestures for user engagement measurement § The top-ranked clustering configuration is the Spectral Clustering for the original dataset, with hyperbolic tangent kernel, for k = 38 • certain types of mouse gestures occur more or less often, depending on user interest in article • significant correlations between certain types of mouse gestures and self-report measures • cursor behaviour goes beyond measuring frustration • inform about the positive and negative interaction
  24. 24. laboratory “in the wild” from high level of consistency and control … to greater external validity and more “true to life” Setting
  25. 25. Crowdsourcing and self-report § How the visual catchiness (saliency) of “relevant” information impacts › focused attention › affect § Saliency model of visual attention developed by (Itti & Koch, 2000)
  26. 26. Manipulating saliency Web page screenshot Saliency maps salient condition non-salient condition (McCay-Peet, Lalmas & Navalpakkam, 2012)
  27. 27. Study design § 8 tasks = finding latest news or headline on celebrity or entertainment topic § Affect measured pre- and post- task using the Positive e.g. “determined”, “attentive” Affect Schedule (PANAS) § Focused attention measured with 7-item focused attention scale e.g. “I was so involved in my news tasks that I lost track of time”, “I blocked things out around me when I was completing the news tasks” and perceived time § Interest level in topics (pre-task) and questionnaire (post-task) e.g. “I was interested in the content of the web pages”, “I wanted to find out more about the topics that I encountered on the web pages” § 189 (90+99) participants from Amazon Mechanical Turk
  28. 28. Using crowdsourcing works § When headlines are non-salient: users are slow at finding them, report more distraction due to web page features, and show a drop in affect § When headlines are salient: user find them faster, report that it is easy to focus, and maintain positive affect § Users reported “easier to focus in the salient condition” BUT no significant improvement in the focused attention scale or differences in perceived time spent on tasks User interest in web page content is a good predictor è of focused attention, itself a good predictor èof positive affect
  29. 29. objective – analytics and physiological subjective – self-report towards reliability and validity … mapping objective and subjective measurement Objectivity vs Subjectivity
  30. 30. “Ugly” vs “Normal” Interface BBC News Wikipedia
  31. 31. Mouse tracking and self-reporting § 324 users from Amazon Mechanical Turk (between subject design) § Two domains (BBC News and Wikipedia) § Two tasks (reading and search) § “Normal vs Ugly” interface § Questionnaires (qualitative data) › focus attention, positive effect › interest, aesthetics › + demographics, hardware § Mouse tracking (quantitative data) › movement speed, movement rate, click rate, pause length, percentage of time still (Warnock & Lalmas, 2013)
  32. 32. Mouse tracking could not tell much about § focused attention and positive affect § user interests in the task/topic § aesthetics § BUT BUT BUT BUT › “ugly” variant did not result in lower USER aesthetics scores › although BBC > Wikipedia • BUT – the comments left … › Wikipedia: “The website was simply awful. Ads flashing everywhere, poor text colors on a dark blue background.”; “The webpage was entirely blue. I don't know if it was supposed to be like that, but it definitely detracted from the browsing experience.” › BBC News: “The website's layout and color scheme were a bitch to navigate and read.”; “Comic sans is a horrible font.”
  33. 33. Flawed methodology? Non-existing signal? Wrong metric? §W Hraownthgo rmnee Eafsfeucrte ? § Design › Usability versus engagement › Within- versus between-subject § Mouse movement was not sophisticated enough as shown by recent work (Arapakis etal 2014)
  34. 34. short-term long-term from intra-session … to inter-session Temporality
  35. 35. Large scale measurements – analytics • intra-session engagement measures success in attracting user to remain on site for as long as possible. • inter-session engagement measured by observing lifetime user value. intra-session measures inter-session measures • Dwell time • Session duration • Bounce rate • Play time (video) • Mouse movement • Click through rate (CTR) • Number of pages viewed (click depth) • Conversion rate • Number of UCG (comments). • … • Fraction of return visits • Time between visits (inter-session time, absence time) • Total view time per month (video) • Lifetime value (number of actions) • Number of sessions per unit of time • Total usage time per unit of time • Number of friends on site (social networks) • Number of UCG (comments) • … loyalty popularity activity
  36. 36. Inter-session metric – absence time short absence is a sign of loyalty important indication of user engagement (Dupret & Lalmas, 2013)
  37. 37. Absence time – search experience search session metrics absence time 1. Clicks after the 5th results reflect poorer user experience; users cannot find what they are looking for 2. No click means a bad user experience 3. Clicking at bottom is a sign of low quality overall ranking 4. Users finding their answers quickly (click sooner) return sooner to the search application 5. Returning to the same search result page is a worse user experience than reformulating the query.
  38. 38. Conclusions
  39. 39. Measuring User Engagement What is a good signal? What is a good metric? What is a correct interpretation? 1. No one measurement is perfect or complete. 2. Studies have different constraints. 3. Measurement should be applied consistently with attention to reliability. 4. Mostly “normal” interaction. 5. “It is a capital mistake to theorize before one has data.” - Arthur Conan Doyle
  40. 40. Danke schön This talk is based on tutorial & book “Measuring User Engagement” (with Heather O’Brien and Elad Yom-Tov)

×