Attention Approximation:
From the web to multi-screen television
Caroline Jay
caroline.jay@manchester.ac.uk
Web Ergonomics Lab, University of Manchester
Research funded by EPSRC Knowledge Transfer and Impact Acceleration Accounts
‘Attention Approximation’
• What is it?
• Why is it useful?
• Where did it come from?
• How are we using it now?
Attention Approximation 2
Attention Approximation
• Determining the ‘focus’ of attention, where ‘focus’
may vary along a number of dimensions:
– Granularity
• Which device?
• Which part of the screen?
– Population
• Individual
• Particular group
• Everyone
– Time period
• Seconds
• Time of day
3Attention Approximation
Driving technology development with
empirical models
• Conceptual representations of interaction
built entirely on data can help us
– Predict technology usage
– Inform interaction design
• In applied research, ecological validity is
important.
4Attention Approximation
Ecologically valid interaction models
• Task may not be predetermined.
• We want to understand what the user is
doing, and why.
– We need to know the current focus of attention.
• When there are multiple parallel information
streams, determining which is in focus is hard.
Attention Approximation 5
Translating Web content to audio
• Screen readers handled dynamic updates
badly.
• If we understood how sighted users view
updates, could we translate them to audio
more effectively?
6
SASWAT project, funded by EPSRC (EP/E062954/1)Attention Approximation
Controlled study
• Real Web pages
• View for 30 seconds
• Conditions:
– Ticker active
– Ticker stationary
• Are people more likely to
look at the moving ticker?
Results
Stationary ticker Moving ticker
Results
Stationary ticker Moving ticker
Exploratory study
• Participants completed tasks on sites that
contained dynamic content.
– No constraints on how task was completed.
– No constraints on where task was completed.
• Nine minutes of browsing.
10Attention Approximation
Data-driven analysis
• Can we predict whether people view dynamic
updates as a function of their characteristics?
• Chi-squared Interaction Detector (CHAID) analysis
– Action: click, hover, keystroke, enter, none
– Area: cm2
– Duration: seconds
– (participant)
– (addition or replacement)
• Validation data from later study
11Attention Approximation
Results
• CHAID model predicts viewing behaviour with an
accuracy of ~80%
• Best predictor: action
Keystroke/Enter/Hover
41%
None
20%
Click
77%
Action
12Attention Approximation
1.1-7.8
71%
7.8-32.9
90%
>32.9
99%
<1.1
39%
Click
77%
Area (cm2)
Click-activated updates
13Attention Approximation
All other updates
2.8-6.2
20%
>6.2
30%
<2.8
6%
>2.8
81%
1.2-2.8
59%
0.6-1.2
41%
<0.6
16%
None
20%
Duration (s) Duration (s)
Keystroke/Enter/Hover
41%
14Attention Approximation
Why does the model take this form?
• Area (and action) are properties of the update.
– As an update increases in size it becomes more
salient.
• Duration is sometimes a property of the update,
and sometimes a property of user behaviour.
– The longer a suggestion
list appears on the screen,
the more likely it is to be
viewed.
– People pause to view the
content.
15Attention Approximation
Translating dynamic updates to audio
• FireFox plugin
– Prioritize click-activated updates.
– Deliver keystroke-activated updates whenever
there is a pause in typing.
– Opt-in to receiving automatic updates.
• Preferred by all participants in blind and
double-blind evaluation when compared with
FireVox baseline.
16Attention Approximation
A conversation with BBC R&D
• Can we predict behaviour with other types of
media?
• Can we use this to drive future media
development?
17Attention Approximation
18Attention Approximation
19Attention Approximation
20Attention Approximation
21Attention Approximation
Media interaction models
• Desktop, Web and social media
– Lean forward
• Newspaper, film and television
– Lean back
• Two or more screens
– Lean back and lean forward
– Lean back and lean back
– Lean forward and lean forward
22Attention Approximation
Eye tracking TV viewing
C. Jay, A. Brown, M. Glancy, M. Armstrong, S.
Harper (2013). Attention approximation: from
the Web to multi-screen television. TVUX-
2013@CHI.
http://goo.gl/dvAp3V
Brown, M. Evans, C. Jay, M. Glancy, R. Jones, S.
Harper (2014). HCI over multiple screens.
CHI EA: alt.chi 2014.
http://goo.gl/UJhPC5
23Attention Approximation
Attention on a single screen
24Attention Approximation
Television Second screen
Attention across two screens
• Observation of existing second
screen app use
• Unconstrained interaction
• Eye tracking
25Attention Approximation
Technical issues
• Can we track eye movement over two screens?
• Is the set up ecologically valid?
26Attention Approximation
Data validity
• Good calibration.
• Good match between eye tracking data and
video analysis.
• Good match between data collected with and
without eye tracking.
27Attention Approximation
Results
• 5:1 split of visual attention to the TV
• Dwell times longer for the TV
Length of viewing period
> 30 seconds < 2.5 seconds
TV 27% 30%
Tablet < 1% 51%
28Attention Approximation
Television
Split of attention across two screen
Tablet
29Attention Approximation
Updates and action
30
TV:
‘There, there, there..!’
Tablet:
‘Where to see a dolphin’
Attention Approximation
31Attention Approximation
32Attention Approximation
Attention approximation in action
Attention Approximation 33
Approximating attention in the wild
• Improve the ecological validity of predictive
models.
• Detect focus to drive interaction on the fly.
Attention Approximation 34
Touch as a proxy for visual attention
35
Web proxy logging tool:
A. Apaolaza, S. Harper & C. Jay (2013). Understanding users in the wild. W4A 2013.
Attention Approximation
Using attention approximation in
technology development
• It’s complicated – particularly in the wild
– Influence
– Inference
• Model according to application
– Production design
– Content delivery
• Ultimate contribution
– To advance craft-based engineering with science
36Attention Approximation
Find out more
Publications, reports and data:
http://goo.gl/1h4z4K
caroline.jay@manchester.ac.uk
The Web Ergonomics Lab
The University of Manchester, UK
http://wel.cs.manchester.ac.uk/
37Attention Approximation
Challenge
• Model must predict future observations.
– Internal validity: reliably predicts observations in
the same setting.
– External validity: reliably predicts observations in
other settings.
38
What is the
appropriate paradigm
for building this type
of model?
Attention Approximation
Challenges
• Eye tracking is accurate, but only suitable for the
lab
– Currently investigating logging data and interaction on
the device
• Many factors to consider:
– Interaction
– Content
– Environment
• If we can effectively monitor these in the wild…
– Privacy
39Attention Approximation

Attention Approximation: From the web to multi-screen television

  • 1.
    Attention Approximation: From theweb to multi-screen television Caroline Jay caroline.jay@manchester.ac.uk Web Ergonomics Lab, University of Manchester Research funded by EPSRC Knowledge Transfer and Impact Acceleration Accounts
  • 2.
    ‘Attention Approximation’ • Whatis it? • Why is it useful? • Where did it come from? • How are we using it now? Attention Approximation 2
  • 3.
    Attention Approximation • Determiningthe ‘focus’ of attention, where ‘focus’ may vary along a number of dimensions: – Granularity • Which device? • Which part of the screen? – Population • Individual • Particular group • Everyone – Time period • Seconds • Time of day 3Attention Approximation
  • 4.
    Driving technology developmentwith empirical models • Conceptual representations of interaction built entirely on data can help us – Predict technology usage – Inform interaction design • In applied research, ecological validity is important. 4Attention Approximation
  • 5.
    Ecologically valid interactionmodels • Task may not be predetermined. • We want to understand what the user is doing, and why. – We need to know the current focus of attention. • When there are multiple parallel information streams, determining which is in focus is hard. Attention Approximation 5
  • 6.
    Translating Web contentto audio • Screen readers handled dynamic updates badly. • If we understood how sighted users view updates, could we translate them to audio more effectively? 6 SASWAT project, funded by EPSRC (EP/E062954/1)Attention Approximation
  • 7.
    Controlled study • RealWeb pages • View for 30 seconds • Conditions: – Ticker active – Ticker stationary • Are people more likely to look at the moving ticker?
  • 8.
  • 9.
  • 10.
    Exploratory study • Participantscompleted tasks on sites that contained dynamic content. – No constraints on how task was completed. – No constraints on where task was completed. • Nine minutes of browsing. 10Attention Approximation
  • 11.
    Data-driven analysis • Canwe predict whether people view dynamic updates as a function of their characteristics? • Chi-squared Interaction Detector (CHAID) analysis – Action: click, hover, keystroke, enter, none – Area: cm2 – Duration: seconds – (participant) – (addition or replacement) • Validation data from later study 11Attention Approximation
  • 12.
    Results • CHAID modelpredicts viewing behaviour with an accuracy of ~80% • Best predictor: action Keystroke/Enter/Hover 41% None 20% Click 77% Action 12Attention Approximation
  • 13.
  • 14.
  • 15.
    Why does themodel take this form? • Area (and action) are properties of the update. – As an update increases in size it becomes more salient. • Duration is sometimes a property of the update, and sometimes a property of user behaviour. – The longer a suggestion list appears on the screen, the more likely it is to be viewed. – People pause to view the content. 15Attention Approximation
  • 16.
    Translating dynamic updatesto audio • FireFox plugin – Prioritize click-activated updates. – Deliver keystroke-activated updates whenever there is a pause in typing. – Opt-in to receiving automatic updates. • Preferred by all participants in blind and double-blind evaluation when compared with FireVox baseline. 16Attention Approximation
  • 17.
    A conversation withBBC R&D • Can we predict behaviour with other types of media? • Can we use this to drive future media development? 17Attention Approximation
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
    Media interaction models •Desktop, Web and social media – Lean forward • Newspaper, film and television – Lean back • Two or more screens – Lean back and lean forward – Lean back and lean back – Lean forward and lean forward 22Attention Approximation
  • 23.
    Eye tracking TVviewing C. Jay, A. Brown, M. Glancy, M. Armstrong, S. Harper (2013). Attention approximation: from the Web to multi-screen television. TVUX- 2013@CHI. http://goo.gl/dvAp3V Brown, M. Evans, C. Jay, M. Glancy, R. Jones, S. Harper (2014). HCI over multiple screens. CHI EA: alt.chi 2014. http://goo.gl/UJhPC5 23Attention Approximation
  • 24.
    Attention on asingle screen 24Attention Approximation
  • 25.
    Television Second screen Attentionacross two screens • Observation of existing second screen app use • Unconstrained interaction • Eye tracking 25Attention Approximation
  • 26.
    Technical issues • Canwe track eye movement over two screens? • Is the set up ecologically valid? 26Attention Approximation
  • 27.
    Data validity • Goodcalibration. • Good match between eye tracking data and video analysis. • Good match between data collected with and without eye tracking. 27Attention Approximation
  • 28.
    Results • 5:1 splitof visual attention to the TV • Dwell times longer for the TV Length of viewing period > 30 seconds < 2.5 seconds TV 27% 30% Tablet < 1% 51% 28Attention Approximation
  • 29.
    Television Split of attentionacross two screen Tablet 29Attention Approximation
  • 30.
    Updates and action 30 TV: ‘There,there, there..!’ Tablet: ‘Where to see a dolphin’ Attention Approximation
  • 31.
  • 32.
  • 33.
    Attention approximation inaction Attention Approximation 33
  • 34.
    Approximating attention inthe wild • Improve the ecological validity of predictive models. • Detect focus to drive interaction on the fly. Attention Approximation 34
  • 35.
    Touch as aproxy for visual attention 35 Web proxy logging tool: A. Apaolaza, S. Harper & C. Jay (2013). Understanding users in the wild. W4A 2013. Attention Approximation
  • 36.
    Using attention approximationin technology development • It’s complicated – particularly in the wild – Influence – Inference • Model according to application – Production design – Content delivery • Ultimate contribution – To advance craft-based engineering with science 36Attention Approximation
  • 37.
    Find out more Publications,reports and data: http://goo.gl/1h4z4K caroline.jay@manchester.ac.uk The Web Ergonomics Lab The University of Manchester, UK http://wel.cs.manchester.ac.uk/ 37Attention Approximation
  • 38.
    Challenge • Model mustpredict future observations. – Internal validity: reliably predicts observations in the same setting. – External validity: reliably predicts observations in other settings. 38 What is the appropriate paradigm for building this type of model? Attention Approximation
  • 39.
    Challenges • Eye trackingis accurate, but only suitable for the lab – Currently investigating logging data and interaction on the device • Many factors to consider: – Interaction – Content – Environment • If we can effectively monitor these in the wild… – Privacy 39Attention Approximation

Editor's Notes

  • #3 Watching TV now involves more than one screen. People have been using second screens, mobile devices such as tablets or phones, for a while, but much of this has been viewer-led – e.g. looking up additional info or social media. Broadcasters are really keen to exploit this, so they are starting to do develop companion content for what’s happening on main screen. Can see Secret Fortune – play at home. Broadcasters want to go way beyond this, but at the moment this type of interaction is not well understood. Has been SS research, but mostly looking at social aspects of SS use. What we don’t have are models describing cognitive and perceptual aspects of multiple device interaction: how do people split their attention between devices? What are the factors that influence attention orientation? This is what we’re trying to investigate with this work.
  • #4 Could apply to different modalities. May be modelled, or detected on the fly.
  • #7 Some previous work in this area Some controlled studies that movement had an effect, others saying it didn’t.
  • #8 Some previous work in this area Some controlled studies that movement had an effect, others saying it didn’t.
  • #9 Some controlled studies that movement had an effect, others saying it didn’t.
  • #10 Some controlled studies that movement had an effect, others saying it didn’t.
  • #11 We realised pretty early on that a controlled that designing a controlled study at the outset was not going to be possible. There are a plethora of dynamic updates – we had no idea what a typical update looked like – and we wanted to be able to deal with any of them.
  • #12 Virtually all modelling based on predicting or understanding performance as a function of task. Study just talked about as it was only loosely based on task (trying to find errors in spreadsheet). In real world, never going to know person’s task. Not to say we ignore task – may be able to infer what’s happening, and use that to help us understand behaviour – but that it’s not possible to know before somebody has done something, what they are going to do. Especially true of complex web apps. Still useful to be able to predict behaviour though – not least because knowing how someone will respond to the perceptual characteristics of UI components could help with design.
  • #13 1486 updates 585 validators
  • #16 Not surprising -
  • #19 Can the model tell us anything about how people view television content? (picture of final score, red button, dual screen)
  • #20 Can the model tell us anything about how people view television content? (picture of final score, red button, dual screen)
  • #21 Can the model tell us anything about how people view television content? (picture of final score, red button, dual screen)
  • #22 Watching TV now involves more than one screen. People have been using second screens, mobile devices such as tablets or phones, for a while, but much of this has been viewer-led – e.g. looking up additional info or social media. Broadcasters are really keen to exploit this, so they are starting to do develop companion content for what’s happening on main screen. Can see Secret Fortune – play at home. Broadcasters want to go way beyond this, but at the moment this type of interaction is not well understood. Has been SS research, but mostly looking at social aspects of SS use. What we don’t have are models describing cognitive and perceptual aspects of multiple device interaction: how do people split their attention between devices? What are the factors that influence attention orientation? This is what we’re trying to investigate with this work.
  • #23 When we think about media consumption, there are two main interaction models.
  • #24 So far we’ve run eye tracking studies examining two scenarios: additional content on the TV, and additional content on a companion device. The methods for this work have been described in CHI presentations last year and this year, so I won’t go into detail about how they were run, but I will share some of the more interesting results with you.
  • #25 On the web you have event data to help. Previous work showed that it was the most important factor. Lean back consumption we need to consider different factors.
  • #26 So within a TV viewing scenario, there are potentially lots of different types of interaction, and lots of different types of activities. How should we start to investigate this situation? We were working on this research the BBC, who are the primary TV network in the UK, but also produce a lot of programs that go out worldwide. They are really interested in this space, and had already produced a prototype companion app for the show ‘Autumn Watch’, which is a popular nature show that goes out between September and November in the UK. Our approach was to get people to watch the program with the app, and observe what happened. Because we were interested in understanding attention orientation, we decided to track their eye movements during the study, so we could work out which device they were looking at.
  • #27 So the first obvious technical issue here is, can we track eye movements in this scenario? We wanted to use free standing eye trackers as they are less intrusive than head-mounted ones, but they are essentially designed to be used one at a time with a desktop display. We used two Tobii eye trackers, one mounted below the tablet, which was fixed a clamp, and one in front of the TV. The second issue is, is the set up ecologically valid. We’ve set up the lab to look like a living room, but there are two obvious problems. One is that the tablet is clamped, which mayu restrict the participant’s willingness to interact with it; the second is that we are still in a lab, we’re not in someone’s home.
  • #28 False positive rate 6% of segments. The internal validity of the eye tracking data was pretty good. Participants primarily fixated faces on the TV, and text on the tablet, as we might expect, which shows that the calibration was reasonably accurate. The tablet had a camera mounted above it, and we performed a painstaking analysis where we checked, for every half second period, whether the video and eye tracking data agreed on whether the participant was fixating the tablet. This showed that the gaze tracking was pretty accurate, apart from a few cases where the eye tracker, mounted beneath the tablet, was occluded by the participants’ hand. All in all, the data were pretty good though, so we can see that eye tracking provides a quick and accurate means of monitoring attention. So what about the external validity of the data? We were concerned that mounting the tablet in a clamp so that it could be used with the eye tracker would restrict the extent that people interacted with it. To check this, we ran a second experiment, without any eye tracking. The video analysis results from these two data sets were highly correlated, so we’re reasonably confident that interaction wasn’t restricted by the set up that much.
  • #29 Half second time slices. Web cam used to record face.
  • #30 So what did the split of attention look like? This graph shows the percentage of participants who were looking at the TV, along the top, or the tablet, along the bottom, in 5 second intervals. One of the things we can see is that updates to the content on the tablet, shown by thick black lines, drew participants’ attention. Correlation 0.6, 2 sec bins
  • #31 There were of course, other factors that drew attention too. Here’s an example of one of them. We can see that
  • #33 Whether or not people view tablet.
  • #36 % people viewing (y/n) vs. % people touching (y/n) in 2 s bins. Correlation 0.44, 2 sec bins
  • #38 Explored the methodological issues – haven’t explored what the data actually means. Will do this tomorrow.