GAZE INTERACTION FROM BED
John Paulin Hansen
IT University of Copenhagen
Rued Langgaards vej 7
Javier San Augustin
IT University of Copenhagen
Rued Langgaards vej 7
IT University of Copenhagen
Rued Langgaards vej 7
This paper presents a low-cost gaze tracking solution for
bedbound people composed of free-ware tracking software and
commodity hardware. Gaze interaction is done on a large wall-
projected image, visible to all people present in the room. The
hardware equipment leaves physical space free to assist the
person. Accuracy and precision of the tracking system was tested
in an experiment with 12 subjects. We obtained a tracking quality
that is sufficiently good to control applications designed for gaze
interaction. The best tracking condition were achieved when
people were sitting up compared to lying down. Also, gaze
tracking in the bottom part of the image was found to be more
precise than in the top part.
Categories and Subject Descriptors
H.5.2 [User Interfaces]: Input devices and strategies,
Design, Reliability, Experimentation, Human Factors,
Gaze tracking, gaze interaction, universal access, target selection,
alternative input, interface design, disability, assistive technology,
healthcare technology, augmented and alternative communication.
People with severe motor disabilities have been pioneering gaze
interaction since the early 1990s . They can communicate with
friends and family by gaze typing, browse the Internet and play
computer games. Several commercial gaze-tracking systems
support these activities well. Most of the systems are fixed into a
single hardware unit consisting of a monitor, one or more cameras
and infrared (IR) light sources.
Systems with all hardware components built-in usually offer high
accuracy and tolerance to head movements. However, in some
situations this might reduce the flexibility of the setup, making it
difficult to use the system in a non-desktop scenario.
Figure 1 shows a commercial gaze communication system
mounted above a reclining person who has ALS/MND. First, the
space requirements for this setup may seriously obstruct
caretaking routines. Second, the limitation of the viewing-angle of
the monitor makes it difficult for people standing around the bed
to follow what this person is doing with his eyes. Third, if a single
part of the unit breaks down, all of it will have to be sent off for
replacement or repair, leaving the user without communication
means for days. Finally, the relatively high cost of commercial
gaze communication systems may prevent some people with
severe disabilities from having access to one.
Figure 1: A person with ALS/MND using a gaze
communication system from his bed.
People at hospitals who are paralyzed due to a severe medical
condition may also be considered for bedside gaze
communication. For instance, patients with frontal burns and a
lung injury commonly have a tracheostomy tube in the front of the
neck and therefore unable to speak. Obviously, it is of uttermost
importance for patient safety that they are able to communicate
with the medical staff. Furthermore, being able to talk with their
relatives may help them getting through a difficult time.
Consequently, we see a need for a gaze tracking system that does
not occupy the physical space in front of the user. Preferably, the
system could apply a large display that can be seen by a group of
people and it should be composed of inexpensive hardware
components (display, camera, IR lights and PC) that can be
substituted immediately if they fail.
In this paper we examine a system that meets these requirements
with off-the-shelf hardware that tracks the user’s gaze from a
distance. We use a standard video camera placed at the end of the
bed, connected to a PC running an open-source gaze tracking
system. A display image is projected onto a wall in front of the
bed, providing visibility for everyone standing in the room and,
most importantly, freeing the physical space around the user. The
accuracy and precision of the system are evaluated in an
experiment and the effect of lying down vs. seating is also
2. PREVIOUS WORK
Accuracy refers to the degree to which the sensor readings
represent the true value of what is measured. In gaze interaction,
accuracy is measured as the distance between the point-of-gaze
estimated by the system and the point where the user is actually
looking. When the accuracy is low, the user will see an offset
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
NGCA '11, May 26-27 2011, Karlskrona, Sweden
Copyright 2011 ACM 978-1-4503-0680-5/11/05…$10.00.
between the cursor location and the point on the screen where he
is looking. Most gaze tracking systems introduce an offset. In
some cases, big offsets can make it difficult to hit small targets.
Furthermore, the offset may vary across the monitor, usually
being larger in the corners.
Precision refers to the extent to which successive readings of the
eye tracking system agree in value, and measures the spread of the
gaze point recordings over time. When the precision is low, the
cursor might become jittery if it’s not smoothed. A fixation
detection algorithm will use the spread of the gaze samples to
detect fixations and smooth the signal to reduce jitter [2, 3].
Manufacturers of gaze tracking systems commonly state the
spatial resolution of their systems between 0.5 and 1.0 degrees of
visual angle, (see e.g. ). The accuracy is measured by
calculating the error in the gaze position over a set of targets
displayed on the screen. If a gaze tracking system provides an
accuracy of 0.5 degrees, this will be sufficient to hit targets larger
than approximately 20 x 20 pixels at a distance of 50 cm.
However, in a standard user interface like Windows, several of the
interactive elements are smaller than that - down to 6 x 6 pixels.
Compensating zooming tools may provide gaze access to the
small targets, for instance by first enlarging a part of the screen
and then offering a second, final selection within the enlargement
. Several dedicated applications have been designed to support
inaccurate gaze pointing. Some of them employ large screen
buttons, e.g. GazeTalk  and some employ a continuous zoom
build into the selection process, e.g. StarGazer .
Recent studies have looked into long-range gaze interaction with
large displays. Kessels et al.  compared touch and gaze
interaction with transparent displays on shop windows and found
touch to be faster than gaze, while the participants in their study
appreciated the novel experience of interacting with gaze. Sippl et
al.  were able to distinguish between which of the four quarters
of the screen people were looking at while standing at different
distances and viewing angles in front of a large display. This
resolution may be enough for e.g. marketing research (with
objects of interest placed in each corner) but it would not be
efficient for gaze interaction with a standard computer interface.
San Agustin et al.  demonstrated how a low-cost gaze tracking
system could effectively track people 1.5 to 2 meters in front of a
55” monitor after a short calibration procedure. They reported
problems tracking people with glasses and some disturbances
from external light sources.
Twelve participants, six women an six men ranging in ages from
24 to 52 years (M = 33.6 years, SD = 9.7 years) volunteered to
participate in the study. All of them were daily users of computers
and all but one participant had prior experience with gaze
tracking. None of the participants were using glasses but 3 had
A Mac-mini computer executed the ITU Gaze Tracker open-
source software (see  for more details and download of this
system) with Intel Core2Duo processor running Windows 7
Professional. An Optima HD67 projector was 3 meters away
from a white wall, creating an image of 140 cm (w) x 110 cm (h)
with a resolution of 1280 pixels (w) x 1024 pixels (h) (Figure 2).
A Modux 4 Lojer nursing bed (210 cm x 90 cm) was standing
between the projector and the wall. The bed was used in two
positions. In the seated position the back was lifted to 45 degrees.
From this position, the distance from the subject’s head to the top
right and left image corners was 270 cm, and 240 cm to the
bottom corners. In the lying position (flat) the distances were 300
cm to the top corners and 260 cm to the bottom corners. A pillow
was offered for comfort.
A Sony HDR-HC5 video camera was mounted on a stand behind
the bed just below the projected image. The camera was then
zoomed in to capture one of the subject’s eyes with night vision
mode and telemacro enabled. The images were sent from the
camera to the computer via a Firewire-connection. One Sony IVL
IR lamp was mounted on the end of the bed. Total cost of the
apparatus (excluding the nursing bed) is approximately 2000 €.
Half of the subjects started in a seating position and half of them
would start in a lying position. First they conducted the standard
calibration procedure on the ITU Gaze Tracker by looking at 9
points appearing in random order on the screen. The calibration
was redone until the accuracy value reported by the software was
better than 3 degrees. One female participant had to be excluded
from the experiment at this point, since she could not achieve a
satisfying initial calibration.
Immediately after the calibration, participants were told to gaze on
16 points randomly appearing one-by-one in a 4 x 4 grid. Targets
disappeared after a total of 50 samples had been collected at 30
Hz. In the second part of the experiment the positioning of the
subject was changed and the full procedure was repeated (i.e.,
calibration plus 16 measures). The full experiment lasted less than
10 minutes for each participant.
Figure 2: The experimental setup with a subject in a seated
position. A monitor projects the gaze active display on the
wall. A video camera standing behind the bed records the eye
movements and an IR light, mounted on the end of the bed,
provides the corneal reflection for the gaze tracking system.
In the evaluation of the system we have based the performance
measures on the recommendations given in the working copy of
the COGAIN report on standard for measuring Eye tracker
accuracy terms and definitions .
Accuracy, Adeg is defined as the average angular distance, θi
(measured in degrees of visual angle) between n fixation locations
and the corresponding fixation targets (Equation 1).
The spatial precision is calculated as the Root Mean Square
(RMS) of the angular distance θi (measured in degrees of visual
angle) between successive samples (xi, yi) to (xi + 1, yi + 1) (see
A within-participant factorial design was employed. Position
(lying and seated) was used as the first independent variables.
Based on our experience, accuracy and precision tend to differ
across the screen area, especially between the middle area and the
top and bottom areas. We treated target locations as the second
independent variable, distinguishing between measures from the
eight targets in the two middle rows of the grid, the four top
targets and the four bottom target locations. Dependent variables
were accuracy and precision. In total, 11 participants performed 2
trials, each consisting of 16 targets giving a total of 352 measures.
In summary the design was:
Position (seated, lying)
Target location (top, middle or bottom row)
We removed a total of 27 outliers found to be 3 standard
deviations above the mean of either accuracy or precision. We
then conducted an ANOVA on the dependent variables.
The grand mean of accuracy was 1.17 degrees (SD = 0.84
degrees). There was a main effect from Position F(1, 10) = 10.98,
p < 0.001. The seated position (M = 0.96 degrees) was
significantly different from the lying position (M = 1.31 degrees).
There was no effect from top, middle and bottom target locations
and no interaction effects. Accuracy was correlated with the initial
accuracy value reported by the gaze tracker right after calibration,
r = 0.46, (Pearson Product-Moment).
The grand mean of precision was 0.73 degrees (SD = 1.27).
Again, there was a main effect from Position F(1, 10) = 6.95, p <
0.01. The seated position (M = 0,47) was significantly different
from the lying position (M = 0.89). There was a main effect from
target locations F(2, 10) = 3.10, p < 0.05. The Scheffe post-hoc
test showed the precision for the top row targets (M = 0.99) to be
significantly lower than the precision for the bottom (M = 0.51), p
< 0.05, while the middle rows (M = 0.71) were no different from
Figure 3 shows the effect of position on accuracy and precision
from two of the subjects. Gaze samples in the lying position are
noticeably more spread and less accurate than in the seated
Figure 3: Data from two subjects (right and left
column) in the seated (top) and laying (bottom)
The average accuracy of 1.17 degree is lower than what most
commercial systems claim to offer. However, in the present setup
the eye has to be recorded from a rather long distance (approx. 2
meters) and not the usual 50 cm. Also, we conducted the
experiment with a standard video camera and not a high-
performance machine vision camera like the ones normally used
in commercial systems. Under these circumstances we were able
to obtain a sufficient spatial resolution for the gaze tracking
system to support interaction with a range of applications that
have high tolerance to noise or even to a windows environment
with additional zoom-selector tools.
Both accuracy and precision were influenced by the user’s
position. Raising the back 45 degrees would improve system
performance considerably. While some people can easily be lifted
to this position, the clinical conditions of other people may dictate
them to keep lying flat. In this case we may consider creating a
projection onto a white canvas hanging slightly tilted from the
ceiling over the bed with IR lights mounted on the frame of the
canvas (Figure 4A) and/or using a smaller camera placed close to
the users eye (Figure 4B).
Figure 4A: A canvas hanging
from the ceiling with a gaze
interactive image projection.
Figure 4B: A web camera
mounted on a flexible arm
close to the user’s eye.
Precision turned out to be lesser for the upper part of the display
compared to the lower part. No previous studies that we are aware
of have looked into the impact that viewing angle may have on
spatial resolution. It is likely that viewing angle may be of
importance for systems that determine the point-of-regard by
tracking the position of a glint relative to the centre of the pupil
because the pupil will appear more elliptic when seen by a camera
from a low angle. The camera is also more likely to capture
disturbing IR reflections from the eyelids from a low angle
6. FUTURE WORK
Gaze control of smart home technology (e.g. lights, bed
adjustment, television and music player) can make a paralyzed
person more self-sufficient by offering control of appliances
connected to a PC . Video projections onto walls may further
extend the environmental control to several audio-visual media
sources running simultaneously. For instance, a person in a
hospital bed might like to have both a digital photo slide show
and, at the same time, take part in a videoconference running with
his family at home. Furthermore, he needs advanced control of the
lights in the room for the projections and the outgoing video
signals to work well. In his seminal paper, Bolt  envisioned
gaze-orchestrated control of dynamic windows:
“Some of the windows come and go, reflecting their
nature as direct TV linkages into real-time, real-world events.
Others are non-real-time, some dynamic, others static but capable
of jumping into motion. Such an ensemble of information inputs
reflects the managerial world of the top-level executive of the not
too distant electronic future.” (p. 109)
We believe that Bolt’s vision of the future can now be deployed
with affordable technology supporting communication and
entertainment needs for people bound to bed. To explore this, we
are building a full-scale mock-up of a hospital room equipped
with various media and smart home technology that are to be
controlled by gaze only.
Our thanks to Mr. Ms. Fujisawa for their hospitality and advice
regarding the needs of people with ALS/MND. This work was
supported by The Danish Research Council (grant number 09-
075700 and 2106-080046).
 Paivi Majaranta and Kari-Jouko Raiha. 2002. Twenty years
of eye typing: systems and design issues. In Proceedings of
the 2002 symposium on Eye tracking research
applications (ETRA '02). ACM, New York, NY, USA, 15-
 Duchowski, A. T. 2002. A breadth-first survey of eye-
tracking applications. Behavior Research Methods,
Instruments, Computers 34, 455–470.
 Salvucci, D. D. and Goldberg, J. H. 2000. Identifying
fixations and saccades in eye-tracking protocols. In
Proceedings of the 2000 symposium on Eye tracking
research applications. ACM, Palm Beach Gardens,
Florida, United States, 71–78.
 Henrik Skovsgaard, Julio C. Mateo, John Paulin Hansen.
2011. Evaluating gaze-based interface tools to facilitate
point-and-select tasks with small targets. Journal of
Behaviour Information Technology, 2011 (accepted).
A.; Hansen, J. P.; Itoh, K. (2008). Learning to
interact with a computer by gaze. Journal of Behaviour and
Information Technology, Volume 27, Number 4, July 2008 ,
 Dan Witzner Hansen, Henrik H. T. Skovsgaard, John Paulin
Hansen, and Emilie Møllenbach. 2008. Noise tolerant
selection by gaze-controlled pan and zoom in 3D. In
Proceedings of the 2008 symposium on Eye tracking
research applications (ETRA '08). ACM, New York, NY,
USA, 205-212.Hansen, Stargazer
 Angelique Kessels, Evert Loenen, and Tatiana Lashina.
2009. Evaluating Gaze and Touch Interaction and Two
Feedback Techniques on a Large Display in a Shopping
Environment. In Proceedings of the 12th IFIP TC 13
International Conference on Human-Computer Interaction:
Part I (INTERACT '09), Springer-Verlag, Berlin,
 Andreas Sippl, Clemens Holzmann, Doris Zachhuber, and
Alois Ferscha. 2010. Real-time gaze tracking for public
displays. In Proceedings of the First international joint
conference on Ambient intelligence (AmI'10), Springer-
Verlag, Berlin, Heidelberg, 167-176.
 Javier San Agustin, John Paulin Hansen, and Martin Tall.
2010. Gaze-based interaction with public displays using off-
the-shelf components. In Proceedings of the 12th ACM
international conference adjunct papers on Ubiquitous
computing (Ubicomp '10). ACM, New York, NY, USA, 377-
 Fulvio Corno, Alastair Gale, Päivi Majaranta and Kari-Jouko
Räihä (2010): Eye-based Direct Interaction for
Environmental Control in Heterogeneous Smart
Environments. In: Handbook of Ambient Intelligence and
Smart Environments 2010, Part IX, 1117-1138,
Springer+Science Business Media.
 Richard A. Bolt. 1981. Gaze-orchestrated dynamic windows.
In Proceedings of the 8th annual conference on Computer
graphics and interactive techniques (SIGGRAPH '81).
ACM, New York, NY, USA, 109-119.