Josephson Have You Seen Any Of These Men Looking At Whether Eyewitnesses Use Scanpaths To Recognize Suspects In Photo Lineups

Have you seen any of these men? Looking at whether eyewitnesses use
scanpaths to recognize suspects in photo lineups

Sheree Josephson Michael E. Holmes
Weber State University Ball State University
sjosephson@weber.edu mholmes@bsu.edu

lineup provides a real-world situation where Noton and Stark’s
Abstract 1971 theory of visual perception and memory can be tested.
Noton and Stark [1971a, 1971b] reported that when people
The repeated viewing of a suspect’s face by an eyewitness during memorize a face, as they would try to do in an eyewitness
the commission of a crime and subsequently when presented with identification situation, they use a sequential visual sampling
suspects in a photo lineup provides a real-world scenario where process during which several points on the image are fixated,
Noton and Stark’s 1971 “scanpath theory” of visual perception often repetitively. They called this sequence of points a scanpath
and memory can be tested. Noton and Stark defined “scanpaths” and argued that it is often repeated during recognition, especially
as repetitive sequences of fixations and saccades that occur during during the initial viewing of this stimulus.
exposure and subsequently upon re-exposure to a visual stimulus,
facilitating recognition. Ten subjects watched a video of a staged Regarding scanpath theory, Noton and Stark [1971a] asserted:
theft in a parking lot. Scanpaths were recorded for the initial "The internal representation of a pattern in memory is a network
viewing of the suspect’s face and a later close-up viewing of the of features and attention shifts, with a habitually preferred path
suspect’s face in the video, and then on the suspect’s face when through the network, corresponding to the scanpath. During
his picture appeared 24 hours later in a photo lineup constructed recognition, this network is matched with the pattern, directing
by law enforcement officers. These scanpaths were compared the eye or internal attention from feature to feature of the pattern"
using the string-edit methodology to measure resemblance [p. 940]. Furthermore, Noton and Stark [1971a] argued that
between sequences. Preliminary analysis showed support for control of the eye by specific features in the visual stimulus is
repeated scanpath sub-sequences. In the analysis of four clusters improbable because of the differences in scanpaths of different
of scanpaths, there was little within-subject resemblance between viewers for a given pattern. They also rejected the explanation
full scanpath sequences but seven of 10 subjects had repeated that viewers are driven by habits because of variation in scanpaths
scanpath sub-sequences. When a subject’s multiple scanpaths of a given viewer for different stimulus patterns.
across the suspect’s photo in the lineup were compared, instances
of within-subjects repetition of short scanpaths occurred more In the following study, we asked this research question:
often than expected due to chance.
RQ: Do scanpaths emerge in the repeated viewing of a suspect’s
Keywords face by an eyewitness during the commission of a crime and
during the presentation of that face in a photo lineup?
Eye movement, eye tracking, eyewitness identification, optimal
matching analysis, scanpath, sequence comparison, string editing 2 Testing Scanpath Theory Using String-Edit
1 Introduction / Overview We compared recorded scanpaths using string-edit method, a
technique that measures resemblance between sequences by
Every day people are convicted of crimes based on the testimony means of a simple metric based on the insertions, deletions and
of eyewitnesses. However, eyewitness testimony is often substitutions required to transform one sequence into another
inaccurate. Since 1991, according to the Innocence Project [Sankoff and Kruskal 1983]; this generates a “distance index” or
[2009], 244 people in the United States have been exonerated of a measure of dissimilarity.
previous conviction through DNA testing. Strikingly, more than
75 percent of these cases involved mistaken eyewitness String-edit sequence comparison, in conjunction with scaling and
identification. clustering techniques suited for the resulting proximity data,
provides a visual display of inter-sequence distances and
The repeated viewing of the suspect’s face by the eyewitness identifies clusters of similar sequences. If scanpaths are stable
during the actual commission of the crime and subsequently when over repeated viewings, and are not driven wholly by stimulus
the eyewitness is presented with possible suspects in a photo features (and are therefore variable across participants), then
sequences for a given participant and stimulus should group
Copyright © 2010 by the Association for Computing Machinery, Inc.
Permission to make digital or hard copies of part or all of this work for personal or together in neighborhoods in the scaling and should share cluster
classroom use is granted without fee provided that copies are not made or distributed membership. Sequences for a given participant should not be in
for commercial advantage and that copies bear this notice and the full citation on the separate neighborhoods and clusters, though a given
first page. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on
neighborhood or cluster may contain multiple participants,
servers, or to redistribute to lists, requires prior specific permission and/or a fee. reflecting the influence of stimulus features across participants.
Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail
permissions@acm.org.
ETRA 2010, Austin, TX, March 22 – 24, 2010.
© 2010 ACM 978-1-60558-994-7/10/0003 $10.00

49

3 Contribution of This Study The next day, approximately 24 hours later, participants returned
to the eye-tracking laboratory. They listened to these instructions:
While several researchers such as Brandt and Stark [1997], “Yesterday you witnessed a theft in a parking lot. Now you are
Salvucci and Anderson [2001], and Josephson and Holmes [2002] going to be shown a photo lineup constructed by law enforcement
have used string-edit methods to study eye-path sequences, this officers. The thief may or may not be present. Look carefully at
appears to be the first study to test scanpath theory using string- the six men in the photo lineup. Take your time. When you are as
edit methods in an eyewitness identification situation. certain as you can be that you have identified the thief or that the
thief is not present in the photo lineup, say ‘OK.’”
4 Method
4.4 Eye-Tracking Apparatus
4.1 Participants
Eye-movement data were collected using an ISCAN RK-426PC
The participants were 10 college students recruited from a Pupil/Corneal Reflection Tracking System [ISCAN, Inc.
medium-sized university in the western United States. The Operating Manual 1998], which uses a corneal reflection system
participants included seven Caucasian men and three Caucasian to measure the precise location of a person’s eye fixations when
women. Their average age was 26.6 years (SD = 6.65). looking at a visual display – in this case a six-person photo lineup
– on a computer monitor. The eye-tracking system does not
4.2 Stimulus Materials require participants to wear head gear. It uses a real-time digital
image processor to automatically track the center of the pupil and
Participants were first asked to watch a 45-second video of a a low-level infrared reflection from the corneal surface [ISCAN,
property theft in a parking lot while their eyes were tracked. The Inc. Operating Manual 1998]. The system collects data at 60 Hz
video showed a young Caucasian man sitting in a car witnessing a or about every 16.7 milliseconds. For the purposes of this study,
crime being committed by another young Caucasian man. The we did not analyze fixations but looked at the eye-path trace
men were actors who were paid for their participation. They were across the visual stimuli.
selected because they did not possess any unique facial features.
4.5 Sequence Comparison
A photo lineup was constructed by Caucasian law enforcement
officials responsible for producing photo arrays on a daily basis in The next step was to define the eye-path sequence for each
a medium-sized city in the western United States. A mug shot of participant’s two main viewings of the suspect’s face on the
the actor was produced, copying the style and quality of actual stimulus video (see Figure 2) and then the often multiple eye-path
mug shots taken in the police department in this city. It was then sequences across the suspect’s face in the photo lineup (see
placed among five other photos in a photo array with three Figure 1). For example, a viewing beginning with a single trace
photographs on the top row and three photographs on the bottom. over the right eye designated as area “E” followed by traces over
See Figure 1. the nose "H" and the mouth "M" would generate a sequence
beginning “EHM”. For the purposes of this study, sequences were
characterized by the presence of the eye-path trace within a
defined area of interest. AOIs included facial features such as
eyes, eyebrows, nose, mouth, cheeks, ears, forehead and hair.
Each pass over an AOI, regardless of duration, was represented by
a single element in the sequence.

Optimal matching analysis (OMA) was used to compare two
coded sequences in the video (the first appearance of the suspect
and a later close-up) and the multiple coded sequences across the

Figure 1: Example of photo lineup with suspect in E position.

4.3 Data Collection

Participants were instructed to “carefully” watch the video as if
they were the person shown sitting in a car witnessing what was
happening. The video was shown on a 15-inch laptop screen at a
resolution of 1280 by 800 pixels and 32-bit color. It was viewed
at a normal distance of about 18 to 24 inches under ideal indoor
Figure 2: The eye-path sequence for Subject 4 as he scanned
lighting conditions.
the face of the suspect in the close-up in the video.

50

thief’s photo in the lineup to generate a matrix of distance indexes
for these eye-path sequences. OMA is a string-edit tool for Cluster 1: (length range 1 to 4, mean 2.7)
finding the Levenshtein distance between any sequences Members: 4A, 6A, 6B
composed of elements from the same sequence alphabet. Short sequences beginning with gazes to the mouth or chin
Sequences similar in composition and sequence will, when
compared, have smaller distances; the more different the
sequences, the greater the distance. To adjust for the role of Cluster 2: (length range 2 to 6, mean 3.9)
sequence length in the total cost of alignment, the inter-sequence Members: 2B, 4B, 7A, 7C, 9B, 11B, 11C, 13A, 13B, 13C
distance is determined by dividing the total alignment cost by the Sequences tend to include glances at the nose and left cheek.
length of the longer sequence in the sequence. Given the
normalization of the distance index by the length of the longer of
the sequences, the distance index can range from 0 for identical Cluster 3: (length range 1 to 12, mean 4.1)
sequences to 1 for maximally dissimilar sequences. Examples, Members: 1A, 3A, 3C, 4C, 5A, 5B, 9A, 12B
illustrations and applications of OMA can be found in Kruskal Sequences tend to include glances at hair and the right eye.
[1983].

Alignments may use a combination of substitutions, insertions
Cluster 4: (length range 2 to 7, mean 4.8)
and deletions to produce the Levenshtein distance. Brandt and
Members: 2C, 3B, 5C, 6C, 7B, 9C, 11A, 12A, 12C
Stark [1997] set equal substitution costs for all pairs of sequence
Sequences tend to include mid-sequence gazes at the bridge of the
elements. Similarly, in this study equal substitution costs were
nose and middle or late glances at the right eye.
used because there was no compelling reason to differentiate
those costs given the similarity of the images and their close
arrangement to each other. Table 1: Clusters for two viewings of the suspect in the video
(A and B), as well as the first scan across the suspect’s face in
5 Sequence Comparison Results the photo lineup (C).

SPSS software, version 15 [SPSS 2007] was used to perform "A" sequences are naturally constrained in by the duration of
hierarchical cluster analysis, Ward’s method (CLUSTER) and appearance of the thief's face appears on video length (as are “B”
non-metric multidimensional scaling (PROXSCAL) on the sequences though somewhat less so). “C” sequences are less
scanpath distance matrices generated by comparing the initial constrained due to the unchanging nature of still images.
viewing of the suspect’s face in the video, the close-up of the However, while the longest sequences tended to be generated by
suspect’s face in the video, and the first scan of the thief's face in the photo lineup, all three sequence types have instances as short
the photo lineup. It was also used to compare the multiple as one or two scanpath regions.
scanpaths across the suspect’s photo in the lineup. Scaling
arranges sequences in n-dimensional space such that the spatial Sequence length appears to be associated with cluster
arrangement approximates the distances between sequences; membership, with the shortest average sequence length in Cluster
cluster analysis defines “neighborhoods” of similar cases within 1 and the longest average sequence length in Cluster 4. However,
that space. sequence content also drives the clustering. Key content
differentiators are noted in Table 1.
5.1 Cluster Analysis
Finally, seven of 10 subjects had repeated scanpath sub-sequences
Dendograms and agglomeration schedules for the hierarchical within the clusters. Cluster 1 contains two scanpaths from Subject
clustering of the scanpath sequence distance matrix were 6; Cluster 2 has two scanpaths from Subjects 7 and 11, and three
examined to identify clustering stages which yielded a sharp from Subject 13; Cluster 3 has two scanpaths from Subjects 3 and
increase in distance coefficients. Review of the results yielded 5; and Cluster 4 contains two scanpaths from Subject 12. Thus
four clusters at stage 25 of the 39-stage clustering. The member this preliminary analysis showed support for repeated scanpath
sequences of the four clusters are listed and described in Table 1. sub-sequences.

Sequences are identified by case number and sequence type. "A" 5.2 MDS Results
sequences are composed of the scanpath AOI sequence over the
short, initial glimpse of the thief's face in the video. "B" sequences Scree plots of stress for one- to five-dimension MDS solutions
are composed of the scanpath AOI sequence over the longer first revealed an elbow in the stress curve at two dimensions, yielding
closeup of the thief's face in the video, and "C" sequences are a solution with an acceptable level of stress (stress = .10). The
composed of the first scanpath AOI sequence over the thief's face two-dimensional MDS solution is displayed in Figure 3 with the
in the photo lineup. Hence "2A" is the first video scanpath for cluster analysis results superimposed.
Subject 2.
While neither dimension of the MDS solution produced a
The mean sequence lengths in Table 1 suggest sequence length is significant correlation with sequence length (Dim 1, r=0.26,
associated with cluster membership. This is unsurprising given p=.89; Dim 2, r=-.14, p=.45), the association of length to
the role of length in contributing to the Levenshtein distance. sequence groupings and placement in the MDS solution is
However, visual examination of the sequences reveals content apparent when looking from Cluster 1, in the upper left of the
differences as well as length differences with each cluster showing plot, to Cluster 4, in the lower right of the plot. Sequence length
at least one content pattern of interest.

51

component glances. Much of the repetition is repeated
occurrences of a pair within a single scanpath rather than across
multiple scanpaths within or between subjects.

6 Conclusion

Preliminary analysis showed support for repeated scanpath sub-
sequences. Six of 10 subjects had two repeated scanpath sub-
sequences that clustered. These two scanpaths consisted of the
two frontal viewings of the suspect in the video or one of these
scanpaths with the initial scanpath across his picture in the photo
lineup. One subject’s scanpaths for all three viewings were
grouped within one cluster. When a subject’s multiple scanpaths
across the suspect’s photo in the lineup were compared, nine
types of within-subjects repetition of short scanpaths occurred
more often than expected due to chance, also adding support for
repeated scanpath sub-sequences. While this research is
Figure 3: Sequence comparison MDS solution in two preliminary in nature, it shows limited support for Noton and
dimensions with cluster results superimposed. Stark’s [1971a, 1971b) “scanpath theory” of visual perception and
memory in an eyewitness identification situation.
varies within clusters but shows a general trend from shorter to
longer from the upper left to lower right. While small sequence References
lengths and group sizes preclude statistical testing for
relationships between placement in the MDS solution and BRANDT, S.A. AND STARK, L. W. 1997. Spontaneous Eye
sequence content, an assumption can be made that the dispersion Movements During Visual Imagery Reflect the Content of the
on the axes results from a combination of sequence length and Visual Scene, Journal of Cognitive Neuroscience 9, 1, 27-38.
sequence composition differences.
INNOCENCE PROJECT 2009. Innocence Project Home.
5.3 Transition pair frequency Retrieved October 10, 2009, from http://www.innocence
project.org.
Adjacent pairs of AOI glances constitute the smallest possible
subsequences of the full scanpaths. We tabulated all transition ISCAN, INC. January 1998. RK-726PCI Pupil/Corneal
pairs without regard to AOI order in the pair (i.e., a hair to Reflection Tracking System PCI Card Version Operating
forehead transition is treated as equal to a forehead to hair Instructions. Rikki Razdan.
transition). Table 2 lists all such pairs occurring at least four
JOSEPHSON, S. and HOLMES, M. E. November 2002.
times in the full data set. Each listed pair appears at a greater
Attention to Repeated Images on the World-Wide Web:
frequency than predicted by the simple joint probability of its
Another Look at Scanpath Theory. Behavior Research Methods,
Transition pairs of Expected Observed Instruments, & Computers 34, 4, 539-548.
facial features probability frequency
KRUSKAL, J. B. 1983. An overview of sequence comparison. In
D. Sankoff and J.B. Kruskal (Eds.), Time warps, string edits,
right eye to bridge of nose 1.34% 12.1%
and macromolecules: the theory and practice of sequence
left check to nose 1.58% 7.7% comparison, 1-44. Addison-Wesley, Reading, MA.

right eye to right cheek 1.22% 6.6% NOTON, D. and STARK, L. W. 1971a. Scanpaths in Saccadic
Eye Movements While Viewing and Recognizing Patterns,
hair to forehead 0.40% 5.5% Vision Research 11, 929-942.

nose to upper lip 0.67% 5.5% NOTON, D. and STARK. L. W. 1971b. Scanpaths in Eye
Movements During Pattern Perception, Science 171, 308-311.
nose to right eye 2.14% 4.4%
SALVUCCI, D. D. and ANDERSON, J. R. 2001. Automated
left eye to bridge of nose 0.50% 4.4% Eye-Movement Protocol Analysis, Human-Computer
Interaction 16, 1.
left check to left eye 0.70% 4.4%
SPSS INC. 2007. SPSS Base 15.0 for Windows User's Guide.
mouth to upper lip 0.30% 4.4% SPSS Inc.

Table 2: Transitions between AOI pairs occurring at least four
times.

52

Josephson Have You Seen Any Of These Men Looking At Whether Eyewitnesses Use Scanpaths To Recognize Suspects In Photo Lineups

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (17)

More from Kalle

More from Kalle (20)

Josephson Have You Seen Any Of These Men Looking At Whether Eyewitnesses Use Scanpaths To Recognize Suspects In Photo Lineups