A new procedure to increase face matching accuracy in forensic face examination. By Harriet M. J. Smith , Sally Andrews , David White , Josh P. Davis , Melissa F. Colloff, Thom S. Baguley , & Heather D. Flowe
Today I’m going to talk about a novel procedure we’ve developed to help with face matching.
You could be forgiven for believing that humans are extremely good at recognising and perceiving faces.
But although we might be able to recognise friends and family based on a fleeting glimpse and poor viewing conditions, unfamiliar face recognition and perception is surprisingly error prone.
The task of trying to decide if different facial images feature the same unfamiliar person is more difficult than you might think.
This task is referred to as face matching. In an experiment, participants would be asked to compare one image to another image or series of images and decide whether there is an identity match.
Error rates commonly vary between 10 and 30%. Which is pretty worrying when you consider that on a daily basis security decisions rely on unfamiliar face matching. It’s more worrying still that even passport officers, who earn a living from face matching, commit relatively frequent errors. And length of time in service makes no difference.
We were interested in whether there might be ways in which we could try to improve performance? In order to answer this question it’s important to think about why unfamiliar face matching is difficult.
One of the reasons is that the same person can look very different on different occasions. If you don’t know someone, it can be tricky to decide whether two images look different because they feature different people (between person variability) or because the images capture within-person variability.
Face matching is easy for familiar faces because you have knowledge about how that face would vary across different instances (pose, lighting, orientations etc). This means that you can separate transient from stable within-person differences. But if it’s an unfamiliar face, this knowledge is lacking.
Menon et al. looked at what happens when information about variability is provided– they asked participants to match either high or low variability pairs to a probe. Accuracy was higher for the high variability pair, so this really underlines the importance of variability information.
But are there other ways in which we could try and optimise performance on this task?
One of the ways in which facial images vary is in terms of the orientation. A single image of the face no information about 3D structure.
We know that differences in viewpoint from study to test can undermine recognition accuracy.
But what about when orientation information can be used to help inform a match?
Perhaps surprisingly, K and R (2018) found that….. But they suggest that this may be because there was no mental integration of these images – so even though both the frontal and profile views could be compared to another image, this still didn’t allow viewers to build a 3D, view independent representation of the face.
One way in which we could help participants to build a 3D view independent representation is by showing the face moving from side to side. This is one of the things we looked at in this study
Another way in which we might be able to help boost performance is by allowing participants to interact with the face, by this I mean manoeuvring it into any viewpoint they want to in order to be able to make a matching decision.
Such a system should facilitate comparisons not only because the face can be viewed in the same position as the to-be-compared image, therefore reducing within-person variability
It should also confer benefits of familiar face processing by facilitating the building of a 3D view-independent representation.
Further more, the participant will also be less passive which might support performance
There’s a wide range of individual differences in unfamiliar face recognition and matching.
Super recognisers are people with natural face skills. These can be in terms of recognition and perception. The two are related, but not always.
How are super-recognisers different? Differences – esp in terms of discrimination, also more confident.
An equally important question is what are they doing differently? Limited evidence that SRs might rely more on holistic information. However, it has been suggested by Bobak et al that SRs may be better at the structural encoding of faces than typical perceivers (Bobak et al., 2015). This might help them to build a view-independent representation.
However, to date there hasn’t been much work on the basis of differences between normal and super-recognisers. So it seemed sensible, in putting forward a new matching procedure based on interactivity, that we should investigate patterns of performance in normal and super recognisers separately.
Our aims in this study were as follows:
To test how accurately people perform 1-1 face matching tasks when provided with different types of orientation information. And furthermore, if orientation information affects performance, is it to do with fluid movement or interactivity.
We were interested in the relationship between confidence and accuracy.
We were also interested in individual differences so we tested both people with normal face recognition skills in Experiment 1, and those with superior face recognition skills in Experiment 2.
In this experiment we manipulated identity: the correct answer was same in half of the trials, and different in the remaining trials.
And to-be-compared image type. Whereas face 1 was always a static frontal image, face 2 was either a static frontal image, a series of static images of the face at different orientations, a video of the face moving from side to side, or it was interactive. In the interactive condition participants could move the face into different orientations using the computer mouse.
Participants were recruited online with the help of Josh Davis at the University of Greenwich. Thousands of participants across the whole range of face recognition ability have taken part in experiments for Josh in the past. Many have agreed to take part in subsequent experiments. We invited people who had previously completed the CFMT+
CFMT+ is an extension of the CFMT – features particularly difficult trials, and is used to confirm super recognition.
We invited people who had scored 92 or below on this test.
At the beginning of experiment 2 I will address why we used 92 as a cut-off. For now what is important is that this sample does not include people who could be argued to have attained SR scores.
The stimuli were taken from the University of New South Wales Unfamiliar Face and Voice database. This database has a series of facebook images for each individual, as well as a high-quality video of the face moving from side to side.
From this video we were able to construct the stimuli for the different orientation conditions.
Each participants completed 94 trials
In this plot I’m going to show accuracy in each of the conditions. Same identity performance is shown by black dots, and different identity performance is shown by white dots.
This is the frontal condition. Error bars are 95% CIs
We analysed the results using multilevel modelling. To do that we used the lme4 package in R.
There was a main effect of identity. Higher accuracy on same identity trials sits well with the previous literature (White et al., 2014).
There was a main effect of image type, which suggests that to some extent, orientation information supports matching performance.
Worth saying that descriptively speaking the most accurate performance is observed in the interactive condition.
However, there was an interaction between image type and identity. It’s clear that the basis of this is that in the interactive condition, accuracy on different identity trials is higher. And accuracy on same identity trials is lower.
So perhaps interactivity makes between-person variability more salient, and therefore supports performance on different identity trials. It looks like there might be a shift in bias – in the interactive condition, participants are more conservative.
We haven’t done signal detection analyses yet because this data requires a multilevel approach. Traditionally, signal detection would involve aggregating over stimuli. We want to avoid this because there is variability at the stimulus level, which needs to be taken into account. We think we’ve found a way of doing multilevel signal detection though so we’re working on that at the moment.
We used the ordinal package in R to analyse confidence data, but we had problems with the model converging, so currently we’re re-running all the analyses using Bayesian Regression Models the brms package in R. But for now I’ll show you the descriptive statistics.
Here we have accuracy in each of the conditions. Again, same identity trials are shown by black dots and different identity trials are shown by white dots.
We tested whether, in each of the conditions, confidence predicted accuracy, and it did.
Another way of looking at the relationship between confidence and accuracy is to look at the probability of incorrect and correct matches for each level of self-rated confidence. That’s what we did here for each of the image conditions.
Here is the frontal condition. On the left we have incorrect responses, and on the right we have correct responses. A strong relationship between confidence and accuracy would be depicted by higher probability of an incorrect match at lower levels of confidence (left plots), and higher probability of a correct match at higher levels of confidence (right plots). You can see that while the correct responses conform to this pattern, the incorrect responses don’t.
As is clear, this relationship is driven by the correct responses. Remember that most of the data will be captured in the right hand plots (accuracy was pretty high).
Again, all of the participants were recruited online with the help of Josh Davis. This time though, we wanted people who had done particularly well at the CFMT+. Worth saying that sample is similar to means used in other studies (Russel et al., 2012 = 95.0, Bobak et al., 2015 = 95.71).
However, we are aware of the debate about cut-point for super-recognition. We have not adhered to 95 set by Bobak, so we refer to this sample as ‘superior’ recognisers rather than super-recognisers. In any case the term SR would normally be reserved for those who had taken a series of neuropsychological tests.
(We also acknowledge that super-recognition is heterogenous – just because someone is good at face memory doesn’t necessarily mean that they will be good at face matching. However, taking the scores from Exp 1 and 2 together, there was a moderate positive correlation. The correlation coefficient was .46.)
We used multi level modelling to analyse the results.
There was a main effect of identity. Accuracy was higher on different identity trials, which is the opposite pattern to normals. Bobak et al (2016) also found that in comparison to normals, SRs tended to be quite conservative.
There was no main effect of image type, but there was an interaction between identity and image type. The moving and interactive condition support SR performance. There doesn’t seem to be a bias to respond different in these conditions; performance on same identity trials is more accurate, but performance on different identity trials does not suffer.
Again, we tested whether, in each of the conditions, confidence predicted accuracy, and it did.
The patterns observed here are similar to those observed in Experiment 1. The relationship between confidence and accuracy is driven by correct responses.
To help us draw conclusions, here we have the matching performance of normal and superior face recognisers presented side by side.
So by default, perhaps normals are looking for similarities, whereas superiors are looking for differences, which explains higher accuracy on same ID trials for normals, and higher accuracy on different identity trials for superiors.
However, interactivity seems to shift the focus for normals, and highlights differences, whereas for superiors the introduction of fluid movement (present in both video and interacive condition) helps to highlight similarities.
Next step – ask normal/superior face recogniser to either focus on similarities or differences in static and interactive conditions.
Should we recommend interactivity in an applied setting? It depends what is important – is it more dangerous to wrongly conclude that two images of faces belong to the same person, or that they belong to different people. What will have the most serious repercussions? If it’s the former, it would be difficult to argue that interactive isn’t the best.
These are preliminary findings…
A novel interactive face matching procedure: Performance of normal and super face recognizers
A novel interactive face matching
procedure: Performance of normal and
super face recognizers
Harriet M. J. Smith , Sally Andrews , David White , Josh P. Davis ,
Melissa F. Colloff, Thom S. Baguley , & Heather D. Flowe
11 October 2018 2
Are we really that good at recognising/perceiving
Unfamiliar face matching is error prone (Bruce et
Passport officers – 10%
errors (White et al., 2014)
Why is face matching difficult?
Between-person and within-person variability
Unfamiliar face matching accuracy improves when information about
variability is available (Menon et al., 2015)
11 October 2018 3
11 October 2018 4
Faces look different from different viewpoints
Recognition memory – differences in viewpoint affect accuracy
Less of an effect for face matching? (Estudillo & Bindemann, 2014 but see
Bruce et al., 1999; Hill & Bruce, 1996)
No benefit when both frontal and profile views were provided (Kramer
& Reynolds, 2018) 3D view-independent representation?
Images from UNSW
Unfamiliar Face and Voice
Database (White, Burton &
What if the participant could interact with the face, and manoeuvre it
into any viewpoint?
- Facilitate building of 3D view independent representation?
- Encourage engagement deep encoding?
11 October 2018 5
Normal vs super recognisers
11 October 2018 6
Individual differences in unfamiliar face recognition and matching (e.g.
Davis et al., 2016; Bobak et al., 2015; Russell et al., 2009)
Super-recognisers – natural face recognition/face perception skills
What are SRs doing differently? (see Bobak et al., 2015)
- Holistic information?
- Structural encoding view-independent representation?
Effect of orientation information/movement/interactivity?
Experiment 1: ‘Normal’ face recognisers
Experiment 2: ‘Superior’ face recognisers
11 October 2018 7
Results: Confidence and accuracy
11 October 2018 25
11 October 2018 26
Normal and superior face recognisers: different patterns of
Interactive makes ‘normal’ face recognisers behave like ‘superior’
face recognisers when viewing static images
Should we recommend interactivity in an applied setting?
11 October 2018 27
Thank you for listening
Harriet M. J. Smith