Changing the face of criminal identification

Charlie Frowd, Vicki Bruce and Peter J.B. Hancock describe the latest techniques to construct the face of a criminal
Imagine you are a witness to a crime: you saw a young man running from a bank; it all happened very quickly, but you were able to have a good look at his face as he removed his balaclava. Would you be able to describe his face? There might not be useful CCTV footage from the bank, because of the mask, so would you also be able to make a recognisable image of his face? Until recently the answer to both questions would probably have been ‘no’. But now a 10-year programme of research at the Universities of Stirling, Edinburgh and Central Lancashire is changing things. This article describes this ongoing work, which designs and evaluates improvements to each stage in the process: to the interview, to the system and to the presentation. The police in the UK are now using these techniques.

The difficulty of constructing faces from our memory has been known for over 30 years (e.g. Davies, 1978). We are not good at the tasks required – describing and selecting individual facial features – instead we process faces ‘holistically’, more as a complete entity (e.g. Young et al., 1987). For example, the perception of facial features changes in the presence of other features (e.g. Tanaka & Farah, 1993), and so the features and their position on the face are both important. Modern facial composite systems, where witnesses choose individual features in the context of a complete face, apply this idea to some extent.

There are currently two established systems that the police use to construct ‘facial composites’ of criminal suspects. (Later, we will mention a third kind of system that is currently being introduced.) Firstly, there are forensic artists. These people have skills in portraiture and use pencils or crayons to draw the face by hand. Secondly, there are software packages, of which the UK has two, namely E-FIT and PRO-fit, which offer
a selection of ready-made parts to build a face. In both the artist and computerised systems witnesses select individual facial features – hairstyle, face shape, eyes, brows, nose, etc. The result is a composite of facial parts (see Figure 1 for examples on PDF).

Considerable attention to detail is needed to evaluate the efficacy of a composite system. This is described in our ‘gold standard’ procedure (Frowd, Carson et al., 2005a). In brief, about 10 participants would first be shown an unfamiliar target face. Sometime later, these ‘witnesses’ would describe the face to an artist or computer technician, who would use cognitive interviewing techniques to help them recall, and construct the best possible composite. Technicians must have expertise with the composite system plus good artistic skills to enhance the face (e.g. addition of stubble and wrinkles). Other people who know the targets would attempt to name the composites.

Using this procedure, when the delay to construction is brief (no more than a few hours), E-FIT and PRO-fit composites are typically named about 20 per cent of the time on average (e.g. Davies et al., 2000; Frowd, Bruce, McIntyre et al., 2007; Frowd, Carson et al., 2005a). After two days – which is typically the minimum for real witnesses – correct naming is normally just a few per cent correct! (e.g. Frowd, Bruce, Ness et al., 2007; Frowd, Carson et al., 2005b). For artist-composites, it is about 10 per cent and independent of delay (Frowd, Carson et al., 2005a, 2005b). These data suggest that identification is unlikely from composites made using current procedures and systems.

As part of ongoing research, we have designed and evaluated improvements to each stage in the process: to the interview, to the system and to the presentation. As discussed below, one research thread has successfully enhanced the interview; another has provided an alternative system; a third has caricatured the face at presentation. In time, some of these techniques may also help witnesses elsewhere, for example when they try
to identify a criminal from a line-up.

Improving the interview
Berman and Cutler (1989) found that participants recognise a face much better after they have made a few personality judgements about it – such as intelligence and attractiveness. If the principle extends to witnesses, then their ability to select facial features may also improve. In  two of our experiments (Frowd, Bruce, Ness et al., 2007; Frowd, McQuiston-Surrett et al., 2005), instead of describing a face, participants made seven trait judgements of a target prior to composite construction. Example traits included intelligence, friendliness and arrogance. We found, as predicted, that trait attribution improved the quality of the composites.

Witnesses normally receive a cognitive interview (CI); however, to help them recall details of an event and a suspect’s face. The ‘revised’ cognitive interview, as used currently by practitioners, is quite an involved procedure (see Wells et al., 2007 for a review), including: rapport building, to help a witness relax; context reinstatement, to assist recall; free recall, whereby a witness freely describes people and events; and cued recall, to allow clarification and elaboration of specific details. The revised interview also allows for repeated recall from different perspectives and from different temporal orders. This interview is important for locating a subset of facial features within a composite system, or via catalogues of features for sketch artists.

We thought to combine this CI with arecognition-enhancing ‘holistic’ interview. The latter involves witnesses thinking about the personality of the face, which is somewhat like the free recall stage of the CI. They then make seven personality judgements by rating each on a three-point Likert scale (low/medium/high). In an evaluation by Frowd, Bruce, Smith et al. (in press), composites constructed after the combined interview were much better than those after the CI alone. Police composite operators in the UK are now being trained on the combined interview. 

Enhancing an existing system
Modern computerised ‘feature’ systems can produce very good-quality composites. In 2000, for example, Graham Davies and his colleagues found an average naming level of 49 per cent from participants constructing E-FIT composites. Unfortunately, in order to achieve this level of performance, participants were both familiar with the target and had a photograph available to refer to. While the result does not mirror real life, it does suggest that feature selection might be improved if witnesses processed the face as if familiar.

Considerable research has shown that we process familiar and unfamiliar faces differently (e.g. Ellis et al., 1979; Young et al., 1985). For familiar faces, the internal features are the most important: the region containing the eyes, brows, nose and mouth. For faces seen a few times, or just once, the external features are more salient, especially the hair and face shape, and these tend to dominate our perception. Consequently, we may misidentify an unfamiliar person when their hairstyle changes. Frowd, Bruce, McIntyre et al. (2007) have also found that the external features of facial composites are better likenesses of a target than the internal features.

It is possible, however, to lessen the perceptual impact of the external features by processing them with a Gaussian filter. This ‘blurring’ technique (see Figure 2) allows the inner face to appear more prominent whilst maintaining a complete face context (important for holistic face processing). The procedure does appear to help participants select facial features: in two small-scale studies, we found that the quality of the internal features was significantly better in composites made with blurring than without (e.g. Frowd, Park et al., in press). Ongoing research is exploring the potential of the technique in a larger, more realistic composite study, as well as a general aid to unfamiliar face recognition.


An alternative system
As noted above, a detailed description of the face is a normal prerequisite to composite construction. Even with a good view of a criminal’s face and cognitive interviewing techniques, many witnesses are unable to provide a satisfactory description. Sadly, these witnesses may be denied the opportunity of constructing a composite, in spite of feeling confident that they could recognise the person in future.

Several research labs are designing composite systems that are based more on recognition than recall (Gibson et al., 2003; Tredoux et al., 2006). Ours is called EvoFIT and presents sets of whole faces, 18 per screen (Frowd, Hancock et al., 2004). Witnesses select faces that look something like the criminal’s and EvoFIT ‘breeds’ them together to produce another set. While the initial faces have random characteristics, repeating the selection and breeding procedure a few times normally allows a good likeness to be ‘evolved’ – see Figure 3. In practice, the process is improved by first choosing facial shapes, then facial textures.

At the heart of EvoFIT is a ‘face model’ that can generate a very large number of synthetic, but realistic-looking faces; the model is built from the statistics of about 70 complete faces. A genetic algorithm is used to search the space of possible faces, but converging on a good likeness is sometimes difficult. We now ask users to select the best match of shape and texture at the end of each generation, since this combined face can be given a greater emphasis during breeding to accelerate the search.

At this stage in development EvoFIT was used in a criminal investigation, the ‘Beast of Bozeat’ case (Frowd, Bruce, Storås et al., 2006). As the image on the contents page of this issue illustrates, a good likeness was produced of the criminal.

As part of a recent project, four new white male face models were designed, each covering a different age range and spaced apart by about 10 years. Judgements of age are fairly accurate (e.g. Bruce & Young, 1986), so this enables faces to be evolved with roughly the correct age. Thus, witnesses now have the fairly easy task of estimating the age of the criminal, to load the appropriate model.

A general problem of evolving systems is what to do when the search does not converge on the correct region of face space? One way is to simply evolve again using a fresh set of initial faces, a procedure that appears to be quite effective (Frowd, Bruce, Plenderleith et al., 2006).

There is however the associated risk that increasing the number of faces may interfere with a witness’s memory, of which there is some evidence from the use of a popular US system called FACES (Wells et al., 2005). Our evidence is that the construction of a single EvoFIT does not appear to interfere with a user’s memory any differently (if at all) from the construction of an E-FIT or PRO-fit, although limiting the number of faces presented is no doubt sensible.

More recent work has built models that match the target on age plus a few distinctive features recalled by a witness. This ‘tailoring’ approach avoids generating inappropriate characteristics – e.g. wide faces when narrow is required – and the final faces are much more identifiable (e.g. Frowd, Bruce, Gannon et al., 2007).

In spite of such improvements, the apparent age and other ‘holistic’ aspects of an evolved face can sometimes be inaccurate. A set of ‘holistic’ scales (tools) was therefore designed to allow witnesses to improve the likeness of their evolved face. There are 10 scales in total – including age, facial weight and masculinity – and each changes the face along the relevant dimension. The tools can be quite effective, as Figure 5 shows (Frowd, Bruce, McIntyre et al., 2006).

Developments to EvoFIT have largely been successful, with composite naming increasing from only a few percent correct (Frowd, Carson et al., 2005b) to 10–15 percent (Frowd, Bruce, Ness et al., 2007) after a two-day delay. The most recent version of EvoFIT includes blur, holistic tools and an improved control of the facial aspect ratio (facial width to height). This has just been evaluated using the gold standard and a two-day delay, and composites were correctly named at 25 per cent, compared to 5 per cent for a traditional ‘feature’ system.

EvoFIT was recently involved in a six-month pilot scheme by Lancashire police. About 30 composites were constructed and the system was reported to be of value in 20 per cent of cases. (An example is shown in Figure 4 on PDF version)

The development of EvoFIT is clearly ongoing. The system can now evolve white female faces, and models of other races are currently being built and evaluated.
Enhancing an existing compositeThe above discussion suggests that improvements can be made to both interview and system, but can anything be done once a composite has been constructed? Research suggests two things: composites can be morphed together, or caricatured.

It is sometimes the case that there is more than one witness to a particular crime. When this happens, the police normally give different tasks to different witnesses, and one person constructs a composite. Asking several witnesses to do this produces very different-looking faces. Currently, there is no reliable test to predict who will produce the best composite; so, all other factors being equal, which witness should the police choose? Our answer is all of them!

In 2002 Bruce and colleagues asked groups of four laboratory-witnesses to each construct a single composite of a target. They combined the images by averaging to produce a morphed composite. The data suggested that the morph was better at conveying identity than a typical composite produced by an individual witness, and always at least as good as the best individual witness composite. Hayley Ness has shown that a morphed composite can still be effective when composed of faces from different systems (Ness, 2003); a similar result was also found for composites made in the Beast of Bozeat case (Frowd, Bruce, Storås et al., 2006). A morphed image is effective because the consistent parts of the individual composites tend be reinforced, and errors averaged out. The work has prompted a change in police policy to permit construction of multiple composites (of the same face) for the production of a morph.

An alternative is to ask a single witness to construct more than one face, as mentioned above for EvoFIT. While the norm is to construct at a front-view, there is evidence that unfamiliar faces are processed better in half profile (e.g. Bruce et al., 1987). Thus, Ness et al. (in press) asked participants to construct composites in both front and three-quarter view using an enhanced version of PRO-fit. The three-quarter view images were sometimes better, but best performance was found when trying to guess identity with both views visible.

Another approach can be used when only a single composite is available. This is based on the observation that composites often appear fairly bland, but making them more different from each other may increase identification. We tried this by exaggerating the feature shapes of a composite from an average face to produce a caricature. Results indicated that while one level of caricature helped one person to identify a face, for another, a different level was required. The solution was to present a range of exaggerations!

In Frowd, Bruce, Ross et al. (2007), we presented participants with sequences of images, ranging from a strong positive caricature to a strong negative one, where the feature shapes were deemphasised to look more like the average (see Figure 6). Whether using the E-FIT, PRO-fit, sketch or EvoFIT system, an animated composite led to significantly better naming than an unaltered (veridical) one. We also found that the best gain occurred for the worst-quality images, which suggests that the animation will be maximally beneficial for composites produced in the field. Presenting the caricature sequence in the form of an animated GIF would appear appropriate for TV crime programmes. This animated format is currently being used by several UK constabularies (examples from Lancashire constabulary can be found at tinyurl.com/5tpc74).

Putting it all together there is good evidence that composites are not very identifiable when produced using standard police procedures and systems. We have found that significant improvements can be made by changing the interview, system or presentation format. Current work is looking at which combination of above techniques works the best – which is likely to be blurring, holistic tools, EvoFIT and caricature – and whether any of these could also help witnesses carry out other identification tasks.

Answers
Figure 1: The composites are (from left to right, top to bottom) of Brad Pitt, Graham Norton, Nicholas Cage, Michael Owen, Robbie Williams, Anthony (Ant) McPartlin, David Beckham and Noel Gallagher.
Figure 5. Simon Cowell.
Figure 6: The former British Prime Minister, Tony Blair.

BPS Members can discuss this article

Already a member? Or Create an account

Not a member? Find out about becoming a member or subscriber