New voices: An ecological approach to audio description
At the moment the space is dimly lit and shadowy. Slowly the lights fade to total darkness. There’s a shaft of light from an open door as Gloucester walks in, rubbing his face wearily. Kent, glancing nervously over his shoulder, follows.
This is how King Lear begins in a recent production at London’s Almeida Theatre. At least, this is how King Lear begins if you are a blind theatre-goer, listening to the audio description (AD).
AD is a verbal commentary providing visual information for those unable to perceive it themselves. AD helps blind and partially sighted people access audiovisual media and is also used in live settings such as theatres, galleries and museums (e.g. Diaz-Cintas et al., 2007). The practice was developed in the US and came to the UK in the late-1980s. I was amongst a small handful of people who were trained by the National Theatre when they first began an audio description service in 1992, and I still regularly describe productions. I was also a pioneer of TV description, working for the BBC as part of a European pilot project called Audetel.
Although AD is now a legal requirement (under the Communications Act 2003 and Equality Act 2010), its methods are largely untested (Gerber, 2007). As I train new describers, I wanted data to back up what I teach. As a discipline, AD is found in departments of audio visual translation (AVT). While there are interesting overlaps between translation and description, for me translation concentrates on the text, missing out the performance element and another crucial factor: the audience. When going to the theatre or watching a film, you immerse yourself in another world, so a mediated experience appears unmediated (Lombard & Ditton, 1997). AD adds yet another level of mediation. Can an AD user experience ‘presence’, that ‘feeling of being there’ (Biocca, 1997), in the same way as a sighted person? How do the perceptual experiences of blind or partially sighted people affect how they engage with AD? My quest for answers led me not to a translation department but to the Psychology Department at Goldsmiths, University of London to research for a PhD.
On one level, AD seems easy enough. American guidelines provide an acronym: W.Y.S.I.W.Y.S. – ‘what you see is what you say’ (see www.acb.org/adp/ad.html). Yet if everything in the visual array were described, AD users would be overwhelmed by detail. In their comparison of AD guidelines in different countries, Rai et al. (2010) suggest that the greatest challenge is how to choose ‘what not to describe’. So how do sighted people avoid being bombarded by visual information; how do we select from what we see?
My first degree was in anthropology, so my supervisors, Dr Jonathan Freeman and Professor Linda Pring, had to introduce me to key concepts in psychology. I found the ideas of J.J. Gibson particularly illuminating. In the 1970s Gibson took issue with the prevailing model of vision as a series of objects projected on the retina, like pictures projected onto a screen. Instead, he developed what he called ‘the ecological approach to visual perception’. We are not, he argues, passive recipients of a static, snapshot view of the world. We look around and move around while we are looking. We compare what we see with what we have seen before, i.e. with what we know. And this varies: the AD audience is diverse. Some, especially those who have recently lost their sight, remain very visual. As they listen to introductions to theatre performances describing characters, costumes and set designs, or to descriptions of objects or paintings in galleries and museums, they build up a picture in their mind’s eye. For people who have been blind from birth, however, it is not possible to create a mental representation that matches what a sighted person sees. Even people who have lost their sight later in life may gradually lose their interest in the visual world. As part of my research, I have been conducting interviews with blind and partially sighted people regarding their experiences of AD. One participant, a 70-year-old man who went blind at the age of 60, put it like this:
I don’t consciously build pictures in my mind… I’m imagining action... And what I think I imagine is, is what I do in the rest of my life. That is to say I know you’re sitting there on the other end of the settee – because I’m intimately acquainted with the settee, I know you’re a woman and you’re sitting at that end of it and there’s just over a metre between us. I know what women are like, therefore I can imagine you. I don’t actually have a picture of you… One or two very old friends I can maybe sort of have a sense of he’s got craggy features or he’s tall and skinny or whatever, but that’s gone really, because I don’t experience people in that visual way... my reality is a reality without visual images and maybe one of the mistakes that [audio describers] make… is maybe they think we read the notes and listen to the things furiously creating visual pictures that a sighted person would see.
Gibson provides a helpful distinction between qualities of objects and their affordances, i.e. what an object allows us to do with it. A twig, for example, affords us the opportunity to pick it up, to use as a tool or as fuel for a fire. A tree branch may afford grasping but not carrying. The tree itself cannot be carried, nor, if the trunk is wide, can it be grasped. However, it may afford climbing. If it is a fruit tree, the fruit may afford eating. As for qualities, Gibson’s list is extensive, including colour, texture, composition, size, shape, mass, elasticity, rigidity and mobility. The list goes beyond what we might consider to be purely visual, and all can be ascertained at a glance. Yet much of that visual information fails to register at a conscious level. If I asked you to describe the chair you are sitting on, you would probably instantly look at it. You saw the chair before you sat on it, yet you almost certainly did not register its colour, its precise form or the materials of its construction. The fact of it being a chair was sufficient for you to enjoy what it affords: the opportunity to sit down. Gibson suggests ‘we can discriminate the dimensions of difference if required to do so… but the special combination of qualities by which an object can be analysed is ordinarily not noticed’ (1986, p.134).
It is the role of the audio describer to notice what is ordinarily not noticed. But how much detail should they provide? Gibson claims ‘however skilled an explicator one may become one will always, I believe, see more than one can say’ (1986, p.261). He gives the illustration of a cat on a mat. What we see is a cat obscuring part of the mat; the mat extending on each side of the cat, the mat supporting the cat, the floor supporting the mat and the cat as it extends horizontally in each direction, the rigidity of the floor that affords support. We also see where we are in relation to the cat. We see parts of ourselves, the shadow of our nose, and perhaps a strand of hair, our toes and hands and parts of our forearms. We see whether the cat is sleeping or awake, friendly or twitching its tail. We see what colour the cat is, and perhaps an indication of its age and whether it has been in a fight and is missing part of its left ear... I could go on.
AD during a play or film is limited by time constraints. It must be fitted in between bursts of dialogue. If there is only time for a brief description, it may be better to ask not ‘what do we see?’ but ‘what does the visual information afford?’ If the cat is twitching its tail, it is less likely to afford us the opportunity of stroking it. So too if its coat is scabby or crawling with fleas. A cat may afford information about its owner – in the film To Russia with Love the Persian cat stroked by the Bond villain Blofeld indicates his interest in prestige and appearance. That the cat is long-haired, white and wears a jewel-encrusted collar tells us much about Blofeld’s circumstances, both financial and environmental. By contrast, in Austin Powers: International Man of Mystery, the hairless sphynx cat adopted by the spoof villain, Dr Evil, affords comparison with Blofeld in terms of power and over-weaning ambition, and humour in terms of abundant hair versus no hair. Although neither cat affects the plot, the visual information arguably enhances our understanding of a key character and enriches our enjoyment of the film.
Gibson’s distinction between quality and affordance provides a useful rule of thumb: qualities are nice to know, affordances are what
AD users need to know. Zahorik and Jenison (1998) use the example of a basketball, arguing that it is not best represented by the qualities ‘round’, ‘orange’ and ‘rubber’ but by the fact it can be thrown, rolled or bounced. In AD, the colour of a ball may be less important than the type of ball, e.g. golf ball, cricket ball, tennis ball. From this information we can infer details of size, weight and how bouncy it is so we can anticipate how it might be used.
Affordances draw our attention, even if we are not consciously aware of them. When shown a target with a graspable handle on either the left- or right-hand side, observers respond more quickly and more accurately if the response hand is on the same side as the handle, than if it is on the opposite side (Tucker & Ellis cited in Murphy et al., 2012). This holds true even if participants observe a photograph, rather than an object that could be physically grasped. It suggests that in a play or a film, a sharp knife lying casually on a work surface, for example, calls out to us to expect cookery at one end of the spectrum and violence at the other. Other visually apprehended details may also be relevant: the length of the blade, the keenness of the edge, if the knife is gleaming or bloody. If time is short, however, simply describing the presence of the knife and by implication what it affords (stabbing) may be sufficient. The converse applies too. In mentioning something, that describer sets up an expectation of its importance. Describing every object on a cluttered mantelpiece may convey the impression that each features in the action: the candle in the candlestick will be lit; flowers will be placed in the vase; a photograph will be examined; the clock will be wound or time is crucial to the plot. An AD user may be waiting for the moment these props are brought into the limelight and waste cognitive effort in doing so.
Gibson’s approach is not without its limitations. In particular, he ascribes perception of affordances primarily to vision, taking little account of non-visual modalities. Woods and Newell (2004) point out that while vision can successfully identify a glass of water affording the opportunity to drink, it may need touch to confirm whether the water is too cold to drink comfortably. Similarly we can tell much about an object through sound: for example its length (Carello et al., 1998, 2005), its size (Grassi, 2005) and whether or not it is within reach (Rosenblum et al., 1996). We can identify natural sounds such as footsteps, applause, or a can being opened. We recognise voices, the spoken word and emotion in the voice of the speaker (Davis & Johnsrude, 2007). From this we can deduce much about the speaker including whether or not he or she is safe to approach. The affordances of sound have been utilised to improve robotic navigation (Chu et al., 2009). Perceiving affordances through non-visual modalities is important for the describer too, if they are to avoid supplying redundant information (Fryer, 2010).
Gibson’s most helpful insight is that the observer is not passive. That brings us back to the focus of my research. How does impairment impact on perception and experience? And how does that affect the user’s engagement with AD? So far my studies suggest that the spoken word is as evocative for blind people as non-verbal auditory stimuli (Fryer, Pring et al., 2013a); that navigation involves explicit rather than implicit processing (Fryer, Freeman et al., 2013); and that word–shape associations that are robust in the sighted are significantly less strong for those who are blind (Fryer, Pring et al., 2013b). Most importantly, my research has brought me into contact with many blind and partially sighted people who have reminded me of the pure aesthetic enjoyment that visual qualities can provide, regardless of whether the listener can ‘picture’ them. As one congenitally blind participant expressed it: ‘Because I’ve never experienced light in all its wonderful forms… I’m entirely fascinated… and I love to hear talk of what light looks like and colours of fireworks and I love it all… it doesn’t bother me because I can’t see it, I just love hearing about it.’
Louise Fryer is an audio describer and also a PhD student at Goldsmiths College, University of London [email protected]
Biocca, F. (1997). The cyborg’s dilemma: Embodiment in virtual environments. Journal of Computer-Mediated Communication, 3(2). doi:10.1111/j.1083-6101.1997.tb00070.x
Carello, C., Anderson, K.L. & Kunkler-Peck, A.J. (1998). Perception of object length by sound. Psychological Science 9(3), 211–214.
Carello, C, Wagman, J.B. & Turvey, M.T. (2005). Acoustic specification of object properties. In J.D. Anderson & B. Fisher Anderson (Eds.) Moving image theory (pp.79–104). Carbondale, IL: Southern Illinois University Press.
Chu, S., Narayanan, S., & Kuo, C.-C.J. (2009). Environmental sound recognition with time-frequency audio features. IFEE Transactions on Audio, Speech and Language Processing, 17(6), 1142–1158.
Davis, M.H. & Johnsrude, I.S. (2007). Hearing speech sounds: Top-down influences on the interface between audition and speech perception. Hearing Research, 229, 132–147.
Diaz-Cintas, J, Orero, P. & Remael, A. (2007). Media for all: Approaches to translation studies. Rodopi.
Fryer, L. (2010). Audio description as audio drama. Perspectives: Studies in Translatology, 18(3), 205–213.
Fryer, L., Freeman, J. & Pring, L. (2013). What verbal orientation information do blind and partially sighted people need to find their way around? British Journal of Visual Impairment, 31(2), 123–138.
Fryer, L., Pring, L. & Freeman, J. (2013a). Audio drama and the imagination: The influence of sound effects on presence in people with and without sight. Journal of Media Psychology, 25(2), 65–71.
Fryer, L., Pring, L. & Freeman, J. (2013b). Touching words is not enough: How vision mediates haptic-auditory associations in the ‘Bouba-Kiki’ effect. Manuscript submitted for publication.
Gerber, E. (2007). Seeing isn’t believing: Blindness, race and cultural literacy. Senses & Society, 2(1), 27–40.
Gibson, J.J. (1986). The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum.
Grassi, M. (2005). Do we hear size or sound? Perception and Psychophysics 67(2), 274–284.
Lombard, M. & Ditton, T. (1997). At the heart of it all: The concept of presence. Journal of Computer-Mediated Communication, 3(2).
Murphy, S., van Velsen, J. & de Fockert, J.W. (2012). The role of perceptual load in action affordance by ignored objects. Psychonomic Bulletin & Review 19(6), 1122–1127.
Rai, S., Greening, J. & Petre, L. (2010). A comparative study of audio description guidelines prevalent in different countries. London: Media and Culture Department/Royal National Institute of Blind People.
Rosenblum, L.D., Paige Wuestefeld, A. & Anderson, K.L. (1996). Auditory reachability: An affordance approach to the perception of sound source distance. Ecological Psychology, 8(1), 1–24.
Woods, A. & Newell, T. (2004). Visual, haptic and cross-modal recognition of objects and scenes. Journal of Physiology-Paris. 98(1–3), 147–159.
Zahorik, P. & Jenison, R.L. (1998). Presence as being-in-the-world. Presence, 7(1), 78–89.
BPS Members can discuss this article
Already a member? Or Create an account
Not a member? Find out about becoming a member or subscriber