Skip to main content

Listening with Your Eyes

To perceive the world as a whole, our five senses have to team up in the brain—and in some cases, they actually seem to fuse with one another

It is Saturday evening at the state fair. To your left, "Rock around the Clock" wafts out of a tent. Behind you, a group of teenagers is carrying on, laughing loudly. Somewhere, an infant is crying. A profusion of neon signs and blinking lights competes for your attention. A roller coaster plummets and makes a hairpin curve. Your senses are already overloaded. But the experience wouldn't be complete without an ice-cream cone in hand and the aroma of cotton candy and honey-roasted almonds in the air.

A scene like this busy fair illustrates just how many signals bear in on us simultaneously from the environment. Yet our brain is able to integrate all the stimuli and make sense of the cacophony of movement and sound. Exactly how this integration happens is not yet understood--which naturally piques the curiosity of neuroscientists.

The abundance of stimuli typical of a state fair, however, does not lend itself to studying the mind's fusion of the five senses: a process called sensory integration. Researchers tend to be interested in situations in which the brain tricks itself, so to speak, and creates a false picture of its surroundings. In ventriloquism, for example, even though the voice is not coming from the slack-jawed wooden puppet on the ventriloquist's lap, the audience suspends disbelief. By the same token, characters on the silver screen are not actually speaking; their words emanate from loudspeakers distributed around the theater. But when the brain observes lips moving in rhythm with words, it believes the illusion that those lips are the actual source of what is heard. In other words, our auditory and visual impressions work in tandem to create a perception of our surroundings.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


But not only do we sometimes misinterpret the source of a sensory impression, we also occasionally perceive it as something entirely different. For example, psychologists Harry McGurk and John MacDonald of the University of Surrey in England discovered an interesting phenomenon in the mid-1970s. They showed a film to volunteers in which a speaker articulated the syllable "ga" but over which they had dubbed the sound "ba." The test subjects reported perceiving neither of these sounds; rather they heard the syllable "da." Visual and auditory information combined to create a third, completely new sound, a process now known as the McGurk effect. Our auditory and tactile senses can create illusory alliances as well. When we rub the palms of our hands together, we can tell how wet they are by sensing not only the amount of wetness we feel but also the sound our skin makes. If we hear a strong rustling noise, our skin feels dry--the fainter or higher-pitched this sound becomes, the wetter the palms of our hands will feel.

Such illusions demonstrate that our brain is constantly combining information from various sensory organs to "draft" a more or less correct image of the environment around us. The question posed for perceptual researchers is: Where and how do our various senses get fused in the brain?

Two basic mechanisms are conceivable. Either the senses function separately and our brain combines their inputs into a coherent whole during the final stages of processing, or else the senses work together from the start, complementing and influencing one another at a very early stage.

Consider the scene of a barking dog in a neighbor's yard. In the first model, each sensory system of the brain first analyzes its particular stimuli by itself and generates its own complete "image" of the environment. For example, our visual apparatus creates the image of a golden retriever barking behind a white picket fence, while our auditory system simultaneously registers both a barking noise and the sound of a passing car. The brain then integrates the sensory impressions to complete the scene: a barking dog in a yard near a street.

In the second model the visual system might first detect a golden brown surface of a given size within a field of green. At the same time, the auditory system picks up a rhythmically repetitive sound from the direction of this surface. The visual system then registers that the surface changes when the auditory system perceives the sound. The various senses complement one another within a few fractions of a second until the overall impression of a barking golden retriever emerges. In this mechanism, sensory integration occurs at a very early phase of processing.

These two scenarios are the extreme ends of a spectrum of possible mechanisms for sensory integration. An infinite number of intermediate stages between these two variants is conceivable. Presumably the path that the brain actually takes is somewhere in the middle. The question is, Where?

Images of Integration
Psychologists first began investigating interactions among the senses in the 1950s by examining how different sensory combinations affect our perception of the world around us. They quantified illusions such as the McGurk effect, mentioned above, and the ventriloquist effect, first described in 1966 by Ian P. Howard and W. B. Templeton, who were researchers at York University in Toronto. Even today psychological studies continue to explore perceptual illusions to find out how our brain combines different aspects of sensory information and how this improves our performance in tasks that rely on multisensory information.

Around the 1970s, as psychologists were investigating sensory integration from a perception standpoint, scientists coming from more classical biological fields such as neurophysiology started to investigate the neuronal basis of how the brain combines sensory information. But whereas many of these researchers investigated neurons related to specific senses, such as those in the visual or auditory pathways, only a small minority studied multisensory properties. Only recently, helped in part by advances in brain-imaging techniques, have people begun to realize that our different senses do not function as discretely as was previously thought.

Technology such as functional magnetic resonance imaging (fMRI) makes use of the fact that when an area of the brain works particularly hard, it needs more oxygen than adjacent regions and is therefore more heavily perfused with blood. Oxygen-rich hemoglobin molecules behave differently in a strong magnetic field from those that contain no oxygen, so fMRI scanners can detect blood flow and therefore produce images of the working brain.

Now consider again the neighbor's barking dog: fMRI scanning should be able to detect the difference between the two models of sensory integration. If the first model is correct and sensory information is analyzed separately by the various systems and then combined at the end, many different regions of the brain should be engaged, and each should exclusively process a single sense. On the other hand, if the information is combined early, only a few highly specialized regions should suffice.

Over the past several years, a series of imaging studies has disclosed a complex network of brain regions that are activated most strongly when various sensory data fuse. It has long been known that so-called associational regions in the parietal and frontal lobes of the cerebral cortex process information streaming in through various sensory channels. Yet regions that up to now have been thought to be responsible for only one sense have recently been demonstrated to have a broader spectrum of talents. As Jon Driver of University College London described in 2000, activity in the visual cortex of test subjects who have just seen a short flash of light in the vicinity of their right or left hand increases when the fingers of that hand also perceive tactile stimuli. This increased brain activity only occurs, however, when the visual and tactile stimuli occur simultaneously and on the same side of the body.

Psychologists have known about this "multimodal reinforcement" for quite some time. For example, people have more trouble seeing a flickering point of light as its intensity decreases. Yet if we hear a short burst of sound at the same time as the flickering, we will perceive even the weakest glimmer of light. But this effect works only when the light and the sound are precisely synchronized.

The perception of language is particularly interesting. As the McGurk effect demonstrates, the spoken word is not only conveyed acoustically. Lip movements communicate important information as well. In 2001 psychologist Gemma Calvert, now at the University of Bath in England, observed that speech perception increases the activity of both the auditory and the visual system when acoustic and visual stimuli are perceived simultaneously. In other words, the image of moving lips affects the processing of acoustic signals early on. This synergy between hearing and seeing occurs in regions of the brain that had previously been viewed as separate sensory regions.

Even the soundless image of a person speaking is sufficient to stimulate the auditory cortex measurably, including when the speaker is talking gibberish. On the other hand, making faces leaves the auditory cortex cold. This phenomenon makes it clear that the auditory cortex reacts specifically to the visual image of speech, and the sensory integration of acoustic and visual stimuli facilitates speech processing.

Fusion in the Brain
Accordingly, the second model, which presumes early sensory fusion, appears to be much more accurate. My team's research at the Max Planck Institute for Biological Cybernetics in Tuebingen also points in this direction. In 2005 we performed high-resolution magnetic resonance measurements on various regions of the auditory cortex of rhesus monkeys (Macaca mulatta). The auditory cortex comprises various subunits [see box on opposite page]. The primary auditory cortex receives the electrical impulses produced by sound waves in the inner ear, via a mediator in the thalamus. Then those impulses travel to the higher auditory regions, which surround the primary auditory cortex like a belt only a few millimeters thick.

We measured the increased activity in the auditory cortex while we played rustling noises to the animals through a headset and stimulated their palms or the soles of their feet with a brush. When we did both simultaneously, the posterior end of the secondary auditory cortex in particular was stimulated. Earlier this year we saw similar results in a new study in which we used visual instead of tactile stimulation. Again we found that only the posterior half of the auditory cortex was stimulated. This is where sensory integration appears to occur.

We do not yet know why sensory information fuses in these particular brain regions. But it appears that the posterior part of the auditory cortex is specialized for registering spatial information--that is, recognizing the directionality of a sound. Perhaps the sensory fusion that occurs here contributes to the relating of various sensory impressions to a particular source in space.

In January a groundbreaking study by neuroscientist Charles Schroeder and his colleagues at the Nathan S. Kline Institute for Psychiatric Research in Orangeburg, N.Y., revealed a mechanism by which nonauditory stimulation enhances activity in the auditory cortex. The researchers found that although a tactile stimulus alone will not cause auditory neurons to fire, it will manipulate the underlying oscillatory pattern in the neurons so that they have maximum firing potential. This way, if the auditory cortex simultaneously receives auditory and tactile stimuli, its neurons will fire more strongly than they would if auditory stimuli were received alone. This new insight helps to explain how receiving information from two different sensory organs causes both processing centers to activate more strongly, and it might point to the neuronal basis of sensory integration.

Although we are still working toward a complete understanding of how the brain processes sensory information, one thing seems certain: sensory integration occurs in high-level regions, and it occurs early in the process, though not as early as one might assume on theoretical grounds. The first model, which assumes separate processing of sensory impressions, is simply false. The second model, which assumes that the senses are fused at the earliest possible moment, is overstated but fits reality better. Clearly, many regions of the brain are engaged in combining information from different senses, and a much smaller part of the brain than previously thought is dedicated exclusively to each individual sense.

(Further Reading)

  • The Handbook of Multisensory Processes. Edited by Gemma A. Calvert, Charles Spence and Barry E. Stein. MIT Press, 2004.

  • Integration of Touch and Sound in Auditory Cortex. C. Kayser et al. in Neuron, Vol. 48, pages 373–384; October 20, 2005.

  • Multisensory Spatial Interactions: A Window onto Functional Integration in the Human Brain. Emiliano Macaluso and Jon Driver in Trends in Neurosciences, Vol. 28, No. 5, pages 264–271; 2005.

SA Mind Vol 18 Issue 2This article was originally published with the title “Listening with Your Eyes” in SA Mind Vol. 18 No. 2 (), p. 24
doi:10.1038/scientificamericanmind0407-24