A new study of spatial localization of audio-visual stimuli has shown that the 'ventriloquist effect' is a function of near-optimal integration of visual and auditory inputs.

Ventriloquism is a phenomenon whereby we inaccurately perceive a voice as emanating from a spatially displaced source; for example, from the lips of a ventriloquist's dummy rather than from those of the ventriloquist. The effect was originally thought to be the product of voice projection techniques perfected by the performer, but recent hypotheses have suggested that it results from the domination of hearing by vision.

Alais and Burr tested this assertion by having observers localize visual and audio stimuli — brief 'blobs' of light or 'clicks' of sound — in space. Unimodal thresholds were established by presenting these stimuli separately and asking observers to indicate which of the two stimuli appeared more to the left. Subsequently, blobs and clicks were presented simultaneously in one of two modes. In 'conflict' mode, blobs and clicks were spatially displaced from each other; in 'non-conflict' mode, the stimuli were equally displaced to the left or right of centre.

The ability of subjects to localize the stimuli depended on the size and clarity of the blobs. When visual localization was good, vision 'captured' sound, as in the classic ventriloquist effect. However, the reverse was true when visual stimuli were blurred and therefore poorly localized. In this case, the subjects perceived the blob as closer to the correct location of the click, rather than vice versa. In all six subjects, bimodal localization was more precise than either form of unimodal localization. Based on these data, the authors propose a model in which visual and auditory inputs are optimally combined to minimize variance and improve spatial localization.