Why We See What We Do: An Empirical Theory of Vision

  • Dale Purves &
  • Beau Lotto
Sinauer Associates, Sunderland, Massachusetts, 2003 $42.95 paperback, pp 260 ISBN 0878937528 | ISBN: 0-878-93752-8

To the surprise of most people, vision has not yet been explained scientifically. There is no agreement on how we see the size of an object (at various distances), its color, and whether it is moving or not, simply by looking at it. How does the rich, three-dimensional world of visual experience arise from the ambiguous, seemingly impoverished two-dimensional image projected onto the retina? Imagine that a retinal image contains a trapezoidal region of a given intensity. Its shape could come from a rectangle lying down or a trapezoid standing up. Its intensity could come from a white surface in dim light or a black surface in bright light. How does the visual system compute an answer (that is, generate a percept)?

Purves and Lotto must be applauded for defining this “pervasive ambiguity of retinal stimuli” as the central problem. In the finest tradition of giving science away, they bring this problem to life using a series of computer-generated illustrations that delight the eye and edify the mind. The coverage is reasonable, with chapters on lightness, color, three-dimensional space and motion. Sensory physiology is thoroughly addressed, which is not surprising given the status of the senior author as a leading neuroscientist. More surprising is the authors' bold critique of sensory physiology. Dismissing current research trends (such as channels) as fads, they argue that neuroscience has failed to address the ambiguity problem. They assign a vital role to phenomenology and suggest that rapid progress in neuroscience requires an understanding of the “overarching strategy of vision.”

The authors argue at great length that the ambiguity of the retinal image is solved by the human visual system in a “wholly empirical” fashion. Whereas other theories invoke inferential processes, contextual patterns or maximum simplicity, Purves and Lotto speak of probabilities extracted from past visual experience. Consulting my stored memories of similar trapezoidal images, I discover that in most cases, the object turned out to be a rectangle. Thus I see a rectangle.

Past-experience theories have historically moved in and out of fashion. Although not mentioned by Purves and Lotto, Adelbert Ames and his colleagues promoted the same theory, called transactionalism, in the mid-20th century, illustrating it with such engaging demonstrations as the Ames distorted room and the rotating trapezoidal window. Although appealing at a glance, such accounts are not easily made concrete, and they have difficulty standing up to a series of logical and empirical challenges.

For example, if seeing requires past experience, how does a newborn see? In the 18th century, George Berkeley argued that touch educates vision. However, this merely displaces the problem. Tactile stimulation is even more ambiguous than retinal stimulation, and the weight of the evidence shows that vision educates touch, not vice versa. Purves and Lotto speak of what the ambiguous stimulus “turned out to signify in past experience.” But exactly how did it turn out thus? What is the source of feedback that resolves the ambiguity? Here the reader wishes that their solution could have been described with some of the same concreteness as their description of the problem.

Neither do the authors explain how we perceive novel or unlikely objects. If we are more likely to encounter rectangles than trapezoids, how do we ever perceive a trapezoid?

Purves and Lotto repeatedly say that vision is based on the past experience of both the species and the individual. They offer evidence of the former, but scant evidence of the latter. The term empiricism has always referred to the latter, and indeed the former is actually its opposite, as it is innate in each individual.

Several of their demonstrations illustrate the well-known visual assumption of light from above. But all of evolution has taken place under the sun. Is it realistic, then, to think that every organism must learn this principle from scratch? Wayne Hershberger has shown that chicks raised entirely with light from below still interpret ambiguous images consistent with light from above.

Infant habituation studies show that size and shape are perceived correctly on the first day of life. The baby regards a small nearby object and a distant larger object as different even when they make the same retinal image. But newborns can recognize an object placed at two different distances as the same object, despite the different retinal size, or the same rectangle placed at different slants. How can the newborn learn something so sophisticated in a matter of hours? None of this evidence is considered in the book.

No one denies that perceptual learning occurs under specific conditions. But Purves and Lotto have not made a case for the heavy lifting they attribute to it.

Although the title of their book echoes Koffka's famous question, “Why do things look as they do?” Purves and Lotto generally neglect the work of Koffka and the other Gestalt theorists, work that demolished the past-experience theories of an earlier day. Purves and Lotto dismiss the Gestalt emphasis on contextual information, writing, “the context is simply a collection of other patches whose respective ambiguities are just as profound as the ambiguities of the designated targets.” But as Gibson has brilliantly shown, the visual system responds to patterns of patches, and these are not ambiguous in the way that individual patches are.

The non-specialist will appreciate the wonderful illustrations that fill this book and the clear introduction to the fundamental challenge of vision. But in explaining how vision succeeds, Purves and Lotto ignore crucial pieces of evidence. And they add little to the debate that is new.