Published online 7 January 1999 | Nature | doi:10.1038/news990107-7


Deafening silence

To our brains, the meaning of a visual scene could be determined as much by what isn’t there, as by what is. In a report in Nature Neuroscience, Rajesh P. N. Rao of the Salk Institute, La Jolla, California, and Dana H. Ballard of the University of Rochetser, New York, describe a theoretical model called ‘predictive coding’ in which the stream of visual information entering the brain is progressively winnowed, so that the visual brain concentrates only on things that are different or unusual.

Although predictive coding explains much experimental evidence, it is the exact opposite of models of brain function in which neurons fire when they recognize certain preferred features in a scene. Which is right? As usual, only time and more research will tell.

When we turn our eyes on a scene, patterns of light are converted into neuronal impulses that travel to a part of the brain called the visual cortex. This is arranged hierarchically. Raw input comes into neurons in the lowest level. These neurons may be sensitive to particular features in a scene, such as lines, the orientation of edges, or colour contrasts, but do not apprehend the scene as a whole. These low-level neurons process these inputs and send outputs to the next level up. Several neurons from one level feed single neurons further up, until - several layers on - there are neurons able to integrate all the inputs and ‘comprehend’ the ‘big picture’.

Crucially, much evidence indicates that neurons at any level fire impulses actively when they recognize (or are trained to recognize) certain features in a visual scene, whether it is a simple oblique line, or something more complicated, such as a moonlit scene.

‘Predictive coding’ is different from this ‘feedforward’ model, in that objects are represented by the absence of firing activity - a deafening silence.

The idea of predictive coding is supported by a phenomenon called ‘end-stopping’. Take a neuron that’s sensitive to a segment of a straight line. Extend the line until each end goes beyond the ‘receptive field’ of the neuron (for each neuron has its own point of view) - and something odd happens. Rather than the neuron firing more intensely, it lessens its rate of firing until it stops. Why?

It could be, reason Rao and Ballard, that longer lines are more common in everyday visual scenes than shorter ones, so they are tuned out. A low-level neuron reporting lots of long lines will be ‘told’ by neurons in levels above that long lines are to be expected, so the low-level neuron ceases activity when it ‘sees’ long, straight lines. This process saves energy and processing time, reserving them in expectation of the unexpected (in this case, a scene with a lot of short line segments in it.)

Predictive coding is analogous to methods of data compression that save a lot of unnecessary work and memory. For example - in most natural scenes, the light intensity and colour value of neighbouring ‘pixels’ are usually closely correlated. It is simpler to encode something samey - a large red blob, say - by a kind of shorthand, in which the redness of most of the blob can be safely predicted from a few pixels. The constant firing of a battery of neurons sensitive to the colour red is unnecessary - but if the red blob starts sprouting green antennae, then that’s the time for neurons to perk up.

Predictive coding has a body of experimental support in its favour - but so does the traditional ‘feedforward’ model. It could be that the ‘feedforward’ model describes some aspects of visual processing, such as feature selectivity (neurons that fire at the sight of flying reindeer) and predictive coding describes others (the reliable redness of Santa’s cloak.) Undoubtedly, the real, live working brain is more complicated that the sum of its parts, and of the models used to describe how it works.