Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Information processing with population codes


Information is encoded in the brain by populations or clusters of cells, rather than by single cells. This encoding strategy is known as population coding. Here we review the standard use of population codes for encoding and decoding information, and consider how population codes can be used to support neural computations such as noise removal and nonlinear mapping. More radical ideas about how population codes may directly represent information about stimulus uncertainty are also discussed.

Key Points

  • Population codes:

    Information about quantities in the world is represented by neural activity patterns in a characteristic general fashion. Single cells respond to a specific variety of values of the quantities; so each particular value leads to coordinated firing in a whole population of cells.

  • Encoding in the standard model:

    Under the standard model, a single value of a quantity is encoded by the population. Each cell has a tuning curve for the quantity, which shows how its average response (in spikes per second) varies with the quantity. The actual population activity on any trial is noisy about these means and the noise has a variance that can be characterized.

  • Decoding by maximum likelihood and Bayes rule:

    Under the standard model, a simple statistical technique can be used to find out what the activity of the population on any trial implies about the value of the quantity encoded. Under some further assumptions, the most likely value of the quantity can be extracted, essentially by a process of curve fitting the average responses predicted by the tuning curves of the cells to the actual responses recorded.

  • Decoding by recurrent interactions:

    Decoding in the standard model seems to require complex mathematical operations. However, non-linear recurrent networks of neurons can be constructed that have stable points corresponding to each value of the encoded variable, and can be shown to perform nearly optimal decoding using simple interactions.

  • Basis function mappings:

    The tuning curves of neurons show that they collectively form a particular representation, called a basis function representation, of the quantity they encode. This means that any, even nonlinear, function of this quantity can be extracted as a linear sum over the activities of the population of neurons. This underlies a successful and predictive model of parietal cortex.

  • Probabilistic population codes:

    Although the standard model is a powerful way of characterizing population codes, it has some shortcomings. In particular, it cannot correctly represent the uncertainty the animal might have about the quantity encoded. An extension to the standard model can be defined, in which the population of neurons is treated as encoding uncertainty (and also multiplicity, in the case that multiple values of the quantity are present).

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: The standard population coding model.
Figure 2: A neural implementation of a maximum likelihood estimator.
Figure 3: Function approximation with basis functions.
Figure 4: Multiplicity and uncertainty in population codes.


  1. 1

    Bair, W. Spike timing in the mammalian visual system. Curr. Opin. Neurobiol. 9, 447–453 (1999).

    CAS  Article  Google Scholar 

  2. 2

    Borst, A. & Theunissen, F. E. Information theory and neural coding. Nature Neurosci. 2, 947– 957 (1999).

    CAS  Article  Google Scholar 

  3. 3

    Usrey, W. & Reid, R. Synchronous activity in the visual system. Annu. Rev. Physiol. 61, 435– 456 (1999).

    CAS  Article  Google Scholar 

  4. 4

    Zemel, R., Dayan, P. & Pouget, A. Probabilistic interpretation of population codes. Neural Comput. 10, 403–430 (1998).

    CAS  Article  Google Scholar 

  5. 5

    Tolhurst, D., Movshon, J. & Dean, A. The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Res. 23, 775–785 (1982).

    Article  Google Scholar 

  6. 6

    Földiak, P. in Computation and Neural Systems (eds Eeckman, F. & Bower, J.) 55–60 (Kluwer Academic Publishers, Norwell, Massachusetts, 1993).

    Book  Google Scholar 

  7. 7

    Salinas, E. & Abbot, L. Vector reconstruction from firing rate. J. Comput. Neurosci. 1, 89– 108 (1994).

    CAS  Article  Google Scholar 

  8. 8

    Sanger, T. Probability density estimation for the interpretation of neural population codes. J. Neurophysiol. 76, 2790– 2793 (1996).

    CAS  Article  Google Scholar 

  9. 9

    Zhang, K., Ginzburg, I., McNaughton, B. & Sejnowski, T. Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. J. Neurophysiol. 79, 1017–1044 (1998).

    CAS  Article  Google Scholar 

  10. 10

    Cox, D. & Hinckley, D. Theoretical statistics (Chapman and Hall, London, 1974).

    Book  Google Scholar 

  11. 11

    Ferguson, T. Mathematical statistics: a decision theoretic approach (Academic, New York, 1967).

    Google Scholar 

  12. 12

    Paradiso, M. A theory of the use of visual orientation information which exploits the columnar structure of striate cortex. Biol. Cybern. 58, 35–49 (1988).A pioneering study of the statistical properties of population codes, including the first use of Bayesian techniques to read out and analyse population codes.

    CAS  Article  Google Scholar 

  13. 13

    Seung, H. & Sompolinsky, H. Simple model for reading neuronal population codes. Proc. Natl Acad. Sci. USA 90, 10749–10753 (1993).

    CAS  Article  Google Scholar 

  14. 14

    Deneve, S., Latham, P. & Pouget, A. Reading population codes: A neural implementation of ideal observers. Nature Neurosci. 2, 740 –745 (1999).Shows how a recurrent network of units with bell-shaped tuning curves can be wired to implement a close approximation to a maximum likelihood estimator. Maximum likelihood estimation is widely used in psychophysics to analyse human performance in simple perceptual tasks in a class of model known as `ideal observer analysis'.

    CAS  Article  Google Scholar 

  15. 15

    Georgopoulos, A., Kalaska, J. & Caminiti, R. On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex. J. Neurosci. 2, 1527–1537 ( 1982).

    CAS  Article  Google Scholar 

  16. 16

    Pouget, A., Deneve, S., Ducom, J. & Latham, P. Narrow vs wide tuning curves: what's best for a population code? Neural Comput. 11, 85–90 ( 1999).

    CAS  Article  Google Scholar 

  17. 17

    Pouget, A. & Thorpe, S. Connectionist model of orientation identification. Connect. Sci. 3, 127– 142 (1991).

    Article  Google Scholar 

  18. 18

    Regan, D. & Beverley, K. Postadaptation orientation discrimination . J. Opt. Soc. Am. A 2, 147– 155 (1985).

    CAS  Article  Google Scholar 

  19. 19

    Zhang, K. Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: a theory. J. Neurosci. 16, 2112–2126 (1996). Head direction cells in rats encode the heading direction of the rat in world-centred coordinates. This internal compass is calibrated with sensory cues, maintained in the absence of these cues and updated after each movement of the head. This model shows how attractor networks of neurons with bell-shaped tuning curves to head direction can be wired to account for these properties.

    CAS  Article  Google Scholar 

  20. 20

    Seung, H. How the brain keeps the eyes still. Proc. Natl Acad. Sci. USA 93, 13339–13344 (1996).

    CAS  Article  Google Scholar 

  21. 21

    Poggio, T. A theory of how the brain might work. Cold Spring Harbor Symp. Quant. Biol. 55, 899–910 (1990).Introduces the idea that many computations in the brain can be formalized in terms of nonlinear mappings, and as such can be solved with population codes computing basis functions. Although completely living up to the title would be a tall order, this remains a most interesting proposal. A wide range of available neurophysiological data can be easily understood within this framework.

    CAS  Article  Google Scholar 

  22. 22

    Pouget, A. & Sejnowski, T. Spatial transformations in the parietal cortex using basis functions. J. Cogn. Neurosci. 9, 222–237 (1997).

    CAS  Article  Google Scholar 

  23. 23

    Andersen, R., Essick, G. & Siegel, R. Encoding of spatial location by posterior parietal neurons . Science 230, 456–458 (1985).

    CAS  Article  Google Scholar 

  24. 24

    Squatrito, S. & Maioli, M. Gaze field properties of eye position neurons in areas MST and 7a of macaque monkey. Visual Neurosci. 13, 385–398 ( 1996).

    CAS  Article  Google Scholar 

  25. 25

    Rumelhart, D., Hinton, G. & Williams, R. in Parallel Distributed Processing (eds Rumelhart, D., McClelland, J. & Group, P. R.) 318–362 (MIT Press, Cambridge, Massachusetts, 1986).

    Google Scholar 

  26. 26

    Zipser, D. & Andersen, R. A back-propagation programmed network that stimulates reponse properties of a subset of posterior parietal neurons . Nature 331, 679–684 (1988).

    CAS  Article  Google Scholar 

  27. 27

    Burnod, Y. et al. Visuomotor transformations underlying arm movements toward visual targets: a neural network model of cerebral cortical operations. J. Neurosci. 12, 1435–1453 (1992).A model of the coordinate transformation required for arm movements using a representation very similar to basis functions. This model was one of the first to relate the tuning properties of cells in the primary motor cortex to their computational role. In particular, it explains why cells in M1 change their preferred direction to hand movement with starting hand position.

    CAS  Article  Google Scholar 

  28. 28

    Salinas, E. & Abbot, L. Transfer of coded information from sensory to motor networks. J. Neurosci. 15, 6461–6474 (1995). A model showing how a basis function representation can be used to learn visuomotor transformations with a simple hebbian learning rule.

    CAS  Article  Google Scholar 

  29. 29

    Olshausen, B. & Field, D. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 ( 1996).

    CAS  Article  Google Scholar 

  30. 30

    Bishop, C., Svenson, M. & Williams, C. GTM: The generative topographic mapping. Neural Comput. 10, 215–234 (1998).

    Article  Google Scholar 

  31. 31

    Lewicki, M. & Sejnowski, T. Learning overcomplete representations . Neural Comput. 12, 337– 365 (2000).

    CAS  Article  Google Scholar 

  32. 32

    Hinton, G. E. in Proceedings of the Ninth International Conference on Artificial Neural Networks 1–6 (IEEE, London, England, 1999).

    Google Scholar 

  33. 33

    Poggio, T. & Edelman, S. A network that learns to recognize three-dimensional objects. Nature 343, 263 –266 (1990).

    CAS  Article  Google Scholar 

  34. 34

    Salinas, E. & Abbott, L. Invariant visual responses from attentional gain fields. J. Neurophysiol. 77, 3267– 3272 (1997).

    CAS  Article  Google Scholar 

  35. 35

    Deneve, S. & Pouget, A. in Advances in Neural Information Processing Systems (eds Jordan, M., Kearns, M. & Solla, S.) (MIT Press, Cambridge, Massachusetts, 1998).

    Google Scholar 

  36. 36

    Groh, J. & Sparks, D. Two models for transforming auditory signals from head-centered to eye-centered coordinates. Biol. Cybern. 67, 291–302 ( 1992).

    CAS  Article  Google Scholar 

  37. 37

    Pouget, A. & Sejnowski, T. A neural model of the cortical representation of egocentric distance. Cereb. Cortex 4, 314–329 (1994).

    CAS  Article  Google Scholar 

  38. 38

    Olshausen, B., Anderson, C. & Essen, D. V. A multiscale dynamic routing circuit for forming size- and position-invariant object representations. J. Comput. Neurosci. 2, 45–62 ( 1995).

    CAS  Article  Google Scholar 

  39. 39

    Treue, S., Hol, K. & Rauber, H. Seeing multiple directions of motion-physiology and psychophysics. Nature Neurosci. 3, 270–276 (2000).

    CAS  Article  Google Scholar 

  40. 40

    Shadlen, M., Britten, K., Newsome, W. & Movshon, T. A computational analysis of the relationship between neuronal and behavioral responses to visual motion. J. Neurosci. 16, 1486– 1510 (1996).

    CAS  Article  Google Scholar 

  41. 41

    Zemel, R. & Dayan, P. in Advances in Neural Information Processing Systems 11 (eds Kearns, M., Solla, S. & Cohn, D.) 174–180 (MIT Press, Cambridge, Massachusetts, 1999).

    Google Scholar 

  42. 42

    Simoncelli, E., Adelson, E. & Heeger, D. in Proceedings 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 310– 315 (Los Alamitos, Los Angeles, 1991).

    Book  Google Scholar 

  43. 43

    Watamaniuk, S., Sekuler, R. & Williams, D. Direction perception in complex dynamic displays: the integration of direction information. Vision Res. 29 , 47–59 (1989).

    CAS  Article  Google Scholar 

  44. 44

    Anderson, C. in Computational Intelligence Imitating Life 213– 222 (IEEE Press, New York, 1994).Neurons are frequently suggested to encode single values (or at most 2–3 values for cases such as transparency). Anderson challenges this idea and argues that population codes might encode probability distributions instead. Being able to encode a probability distribution is important because it would allow the brain to perform Bayesian inference, an efficient way to compute in the face of uncertainty.

    Google Scholar 

  45. 45

    Recanzone, G., Wurtz, R. & Schwarz, U. Responses of MT and MST neurons to one and two moving objects in the receptive field. J. Neurophysiol. 78 , 2904–2915 (1997).

    CAS  Article  Google Scholar 

  46. 46

    Wezel, R. V., Lankheet, M., Verstraten, F., Maree, A. & Grind, W. V. D. Responses of complex cells in area 17 of the cat to bi-vectorial transparent motion. Vision Res. 36, 2805–2813 ( 1996).

    Article  Google Scholar 

  47. 47

    Riesenhuber, M. & Poggio, T. Hierarchical models of object recognition in cortex. Nature Neurosci. 2 , 1019–1025 (1999).

    CAS  Article  Google Scholar 

  48. 48

    Perrett, D., Mistlin, A. & Chitty, A. Visual neurons responsive to faces. Trends Neurosci. 10, 358–364 ( 1987).

    Article  Google Scholar 

  49. 49

    Bruce, V., Cowey, A., Ellis, A. & Perret, D. Processing the Facial Image (Clarendon, Oxford, 1992).

    Google Scholar 

Download references

Author information



Supplementary information

Related links

Related links


Computational neuroscience

Gatsby computational neuroscience unit

Alex Pouget's web page

Peter Dayan's web page

Richard Zemel's web page



The part of a neuronal response that cannot apparently be accounted for by the stimulus. Part of this factor may arise from truly random effects (such as stochastic fluctuations in neuronal channels), and part from uncontrolled, but non-random, effects.


A linear function of a one-dimensional variable (such as direction of motion) is any function that looks like a straight line, that is, any function that can be written as y = ax + b , where a and b are constant. Any other functions are nonlinear. In two dimensions and above, linear functions correspond to planes and hyperplanes. All other functions are nonlinear.


A bell-shaped curve. Gaussian tuning curves are extensively used because their analytical expression can be easily manipulated in mathematical derivations.


A tuning curve to a feature is the curve describing the average response of a neuron as a function of the feature values.


Lateral connections are formed between neurons at the same hierarchical level. For instance, the connections between cortical neurons in the same area and same layer are said to be lateral. Lateral connections are symmetric if any connection from neuron a to neuron b is matched by an identical connection from neuron b to neuron a.


In the visual cortex, an orientation hypercolumn refers to a patch of cortex containing neurons with similar spatial receptive fields but covering all possible preferred orientations. This concept can be generalized to other visual features and to other sensory and motor areas.


This refers to the statistical computation of specifically extracting all the information implied about the stimulus from the (noisy) activities of the population. Ideal observers make optimal inferences.


A mapping is a transformation from a variable x to a variable y, such as y = x2. The identity mapping is the simplest form of such mapping in which y is simply equal to x.


In linear algebra, a set of vectors such that any other vector can be expressed in terms of a weighted sum of these vectors is known as a basis. By analogy, sine and cosine functions of all possible frequencies are said to form a basis set.


A transformation that expresses any function in terms of a weighted sum of sine and cosine functions of all possible frequencies. The weights assigned to each frequency are specific to the function being considered and are known as the Fourier coefficients for this function.


A learning algorithm based on the chain rule in calculus, in which error signals computed in the output layer are propagated back through any intervening layers to the input layer of the network.


A learning rule in which the synaptic strength of a connection is changed according to the correlation in the activities of its presynaptic and postsynaptic sides.


A learning rule that adjusts synaptic weights according to the product of the presynaptic activity and a postsynaptic error signal obtained by computing the difference between the actual output activity and a desired or required output activity.


An adaptation in which a network is trained to uncover and represent the statistical structure within a set of inputs, without reference to a set of explicitly desired outputs. This contrasts with supervised learning, in which a network is trained to produce particular desired outputs in response to given inputs.


A situation in which several directions of motion are perceived simultaneously at the same location. This occurs when looking through the windscreen of a car. At each location, the windscreen is perceived as being still while the background moves in a direction opposite to the motion of the car.


A grating is a visual stimulus consisting of alternating light and dark bars, like the stripes on the United States flag. A full-field grating is a very wide grating that occupies the entire visual field.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Pouget, A., Dayan, P. & Zemel, R. Information processing with population codes. Nat Rev Neurosci 1, 125–132 (2000).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing