How do neurons in the brain represent movie stars, famous buildings and other familiar objects? Rare recordings from single neurons in the human brain provide a fresh perspective on the question.
‘Grandmother cell’ is a term coined by J. Y. Lettvin to parody the simplistic notion that the brain has a separate neuron to detect and represent every object (including one's grandmother)1. The phrase has become a shorthand for invoking all of the overwhelming practical arguments against a one-to-one object coding scheme2. No one wants to be accused of believing in grandmother cells. But on page 1102 of this issue, Quiroga et al.3 describe a neuron in the human brain that looks for all the world like a ‘Jennifer Aniston’ cell. Ms Aniston could well become a grandmother herself someday. Are vision scientists now forced to drop their dismissive tone when discussing the neural representation of matriarchs?
A more technical term for the grandmother issue is ‘sparseness’ (Fig. 1). At earlier stages in the brain's object-representation pathway, the neural code for an object is a broad activity pattern distributed across a population of neurons, each responsive to some discrete visual feature4. At later processing stages, neurons become increasingly selective for combinations of features5, and the code becomes increasingly sparse — that is, fewer neurons are activated by a given stimulus, although the code is still population-based6. Sparseness has its advantages, especially for memory, because compact coding maximizes total storage capacity, and some evidence suggests that ‘sparsification’ is a defining goal of visual information processing7,8. Grandmother cells are the theoretical limit of sparseness, where the representation of an object is reduced to a single neuron.
Quiroga and colleagues3 report what seems to be the closest approach yet to that limit. They recorded neural activity from structures in the human medial temporal lobe that are associated with late-stage visual processing and long-term memory. The structures concerned were the entorhinal cortex, the parahippocampal gyrus, the amygdala and the hippocampus, and the recordings were made in the course of clinical procedures to treat epilepsy.
The first example cell responded significantly to seven different images of Jennifer Aniston but not to 80 other stimuli, including pictures of Julia Roberts and even pictures of Jennifer Aniston with Brad Pitt. The second example cell preferred Halle Berry in the same way. Altogether, 44 units (out of 137 with significant visual responses) were selective in this way for a single object out of those tested.
The striking aspect of these results is the consistency of responses across different images of the same person or object. This relates to another major issue in visual coding, ‘invariance’ (Fig. 1). One of the most difficult aspects of vision is that any given object must be recognizable from the front or side, in light or shadow, and so on. Somehow, given those very different retinal images, the brain consistently invokes the same set of memory associations that give the object meaning. According to ‘view-invariant’ theories, this is achieved in the visual cortex by some kind of neural calculation that transforms the visual structure in different images into a common format9,10,11. According to ‘view-dependent’ theories, it is achieved by learning temporal associations between different views and storing those associations in the memory12,13,14.
Quiroga and colleagues' results3 set a new benchmark for both sparseness and invariance, at least from a visual perspective. Most of the invariant structural characteristics in images of Jennifer Aniston (such as relative positions of eyes, nose and mouth) would be present in images of Julia Roberts as well. Thus, any distributed visual coding scheme would predict substantial overlap in the neural groups representing Aniston and Roberts; cells responding to one and not the other would be rare. The clean, visually invariant selectivity of the neurons described by Quiroga et al. implies a sparseness bordering on grandmotherliness.
However, as the authors discuss, these results may be best understood in a somewhat non-visual context. The brain structures that they studied stand at the far end of the object-representation pathway or beyond, and their responses may be more memory-related than strictly visual. In fact, several example cells responded not only to pictures but also to the printed name of a particular person or object. Clearly, this is a kind of invariance based on learned associations, not geometric transformation of visual structure, and these cells encode memory-based concepts rather than visual appearance.
How do you measure sparseness in conceptual space? It's a difficult proposition, requiring knowledge of how the subject associates different concepts in memory. The authors did their best (within the constraints of limited recording time) to test images that might be conceptually related. In one tantalizing example, a neuron responded to both Jennifer Aniston and Lisa Kudrow, her co-star on the television show Friends. What seems to be a sparse representation in visual space may be a distributed representation in sitcom space! In another example, a neuron responded to two unrelated stimuli commonly used by Quiroga et al. — pictures of Jennifer Aniston with Brad Pitt and pictures of the Sydney Opera House. This could reflect a new memory association produced by the close temporal proximity of these stimuli during the recording sessions, consistent with similar phenomena observed in monkey temporal cortex15.
Thus, Quiroga and colleagues' findings may say less about visual representation as such than they do about memory representation and how it relates to visual inputs. Quiroga et al. have shown that, at or near the end of the transformation from visual information about object structure to memory-related conceptual information about object identity, the neural representation seems extremely sparse and invariant in the visual domain. As the authors note, these are predictable characteristics of an abstract, memory-based representation. But I doubt that anyone would have predicted such striking confirmation at the level of individual neurons.
Rose, D. Perception 25, 881–886 (1996).
Barlow, H. B. Perception 1, 371–394 (1972).
Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C. & Fried, I. Nature 435, 1102–1107 (2005).
Pasupathy, A. & Connor, C. E. Nature Neurosci. 5, 1332–1338 (2002).
Brincat, S. L. & Connor, C. E. Nature Neurosci. 7, 880–886 (2004).
Young, M. P. & Yamane, S. Science 256, 1327–1331 (1992).
Olshausen, B. A. & Field, D. J. Nature 381, 607–609 (1996).
Vinje, W. E. & Gallant, J. L. Science 287, 1273–1276 (2000).
Biederman, I. Psychol. Rev. 94, 115–147 (1987).
Marr, D. & Nishihara, H. K. Proc. R. Soc. Lond. B 200, 269–294 (1978).
Booth, M. C. & Rolls, E. T. Cereb. Cortex 8, 510–523 (1998).
Bulthoff, H. H., Edelman, S. Y. & Tarr, M. J. Cereb. Cortex 5, 247–260 (1995).
Vetter, T., Hurlbert, A. & Poggio, T. Cereb. Cortex 5, 261–269 (1995).
Logothetis, N. K. & Pauls, J. Cereb. Cortex 5, 270–288 (1995).
Sakai, K. & Miyashita, Y. Nature 354, 152–155 (1991).
About this article
Applied Sciences (2017)
Familiarization: A theory of repetition suppression predicts interference between overlapping cortical representations
PLOS ONE (2017)
Neural representations of perceptual and semantic identities of individuals in the anterior ventral inferior temporal cortex of monkeys
Japanese Psychological Research (2014)
Current Biology (2009)