Try this quick do-it-yourself experiment: look at an illuminated light bulb for a few seconds and then view the afterimage on your hand and finally on a nearby wall. The afterimage seems bigger as the surface on which it is viewed becomes farther away. This illusion1, reported by Emmert over one hundred years ago, demonstrates one of the most intriguing aspects of vision: even when objects cast exactly the same size pattern of light on the retina, they appear to be markedly different in size when viewed at different distances. In going from retinal image to conscious perception, the visual system is therefore able to factor in perceived distance to change how big something looks.

Exactly how the visual system achieves this feat remains unclear. It was traditionally assumed that early visual processing areas primarily reflect the physical input from the retina, whereas activity in higher-order areas more closely resembles conscious perception. Such an account would hold that the perceived size of an object would more closely match activity in higher visual areas. However, in this issue, Murray and colleagues2 find a very different pattern of results. They used functional magnetic resonance imaging (MRI) to measure the spatial pattern of activity in human primary visual cortex (V1) while volunteers viewed objects that were physically the same size (and therefore produced identical patterns of retinal input) but were perceived as different in size. Surprisingly, the spatial extent of activity in the very first cortical visual area (V1) reflected not the size of the retinal input, but instead the perceived size of the object. This remarkable finding challenges our notion that V1 contains a very precise one-to-one map of retinal input, and for the first time provides a link between the spatial extent of what we perceive and the exact spatial distribution of activity in human V1.

The authors measured brain activity while subjects viewed pictures of identically sized spheres placed in a picture of a three-dimensional (3D) hallway. A compelling size illusion is immediately apparent (Fig. 1); the sphere at the end of the hallway looks markedly bigger than the one at the start, even though the actual size of the two spheres is exactly the same. Indeed, when subjects were asked to compare the size of these objects with two-dimensional (2D) flat disks (presented on a background without 3D cues), they judged the front sphere to be slightly smaller than the equally sized 2D disk and the back sphere to be larger. The contextual cues to depth in the 3D scene (textural gradients and linear perspective) affect the perceived size of the objects.

Figure 1: A color picture of the stimuli used in the experiment.
figure 1

Scott Murray

The two spheres are actually the same size.

Using retinotopic mapping to delineate primary visual cortex, Murray and colleagues examined whether the size of activation patterns in V1 differed when subjects looked at either the front or back spheres. Remarkably, when the sphere that subjects were looking at was perceived to be bigger (due to the contextual cues), activity in V1 spread over a larger area than when it was perceived to be smaller, even though the size of the retinal image produced by the spheres was identical. Activity at the earliest stages of cortical processing does not therefore simply reflect the pattern of light falling on the retina. Somehow the complex three-dimensional cues present in the scene (Fig. 1) can be integrated to take into account perceived depth in the representation present in V1.

As V1 is a relatively large cortical area relative to the spatial resolution of functional MRI, Murray and colleagues were able to compare in detail the size of the activation produced by purely perceptual (illusory) variations in object size and physical changes in object size. They compared the difference in the distribution of V1 activity between the perceptually 'bigger' and 'smaller' spheres (Fig. 1) with that between two-dimensional discs that physically matched the perceived size difference. They found strikingly similar differences in V1 activity patterns in each case, suggesting that differences in perceived size (rather than retinal input) matter for V1. This further strengthens the claim that the V1 representation of an object closely reflects its perceived size. In other careful control experiments, Murray and colleagues ruled out other possible explanations, such as the local contextual cues altering perceived brightness (which can potentially affect V1 activation3) rather than just perceived size.

Murray and colleagues could not determine precisely where this effect of perceived size on V1 activity arises because they could not examine activity beyond V1 for technical reasons. The visual system must combine information about the perceived depth of an object (provided by the environmental context in Fig. 1) with the projection of that object on the retina. Computing perceived depth from two-dimensional pictorial cues such as linear perspective and texture gradients is associated with activity in parietal cortex4,5. Presumably, such signals reflecting perceived depth can influence V1 through feedback signals that influence the size of the object representation. However, whether the object representation in V1 causes the conscious perception of size remains an open question. Intriguingly, the perceived size of afterimages generated by stimulating the blind hemifield of an individual whose primary visual cortex has been surgically removed nevertheless obeys Emmert's law6. This suggests that activity in areas other than V1 may be sufficient to support scaling of perceived size for at least some types of image with perceived distance. A closer characterization of the functional role of V1 in the conscious perception of size therefore remains an intriguing topic for future research.

This work is not the first to show that V1 activity can be strongly linked to conscious perception rather than to physical (retinal) stimulation7. It is also clear that neural processing in V1 reflects not just feed-forward signals but also feedback influences from higher areas8. However, this work not only provides a particularly clear and compelling example of these properties but also, for the first time, clearly links the spatial extent of what we perceive (rather than, for example, contrast or direction of motion) to the spatial extent of activity in V1. More fundamentally, these findings force us to re-evaluate the notion of a 'hard-wired' retinotopy in V1. The finding that V1 contains a topographic map of the retinal projection of the visual field has been central to visual neuroscience9,10. Instead it now seems that the topographic map in V1 can be modified dynamically according to the perceived size of an object. This has important implications not only for understanding the role of V1 in visual processing but also in practical terms. For instance, it has become common practice in functional MRI studies focusing on early visual areas to functionally localize spatially delimited regions of interest using retinotopic mapping. The general usefulness of this approach notwithstanding, future studies will have to take into account the possibility that visual context can dynamically modify this retinotopy, even in early visual areas.

Dynamic shifts in how retinal outputs map onto cortical targets (such as the retinotopic maps in V1) are a key component of an influential computational model11 that seeks to resolve computational problems in the domains of stereopsis (depth perception from binocular cues), spatial attention and motion perception. Thus, flexible mappings between arrays of neurons at different levels of the visual pathway may reflect a common computational strategy for optimal vision. The limits of this 'flexible retinotopy' (ref. 12) will need to be probed and the fine-grained neural mechanisms uncovered through complementary studies in nonhuman primates. At a single neuron level, primate V1 responses show signals that change according to the distance of an object13,14, forming a potential neural substrate for the dynamic changes observed at a much coarser spatial scale by Murray and colleagues.

Indeed, at a fine-grained level, their findings also raise intriguing questions about whether a V1 representation of the environment that reflects perceived depth and size can be internally coherent. For example, when two objects at different perceived depths partially occlude each other, correct border assignments may be particularly complex as portions of the objects adjacent to the border may be relatively displaced according to perceived depth. The current observation that the near and far objects were judged to appear both smaller and larger with respect to an equivalently sized two-dimensional object may suggest a 'push-pull' mechanism for maintaining coherence in a spatially distributed V1 representation of the subjects perceptions.

Taken together, these compelling findings force us once again to consider a revised model of visual processing in which V1, far from being a passive feed-forward recipient of retinal signals, instead flexibly combines retinal and extraretinal signals to potentially build an integrated representation of the perceived visual environment. Future study of how V1 activity relates to human consciousness will doubtless continue to be both interesting and informative.