Nature Neuroscience
- 9, 1356 - 1357 (2006)
doi:10.1038/nn1106-1356
The virtue of simplicityTao Zhang & Kenneth H BrittenThe authors are at the Center for Neuroscience, and Kenneth H. Britten is also in the Section of Neurobiology, Physiology, and Behavior, University of California at Davis, 1544 Newton Court, Davis, California 95616, USA. khbritten@ucdavis.edu Multiple local motions must be combined to determine the direction of object motion, which is harder than it seems. A new paper proposes an elegant and simple solution to this problem, eminently realizable in feed-forward circuits.Physicists have long regarded simpler models as more valuable, no matter how complex the problem. Neuroscientists have not always embraced this notion, perhaps because of the confusing plethora of detail that the biology of the brain offers. The visual neuroscience of pattern motion processing is typical of tractable vision problems—there is a welter of experimental detail, along with a variety of models. However, models that work well to explain perceptual phenomena are often difficult to instantiate in 'wetware.' For these reasons, a simple model that explains such a complex perceptual problem in neuronally realistic terms provides considerable cause for rejoicing.
The model of Rust et al. in this issue1 combines two simple mechanisms to produce a very good account of how neurons in the middle temporal extrastriate visual cortex (MT, or equivalently V5) might acquire their selectivity for pattern direction. This well-studied, mid-level area is highly specialized for the analysis of motion2,
3. It receives direct inputs from primary visual cortex (V1) and in turn projects to premotor structures. One of the most provocative findings from MT is that neurons integrate multiple directions of motion to unambiguously signal the motion of patterns—they can solve the so-called 'aperture problem'4.
The aperture problem refers to the observation that an object's direction of motion can be hard to track when information is available from a limited area of space. Neurons in V1 have very small receptive fields, and thus can only sample a small patch of the moving object (Fig. 1a). This means that each V1 neuron can only capture the component of motion that is perpendicular to the moving edge (black arrow), which may not be the same as the actual moving direction of the object. A whole family of possible object motions will cause the same response in this V1 neuron. In other words, the motion of the moving object appears to be ambiguous to any single V1 neuron. To solve this problem, the visual system needs to integrate these multiple V1 responses representing the motion of non-parallel moving contours. We can demonstrate this by using two V1 neurons (Fig. 1b). There is only one possible motion (red arrow), which indicates the actual object motion and contains the component motions observed by the two V1 cells.
 | |  | MT in primate visual cortex has abundant opportunity to integrate inputs from directionally selective cells in V1 (ref. 5). MT cells' receptive fields are about 10 times bigger than those in V1, often encompassing multiple object contours6. The original observation, which has long begged for mechanistic explanation, is that some neurons in MT clearly combine multiple motions to represent the unique pattern direction. Other neurons ('component selective') in MT behave much like expanded V1 cells, and responded to single contours but not the pattern as a whole. Many others responded in intermediate ways4. The paper by Rust et al. not only explains the emergence of pattern responses, but also compactly captures circuit features that contribute to this diversity of response types in MT.
The new model of Rust and colleagues1 (Fig. 2) contains two stages, a V1 stage and an MT stage. The V1 stage is designed to resemble actual V1 data and contains a feature that turns out to be critical for the model's success: two kinds of divisive inhibition. Such inhibition has been well documented in V1, though the details are still being worked out7,
8. The two types can be thought of as those within single columns in V1 and those between columns. The V1 population then projects to an MT neuron, which weights and adds the inputs, then passes the resulting signal through a nonlinearity that represents the spike generation. This kind of linear-nonlinear ('LN') model has been very successful at capturing integration at many stages of the visual system9,
10. The model was individually adjusted for each cell—necessary for explaining neuronal variability—using a clever reverse-correlation method. Although it seems complex, the actual number of free parameters adjusted was relatively modest. Once developed, the model was tested against actual patterned stimuli of the sort commonly used in these sorts of experiments. The model passed this test convincingly, capturing the responses of the neurons with high fidelity.
 | |  | One might reasonably ask at this point where the added value of another clever model lies, especially given that this particular phenomenon has already been successfully explained by other models, some also simple and elegant11,
12,
13. Models have two chief goals: the consolidation of our ideas (descriptive value) and the generation of useful new experiments (predictive value). Simple, realistic models such as this one win by both metrics. The LN architecture at the heart of the Rust et al. model1 is an idea of considerable power. Using it successfully against a long-standing computational problem is further evidence of its generality; this is a significant consolidation. The success of this model in this instance will no doubt drive others to test this architecture against their favorite problems, and many more complex problems await.
The model also makes a number of specific predictions about the circuitry between cortical areas. In particular, it proposes particular afferent architecture to bestow either simple (component) or more complex (pattern) responses on MT cells. These predictions are, unfortunately, rather difficult to test at present, as the tests will require paired recordings between MT neurons and their afferents. Such experiments, though possible with current technology, are arduous, difficult and slow. Some of the simplifying predictions will no doubt prove wrong, as the authors freely admit. MT receives afferents from many sources other than V1, and indeed these are likely to convey different kinds of information. Another simple prediction of the model is almost certainly wrong on the basis of current evidence; we are pretty sure that divisive inhibition operates not only in V1, but also in MT itself14. So, bringing a circuit reality to the nonlinearities at the MT level is a target for the next generation of experiments.
The lesson from physics is that models turning out to be wrong is all part of the game and should be viewed with approval. We get to better understanding by climbing a ladder built of the bones of dead models. If there is a core truth, some useful generality achieved by any generation of model, this is major progress. There has long been a desire to find general mechanisms of information processing that will apply across cortical areas15, and this paper marks a notable step in that direction. It is starting to look as if the LN family of models might be such a unifying framework.
REFERENCES
- Rust, N.C., Mante, V., Simoncelli, E.P. & Movshon, J.A. Nat. Neurosci. 9, 1421–1431 (2006). | Article |
- Born, R.T. & Bradley, D.C. Annu. Rev. Neurosci. 28, 157–189 (2005). | Article | PubMed | ISI | ChemPort |
- Britten, K.H. in The Visual Neurosciences (eds. Chalupa, L.M. & Werner, J.S.) 1203–1216 (MIT Press, Cambridge, Massachusetts 2004).
- Movshon, J.A., Adelson, E.H., Gizzi, M.S. & Newsome, W.T. in Study Group on Pattern Recognition Mechanisms (eds. Chagas, C., Gattass, R. & Gross, C.) 117–151 (Pontifica Academia Scientiarum, Vatican City, 1985).
- Movshon, J.A. & Newsome, W.T. J. Neurosci. 16, 7733–7741 (1996). | PubMed | ISI | ChemPort |
- Maunsell, J.H.R. & Van Essen, D.C. J. Neurophysiol. 49, 1127–1147 (1983). | PubMed | ISI | ChemPort |
- Heeger, D.J. Vis. Neurosci. 9, 181–197 (1992). | PubMed | ISI | ChemPort |
- Ferster, D. & Miller, K.D. Annu. Rev. Neurosci. 23, 441–471 (2000). | Article | PubMed | ISI | ChemPort |
- Chichilnisky, E.J. Network 12, 199–213 (2001). | Article | PubMed | ISI | ChemPort |
- David, S.V., Hayden, B.Y. & Gallant, J. J. Neurophysiol. advance online publication, 20 September 2006 (doi: 10.1152/jn.00575.2006). | Article |
- Simoncelli, E.P. & Heeger, D.J. Vision Res. 38, 743–761 (1998). | Article | PubMed | ISI | ChemPort |
- Wilson, H.R. & Ferrera, V.P. Vis. Neurosci. 9, 79–97 (1992). | PubMed | ISI | ChemPort |
- Grossberg, S. & Mingolla, E. Percept. Psychophys. 53, 243–278 (1993). | PubMed | ChemPort |
- Britten, K.H. & Heuer, H.W. J. Neurosci. 19, 5074–5084 (1999). | PubMed | ISI | ChemPort |
- Douglas, R.J. & Martin, K.A. J. Physiol. (Lond.) 440, 735–769 (1991). | PubMed | ISI | ChemPort |
|