Signals in inferotemporal and perirhinal cortex suggest an untangling of visual target information

Journal name:
Nature Neuroscience
Volume:
16,
Pages:
1132–1139
Year published:
DOI:
doi:10.1038/nn.3433
Received
Accepted
Published online

Abstract

Finding sought visual targets requires our brains to flexibly combine working memory information about what we are looking for with visual information about what we are looking at. To investigate the neural computations involved in finding visual targets, we recorded neural responses in inferotemporal cortex (IT) and perirhinal cortex (PRH) as macaque monkeys performed a task that required them to find targets in sequences of distractors. We found similar amounts of total task-specific information in both areas; however, information about whether a target was in view was more accessible using a linear read-out or, equivalently, was more untangled in PRH. Consistent with the flow of information from IT to PRH, we also found that task-relevant information arrived earlier in IT. PRH responses were well-described by a functional model in which computations in PRH untangle input from IT by combining neurons with asymmetric tuning correlations for target matches and distractors.

At a glance

Figures

  1. Theoretical proposals of the neural mechanisms involved in finding visual targets.
    Figure 1: Theoretical proposals of the neural mechanisms involved in finding visual targets.

    Theoretical models propose that visual signals and working memory signals are nonlinearly combined in a distributed fashion across a population of neurons, followed by a reformatting process to produce neurons that explicitly report whether a target is present in a currently viewed scene. The delayed match-to-sample task is logically equivalent to the inverse of an 'exclusive or' (xor) operation in that the solution requires a signal that identifies target matches as the conjunction of looking at and for the same object. Shown (top) is a theoretical example of such a 'target present?' neuron, which fires when ('at', 'for') is (1,1) or (2,2), but not (1,2) or (2,1). Producing such a signal requires at least two stages of processing in a feedforward network40. As a simple example, a 'target present?' neuron could be constructed by first combining visual and working memory inputs in a multiplicative fashion to produce hybrid detectors that fire when individual objects are present as targets, followed by pooling. Note that this is not a unique solution.

  2. The delayed match-to-sample task and example neural responses.
    Figure 2: The delayed match-to-sample task and example neural responses.

    (a) We trained monkeys to perform a delayed match-to-sample task that required them to treat the same four images (shown here) as target matches and as distractors in different blocks of trials. Monkeys initiated a trial by fixating a small dot. After a delay, an image indicating the target was presented, followed by a random number (0–3, uniformly distributed) of distractors, and then the target match. Monkeys were required to maintain fixation throughout the distractors and make a downward saccade when the target appeared to receive a reward. Approximately 25% of trials included the repeated presentation of the same distractor with zero or one intervening distractors of a different identity. (b) Each of four images were presented in all possible combinations as a visual stimulus (looking at), and as a target (looking for), resulting in a four-by-four response matrix. Shown are the response matrices for example neurons with different types of structure (labeled). All matrices depict a neuron's response with pixel intensity proportional to firing rate, normalized to range from black (the minimum) to white (the maximum) response. We recorded these example neurons in the following brain areas (left to right): PRH, PRH, PRH, IT, PRH, IT and IT. Single-neuron linearly separable information (IL; Fig. 4c) values (left to right) were 0.01, 0.02, 3.33, 0.39, 0.44, 0.01 and 0.06.

  3. Population performance.
    Figure 3: Population performance.

    (a) Each point depicts a hypothetical population response, consisting of a vector of the spike count responses to a single condition on a single trial. The four different shapes depict the hypothetical responses to the four different images and the two colors (red, gray) depict the hypothetical responses to target matches and distractors, respectively. For simplicity, only 4 of the 12 possible distractors are depicted. Clouds of points depict the predicted dispersion across repeated presentations of the same condition as a result of trial-by-trial variability. The target-switching task (Fig. 2) required discriminating the same objects presented as target matches and as distractors. (b) Performance of the IT (gray) and PRH (white) populations, plotted as a function of the number of neurons included in each population, via cross-validated analyses designed to probe linear separability (left) and total separability (linear and/or nonlinear, right). The dashed line indicates chance performance. We measured linear separability with a cross-validated analysis that determined how well a linear decision boundary could separate target matches and distractors. We measured total separability with a cross-validated, ideal observer analysis. Error bars correspond to the standard error that can be attributed to the random assignment of training and testing trials in the cross-validation procedure and, for populations smaller than the full data set, to the random selection of neurons.

  4. Additional population performance measures.
    Figure 4: Additional population performance measures.

    (a) Evolution of linear classification performance over time. Thick lines indicate performance of the entire IT (gray) and PRH (black) populations for counting windows of 30 ms with 15-ms shifts between neighboring windows. Thin lines indicate standard error. The dashed line indicates the minimum reaction time on these trials (270 ms). (b) Linear classification performance on error (dotted) as compared with correct (solid) trials (data are presented as in Fig. 3b, left; see Online Methods). Each error trial was matched with a randomly selected correct trial that had the same target and visual stimulus as the condition that resulted in the error and both sets of trials were used to measure cross-validated performance when the population read-out was trained on separately measured correct trials, as described above. Error trials included both misses (of target matches) and false alarms (responding to a distractor). We performed the analysis separately for each multi-channel recording session and then averaged across sessions. Error bars, s.e.m. (c) Left, histograms of linearly separable target match information (IL; equation (3), Online Methods), computed for IT (gray) and PRH (white). Arrows indicate means. The last bin includes PRH neurons with IL of 1.1, 1.4, 3.3 and 5.3. The first (broken) bin includes IT and PRH neurons with negligible IL (defined as IL < 0.05; proportions = 0.75 in IT and 0.56 in PRH). Right, response matrices of the IL top-ranked PRH and IT neurons (data are presented as in Fig. 2b) and the rankings labeled.

  5. Discriminating between classes of models that predict more untangled target match information in PRH than IT.
    Figure 5: Discriminating between classes of models that predict more untangled target match information in PRH than IT.

    (ac) Black lines indicate visual input and cyan lines indicate cognitive input, which can take the form of working memory or target match information. (d) Average magnitudes of visual (dashed) and cognitive (solid) normalized modulation plotted as a function of time relative to stimulus onset for IT (gray) and PRH (black). Normalized modulation was quantified as the bias-corrected ratio between signal variance and noise variance (equation (4), Online Methods), and provided a noise-corrected measure of the amount of neural response variability that could be attributed to: visual, changing the identity of the visual stimulus, or cognitive, changing the identity of the sought target and/or nonlinear interactions between changes in the visual stimulus and the sought target. (e) Enlarged view of the cognitive signals plotted in d. In d and e, response matrices were calculated from spikes in 60-ms bins with 1-ms shifts between bins.

  6. Modeling the transformation from IT to PRH.
    Figure 6: Modeling the transformation from IT to PRH.

    (a) Shown are linear classification (left) and ideal observer (right) performance of the following populations: IT (gray), PRH (black), the nonlinear (N) model (gray dot-dashed) and the linear-nonlinear (LN) model (black dashed). Data are presented as in Figure 3b. To compare performance of the actual and model populations, we regenerated Poisson trial-by-trial variability for the actual IT and PRH populations from the mean firing rate responses across trials (the response matrix) for each IT and PRH neuron. (b) The pairwise linear-nonlinear model we fit to describe the transformation from IT to PRH, shown for two idealized IT neurons. To create the linear-nonlinear model, we combined pairs of IT neurons via two sets of orthogonal linear weights, followed by a nonlinearity to create two model PRH neurons.

  7. The neural mechanisms underlying untangling.
    Figure 7: The neural mechanisms underlying untangling.

    (a) Shown is an idealized neuron that has the same average response to matches (red solid) and distractors (gray), and thus no linearly separable information (IL = 0). However, because the lowest responses in the matrix are matches (red open circles), a threshold nonlinearity can set these to a higher value (red solid circles), thereby producing an increase in the overall mean match response (red dashed) such that it is now higher than the average distractor response (gray). Because linearly separable information depends on the difference between these means, this translates directly into an increase in linearly separable information in the output neuron (IL > 0). (b) Two idealized neurons presented as in Figure 2b. The two neurons produce a nonlinearly separable representation in which a linear decision boundary is largely incapable of separating matches from distractors. However, these two idealized neurons have perfect tuning correlations for matches and perfect tuning anti-correlations for distractors. (c) Pairing the two neurons via two sets of orthogonal linear weights produces a rotation in the two-dimensional space and a difference in the response variance for matches and distractors for both neurons. (d) Applying a nonlinearity to the linearly paired responses results in a representation in which a linear decision boundary is partially capable at distinguishing matches and distractors. The effectiveness of pairing can be attributed to an asymmetry (that is, a difference) in the neurons' tuning correlations for matches and distractors (equation (24), Online Methods).

References

  1. Salinas, E. Fast remapping of sensory stimuli onto motor actions on the basis of contextual modulation. J. Neurosci. 24, 11131118 (2004).
  2. Salinas, E. & Bentley, N.M. Gain modulation as a mechanism for switching reference frames, tasks and targets. in Coherent Behavior in Neuronal Networks (eds. Josic, K., Rubin, J., Matias, M. & Romo, R.) 121142 (Springer, New York, 2009).
  3. Engel, T.A. & Wang, X.J. Same or different? A neural circuit mechanism of similarity-based pattern match decision making. J. Neurosci. 31, 69826996 (2011).
  4. Sugase-Miyamoto, Y., Liu, Z., Wiener, M.C., Optican, L.M. & Richmond, B.J. Short-term memory trace in rapidly adapting synapses of inferior temporal cortex. PLoS Comput. Biol. 4, e1000073 (2008).
  5. Miller, E.K., Erickson, C.A. & Desimone, R. Neural mechanisms of visual working memory in prefrontal cortex of the macaque. J. Neurosci. 16, 51545167 (1996).
  6. Tomita, H., Ohbayashi, M., Nakahara, K., Hasegawa, I. & Miyashita, Y. Top-down signal from prefrontal cortex in executive control of memory retrieval. Nature 401, 699703 (1999).
  7. Haenny, P.E., Maunsell, J.H.R. & Schiller, P.H. State-dependent activity in monkey visual cortex. II. Retinal and extraretinal factors in V4. Exp. Brain Res. 69, 245259 (1988).
  8. Maunsell, J.H.R., Sclar, G., Nealey, T.A. & Depriest, D.D. Extraretinal representations in area V4 in the macaque monkey. Vis. Neurosci. 7, 561573 (1991).
  9. Bichot, N.P., Rossi, A.F. & Desimone, R. Parallel and serial neural mechanisms for visual search in macaque area V4. Science 308, 529534 (2005).
  10. Chelazzi, L., Miller, E.K., Duncan, J. & Desimone, R. Responses of neurons in macaque area V4 during memory-guided visual search. Cereb. Cortex 11, 761772 (2001).
  11. Eskandar, E.N., Richmond, B.J. & Optican, L.M. Role of inferior temporal neurons in visual memory. II. Temporal encoding of information about visual images, recalled images and behavioral context. J. Neurophysiol. 68, 12771295 (1992).
  12. Liu, Z. & Richmond, B.J. Response differences in monkey TE and perirhinal cortex: stimulus association related to reward schedules. J. Neurophysiol. 83, 16771692 (2000).
  13. Gibson, J.R. & Maunsell, J.H.R. Sensory modality specificity of neural activity related to memory in visual cortex. J. Neurophysiol. 78, 12631275 (1997).
  14. Lueschow, A., Miller, E.K. & Desimone, R. Inferior temporal mechanisms for invariant object recognition. Cereb. Cortex 4, 523531 (1994).
  15. Miller, E.K. & Desimone, R. Parallel neuronal mechanisms for short-term memory. Science 263, 520522 (1994).
  16. DiCarlo, J.J. & Cox, D.D. Untangling invariant object recognition. Trends Cogn. Sci. 11, 333341 (2007).
  17. Suzuki, W.A. & Amaral, D.G. Perirhinal and parahippocampal cortices of the macaque monkey: cortical afferents. J. Comp. Neurol. 350, 497533 (1994).
  18. Meunier, M., Bachevalier, J., Mishkin, M. & Murray, E.A. Effects on visual recognition of combined and separate ablations of the entorhinal and perirhinal cortex in rhesus monkeys. J. Neurosci. 13, 54185432 (1993).
  19. Buffalo, E.A., Ramus, S.J., Squire, L.R. & Zola, S.M. Perception and recognition memory in monkeys following lesions of area TE and perirhinal cortex. Learn. Mem. 7, 375382 (2000).
  20. Cohen, M.R. & Maunsell, J.H. Attention improves performance primarily by reducing interneuronal correlations. Nat. Neurosci. 12, 15941600 (2009).
  21. Graf, A.B., Kohn, A., Jazayeri, M. & Movshon, J.A. Decoding the activity of neuronal populations in macaque primary visual cortex. Nat. Neurosci. 14, 239245 (2011).
  22. Fuster, J.M. & Jervey, J.P. Inferotemporal neurons distinguish and retain behaviorally relevant features of visual stimuli. Science 212, 952955 (1981).
  23. Reynolds, J.H. & Heeger, D.J. The normalization model of attention. Neuron 61, 168185 (2009).
  24. Rust, N.C., Mante, V., Simoncelli, E.P. & Movshon, J.A. How MT cells analyze the motion of visual patterns. Nat. Neurosci. 9, 14211431 (2006).
  25. Gold, J.I. & Shadlen, M.N. Banburismus and the brain: decoding the relationship between sensory stimuli, decisions and reward. Neuron 36, 299308 (2002).
  26. Simoncelli, E.P. & Heeger, D.J. A model of neuronal responses in visual area MT. Vision Res. 38, 743761 (1998).
  27. Heeger, D.J. Normalization of cell responses in cat striate cortex. Vis. Neurosci. 9, 181197 (1992).
  28. Adelson, E.H. & Bergen, J.R. Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A 2, 284299 (1985).
  29. Marr, D. Vision (MIT Press, Cambridge, Massachusetts, 1982).
  30. Rust, N.C. & DiCarlo, J.J. Selectivity and tolerance (“invariance”) both increase as visual information propagates from cortical area V4 to IT. J. Neurosci. 30, 1297812995 (2010).
  31. Hung, C.P., Kreiman, G., Poggio, T. & DiCarlo, J.J. Fast readout of object identity from macaque inferior temporal cortex. Science 310, 863866 (2005).
  32. DiCarlo, J.J., Zoccolan, D. & Rust, N.C. How does the brain solve visual object recognition? Neuron 73, 415434 (2012).
  33. Chelazzi, L., Miller, E.K., Duncan, J. & Desimone, R. A neural basis for visual search in inferior temporal cortex. Nature 363, 345347 (1993).
  34. Maunsell, J.H.R. & Treue, S. Feature-based attention in visual cortex. Trends Neurosci. 29, 317322 (2006).
  35. Rigotti, M., Ben Dayan Rubin, D., Wang, X.J. & Fusi, S. Internal representation of task rules by recurrent dynamics: the importance of the diversity of neural responses. Front Comput. Neurosci. 4, 24 (2010).
  36. Najemnik, J. & Geisler, W.S. Optimal eye movement strategies in visual search. Nature 434, 387391 (2005).
  37. Shadlen, M.N. & Newsome, W.T. Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J. Neurophysiol. 86, 19161936 (2001).
  38. Law, C.T. & Gold, J.I. Reinforcement learning can account for associative and perceptual learning on a visual-decision task. Nat. Neurosci. 12, 655663 (2009).
  39. Lavenex, P., Suzuki, W.A. & Amaral, D.G. Perirhinal and parahippocampal cortices of the macaque monkey: projections to the neocortex. J. Comp. Neurol. 447, 394420 (2002).
  40. Minsky, M. & Papert, S. Perceptrons: an Introduction to Computational Geometry (MIT Press, Cambridge, Massachusetts, 1969).
  41. Wang, P. & Nikolic, D. An LCD monitor with sufficiently precise timing for research in vision. Front. Hum. Neurosci. 5, 85 (2011).
  42. Kelly, R.C. et al. Comparison of recordings from microelectrode arrays and single electrodes in the visual cortex. J. Neurosci. 27, 261264 (2007).
  43. Averbeck, B.B. & Lee, D. Effects of noise correlations on information encoding and decoding. J. Neurophysiol. 95, 36333644 (2006).
  44. Edmonds, J. & Johnson, E.L. Matching: a well-solved class of integer linear programs. in Combinatorial Structures and Their Applications: Proceedings (ed. Guy, R.K.) (Gordon and Breach, Calgary, 1970).
  45. Efron, B. & Tibshirani, R.J. An Introduction to the Boostrap (CRC Press, 1994).

Download references

Author information

Affiliations

  1. Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA.

    • Marino Pagan,
    • Luke S Urban,
    • Margot P Wohl &
    • Nicole C Rust

Contributions

N.C.R., M.P.W. and M.P. conducted the experiments. M.P. and L.S.U. developed the data alignment software. M.P.W. and N.C.R. sorted the spike waveforms. M.P. and N.C.R. developed and executed the analyses. M.P. and N.C.R. wrote the manuscript. N.C.R. supervised the project.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (1,388 KB)

    Supplementary Figures 1–5

Additional data