Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Shared mechanisms underlie the control of working memory and attention

Abstract

Cognitive control guides behaviour by controlling what, when, and how information is represented in the brain1. For example, attention controls sensory processing; top-down signals from prefrontal and parietal cortex strengthen the representation of task-relevant stimuli2,3,4. A similar ‘selection’ mechanism is thought to control the representations held ‘in mind’—in working memory5,6,7,8,9,10. Here we show that shared neural mechanisms underlie the selection of items from working memory and attention to sensory stimuli. We trained rhesus monkeys to switch between two tasks, either selecting one item from a set of items held in working memory or attending to one stimulus from a set of visual stimuli. Neural recordings showed that similar representations in prefrontal cortex encoded the control of both selection and attention, suggesting that prefrontal cortex acts as a domain-general controller. By contrast, both attention and selection were represented independently in parietal and visual cortex. Both selection and attention facilitated behaviour by enhancing and transforming the representation of the selected memory or attended stimulus. Specifically, during the selection task, memory items were initially represented in independent subspaces of neural activity in prefrontal cortex. Selecting an item caused its representation to transform from its own subspace to a new subspace used to guide behaviour. A similar transformation occurred for attention. Our results suggest that prefrontal cortex controls cognition by dynamically transforming representations to control what and when cognitive computations are engaged.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Monkeys use selection and attention to control the contents of working memory.
Fig. 2: Selection is observed first in prefrontal cortex and shares a population code with attention.
Fig. 3: Selection increases colour information in working memory.
Fig. 4: Selection transforms memory information in a task-dependent manner.

Data availability

Data supporting all figures are included with the manuscript. Raw electrophysiological and behavioural data are available from the corresponding author upon reasonable request. Source data are provided with this paper.

Code availability

Behavioural code and custom Matlab analysis functions are publicly available at https://github.com/buschman-lab/. All other code is available from the authors upon reasonable request.

References

  1. 1.

    Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001).

    CAS  Article  Google Scholar 

  2. 2.

    Buschman, T. J. & Kastner, S. From behavior to neural dynamics: an integrated theory of attention. Neuron 88, 127–144 (2015).

    CAS  Article  Google Scholar 

  3. 3.

    Buschman, T. J. & Miller, E. K. Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science 315, 1860–1862 (2007).

    ADS  CAS  Article  Google Scholar 

  4. 4.

    Moore, T. & Armstrong, K. M. Selective gating of visual signals by microstimulation of frontal cortex. Nature 421, 370–373 (2003).

    ADS  CAS  Article  Google Scholar 

  5. 5.

    Gazzaley, A. & Nobre, A. C. Top-down modulation: bridging selective attention and working memory. Trends Cogn. Sci. 16, 129–135 (2012).

    Article  Google Scholar 

  6. 6.

    Sprague, T. C., Ester, E. F. & Serences, J. T. Restoring latent visual working memory representations in human cortex. Neuron 91, 694–707 (2016).

    CAS  Article  Google Scholar 

  7. 7.

    Myers, N. E., Stokes, M. G. & Nobre, A. C. Prioritizing information during working memory: beyond sustained internal attention. Trends Cogn. Sci. 21, 449–461 (2017).

    Article  Google Scholar 

  8. 8.

    Ester, E. F., Nouri, A. & Rodriguez, L. Retrospective cues mitigate information loss in human cortex during working memory storage. J. Neurosci. 38, 8538–8548 (2018).

    CAS  Article  Google Scholar 

  9. 9.

    Nobre, A. C. et al. Orienting attention to locations in perceptual versus mental representations. J. Cogn. Neurosci. 16, 363–373 (2004).

    CAS  Article  Google Scholar 

  10. 10.

    Murray, A. M., Nobre, A. C., Clark, I. A., Cravo, A. M. & Stokes, M. G. Attention restores discrete items to visual short-term memory. Psychol. Sci. 24, 550–556 (2013).

    Article  Google Scholar 

  11. 11.

    Wilken, P. & Ma, W. J. A detection theory account of change detection. J. Vis. 4, 1120–1135 (2004).

    Article  Google Scholar 

  12. 12.

    Zhang, W. & Luck, S. J. Discrete fixed-resolution representations in visual working memory. Nature 453, 233–235 (2008).

    ADS  CAS  Article  Google Scholar 

  13. 13.

    Bays, P. M., Catalao, R. F. G. & Husain, M. The precision of visual working memory is set by allocation of a shared resource. J. Vis. 9, 7 (2009).

    Article  Google Scholar 

  14. 14.

    Buschman, T. J., Siegel, M., Roy, J. E. & Miller, E. K. Neural substrates of cognitive capacity limitations. Proc. Natl Acad. Sci. USA 108, 11252–11255 (2011).

    ADS  CAS  Article  Google Scholar 

  15. 15.

    Sprague, T. C., Ester, E. F. & Serences, J. T. Reconstructions of information in visual spatial working memory degrade with memory load. Curr. Biol. 24, 2174–2180 (2014).

    CAS  Article  Google Scholar 

  16. 16.

    Bays, P. M. Spikes not slots: noise in neural populations limits working memory. Trends Cogn. Sci. 19, 431–438 (2015).

    Article  Google Scholar 

  17. 17.

    Bouchacourt, F. & Buschman, T. J. A flexible model of working memory. Neuron 103, 147–160.e8 (2019).

    CAS  Article  Google Scholar 

  18. 18.

    Pertzov, Y., Bays, P. M., Joseph, S. & Husain, M. Rapid forgetting prevented by retrospective attention cues. J. Exp. Psychol. Hum. Percept. Perform. 39, 1224–1231 (2013).

    Article  Google Scholar 

  19. 19.

    Bays, P. M. & Taylor, R. A neural model of retrospective attention in visual working memory. Cognit. Psychol. 100, 43–52 (2018).

    Article  Google Scholar 

  20. 20.

    Desimone, R. & Duncan, J. Neural mechanisms of selective visual attention. Annu. Rev. Neurosci. 18, 193–222 (1995).

    CAS  Article  Google Scholar 

  21. 21.

    Treue, S. & Maunsell, J. H. Attentional modulation of visual motion processing in cortical areas MT and MST. Nature 382, 539–541 (1996).

    ADS  CAS  Article  Google Scholar 

  22. 22.

    Everling, S., Tinsley, C. J., Gaffan, D. & Duncan, J. Filtering of neural signals by focused attention in the monkey prefrontal cortex. Nat. Neurosci. 5, 671–676 (2002).

    CAS  Article  Google Scholar 

  23. 23.

    Schneegans, S. & Bays, P. M. Restoration of fMRI decodability does not imply latent working memory states. J. Cogn. Neurosci. 29, 1977–1994 (2017).

    Article  Google Scholar 

  24. 24.

    Nee, D. E. & Jonides, J. Common and distinct neural correlates of perceptual and memorial selection. Neuroimage 45, 963–975 (2009).

    Article  Google Scholar 

  25. 25.

    Quentin, R. et al. Differential brain mechanisms of selection and maintenance of information during working memory. J. Neurosci. 39, 3728–3740 (2019).

    Article  Google Scholar 

  26. 26.

    Bernardi, S. et al. The geometry of abstraction in the hippocampus and prefrontal cortex. Cell 183, 954–967.e21 (2020).

    CAS  Article  Google Scholar 

  27. 27.

    Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–590 (2013).

    ADS  CAS  Article  Google Scholar 

  28. 28.

    Reynolds, J. H., Chelazzi, L. & Desimone, R. Competitive mechanisms subserve attention in macaque areas V2 and V4. J. Neurosci. 19, 1736–1753 (1999).

    CAS  Article  Google Scholar 

  29. 29.

    Reynolds, J. H. & Heeger, D. J. The normalization model of attention. Neuron 61, 168–185 (2009).

    CAS  Article  Google Scholar 

  30. 30.

    Panichello, M. F., DePasquale, B., Pillow, J. W. & Buschman, T. J. Error-correcting dynamics in visual working memory. Nat. Commun. 10, 3366 (2019).

    ADS  Article  Google Scholar 

  31. 31.

    Bruce, C. J. & Goldberg, M. E. Primate frontal eye fields. I. Single neurons discharging before saccades. J. Neurophysiol. 53, 603–635 (1985).

    CAS  Article  Google Scholar 

  32. 32.

    Rolston, J. D., Gross, R. E. & Potter, S. M. Common median referencing for improved action potential detection with multielectrode arrays. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2009, 1604–1607 (2009).

    PubMed  Google Scholar 

  33. 33.

    Wessberg, J. et al. Real-time prediction of hand trajectory by ensembles of cortical neurons in primates. Nature 408, 361–365 (2000).

    ADS  CAS  Article  Google Scholar 

  34. 34.

    Tort, A. B. L., Komorowski, R., Eichenbaum, H. & Kopell, N. Measuring phase-amplitude coupling between neuronal oscillations of different frequencies. J. Neurophysiol. 104, 1195–1210 (2010).

    Article  Google Scholar 

  35. 35.

    Maris, E. & Oostenveld, R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods 164, 177–190 (2007).

    Article  Google Scholar 

  36. 36.

    Murray, J. D. et al. Stable population coding for working memory coexists with heterogeneous neural dynamics in prefrontal cortex. Proc. Natl Acad. Sci. USA 114, 394–399 (2017).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank B. Morea and H. Weinberg-Wolf for assistance with monkeys; S. Tafazoli for assistance with microstimulation; F. Bouchacourt, C. Jahn, A. Libby, C. MacDowell, S. Tafazoli, M. Uchimura, and S. Henrickson for feedback; and the Princeton Laboratory Animal Resources staff for support. This work was supported by NIMH R01MH115042 (T.J.B.) and an NDSEG Fellowship (M.F.P.).

Author information

Affiliations

Authors

Contributions

T.J.B. conceived the project. M.F.P. trained the monkeys, collected the data, and analysed the data with supervision from T.J.B. T.J.B. and M.F.P. wrote the paper.

Corresponding author

Correspondence to Timothy J. Buschman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Tirin Moore and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Behaviour was consistent across monkeys and selection mitigated the decay of memories over time.

a, b, Mean absolute angular error (a) and mean mixture model parameter fits (b) in the main experiment (experiment 1) (Fig. 1a) for each monkey (Methods). Violin plots depict bootstrapped distribution across sessions (n = 10 for monkey 1 and n = 13 for monkey 2). Lines indicate pairwise comparisons. Although monkey 1 performed slightly better than monkey 2, they displayed similar patterns of performance across conditions. c, Distribution of reported colours and absolute angular error as a function of target colour in experiment 1 for each monkey for pro and retro trials. The distributions of reported colours for each condition and monkey were significantly non-uniform (entropy of report distribution significantly lower than entropy of the target distribution, all P < 0.001, bootstrap across n = 3,873 (pro) and 3,943 (retro) trials for monkey 1 and n = 4,440 and 4,769 trials for monkey 2). Details of this behaviour have previously been published30. d, Mixture model parameter fits of behaviour pooled across monkeys for experiment 1 (bootstrap across n = 23 sessions). e, Top, in a separate behavioural experiment (experiment 2), we fixed the total memory delay of the retro condition and systematically varied the length of the delay between stimulus offset and cue onset. Bottom, increasing the time before selection (x-axis) increased mean absolute angular error (53.1°, 54.4°, and 57.8° for 0.5 s, 1 s, and 1.5 s post-stimulus, respectively; distributions are 1,000 bootstrap resamples across n = 3,306, 3,287, and 3,322 trials, respectively). f, Mixture model parameter fits, pooled across monkeys (1,000 bootstrap resamples across n = 24 sessions), for experiment 2. Linear regression showed that earlier cues improved the precision of memory reports in experiment 2 (β = 3.95 ± 1.88 (s.e.m.), P = 0.012, bootstrap) but did not significantly change the probability of forgetting (that is, random responses; β = 0.03 ± 0.03 (s.e.m.), P = 0.126, bootstrap). Bars and asterisks in all panels reflect two-sided uncorrected randomization tests: ·P < 0.1, *P < 0.05, **P < 0.01, ***P < 0.001. Source data

Extended Data Fig. 2 Population size and neural responsiveness do not explain differences in classification performance across regions.

a, Firing rate of an example LPFC neuron around cue onset when the upper (grey) or lower (green) stimulus was cued in the retro (top) and pro (bottom) conditions. Shaded regions are s.e.m. across trials (n = 161 retro upper, 124 retro lower, 150 pro upper, and 121 pro lower trials). Insets, cues used for retro and pro trials. b, Percentage of neurons in each region of interest with firing rates that were significantly modulated by the selected location after cue onset on retro trials (trials pooled across cue sets 1 and 2). For each neuron, we quantified location selectivity using d′ (Methods) and compared this value to a null distribution by permuting location labels across trials. All four regions showed strong selectivity: LPFC had 159 out of 590 neurons selective; FEF, 37 of 169; parietal, 49 of 301; V4, 62 of 318; all P < 0.001 against chance of 5% (two-sided uncorrected binomial test). c, Mean classification accuracy (top, taken at 300 ms post-cue) and mean time to 55% classification accuracy (bottom) for the selection (left) and generalized (right) classifiers as a function of the number of neurons used for classification. This analysis controls for the total number of neurons recorded in each region. For each subpopulation of a specific size (x-axis), circles reflect average across 1,000 iterations using different randomly selected subpopulations of that size. Lines reflect best-fitting two-parameter power function (Methods). Error bars are 95% prediction intervals. For classifier accuracy (top row): n = 35, 10, 19, and 22 unique population sizes for LPFC, FEF, parietal and V4, respectively. For classifier timing (bottom left and right): n = 35/32, 10/8, 19/4, and 21/20 for selection/generalization in LPFC, FEF, parietal and V4, respectively. The reduction in the number of data points in the bottom plots reflects the fact that, for some neuron counts, classifiers never reached 55% classification accuracy on any iteration. Asterisks indicate significance of projected classification for a given region compared to the measured classification in LPFC at the maximum number of neurons (two-sided z-test, not corrected for multiple comparisons). Selection classification accuracy: FEF P = 2.18 × 10−10; parietal P = 1 × 10−16; V4 P < 1 × 10−16. Generalization classification accuracy: FEF P < 1 × 10−16; parietal P < 1 × 10−16; V4 P < 1 × 10−16. Selection classification timing: FEF P = 0.054; parietal P = 1.02 × 10−4; V4 P < 6.94 × 10−8. Generalization classification timing: FEF P = 0.203; parietal P = 1.11 × 10−13; V4 P < 1 × 10−16. d, Neuron dropping curves as in c, except analysis was restricted to neurons with a significant evoked response to cue onset to control for potential differences in responsiveness across regions (Methods). For classifier accuracy (top row): n = 23, 5, 8, and 8 unique population sizes for LPFC, FEF, parietal and V4, respectively. For classifier timing (bottom left and right): n = 23/22, 5/4, 8/0, and 8/8 for selection/generalization in LPFC, FEF, parietal and V4, respectively. Selection classification accuracy: FEF P < 1 × 10−16; parietal P < 1 × 10−16; V4 P < 1 × 10−16. Selection classification timing: FEF P < 1 × 10−16; parietal P < 1 × 10−16; V4 P < 1 × 10−16. Generalization classification accuracy: FEF P = 0.001; parietal P = 0.021; V4 P = 0.002. Generalization classification timing: FEF P < 1 × 10−16; V4 P < 1 × 10−16. e, To determine whether there were sub-populations of selective neurons in a region with greater selectivity than the overall population, we ranked neurons in each region by their ability to support the selection (left) or generalized (right) classifier (Methods). Neurons with firing rates that yielded large magnitude (and sign consistent) d′ values for the cued location (upper or lower) across both retro cue sets will support selection classifier performance (left). We quantified this by projecting these two d′ values onto the identity (red lines) and taking the absolute value of the resulting vector. Neuron 1 is ranked higher than neuron 2 because of its larger magnitude projection onto the identity. A similar procedure can be used to rank neurons for generalization from pro to retro trials (right) by repeating the procedure on the basis of selectivity for ‘pro cueset 1’ and ‘retro cueset 2’. f, Neuron dropping curves (as in c), except that neurons are added to the analysis on the basis of their selectivity or generalization, as described in d. Shaded region is 95% confidence intervals of best linear fit (which fit better than power functions) (Methods). Even when selecting ideal subpopulations from each region, no region significantly exceeded LPFC performance. Performance now decreases as n increases because, owing to our ranking procedure, later cells are by design less able to support performance on withheld cues (whether within selection or across selection or attention). These later neurons may still be weighted heavily by the classifier (owing to good performance on the training set) and so negatively affect performance at test. This is exemplified by the projections onto one axis, as indicated by the vertical dashed lines in d, showing a greater weighting for neuron 2, despite it not facilitating generalization. For classifier accuracy (top row): n = 35, 10, 19, and 22 unique population sizes for LPFC, FEF, parietal and V4, respectively. For classifier timing (bottom left and right): n = 35/35, 10/10, 19/18, and 22/22 for selection/generalization in LPFC, FEF, parietal and V4, respectively. g, To examine ‘bottom-up’ information flow about low-level sensory aspects of the cue, we trained classifiers to discriminate the variants of each cue, using cross-validation across subsets of trials (Methods). h, Neuron dropping curves (as in c) for these ‘cue appearance’ classifiers. Cue appearance classifiers yielded a qualitatively different pattern of performance, with V4 showing superior classification performance at cue offset (left) and faster classification onset (right). Asterisks indicate significance of projected classification for a given region compared to the measured classification in LPFC at the maximum number of neurons (two-sided z-test, not corrected for multiple comparisons). n = 35, 10, 19, and 22 unique population sizes for LPFC, FEF, parietal and V4, respectively. Classification accuracy: FEF P = 0.282, parietal P < 1 × 10–16, V4 P < 1 × 10–16. Classification timing: FEF P = 0.005, parietal P = 4.24 × 10–16, V4 P = 2.27 × 10–8. ·P < 0.1, *P < 0.05, **P < 0.01, ***P < 0.001. Source data

Extended Data Fig. 3 Successful classification was driven by increases in signal.

a, Example histogram of classifier confidence across ‘upper cued’ and ‘lower cued’ trials for the LPFC selection classifier in the 500 ms after cue onset. Classifier confidence measures the distance of neural activity from the hyperplane identified by the classifier. Signal is the difference between the means of the two trial distributions; noise is their average s.d. b, For both the ‘selection’ and ‘generalization’ classifiers, signal (top row) tracks classification performance (Fig. 2) much better than noise (bottom row), suggesting that classifier performance was due to an increase in signal and not a decrease in noise. Shading shows s.e.m. Distribution estimated from 1,000 iterations of classifiers trained and tested on random samples of n = 60 trials (Methods). c, Mean noise correlation among neurons entering the ‘selection’ and ‘generalization’ analyses described in Fig. 2. Noise correlations were based on mean firing rates over the interval from 0 to 500 ms after the cue. There were no significant differences between regions. d, Fano factor (σ2/μ) of single-neuron firing rates across trials (averaged from 0 to 500 ms after the cue). The ratio was significantly larger in LPFC than V4 but no other comparisons were significant (horizontal bar; two-sided uncorrected t-test). c, d, Violin plots show distribution of values based on 1,000 bootstrapped resamples of n = 60 trials (Methods). Red crosses indicate mean. Source data

Extended Data Fig. 4 Neural responses in prefrontal cortex were similar across cue sets and tasks.

a, Distribution of selectivity across neurons for the selected location (top row) and for the selected and attended location (bottom row). Selectivity was taken as the normalized difference in firing rate (d′) between ‘upper’ and ‘lower’ trials evoked by the two retro cue sets (top) and by pro cue set 1 and retro cue set 2 (bottom) (Methods). Firing rate was computed at the end of the cue period (300 ms after cue onset). Positive d′ values indicated that the neuron was more active when the upper sample was cued. Rose plots in the background show the histogram of neurons binned by angle (grey circle indicates scale; density = 0.1). Bar plots along axes show histogram of marginal distributions (grey ticks on axes indicate scale; density = 0.2). Statistical tests are Pearson’s r. b, Selection correlation values (as in a) computed over time around cue onset. Bars along top indicate correlations greater than zero: P < 0.05, 0.01, and 0.001 for thin, medium, and thick lines, respectively (one-sided uncorrected bootstrap; n = 1,000 resamples of trials). c, Generalization correlation values computed over time around cue onset, as in b. d, Schematic of classifier trained to discriminate the neural response to two cue conditions on pro trials. Performance was calculated as the cross-validated classification accuracy (tenfold cross-validation on each of 1,000 random resamples of trials) (Methods). e, Mean ± s.e.m. classification accuracy of the pro cues, relative to cue onset, for all four brain regions. Distribution was defined across 1,000 random resamples of trials. This analysis captures a mixture of information about the control of attention (up or down) and information about the visual appearance of the cue itself. These results show that these two conditions are separable in all brain regions, and so any failure in cross-classification performance (Fig. 2d, purple traces) is not due to poor separability of the attention conditions. Source data

Extended Data Fig. 5 Single neurons encoded the colours of remembered items.

a, Mean firing rates for example neurons during the retro condition, binned by the colour (indicated by line colour) of the selected (solid) or non-selected (dashed) stimulus. Example neurons are shown for all four brain regions (labelled at top left). b, Mean ± s.e.m. selectivity of neurons in all four regions for the colour of the selected and non-selected stimulus (in light and dark blue, respectively) in each brain region, averaged across neurons. Selectivity is measured using a PEV statistic (Methods) and shows similar results to when using an entropy statistic (Fig. 3). LPFC: 574 neurons, FEF: 163 neurons, parietal: 292 neurons, V4: 311 neurons. Horizontal bars indicate significant information for the selected item (light blue), the non-selected item (dark blue), and a significant difference in information about the selected and non-selected items (black). Bar width indicates significance: P < 0.05, 0.01, and 0.001 for thin, medium, and thick, respectively (two-sided cluster-corrected t-tests). Source data

Extended Data Fig. 6 Comparison of information about the reported/presented colour, the attended/unattended item, and the memory of items on prospective and retrospective trials.

a, Mean z-scored colour information for the reported colour (grey) and the colour of the presented, selected, item (light blue). Information was calculated on firing rates in a 200-ms window before onset of the response colour wheel for all neurons. Distributions show bootstrapped estimates of the mean across neurons (LPFC: 570 neurons, FEF: 163 neurons, parietal: 292 neurons, V4: 311 neurons). Horizontal lines indicate pairwise comparisons. *P < 0.05, **P < 0.01, ***P < 0.001 (two-sided uncorrected randomization tests). b, Mean ± s.e.m. z-scored colour information for the attended and non-attended colour on pro trials. LPFC: 543 neurons, FEF: 160 neurons, parietal: 272 neurons, V4: 300 neurons. Horizontal bars indicate significant information for the attended item (light orange), the non-attended item (dark orange), and significant differences in information about the attended and non-attended items (black). Bar width indicates significance: P < 0.05, 0.01, and 0.001 for thin, medium, and thick, respectively (two-sided cluster-corrected t-tests). c, Mean ± s.e.m. difference in z-scored colour information between retro and pro trials for the cued item (selected − attended; light purple) and uncued item (non-selected − non-attended; dark purple). Positive values reflect more information about an item on retro trials. LPFC: 511 neurons, FEF: 146 neurons, parietal: 258 neurons, V4: 285 neurons. Horizontal bars indicate significant differences from zero (that is, differences between retro and pro) for the cued item (light purple) and the non-cued item (dark purple). Bar width indicates significance: P < 0.05, 0.01, and 0.001 for thin, medium, and thick, respectively (two-sided cluster-corrected t-tests). Source data

Extended Data Fig. 7 The effect of selection on colour information was greater when memories were more accurate.

a, Selection enhanced the representation of the selected item in frontal and parietal regions and reduced the representation of the unselected item in FEF. The y-axis shows the increase in colour information after selection (post-cue period: 200 to 500 ms after cue offset), relative to information before selection (pre-cue period: −300 to 0 ms before cue onset). Violin plots show the distribution of this difference, estimated by 1,000 bootstrapped resamples of neurons (LPFC: 577 neurons, FEF: 170 neurons, parietal: 299 neurons, V4: 316 neurons). *P < 0.05, **P < 0.01, ***P < 0.001 (two-sided uncorrected paired t-tests). b, Mean ± s.e.m. z-scored colour information for the selected (light blue) and non-selected item (dark blue) on retro trials, for trials with more accurate behavioural responses (left; error was less than median error) and less accurate behavioural responses (right; error was greater than median error). LPFC: 457/472 neurons, FEF: 134/135 neurons, parietal: 235/241 neurons, V4: 248/267 neurons for left/right, respectively. Plots follow Fig. 3. Horizontal bars indicate significant information for the selected item (light blue), the non-selected item (dark blue), and significant differences in information about the selected and non-selected items (black). Bar widths indicate significance: P < 0.05, 0.01, and 0.001 for thin, medium, and thick, respectively (two-sided cluster-corrected t-tests). c, Mean ± s.e.m. difference in z-scored colour information about the selected and non-selected items for more accurate and less accurate trials. LPFC: 435 neurons, FEF: 125 neurons, parietal: 221 neurons, V4: 240 neurons. As in b, trials were split on the basis of angular error (relative to median error). Positive values reflect more information about the selected item than the non-selected item. Horizontal bars indicate significant differences between more and less accurate trials; width indicates significance: P < 0.05, 0.01, and 0.001 for thin, medium, and thick, respectively (two-sided cluster-corrected t-tests). Source data

Extended Data Fig. 8 Distributed representations of colour in prefrontal cortex were transformed over time.

a, Mean z-scored colour information for the upper (x-axis) and lower (y-axis) stimuli immediately before selection cue onset (average over −500 to 0 ms before the selection cue) for LPFC (583 neurons). Most neurons carried some amount of information about both items (that is, neurons did not lie along the axes). b, To check whether neurons that primarily carried information about just one item were driving the orthogonality between the colour planes in LPFC before the selection cue, we re-computed the cosine of the angle between the colour planes (Methods) using populations of neurons with significant colour information about one item only or both items (see Methods for description of this test). Histograms show the distribution of the cosine of the angle between the best-fitting planes for the upper and lower stimuli during the pre-cue period for these ‘both’ and ‘1 item’ populations of neurons (with each population subsampled to an equal number of neurons) (Methods). Distributions were estimated from 1,000 resamples of trials. Green squares indicate median values. While the ‘both’ neurons did display slightly less orthogonality than the ‘1 item’ neurons, this difference was not significant (P > 0.4, two-sided bootstrap of difference). Cosine angles are not zero for ‘1 item’ neurons because ‘1 item’ neurons still contain subthreshold information (P > 0.05) about the other item, as seen in a, and subsampling cells in this way decreases statistical power, thereby inflating low cosine values. c, Population trajectories for lower colours, over time, as projected into the lower colour subspace defined either before or after selection (left and right, respectively). Follows Fig. 4e. The lower colour subspace was defined as a 2D space that maximally explained variance across the four lower colours (Methods). As for the upper colour (Fig. 4e), temporal cross-generalization was poor, suggesting that the colour information was represented in different subspaces before and after the selection cue. d, Before selection, colour representations in LPFC are better separated using the pre-selection subspace. After selection, colours are better separated in the post-selection subspace. Separability was measured as the area of the quadrilateral defined by the responses to colours (c, Fig. 4e), projected into either the pre-selection or post-selection subspaces (left and right columns in each plot; area averaged across upper and lower items). Violin plots show distributions estimated from 1,000 resamples of trials. *P < 0.05, **P < 0.01, ***P < 0.001 (two-sided bootstrap of difference). Source data

Extended Data Fig. 9 The alignment of selected items was greater in prefrontal cortex than other brain regions, was greater than the alignment of non-selected items, and was greater when memories were more accurate.

a, Projected population responses for selected upper and non-selected lower colours, computed as in Fig. 4a. The selected and non-selected colours remain orthogonal after the selection cue (main text). b, Projected population responses for non-selected upper and non-selected lower colours. As with the selected colour planes, the non-selected colour planes appear parallel after the selection cue. c, Mean correlation between the population representation of each colour in the upper and lower position during retro trials, when both items were selected (left), one item was selected and another item was non-selected (middle), and when both items were non-selected (right). Correlation was measured during an ‘early’ time period during the delay (dark grey; 150–350 ms after the offset of the stimulus) and a ‘late’ time period during the delay (light grey; 200–0 ms before the onset of the colour wheel). Correlation was measured after subtracting the mean response at each location (Methods). Violin plots show bootstrapped distributions estimated from 1,000 resamples of trials. Horizontal lines indicate pairwise comparisons (two-sided uncorrected bootstrap of difference). Lone asterisks denote two-sided uncorrected bootstrap versus zero: *P < 0.05, **P < 0.01, ***P < 0.001. d, Cosine of the angle between the best-fitting planes for the upper and lower stimuli. Planes were fit to selected and non-selected items during both the early and late time periods (as in c). Histograms show full distribution, estimated from 1,000 resamples of trials; green lines indicate median values. Horizontal lines indicate pairwise comparisons: *P < 0.05, **P < 0.01, ***P < 0.001 (two-sided uncorrected bootstrap of difference). e, To find out whether the selection process transformed the cued and non-cued item in similar ways, we estimated the transformation matrices that mapped pre-cue representations of an item onto their post-cue representation (Methods, Supplementary Discussion 3). Then, we tested whether these transformations were able to reconstruct representations on withheld trials. Transformations were tested on the same condition (withheld trials; first column); on the other item in a condition (for example, applying the transformation of a selected upper item to an non-selected lower item; second column); on the same item, but in a different condition (for example, applying the transformation of a selected upper item to an non-selected upper item; third column); and on the other item in a different condition (for example, applying the transformation of a selected upper item to an selected lower item; fourth column). Violin plots show distributions of these mean reconstruction errors estimated from 1,000 resamples of trials. Red crosses indicate the distribution mean, dashed lines show reconstruction error expected by chance (estimated by random shuffle) (Methods). The results indicate a common component to the transformation of the selected and non-selected item in the same condition (second column) but there was also an item-specific transformation (reflected in the lower reconstruction error for the same item; first column). Horizontal lines show pairwise comparisons: ***P < 0.001 by two-sided uncorrected bootstrap of difference. f, The selected upper and selected lower colour planes do not align on inaccurate trials. Figure follows Fig. 4b, but shows data for trials in which absolute angular error was greater than the median error. Black markers show the cosine of the angle (y-axis) between the two colour planes around the time of cue onset (x-axis) and black line shows the best-fitting logistic function. Source data

Extended Data Fig. 10 Colour representations of the attended item were immediately aligned on prospective trials.

a, Population responses 200 ms after stimulus offset on pro trials (projected into a reduced subspace for visualization). As in Fig. 4a, markers indicate mean position of population activity for each condition (binned by the colour and location of the attended item) in a subspace spanned by the first three principal components that explain the most variance across all eight conditions. b, Mean ± s.e.m. correlation of population vectors representing colours at the same location (self; red line) or between locations (cross-location; blue line) on pro trials. Correlations were measured after subtracting the mean vector at each location (as in Fig. 4c; Methods). Distribution was estimated from 1,000 resamples of trials. Self-correlation was computed on held-out trials and provides an upper-bound on the between-location correlation values, given the noise level. Bars reflect uncorrected two-sided bootstrap (P < 0.05) for each correlation type against zero (red and blue) and between each other (black). c, As in b, but for retro trials. d, Mean correlation between the population representations of each colour in the upper and lower position during pro trials, when both items were attended (left), one item was attended and another item was non-attended (middle), and when both items were non-attended (right). Correlation was measured during an ‘early’ time period during the delay (dark grey; 150–350 ms after the offset of the stimulus) and a ‘late’ time period during the delay (light grey; 200–0 ms before the onset of the colour wheel). Correlation was measured after subtracting the mean response at each location (Methods). Violin plots show distributions, estimated from 1,000 resamples of trials. Horizontal lines indicate pairwise comparisons (two-sided uncorrected bootstrap of difference) and lone asterisks reflect two-sided uncorrected bootstrap against zero: *P < 0.05, **P < 0.01, ***P < 0.001. e, Cosine of the angle between the best-fitting planes for the upper and lower stimuli. Planes were fit to attended and non-attended items during both the early and late time periods, as in d. Histograms show full distribution, estimated from 1,000 resamples of trials; green lines indicate median values. f, Mean correlation between the population representation for each colour during pro trials and the representations during the early or late time periods of retro trials. Correlation was computed between the colour representations taken from the 300 ms before the onset of the response wheel on pro trials and the colour representations taken from either a pre-selection period (left distribution; −300 to 0 ms before cue) or a post-selection period (right distribution; −300 to 0 ms before response wheel onset) on retro trials. Correlations were measured after subtracting the mean vector at each location, as in Fig. 4c (Methods). Violin plots reflect the distribution, estimated from 1,000 resamples of trials. Horizontal line indicates pairwise comparison (two-sided uncorrected bootstrap of difference) and lone asterisks reflect two-sided bootstrap against zero: *P < 0.05, **P < 0.01, ***P < 0.001. Source data

Supplementary information

Supplementary Information

This file contains Supplementary Table 1 and Supplementary Discussions 1-4. Supplementary Table 1 contains statistics of neural recordings. Supplementary Discussion 1 discusses balancing generalized and task-specific representations. Supplementary Discussion 2 discusses distributed memory representations. Supplementary Discussion 3 discusses transformation of selected and unselected items. Supplementary Discussion 4 discusses cognitive control through dynamic transformation of representations.

Reporting Summary

Peer Review File

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Panichello, M.F., Buschman, T.J. Shared mechanisms underlie the control of working memory and attention. Nature 592, 601–605 (2021). https://doi.org/10.1038/s41586-021-03390-w

Download citation

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing