Reward uncertainty asymmetrically affects information transmission within the monkey fronto-parietal network

Abstract

A central hypothesis in research on executive function is that controlled information processing is costly and is allocated according to the behavioral benefits it brings. However, while computational theories predict that the benefits of new information depend on prior uncertainty, the cellular effects of uncertainty on the executive network are incompletely understood. Using simultaneous recordings in monkeys, we describe several mechanisms by which the fronto-parietal network reacts to uncertainty. We show that the variance of expected rewards, independently of the value of the rewards, was encoded in single neuron and population spiking activity and local field potential (LFP) oscillations, and, importantly, asymmetrically affected fronto-parietal information transmission (measured through the coherence between spikes and LFPs). Higher uncertainty selectively enhanced information transmission from the parietal to the frontal lobe and suppressed it in the opposite direction, consistent with Bayesian principles that prioritize sensory information according to a decision maker’s prior uncertainty.

Introduction

Executive control is broadly understood as the ability to engage in information processing in pursuit of a goal, especially in circumstances requiring non-habitual or novel responses1. In humans and monkeys, executive function depends on a network of frontal and parietal areas, which is activated in relation to demanding behaviors requiring the suppression of inappropriate response tendencies, monitoring and adjusting behavioral strategies, and the goal-directed control of attention1,2.

Theories of computational rationality, like current frameworks of executive function, propose that controlled (rather than automatic) information processing is costly and is engaged in proportion to the benefits it brings to the organism1,3. Because the decision-theoretic (Bayesian) definition of information is in terms of a reduction of uncertainty, an important implication of this view is that control should be optimally allocated to tasks that not merely have reward value but, more specifically, have uncertainty. It is in conditions of higher ex ante uncertainty that animals can expect to obtain the greatest benefits from processing new information and improving prediction accuracy4,5,6,7.

Consistent with this view, a growing literature shows that attention is recruited by uncertainty independently of reward gains. Animals are intrinsically motivated to resolve uncertainty independently of instrumental incentives5,8, the expectation of new information influences eye movements in humans9 and monkeys10, and oculomotor neurons in monkey parietal cortex have stronger responses preceding saccades that are expected to reduce uncertainty11. And yet, while existing studies have tested neural activity in the fronto-parietal network in tasks involving risk and ambiguity, learning, exploration, novelty, or surprise (e.g. refs. 12,13,14,15,16,17), critical open questions remain about the cellular effects of uncertainty on this network.

One question concerns the distinction between uncertainty and reward gains. In instrumental conditions, when animals make reward-maximizing decisions, reductions of decision uncertainty are closely related with increases in long-term reward gains5,18. A handful of studies recently used non-instrumental conditions to show that individual neurons have distinct responses to the variance and value of expected rewards, but these studies have targeted the orbitofrontal cortex19 and subcortical structures20,21 rather than the fronto-parietal network22,23,24,25,26 (but see11 for a notable exception).

A second key question is how uncertainty affects not only neural activity within areas but information flow between areas. A central tenet of Bayesian4 and predictive coding theories6 is that, in states of high prior uncertainty, the brain downregulates top-down signals conveying uncertain prior expectations and upregulates the bottom-up transmission of sensory information. However, while this view is prevalent in computational theories, there has been no empirical demonstration of uncertainty-dependent modulations of functional connectivity.

To examine these questions, we simultaneously recorded single-neuron responses and local field potential (LFP) oscillations in the dorsolateral prefrontal cortex (dlPFC) and area 7A, two strongly interconnected nodes of the monkey fronto-parietal network. We used a simple task in which monkeys were cued to expect certain or uncertain rewards but could not make decisions to maximize those rewards. We show that uncertainty has representations in action potential activity and LFP oscillations that are distinct from those of EV. Importantly, uncertainty asymmetrically enhances spike-field coherence (SFC) from the parietal to the frontal lobe while suppressing SFC in the opposite direction, consistent with theoretical predictions of optimal inference under uncertainty.

Results

Two monkeys performed a visually guided saccade task in which they formed expectations about the trial’s rewards based on familiar visual cues. On each trial after achieving central fixation, the monkeys were shown a cue indicating the trial’s reward probability (Fig. 1a), which was followed, after a 400 ms delay period, by presentation of the target for the subsequent saccade. Upon making the required saccade, the monkeys received a reward according to the probability signaled by the cue. Cue and target locations were independently randomized across two locations (8° eccentricity to the right or left of fixation). Thus, the cue was only predictive of the coming reward but not the instrumental action, allowing us to examine behavioral and neural responses to reward expectations independently of saccade planning or reward-maximizing decision strategies.

Monkeys were familiarized with a set of cues signaling nine reward distributions, whose variance and EV were statistically dissociated. Three cues signaled deterministic rewards of, respectively, 3, 6, or 9 points, whereas the remaining six cues indicated probabilistic rewards, with a small or large reward size being equally likely to occur (Fig. 1b single- and double-line cues). Reward sizes were determined through a mean-preserving procedure, whereby the small and large probabilistic rewards were symmetrically positioned with low or high variance, around the deterministic EV. This produced a cue set that statistically dissociated three levels of variance (0, 1, and 4) and three levels of EV (Fig. 1c). Both monkeys achieved high proficiency, with a high fraction of correctly completed trials (monkey 1: 83% correct, monkey 2: 81% correct overall). Importantly the fraction of correct trials did not vary with variance or EV (two-way ANOVA for each monkey, all p > 0.19) ensuring that the rewards that the monkeys experienced were not distorted by uneven performance and corresponded to those signaled by the cues.

Analyses of anticipatory licking confirmed that the monkeys were familiar with the cues and were cognizant of both variance and EV (Fig. 1d). The generalized linear model (GLM) coefficients for licking (see “Methods”) were significantly greater than zero for both variance and EV (mean ± SEM, variance: 0.03 ± 0.0006; EV: 0.24 ± 0.002, all p < 10−9 relative to 0, signed-rank test). In contrast, EV and variance did not consistently affect the monkeys’ saccades. Saccade reaction times (RTs) increased with EV in monkey 1 but not in monkey 2 (GLM coefficients, respectively, p = 2.6 × 10−8 and p = 0.35 relative to 0) and were not affected by variance in either monkey (all p > 0.1). Moreover, the effects of EV and variance on licking were uncorrelated with those on RT across sessions and showed no significant interactions with the location of the visual cue (all p > 0.3). Thus, reward expectations affected anticipatory licking independently of saccade orienting, consistent with previous reports that the two behaviors have different reward sensitivity27.

To investigate the neural correlates of variance and EV, we implanted multi-channel electrode arrays in area 7A and the dlPFC focusing on subdivisions that are reciprocally connected and have visual and attention-related activity—i.e., area OPT in the parietal cortex and the pre-arcuate portion of the dlPFC28,29 (Fig. S1). We describe the effects of variance and EV in single-neuron activity, followed by their influence on LFP oscillatory power and SFC.

Variance and EV have distinct single-neuron representations

In both 7A and dlPFC, individual neurons showed significant encoding of variance or EV (Table 1 and Fig. 2). The sensitive neurons were equally likely to respond to uncertainty and EV with increases or decreases in firing (Table 1 and Fig. 2). Importantly, the GLM coefficients capturing each effect (see “Methods”) were uncorrelated, suggesting that variance and EV are encoded in distinct populations of cells (7A: r = 0.03, p = 0.49, n = 522; dlPFC: r = 0.06, p = 0.15, n = 530).

Responses with positive and negative scaling had similar prevalence and strength in the two areas (Fig. 2 and Table 1) and, across all the cells, the average coefficients showed no net enhancement or suppression of firing with either variable (all p > 0.23, with the single exception of a net positive effect of EV in area 7A; mean ± SEM GLM coefficient of 0.171 ± 0.032, p = 10−6 relative to 0). Cells with positive and negative scaling had sustained effects throughout the cue and delay epochs (Fig. 3a). We found no correlation between trial-by trial firing rates and licking responses, suggesting that the cells encoded expectations rather than licking per se. Moreover, cells with positive or negative scaling for one variable had no significant sensitivity to the other factor (Fig. 3b), confirming that variance and EV were encoded by distinct populations of cells.

Because cue location was included as a nuisance regressor in the GLM, the EV, and variance sensitivity were above and beyond any cue location response. Four additional observations support this conclusion. First, while some neurons encoded the location of the visual cue, location coefficients were uncorrelated with those for variance or EV (Fig. S2). Second, the EV and variance selective cells showed no consistent visual response, ruling out that they merely encoded the appearance of the cue (Fig. S2). Third, EV and variance coefficients that were estimated separately for each cue location were statistically equivalent (all p > 0.59 sign-rank test) and highly correlated (all p < 0.02). Finally, we found no significant correlation between trial-by-trial firing rates and saccadic RT. Thus, the neurons encoded global expectations of reward variance and EV independently of visuo-spatial selectivity or saccade planning activity.

Noise correlations: Given the stark segregation of EV and variance responses we found in both areas, we wondered whether the neurons encoding these variables had distinct functional connectivity. To examine this question, we computed noise correlations between trial-by-trial activity in pairs of simultaneously recorded cells, focusing on firing rates in a 600 ms pre-cue epoch preceding cue onset to avoid confounds related to evoked activity30.

Noise correlations were higher in pairs in which both neurons coded for the same factor (both neurons encoding variance or both encoding EV) relative to pairs with mixed selectivity (Fig. 4a, “across-factor” vs “within factor”) and this difference was highly robust in both areas (Table 2, dlPFC, p = 2.5 × 10−7, n = 47 and n = 56 pairs; 7A, p = 5.1 × 10−8, n = 79 and n = 43 pairs; Kruskal–Wallis test). In addition, in pairs with homogeneous selectivity, noise correlations were larger if the two neurons had the same versus opposite polarity (Fig. 4a) for both variables and both areas (Table 2). Variance, EV, or response polarity had no effect on across-trial variability (Fano factor), ruling out that this may have produced apparent effect on noise correlations. Thus, subject to their encoding polarity, neurons responding to variance shared distinct variability relative to those encoding EV.

Decoding: Because information can be transmitted by neurons that lack linear selectivity, we conducted a final analysis to estimate the decoding capacity from the entire population of cells. We trained support vector machine (SVM) classifiers to perform pairwise discriminations between the different levels of variance and EV based on the population responses and analyzed the boostrapped distributions of excess accuracy (the differences in accuracy in the real and label-shuffled (null) data sets). To determine the extent to which variance and EV had distinct or overlapping representations, we also tested incongruent train-testing regimes—training the classifiers on variance and EV and testing on the untrained variable.

Decoding performance in congruent training-testing regimes was clearly superior to that in incongruent regimes for both variables in both areas. In both the pooled analyses (Fig. 4b) and pairwise comparisons (Fig. S3), 95% confidence bands were clearly above 0 for all congruent train-testing classifications, while decoding in incongruent regimes was significantly weaker and at chance levels in all cases. There were no significant differences between the decoding of variance and EV in 7A and dlPFC.

In sum, analysis of single-neuron activity, noise correlation, and population decoding show that variance and EV had clearly segregated representations that were similar in 7A and the dlPFC.

Variance and EV modulate oscillatory LFP power

Because, in addition to spiking activity, oscillatory LFP potentials are sensitive indicators of cognitive states31,32,33 we next examined how oscillations are affected by variance and EV. To this end, we divided single-trial LFP traces into 1 Hz × 1 ms pixels spanning the cue and delay epochs and, for each pixel, fit a GLM model that included variance and EV as factors, controlling for cue location and interactions (identical to the model applied to spiking activity; “Methods”). The resulting coefficient maps showed that variance and EV exerted consistent effects in two frequency bands: a lower frequency band between 8 and 18 Hz, corresponding to α/low-β frequencies, and a higher band of 18–43 Hz, corresponding to the high-β/low-γ frequencies (Figs. 5 and 6).

Power in an α/low-β frequency band (8–18 Hz) is widely associated with task engagement and arousal in different tasks and brain areas in humans and monkeys34. Consistent with this widely replicated result, activity in this band was suppressed by variance and EV in both 7A and the dlPFC (Fig. 5, pink regions of interest (ROIs)). The strongest effects arose in the late cue and early delay periods (Fig. 6a, b) and were highly significant for both variables for each monkey (7A variance: monkey 1: p < 6 × 10−6 (Wilcoxon rank-sum test relative to 0 across all pixels in the ROI); monkey 2: p < 2 × 10−14; EV: monkey 1: p < 2 × 10−21, monkey 2: p < 9 × 10−13; dlPFC variance: monkey 1: p < 6 × 10−10, monkey 2: p < 8 × 10−8; EV: monkey 1: p < 5 × 10−28, monkey 2: p < 2 × 10−12).

In contrast with the uniform suppression in the low-frequency band, the effects in the high-β/low-γ differed for variance and EV and across the two areas (Fig. 6c vs d and Fig. 5, purple ROI). In 7A, power in this band was suppressed by variance and enhanced by EV (Fig. 6c vs d, dashed traces), while the dlPFC showed the opposite pattern—being enhanced by variance and suppressed by EV (Fig. 6c vs d, solid traces). Each effect was highly robust in each monkey (7A variance: monkey 1: p < 2 × 10−13, monkey 2: p < 4 × 10−17; 7A EV: monkey 1: p < 2 × 10−16, monkey 2: p < 4 × 10−18; dlPFC variance monkey 1: p < 3 × 10−6, monkey 2: p < 2 × 10−22; dlPFC EV: monkey 1: p < 4 × 10−28, monkey 2: p < 9.5 × 10−4). As for the single-neuron results, these effects were above and beyond location selectivity, were equivalent at the two cue locations and were uncorrelated with the sensitivity to variance and EV in saccadic RT. Thus, variance and EV reduced power in the α/low-β frequency range in both areas but had distinct area-specific effects in the high-β/low-γ frequency range.

Variance enhances parietal-to-frontal information transmission

Given theoretical predictions that uncertainty modulates the balance between top-down and bottom-up information transmission4,6, we asked how variance and EV modulate functional interactions among the two areas. To this end, we calculated SFC using the method of Vinck et al. that is known to compensate for biases due to low spike counts and volume conduction35,36. The SFC measures the extent to which spikes arrive at a consistent phase of the LFP oscillations and provides an index of directional interactions. The SFC between spikes in area A and LFPs in area B measures the extent to which outputs from area A influence area B, while the SFC between spikes in area B and LFPs in area A measure the opposite interactions35,36,37 (see also “Methods”).

The most robust modulation we found was an asymmetric effect of variance on fronto-parietal SFC. Higher uncertainty was associated with enhanced SFC from 7A spikes to dlPFC LFPs, suggesting enhanced information transmission from 7A to the dlPFC (Fig. 7a). Conversely, higher variance was associated with reduced SFC in the opposite direction, suggesting reduced information transmission from dlPFC to 7A (Fig. 7b). These effects were consistent in both monkeys and could not be explained by changes in LFP power, which had opposite signs in the two areas (Figs. 5 and 6) or by LFP–LFP coherence, which did not show consistent modulations with variance or EV (Fig. S5). The SFC modulations were unique to variance and to across-area communications, with only weak and inconsistent effects being produced by EV on SFC across areas (Fig. S4a) and by both variance and EV within areas (Fig. S4b).

The SFC modulations by variance extended to all frequency bands and differed across the task epochs (Fig. 7c–f). In the α/low-β frequency band, the earliest modulation was an increase in parietal-to-frontal SFC followed by a decrease in the frontal-to-parietal direction (Fig. 7c, f; all p < 10−7 in each monkey, Krusal–Wallis test; see figure legend for detailed statistics). In the high-β/low-γ frequency band this sequence was reversed, with the earliest modulation being reduction in frontal-to-parietal SFC followed by increased parietal-to-frontal SFC (Fig. 7 d, e; all p < 10−7 in each monkey). Thus, uncertainty sets off an intricate temporal sequence of increases and decreases in fronto-parietal functional connectivity.

Individual variability and risk preference

Although our study did not examine economic decisions, it is interesting to consider how the responses to reward expectancy we report may relate to risk preference. To explore this question we tested the monkeys, after neural recordings were complete, on a choice version of the task in which the monkeys received two cues on each trial and chose one cue whose reward probability they wished to obtain (Fig. S6 legend). This revealed that monkey 1 was risk seeking and monkey 2 was risk averse—a highly significant individual difference (% of choices to the higher variance of, respectively, 56.2% and 47.8%; both p < 0.014 signed-rank test against 50%; p < 10−9 between monkeys; Fig. S6).

These individual differences corresponded with the relative sensitivity to variance versus EV in several behavioral and neural indicators. Monkey 1, who was risk seeking, was relatively more sensitive to EV rather than variance in his licking and saccadic RT; monkey 2, who was risk averse, showed the opposite pattern, and was more sensitive to variance relative to EV (Fig. S6). Among neural indicators, analogous differences were reflected in LFP power and the fraction of sensitive cells in both areas (Fig. S6) although not in measures of SFC. Thus, the relative weight that individuals afford to variance of EV may relate to risk attitudes—a conclusion that can be verified in future investigations.

Discussion

We show that the uncertainty of an expected reward, independently of the value of the reward, affects multiple aspects of microscopic and mesoscopic fronto-parietal activity, including single-neuron responses, LFP oscillations, and SFC.

Uncertainty powerfully modulated LFP power, producing different effects in low- and high-frequency bands. Low-frequency-α/low-β-LFP power homogenously decreased in 7A and the dlPFC as a function of EV and uncertainty. Because lower α/low-β LFP power has been linked with enhanced task engagement, reduced inhibition, and desynchronized neural activity in multiple structures38,39,40 this suggests that arousal was enhanced by both EV and uncertainty in our task. In contrast with the homogeneous effects of uncertainty in the lower frequency band, the signature of uncertainty in the higher frequency range differed markedly by area and was clearly distinct from that of EV. This heterogeneity is consistent with the diverse modulations previously reported for γ-band power, which  consist of increases and decreases with attention across tasks and cortical areas38. Based on the prevailing view that γ-band oscillations primarily index feedforward sensory processing32,41 our findings suggest that feedforward processing is differentially affected by uncertainty versus EV.

Another clear distinction we find is that variance and EV had separate representations in spiking activity. Previous studies have shown that reward variance is encoded independently of value by individual neurons in the orbitofrontal cortex19, in subcortical structures such as the basal forebrain20,21,42 and, more recently, in the anterior cingulate cortex43,44. Our findings show that this segregation extends to lateral fronto-parietal areas and to circuit-based measures including noise correlations and population decoding capacity.

Importantly, we show that, rather than producing overall increases or decreases in firing, uncertainty and EV had opponent-coding representations, enhancing or suppressing responses in distinct classes of cells. An opponent-coding representation has been previously reported for EV in the dlPFC45,46 and here we show that it extends to the parietal cortex and to uncertainty. Our finding that neurons with similar polarity have higher noise correlations suggests that they form subnetworks with distinct functional properties. Neurons with positive and negative EV scaling may be associated with, respectively, approach and avoidance behaviors (go/no-go tendencies) that are mediated by distinct basal ganglia pathways47. Neurons with positive or negative variance coding, on the other hand, may arbitrate between different modes of cognitive control based on uncertainty—relying on a simpler striatal controller when familiar, habitual strategies are sufficient but engaging the prefrontal cortex in uncertain conditions48.

While uncertainty and value have been shown to modulate the representations of specific objects or actions (e.g., in the frontal cortex14,49 and dorsal striatum50) the EV and variance-sensitive cells we describe reflected global, non-spatial states of expectancy that were independent of visuo-spatial selectivity. This result most likely reflects the task we employed, in which monkeys merely formed expectations without formulating a choice, contrasting with previous studies in which monkeys made a deliberate choice. Thus, an important question for future investigations concerns the relation between expectancy and decision making, especially given our preliminary finding that the relative impact of uncertainty versus EV on expectations may be systematically related to individual risk attitudes.

A central result we report is that uncertainty had powerful effects on fronto-parietal connectivity. Higher uncertainty was associated with reduced SFC from the frontal to the parietal cortex but enhanced SFC from the parietal to the frontal lobe. These results are consistent with a recent report that, although fronto-parietal areas have similar single-neuron activity, the direction of their functional interactions can be strongly dependent on context28. That study found that, in monkeys performing a familiar categorization task, information about task context and rules was predominantly transmitted in a top-down direction, from dlPFC to 7A. This is consistent with our finding that, when monkeys have low uncertainty, frontal-to-parietal SFC is higher than parietal-to-frontal SFC (i.e., 0 variance, Fig. 7a, b). However, we show that, for higher uncertainty, this balance can drastically change in favor of parietal-to-frontal transmission, consistent with Bayesian optimal inference theories.

Our finding that uncertainty reduces SFC in the top-down direction does not imply that the dlPFC goes “offline” in uncertain conditions. Indeed, the modulations of frontal-to-parietal and parietal-to-frontal SFC had overlapping time-courses and some of the strongest effects of uncertainty—i.e., on high-β/low-γ frequency LFP power (Fig. 5) and SFC (Fig. 7c–f)—emerged first in the dlPFC and only later in 7A. It is thus possible that the connectivity changes relied on dynamic interactions between the frontal and parietal lobes. We propose that uncertainty is detected by frontal cortical areas including the dlPFC and the anterior cingulate cortex; these areas may provide the initial drive which, perhaps by triggering release of neuromodulators, ultimately leads to increases in sensory gains and enhancements in parietal-to-frontal transmission2,51.

Our findings also support the idea that the parietal cortex plays an important role in resolving uncertainty. Early support for this view comes from the reinforcement learning literature showing that rats have increases in associability (learning rates) for uncertain sensory cues and these increases are reduced by lesions of the parietal cortex52. Subsequent single-neuron recordings in monkeys show that parietal neurons have enhanced responses to novel stimuli and salient distractors16,53 and, in multi-step decision tasks, assign credit and learning specifically at junctures that resolve uncertainty11,54,55. Thus, an important direction for future research is to refine our understanding of the intricate mechanisms that allow the brain to allocate attention and other cognitive resources to task junctures that not only have high value but benefit from new information and a reduction of uncertainty5,7,18,26.

Methods

General methods

Data were collected from two adult male rhesus monkeys (Macaca mulatta; 9–12 kg) using standard behavioral and neurophysiological techniques as described previously56. All methods were approved by the Animal Care and Use Committees of Columbia University and New York State Psychiatric Institute as complying with the guidelines within the Public Health Service Guide for the Care and Use of Laboratory Animals. Visual stimuli were presented on a MS3400V XGA high definition monitor (CTX International, Inc., City of Industry, CA; 62.5 × 46.5 cm viewing area). Eye position was recorded using an eye tracking system (Arrington Research, Scottsdale, AZ). Licking was recorded with an in-house device that detected interruptions in a laser beam produced by extensions of the monkeys’ tongue.

A trial started with the presentation of two textured square placeholders (1° width) located along the horizontal meridian at 8° eccentricity to the right and left of a central fixation point (white square, 0.2° diameter). After a 300–500 ms period of central fixation (when the monkeys maintained gaze within a 1.5–2° square window centered on the fixation point) one of the placeholders was replaced by a randomly selected reward cue (a vertical rectangle measuring 1.2 × 5° with 11 gray bars indicating the reward scale, and one or two gradations highlighted in yellow, indicating the trial’s rewards). The cue was visible for 400 ms and was followed by a 400-ms delay period, after which the fixation point disappeared and one of the placeholders simultaneously increased in luminance, indicating the saccade target. The target location was randomized independently of the reward cue. If the monkey made a saccade to the target with an RT of 100–700 ms and maintained fixation within a 2.0–3.5° window for 377 ms, he received a reward with the magnitude and probability that had been indicated by the cue.

Neural recordings

After completing behavioral training, each monkey was implanted with two 48-electrode Utah arrays (electrode length 1.5 mm) arranged in rectangular grids (1 mm spacing; monkey 1, 7 × 7 mm; monkey 2, 5 × 10 mm) and positioned in the pre-arcuate portion of the dlPFC and the posterior portion of area 7A (Fig. S1). Data were recorded using the Cereplex System (Blackrock, Salt Lake City, Utah) over 24 sessions spanning 4 months after array implantation in monkey 1, and 11 sessions spanning 2 months after implantation in monkey 2.

Statistics and reproducibility

Data were analyzed with MatLab (MathWorks, Natick, MA; version R2016-b) and other specialized software as noted below. Raw spikes were sorted offline using WaveSorter57. We analyzed a total of 12,029 trials that (1) were correctly completed and (2) had RT within 2 standard deviations relative to the mean of each monkey’s full dataset (monkey 1: n = 8082 analyzed trials, monkey 2: n = 3947). These trials were further sub-selected for different analyses. For single-neuron analysis, we included only well-isolated cells, as defined by the automated sorting results followed by visual inspection to verify that only neurons with waveforms clearly separated from noise were included in the analysis, and that the population of cells was substantially different across days (ensuring that we did not systematically record from the same subsets of cells). For LFP and SFC analyses, trials were further cleaned to remove electrical artifacts as described in detail below. The unit of statistical comparison and statistical tests differed are described in detail throughout the text.

Analysis of behavior

The lickometer signal was digitized at 1 kHz to produce a trial-by-trial record of licking with 1 ms resolution. The probability of licking was measured in a time window centered on the time of each monkey’s average peak licking response (monkey 1: 400–800 ms after cue onset; monkey 2: 800–1100 ms after cue onset). Licking probabilities in individual trials were pooled across sessions and subjected to a GLM analysis with EV and variance, including cue position and the EV × variance interaction as nuisance regressors (using a binomial distribution and logit link function and implemented in the fitglm function in the MATLAB statistics toolbox). Models that included a parametric uncertainty regressor outperformed those that included only a binary indicator of probabilistic versus deterministic cues and are presented throughout the paper.

Single neurons spike analysis

Raw spikes were sorted offline using WaveSorter and produced a total of 1175 neurons in dlPFC (749 in monkey 1) and 971 neurons in 7A (755 in monkey 1). We focused the analysis on the subset of units that were well isolated, had at least five trials in each condition, and fired at least five spikes on average within the time interval from 500 ms before to 1000 ms after cue onset, comprising 530 neurons in dlPFC (432 in monkey 1) and 522 neurons in 7A (481 in monkey 1).

To measure neuronal selectivity, we fit each neuron’s trial-by-trial spike count in the interval 0–800 ms after cue onset using a GLM with factors EV, variance, EV × variance, and Cue location, using a normal distribution and identity link function. To estimate changes in firing rate variability, we computed the Fano factor—the ratio of across-trial variability to the mean firing rate. Although the Fano factor was lower during the cue/delay relative to the pre-cue epochs, we found no consistent changes as a function of variance or EV in either area.

Peri-stimulus time histograms were constructed for display purposes by smoothing the cue-onset aligned spike train with a Gaussian kernel with 50 ms standard deviation z-scoring using the mean and standard deviation during the cue and delay epochs for each cell, and averaging across cells.

SVM classification

We smoothed the raw spike train using a Gaussian kernel of 50 ms standard deviation and measured the average smoothed firing rate in each trial in the interval 0–800 ms after cue onset. We then evaluated decoding accuracy for each pairwise classification (e.g., EV3 vs EV6, variance 1 vs variance 4, etc) using the data pooled across all the neurons in an array. To construct the pooled dataset, we randomly selected m trials from each neuron and every condition, where m was equal to the minimum number of trials across all neurons and all conditions. We used a fivefold cross-validation procedure with 200 repetitions to compute decoding accuracy in the original dataset and repeated the procedure after randomly shuffling trial labels to compute the baseline accuracy expected purely by chance.

LFP pre-processing

The raw LFP from each electrode and trial were measured from 1200 ms before to 2000 ms after cue onset, notch filtered at 60 Hz, low pass filtered at 100 Hz, and subjected to a linear trend removal. The traces from each session were then pooled and subjected to a two-step cleaning procedure to remove outliers in, respectively, the frequency and time domains. For the first step that removed outliers in the frequency domain, we calculated the power spectrum of each LFP trace in the range of 0–90 Hz (using a multitaper method with four tapers) and characterized each trial with a five-dimensional vector containing the sum of the logarithm of the power spectrum in five frequency bands (0.5–4, 4–8, 8–12, 12–30, and 30–90 Hz). We then reduced the dimensionality of each session’s dataset to two principal components using principal component analysis and clustered this two-dimensional dataset using Gaussian Mixture Models (fitgmdist function in the MATLAB statistics and machine learning toolbox). This procedure produced, for each session, one or two “dense” clusters that contained most of the session’s data, and one or two “sparse” clusters containing the remaining trials, in which the LFP power in at least one frequency band was an outlier. We discarded the trials in the sparse clusters. In addition, we discarded trials that were identified as outliers within the dense clusters—i.e., for which the Mahalanobis distance to all other trials in the cluster was above the 90th percentile. The trials surviving the first step were subjected to a second step that removed outliers in the time domain. To this end, we computed the peak-to-peak amplitude of the broadband LFP trace in each trial, z transformed these values, and removed trials for which this measure was more than half a standard deviation away from the mean across all trials. This was a conservative cleaning procedure that removed all the trials with poor signal quality due to a variety of reasons (e.g., signal-to-noise ratio, artifact, or saturation). Overall, 39.5% of trials (12.3–77.4% across sessions) were excluded after pre-processing.

LFP power spectrum

For each trial that was accepted for analysis, we calculated the LFP power spectrum in 1 Hz frequency bands using Morlet wavelet transformation (ft_freqanalysis function of the FieldTrip toolbox58). The power in each band was then z-scored relative to all the trials and time points within the session, and normalized relative to the trial’s baseline using the following equation:

$${\mathrm{{relative}}}\;{\mathrm{{power}}}\;{\mathrm{{change}}}\,(t,f) = \frac{{{\mathrm{{power}}}_{tf} - \overline {{\mathrm{{baseline}}}} _f}}{{\overline {{\mathrm{{baseline}}}} _f}},$$
(1)

where powertf is the power at time t and frequency f, and $$\overline {{\mathrm{{baseline}}}} _f$$ is the power in frequency f during the 300 ms interval before cue onset on the same trial. Normalization relative to the frequency-specific baseline accounted both for trial-by-trial variability and 1/f power distribution36.

GLM of LFP power spectrum

The Relative Power Change quantity from Eq. (1) produced a time-frequency map of normalized LFP power for each trial and electrode. To determine how these maps varied as a function of uncertainty and EV, for every trial we pooled the trials across the electrodes of an array, and fit this pooled dataset using a GLM with factors of [EV, variance, EV × variance, Cue location] assuming a normal distribution and identity link function. This produced a time-frequency map of coefficients measuring the effects of EV and variance, controlling for any visuo-spatial response and EV × variance interaction (Fig. 5). To identify ROI within the GLM coefficient maps, we divided the cue and delay periods into 200 ms epochs, and identified frequencies for which the coefficients for a variable were significantly different from 0 with the same sign in both monkeys (Kruskal–Wallis test with false discovery rate (FDR) correction).

Field–field coherence

Field–field coherence was measured using weighted phase lag index (WPLI)37. The WPLI uses imaginary part of the cross-spectrum to remove the volume conduction effect. Within a session, for every task condition the phase locking index was calculated across trials and LFP channel pairs for every time and frequency. GLMs with EV, variance, and EV × variance factors were then fitted to the coherence maps from different sessions, assuming normal distribution and identity link function.

Spike-field coherence

We used the FieldTrip toolbox58 to calculate the power spectrum for the trial-by-trial LFP using multitaper analysis, and the ft_spiketriggeredspectrum function to measure the phase in frequencies of 4–47 Hz. We estimated SFC using the average Pairwise Phase Consistency index (PPC2, ft_spiketriggeredspectrum_stat function), which is known to minimize biases due to low spike counts and volume conduction35,37. For every pair of neuron-LFP channel, in each task condition and frequency, PPC2 was calculated across all spikes that the cell fired in the corresponding task condition. PPC2 values of all cell–LFP pairs (excluding pairs in which the neurons did not emit any spikes) were submitted to non-parametric analyses to detect influences of EV and variance (n ranging between 136,368 and 164,880 across conditions). In the frequency plots, p values were corrected for comparison across frequencies using the FDR correction method.

Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The summary statistics are available within the article and its data supplement. All other data are available from the corresponding author upon reasonable request.

References

1. 1.

Shenhav, A., Botvinick, M. & Cohen, J. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013).

2. 2.

Shenhav, A. et al. Toward a rational and mechanistic account of mental effort. Annu. Rev. Neurosci. 40, 99–124 (2017).

3. 3.

Grossberg, S. Adaptive pattern classification and universal recoding, II: Feedback, expectation, olfaction, and illusions. Biol. Cybern. 23, 187–202 (1976).

4. 4.

Yu, A. J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).

5. 5.

Gottlieb, J. & Oudeyer, P. Y. Toward a neuroscience of active sampling and curiosity. Nat. Rev. Neurosci. 19, 758–770 (2018).

6. 6.

Pezzulo, G., Rigoli, F. & Friston, K. J. Hierarchical active inference: a theory of motivated control. Trends Cogn. Sci. 22, 294–306 (2018).

7. 7.

Fan, J. An information theory account of cognitive control. Front. Hum. Neurosci. 8, 680 (2014).

8. 8.

Kidd, C. & Hayden, B. Y. The psychology and neuroscience of curiosity. Neuron 88, 449–460 (2015).

9. 9.

Baranes, A. F., Oudeyer, P. Y. & Gottlieb, J. Eye movements encode epistemic curiosity in human observers. Vis. Res. 117, 81–90 (2015).

10. 10.

Daddaoua, N., Lopes, M. & Gottlieb, J. Intrinsically motivated oculomotor exploration guided by uncertainty reduction and conditioned reinforcement in non-human primates. Sci. Rep. 6, 20202 (2016).

11. 11.

Horan, M., Daddaoua, N. & Gottlieb, J. Parietal neurons encode information sampling based on decision uncertainty. Nat. Neurosci. 22, 1327–1335 (2019).

12. 12.

Bach, D. R. & Dolan, R. J. Knowing how much you don’t know: a neural organization of uncertainty estimates. Nat. Rev. Neurosci. 13, 572–586 (2012).

13. 13.

Platt, M. L. & Huettel, S. A. Risky business: the neuroeconomics of decision making under uncertainty. Nat. Neurosci. 11, 398–403 (2008).

14. 14.

Grabenhorst, F., Tsutsui, K. I., Kobayashi, S. & Schultz, W. Primate prefrontal neurons signal economic risk derived from the statistics of recent reward experience. Elife 8, e44838 (2019).

15. 15.

Ebitz, R. B., Albarran, E. & Moore, T. Exploration disrupts choice-predictive signals and alters dynamics in prefrontal cortex. Neuron 97, 450–461 (2018).

16. 16.

Foley, N. C., Jangraw, D. C., Peck, C. & Gottlieb, J. Novelty enhances visual salience independently of reward in the parietal lobe. J. Neurosci. 34, 7947–7957 (2014).

17. 17.

Hayden, B. Y., Heilbronner, S. R., Pearson, J. M. & Platt, M. L. Surprise signals in anterior cingulate cortex: neuronal encoding of unsigned reward prediction errors driving adjustment in behavior. J. Neurosci. 31, 4178–4187 (2011).

18. 18.

Gottlieb, J., Hayhoe, M., Hikosaka, O. & Rangel, A. Attention, reward and information seeking. J. Neurosci. 34, 15497–154504 (2014).

19. 19.

O’Neill, M. & Schultz, W. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800 (2010).

20. 20.

Monosov, I. E. & Hikosaka, O. Selective and graded coding of reward uncertainty by neurons in the primate anterodorsal septal region. Nat. Neurosci. 16, 756–762 (2013).

21. 21.

Monosov, I. E., Leopold, D. A. & Hikosaka, O. Neurons in the primate medial basal forebrain signal combined information about reward uncertainty, value, and punishment anticipation. J. Neurosci. 35, 7443–7459 (2015).

22. 22.

Leong, Y., Radulescu, A., Daniel, R., DeWoskin, V. & Niv, Y. Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93, 451–463 (2017).

23. 23.

Padmala, S. & Pessoa, L. Reward reduces conflict by enhancing attentional control and biasing visual cortical processing. J. Cogn. Neurosci. 23, 3419–3432 (2011).

24. 24.

Kennerley, S. W. & Wallis, J. D. Reward-dependent modulation of working memory in lateral prefrontal cortex. J. Neurosci. 29, 3259–3270 (2009).

25. 25.

Mansouri, F. A., Egner, T. & Buckley, M. J. Monitoring demands for executive control: shared functions between human and nonhuman primates. Trends Neurosci. 40, 15–27 (2017).

26. 26.

Nakamura, K. & Komatsu, M. Information seeking mechanism of neural populations in the lateral prefrontal cortex. Brain Res. 1707, 79–89 (2019).

27. 27.

Peck, C. J., Jangraw, D. C., Suzuki, M., Efem, R. & Gottlieb, J. Reward modulates attention independently of action value in posterior parietal cortex. J. Neurosci. 29, 11182–11191 (2009).

28. 28.

Crowe, D. A. et al. Prefrontal neurons transmit signals to parietal neurons that reflect executive control of cognition. Nat. Neurosci. 16, 1484–1491 (2013).

29. 29.

Katsuki, F. & Constantinidis, C. Early involvement of prefrontal cortex in visual bottom-up attention. Nat. Neurosci. 15, 1160–1166 (2012).

30. 30.

Cohen, M. R. & Kohn, A. Measuring and interpreting neuronal correlations. Nat. Neurosci. 14, 811–819 (2011).

31. 31.

Haegens, S. & Zion Golumbic, E. Rhythmic facilitation of sensory processing: a critical review. Neurosci. Biobehav Rev. 86, 150–165 (2018).

32. 32.

Bastos, A. M. et al. Visual areas exert feedforward and feedback influences through distinct frequency channels. Neuron 85, 390–401 (2015).

33. 33.

Bastos, A. M., Vezoli, J. & Fries, P. Communication through coherence with inter-areal delays. Curr. Opin. Neurobiol. 31, 173–180 (2015).

34. 34.

Van Diepen, R. M., Foxe, J. J. & Mazaheri, A. The functional role of alpha-band activity in attentional processing: the current zeitgeist and future outlook. Curr. Opin. Psychol. 29, 229–238 (2019).

35. 35.

Vinck, M., Battaglia, F. P., Womelsdorf, T. & Pennartz, C. Improved measures of phase-coupling between spikes and the local field potential. J. Comput. Neurosci. 33, 53–75 (2012).

36. 36.

Cohen, M. X. Analyzing Neural Time Series Data: Theory and Practice (MIT Press, 2014).

37. 37.

Martin Vinck, M., Oostenveld, R., van Wingerden, M., Battaglia, F. & Pennartz, C. M. A. An improved index of phase-synchronization for electrophysiological data in the presence of volume-conduction, noise and sample-size bias. NeuroImage 55, 1548–1565 (2011).

38. 38.

Thiele, A. & Bellgrove, M. A. Neuromodulation of attention. Neuron 97, 769–785 (2018).

39. 39.

An, J., Yadav, T., Hessburg, J. P. & Francis, J. T. Reward expectation modulates local field potentials, spiking activity and spike-field coherence in the primary motor cortex. eNeuro 6, https://doi.org/10.1523/ENEURO.0178-19.2019 (2019).

40. 40.

Lemaire, N. et al. Effects of dopamine depletion on LFP oscillations in striatum are task- and learning-dependent and selectively reversed by L-DOPA. Proc. Natl Acad. Sci. USA 109, 18126–18131 (2012).

41. 41.

Buffalo, E. A., Fries, P., Landman, R., Buschman, T. J. & Desimone, R. Laminar differences in gamma and alpha coherence in the ventral stream. Proc. Natl Acad. Sci. USA 108, 11262–11267 (2011).

42. 42.

Ledbetter, N. M., Chen, C. D. & Monosov, I. E. Multiple Mechanisms for processing reward uncertainty in the primate basal forebrain. J. Neurosci. 36, 7852–7864 (2016).

43. 43.

White, J. K. et al. A neural network for information seeking. Nat. Commun. 10, 5168 (2019).

44. 44.

Monosov, I. E. Anterior cingulate is a source of valence-specific information about value and uncertainty. Nat. Commun. 8, 134 (2017).

45. 45.

Kennerley, S. W., Behrens, T. E. & Wallis, J. D. Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat. Neurosci. 14, 1581–1589 (2012).

46. 46.

Wallis, J. D. & Kennerley, S. W. Heterogeneous reward signals in prefrontal cortex. Curr. Opin. Neurobiol. 20, 191–198 (2010).

47. 47.

Calabresi, P., Picconi, B., Tozzi, A., Ghiglieri, V. & Di Filippo, M. Direct and indirect pathways of basal ganglia: a critical reappraisal. Nat. Neurosci. 17, 1022–1030 (2014).

48. 48.

Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

49. 49.

Tsutsui, K. I., Grabenhorst, F., Kobayashi, S. & Schultz, W. A dynamic code for economic object valuation in prefrontal cortex neurons. Nat. Commun. 7, 12554 (2016).

50. 50.

White, J. K. & Monosov, I. E. Neurons in the primate dorsal striatum signal the uncertainty of object-reward associations. Nat. Commun. 7, 12735 (2016).

51. 51.

Silvetti, M., Vassena, E., Abrahamse, E. & Verguts, T. Dorsal anterior cingulate-brainstem ensemble as a reinforcement meta-learner. PLoS Comput. Biol. 14, e1006370 (2018).

52. 52.

Pearce, J. M. & Mackintosh, N. J. Two Theories of Attention: A Review and a Possible Integration (Oxford University Press, New York, 2010).

53. 53.

Suzuki, M. & Gottlieb, J. Distinct neural mechanisms of distractor suppression in the frontal and parietal lobe. Nat. Neurosci. 16, 98–104 (2013).

54. 54.

Gersch, T. M., Foley, N. C., Eisenberg, I. & Gottlieb, J. Neural correlates of temporal credit assignment in the parietal lobe. PLoS ONE 9, e88725 (2014).

55. 55.

Foley, N. C., Kelley, S. P., Mhatre, H., Lopes, M. & Gottlieb, J. Parietal neurons encode expected gains in instrumental information. Proc. Natl Acad. Sci. USA 114, E3315–E3323 (2017).

56. 56.

Oristaglio, J., Schneider, D. M., Balan, P. F. & Gottlieb, J. Integration of visuospatial and effector information during symbolically cued limb movements in monkey lateral intraparietal area. J. Neurosci26, 8310–8319 (2006).

57. 57.

Phillips, M. H. In Society for Neuroscience Vol. 508.12 (San Diego, CA, 2012).

58. 58.

Oostenveld, R., Fries, P., Maris, E. & Schoffelen, J.-M. FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 156869 (2011).

Acknowledgements

The work was supported by a generous gift of surgical equipment from Synthes, Inc. and a McKnight Foundation Memory and Cognitive Disorder Award to J.G.

Author information

Authors

Contributions

N.C.F. and J.G. designed the experiment. N.C.F., S.A.S., M.C., M.S., R.L., and J.G. implemented the experiment and collected the data. B.T., S.K., R.L., and J.G. analyzed the data and wrote the manuscript.

Corresponding author

Correspondence to Jacqueline Gottlieb.

Ethics declarations

Competing interests

The authors declare that they have no competing interests. J.G. is an Editorial Board Member for Communications Biology, but was not involved in the editorial review of, nor the decision to publish, this article.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

Taghizadeh, B., Foley, N.C., Karimimehr, S. et al. Reward uncertainty asymmetrically affects information transmission within the monkey fronto-parietal network. Commun Biol 3, 594 (2020). https://doi.org/10.1038/s42003-020-01320-6

• Accepted:

• Published: