Introduction

Perceptual decision-making requires neural circuits to integrate evidence and classify a stimulus to trigger the correct behavioral response. Neurons in a range of cortical areas modulate their firing rate to signal animal’s choice1. The functional properties of decision-making neural circuits have been extensively studied and modeled2,3,4,5,6,7,8,9. Central to the function of these circuit models are attractors in the activity space which characterize the population’s encoding of a given choice. The attractor mechanism driving the decision-making activity in these models relies on structured recurrent connections between populations of excitatory neurons that are each selective for a different choice8,10,11. Inhibitory neurons, in this view, are merely supporting actors facilitating competition and providing balance to the excitatory neurons.

Since the canonical models of decision-making circuits were built, the diversity and complexity of inhibitory neurons within the cortex have been characterized in increasing detail12. In primary sensory areas, inhibitory neurons are generally more broadly tuned13 and more densely connected to neighboring excitatory neurons14,15. These inhibitory neurons reliably modulate spike output to reflect stimulus features and have highly specific connectivity to surrounding excitatory neurons16,17. The stimulus selectivity of inhibitory neurons is enhanced by learning and attention18 suggesting that task dependent modulation of inhibitory activity is necessary for cognition. Beyond the primary sensory cortex, stimulus information and animal choice can be decoded from the activity of inhibitory neurons in secondary sensory and association areas indicating a role for selective inhibition in higher cognitive functions, such as decision making19,20,21. While there is growing evidence that the activity and connectivity of inhibitory neurons is as complex as excitatory neurons, how the selectivity of inhibitory activity and the diversity of their connections affect the decision-making function of cortical circuits is still unknown.

To reveal the role of choice selective inhibitory neurons in decision-making computations we extended a well established mean-field model of decision-making circuits4 to account for the presence of inhibitory choice selectivity. Our model allows us to parametrically alter the specificity of connections between two choice-selective excitatory and two choice-selective inhibitory populations. Through analysis of this model, we found that while inhibition must drive competition between choice-selective excitatory populations it must also stabilize activity driven by recurrent excitation at the same time. These two concurrent roles are mediated by inhibitory connections to the excitatory populations and either role can be enhanced by structured inhibitory connectivity. We found that inhibitory selectivity expands the space of possible circuits which support decision-making by enhancing either a competitive or stabilizing role for inhibition. In addition, the connectivity motif between choice selective populations alters the underlying attractor dynamics and modulates the decision-making performance to prioritize speed or accuracy. We generalized these results by training recurrent neural networks (RNNs) to perform the same decision-making task. After training, RNNs had both excitatory and inhibitory units significantly selective for choice and displayed a similar dependence between the specificity of excitatory and inhibitory connections found in the mean-field model. Finally, we perturbed inhibitory neuron activity in these models to probe the dynamical regime in which the circuit operates. We found two regimes in which circuits respond differently to perturbations of inhibitory neurons: one in which the competitive role dominates and the other in which the stabilizing role dominates. Our work demonstrates that choice selective inhibition impacts decision-making behavior by enhancing either the competitive or the stabilizing role for inhibition in the circuit. These results generate testable predictions for perturbation experiments.

Results

We consider circuits where two excitatory (E) populations integrate dedicated streams of sensory evidence to produce a categorical choice (Fig. 1a). In contrast to previous circuit models of decision-making with global inhibition, we include two inhibitory (I) populations which can inherit choice selectivity from excitatory neurons (Methods). We model the circuit dynamics using two-dimensional mean-field equations where the mean presynaptic activation of N-methyl-D-aspartate (NMDA) receptor of the two excitatory (E1 and E2) populations are the dynamic variables4. The average strength of connections between the four choice selective populations is controlled by a specificity parameter γ. For each of three connection classes (E to E, E to I, and I to E; Fig. 1b), γEE, γEI, and γIE set the balance of connection strengths between populations with the same and opposite choice selectivity (Fig. 1c). For example, (1 + γEE) is the strength of feedback connections within excitatory populations selective for the same choice, and (1 − γEE) is the strength of connections between excitatory populations selective for the opposite choice. We keep γEE positive due to the importance of recurrent feedback excitation in the function of these circuits4. When γEE = 1, each of E1 and E2 have a strong self-excitatatory feedback and are not connected to each other. When γEE = 0, the strengths of excitatory connections between and within E1 and E2 are all equal. Inhibitory choice selectivity is controlled by γEI defined in the same way, which is also positive because inhibitory neurons inherit choice and stimulus information from the excitatory neurons. When γEI = 1, inhibitory population I1 receives excitatory inputs from E1 but not E2 and vice versa. When γEI = 0, each I1 and I2 receive equal excitatory inputs from E1 and E2. Thus, inhibitory activity is not choice selective when γEI = 0 because inhibitory neurons receive equal input from both excitatory populations. Inhibitory choice selectivity emerges as γEI increases (Fig. 1d).

Fig. 1: A mean-field circuit model of decision making with choice-selective inhibition.
figure 1

a The circuit diagram of the model with choice-selective excitatory and inhibitory populations. b The circuit model includes three connection classes: excitatory-excitatory (EE), excitatory-inhibitory (EI), and inhibitory-excitatory (IE). c The parameter γ controls the specificity of connections between choice-selective populations. The output connections preferentially target neurons with the same choice preference when γ is positive, and with the opposite choice preference when γ is negative. d γEI controls inhibitory choice selectivity. Firing rate of inhibitory populations for γEI = 0 (left), γEI = 0.05 (center), γEI = 0.25 (right) are shown for an example trial with stimulus strength equal to 20. e Circuits report choices by elevating the firing rate of one excitatory population. Example trials showing E1 (blue) and E2 (red) population activity for stimulus strength equal to 20 (upper panel), and for stimulus strength equal to 0 on a completed (middle panel) and invalid trial (lower panel). Gray shading indicates stimulation period. Numbers indicate activity corresponding to fixed points in f. f Eight fixed points are required for decision-making dynamics in the circuit: five in the unstimulated phase-plane (left) and three in the stimulated phase-plane (right, stimulus strength is equal to 0). Lines show nullclines of E1 and E2 populations, black squares indicate fixed-point attractors, and gray squares indicate saddle-points for a circuit with nonselective inhibition.

For inhibitory choice selectivity to have any effect on circuit function, the outputs of inhibitory populations must be structured (i.e. γIE ≠ 0; Fig. 1c). The specificity of inhibitory outputs γIE can range between [−1, 1] with negative values favoring connections between E and I populations with opposite choice preference and positive values favoring connections between E and I populations with the same choice preference. When γIE = 1, I1 sends inhibitory output to E1 but not E2. When γIE = −1, I1 sends inhibitory output to E2 but not E1. Thus, the specificity of inhibitory output connectivity defines three circuit motifs: contraspecific for γIE < 0, ipsispecific for γIE > 0, and nonspecific for γIE = 0.

In any decision-making circuit, inhibition concurrently fulfills two roles. The first is providing the substrate for competition between the excitatory populations, and the second is stabilizing the self-amplification driven by strongly recurrent excitatory populations. Both of these roles must be fulfilled for a circuit to function, but specific connections to and from inhibitory populations could enhance one of these roles (Fig. 1c). Specifically, ipsispecific inhibition can promote stabilizing feedback and contraspecific inhibition can maximize competition.

In response to an input stimulus, the circuit can produce different choice outcomes by changing the firing rates of the excitatory populations. Circuits report a choice by persistently raising the firing rate of one excitatory population at least 15 Hz above the other. Trials where this separation does not occur are considered invalid and not included in the calculation of psychometric or chronometric functions (Fig. 1e, Methods). We also require that prior to the stimulus onset, the circuit maintains low, symmetric activation of excitatory neurons. Persistence of the decision after stimulus offset allows for a choice readout to be made even after a significant delay and its utility led us to include the working memory of a choice in our criteria for inclusion as a circuit supporting decision-making (Fig. 1e). These dynamics are governed by eight fixed points across the phase planes of unstimulated and stimulated system, which are essential for the functional decision-making and working memory behavior (Fig. 1e, f). Prior to stimulus onset, both excitatory populations maintain low symmetric activation, which is set by an attractor located near the origin in the unstimulated phase plane. Following stimulus onset, the firing rate for both populations increases as the system approaches a saddle point along the stable manifold which acts as a separatrix between two choice attractors in the stimulated phase plane. Following stimulus offset, the system returns to its unstimulated phase plane and the choice of the circuit is preserved by one of two working memory attractors.

Inhibitory connection specificity expands the space of circuits that support decision making

Using the mean-field model, we investigated how the circuit’s ability to perform decision-making depends on the inhibitory connectivity structure. Specifically, we determined how choice-selective inhibition affects the presence of the eight fixed points (three attractors and two saddle points in the unstimulated phase plane, and two attractors and one saddle in the stimulated phase plane) governing decision-making behavior. We sampled the specificity parameter space to identify circuits which support these eight fixed points (Fig. 2a). We found that a broad range of circuit configurations can support decision making. There are two components of inhibitory choice selectivity which rely on specific connections to and from inhibitory populations. The first is the degree of choice selective firing by inhibitory neurons that is controlled by γEI. The second is the degree to which inhibitory populations have a specific effect on excitatory neurons that is controlled by γIE. We combine these two components into a specificity index γEIγIE, which is negative for contraspecific and positive for ipsispecific circuits following the sign of γIE. The specificity of excitatory and inhibitory connections is highly correlated in circuits supporting decision making (Fig. 2b). When inhibition is nonselective (γEI = 0) or nonspecific (γIE = 0), the strength of recurrent excitation (γEE) is highly constrained and deviations from a narrow range leads to the loss of one of the essential fixed points (Fig. 2c). For circuits with selective inhibition, a wider range of γEE will support decision making as long as a complementary inhibitory motif is present. For low γEE, the inhibitory motif must be contraspecific (γEIγIE < 0, Fig. 2b and d left) and for high γEE it must be ipsispecific (γEIγIE > 0, Fig. 2b, d right). A contraspecific inhibitory motif can promote competition in circuits where excitatory feedback connections are insufficiently strong to amplify firing rate differences between choice selective populations. An ispispecific inhibitory motif can stabilize excitatory feedback to prevent inadvertent winner-take-all dynamics in the absence of stimulus in circuits with strong excitatory specificity. By enhancing either the competitive or stabilizing role, circuits with choice selective inhibitory populations can support decision making for a wider range of γEE (Fig. 2b).

Fig. 2: Choice-selective inhibition expands the space of circuits supporting decision making.
figure 2

a The volume in the connection specificity parameter space where circuits have all fixed points necessary to support decision-making. Color indicates the inhibitory specificity index γEIγIE. b For nonselective or nonspecific inhibition, circuits that support decision-making exist only for a narrow range of γEE. If inhibition is selective and specific, a broader range of γEE becomes possible. Inhibitory specificity needs to be complementary to excitatory specificity, as reflected in the correlation between the inhibitory specificity index and γEE for circuits with all necessary fixed points. c With increasing inhibitory selectivity, a broader range of γEE can support decision making, evident as a steeper slope of the region with all required fixed points (green, good decision circuit). Panels show slices through the parameter space in a at γEI = 0.0 (left), γEI = 0.5 (center), γEI = 1.0 (right). Colored frames correspond to dots on the γEI axis in a. d When γEE is low inhibition must be contraspecific (left) and when γEE is high inhibition must be ipsispecific (right). Slices through the parameter space in a at γEE = 0.225 (left), γEE = 0.35 (center), γEE = 0.475 (right). Colored frames correspond to dots on the γEE axis in a. Arrows indicate a sequential loss of fixed points described in the text.

The emphasis on competition or stability can also be seen in which fixed points are lost when connection specificity between excitatory and inhibitory populations are not complementary. When γEE is low, nonspecific and ipsispecific circuits lack the fixed points representing choice both in the presence and absence of stimulation as well as the saddle point during the stimulus (Fig. 2d left, Supplementary Fig. 1), because recurrent excitation is too weak to drive competition alone. Contraspecific inhibition paired with low γEE restores these fixed points by emphasizing competition between populations selective for opposite choices. These fixed points emerge sequentially as the inhibitory motif becomes more contraspecific: first the choice attractors appear, followed by the saddle point, and finally by the working memory attractors (arrow in Fig. 2d left). For moderate γEE, nonspecific circuits have all eight necessary fixed points, but deviations to a contraspecific motif cause the loss of the attractor for the low initial state, whereas deviations to an ipsispecific motif cause the loss of the working memory attractors, then saddle point, and then choice attactors (arrow in Fig. 2d center, Supplementary Fig. 1). For circuits with high γEE to support decision making, inhibitory motif must be ipsispecific, as nonspecific and contraspecific circuits lack the initial low activation state attractor (Fig. 1d right, Supplementary Fig. 1). The trade-off between competition and stability across contraspecific and ipsispecific circuits is also evident in the size of choice-selective populations that support decision-making (Supplementary Fig. 2).

Specific connections between choice-selective inhibitory populations may also impact the attractors underlying decision-making. For example, competitive inhibitory-inhibitory connections can mediate disinhibition in contraspecific circuits22,23. We therefore investigated the effect of inhibitory-to-inhibitory connection specificity on decision-making dynamics. We extended our mean-field approach to explicitly model the activity of two choice-selective excitatory and two choice-selective inhibitory populations (Methods). This four-variable model produces the firing-rate dynamics and attractors similar to the original two-variable mean-field model (Supplementary Fig. 3a–c). In the four-variable model, we controlled the balance of connection strength between choice-selective inhibitory populations in the same manner as for other connection classes using a specificity parameter γII, which like γIE ranges between [−1, 1]. We sampled the four-dimensional specificity parameter space to identify points where the eight decision-making attractors are present (Supplementary Fig. 3d). As with the two-variable model, the main factor determining whether a circuit has the necessary fixed points is the linear relationship between γEE and γEIγIE. Inhibitory-to-inhibitory connection specificity γII has a limited impact on the presence of the fixed points (Supplementary Fig. 3d, cf. Fig. 2b). Circuits with negative γII are hyper-competitive and lose the low activation state attractor.

Inhibitory motif controls the speed versus accuracy trade-off

The roles enhanced by contra- and ipsispecific inhibititory motifs lead to differences in performance of decision circuits. In circuits with moderate strengths of recurrent excitation, all three motifs can support decision making for the same γEE. We found that circuits with three inhibitory motifs differ in choice accuracy on difficult trials where stimulus strength is weak (Fig. 3a). Relative to a circuit with nonspecific inhibitory outputs (γIE = 0), ipsipecific circuits are more accurate at classifying difficult stimuli but more often fail to separate the outputs sufficiently producing invalid trials (Fig. 3b). Contraspecific circuits, on the other hand, have lower accuracy for difficult stimuli. In addition, contraspecific circuits have a stimulus independent rate of trial failure attributable to trials where the firing rates of choice-selective populations separate prior to the stimulus onset (Fig. 3b), highlighting how these circuits are primed for competitive dynamics. It is well known that decision accuracy and decision time are linked through the speed-accuracy trade-off, where longer integration times lead to more accurate decisions24,25,26. Ipsispecific circuits could be more accurate at the expense of speed, so we compared the average time it takes circuits to cross the decision threshold for each stimulus strength as a proxy for decision time. Ipsispecific circuits do indeed arrive at choices more slowly than the less accurate contraspecific circuits (Fig. 3c). These differences in behavioral performance indicate a speed versus accuracy trade-off which is mediated by the specificity of connections between choice-selective populations in the circuit (also evident in the four-variable model, Supplementary Fig. 3e). These performance outcomes again highlight the roles enhanced by ipsispecific and contraspecific inhibition: the contraspecific motif primes a circuit for competition, whereas the ipsispecific motif promotes stability, lengthening integration times.

Fig. 3: Inhibitory circuit motifs mediate the speed-accuracy trade-off in decision-making.
figure 3

ac Contraspecific circuits are faster and less accurate, whereas ipsispecific circuits are slower and more accurate than nonspecific circuits. Psychometric functions (a), probability of trial completion (b), and chronometric functions (c) for circuits with different inhibitory motifs. d Contraspecific circuits deviate to a choice attractor earlier and faster than ipsispecific circuits. Single-trial trajectories are shown for three circuits with different inhibitory motifs, in 50 ms time steps (dots) for 0 stimulus strength. Noise is reduced by 50% for illustration clarity. The decision threshold for each circuit is shown by the dashed line. Squares indicate choice attractors, triangle indicates the saddle point. The stable (black) and unstable (gray) manifolds of the saddle point are shown. e As the circuit motif changes from contra- to ipsispecific, the time constant of the unstable eigenvector of the saddle point τslow increases, indicating stabilization of dynamics and longer integration times. f The time constant τslow is tightly correlated with decision time (shown for stimulus strength equal to 0). g The saddle point becomes an attractor for ipsispecific circuits with high γEIγIE. The bifurcation diagram for circuits driven by a stimulus of 0 strength shows the location of attractors (black solid line) and saddle points (black dashed line). Dashed vertical lines correspond to examples in ad. In all panels γEE = 0.32 and γEI = 0.25.

We can understand the speed-accuracy trade-off between ipsi- and contraspecific circuits by analyzing the dynamics around the saddle point. Differences in these dynamics are seen by comparing single-trial trajectories of ipsi-, non-, and contrapecific circuits in response to the neutral stimulus (Fig. 3d). At the trial start, both choice-selective populations are symmetrically activated and the trajectory moves along the stable manifold toward the saddle point. The circuit activity deviates to a choice attractor after approaching the saddle. Contra- and ipsispecific circuits differ in both how far along the stable manifold the activity progresses and how quickly it moves toward the choice attractor once it deviates. We can estimate how quickly the dynamics will leave the neighborhood of the saddle point with the time-constant τslow, which is the time-constant of dynamics moving along the unstable manifold of the saddle point4,27. Changing the circuit motif from contraspecific to ipsispecific by increasing γEIγIE leads to an increase in τslow (Fig. 3e) and slowing down the pace of decisions (Fig. 3f). The divergence of τslow indicates that ipsispecific inhibition stabilizes the saddle point until at high γEIγIE a bifurcation occurs and the saddle point becomes an attractor with a symmetric high activity state (Fig. 3g). This bifurcation leads to the system stabilizing in a state where the firing rates of two choice-selective populations do not sufficiently separate on neutral and difficult stimuli trials, a state where the circuit fails to produce a decision. Easy stimuli impose a stronger asymmetry on the phase plane4 allowing circuits with highly ipsispecific inhibition to converge to a choice on easy trials (Supplementary Fig. 4).

Strong ipsispecific inhibition destabilizes working memory

The inhibitory connectivity motif affects the circuit’s ability to maintain the working memory of a choice. Contraspecific and nonspecific circuits maintain a difference in excitatory firing rates of at least 15 Hz for a very long time following stimulus offset, whereas ipsispecific circuits exhibit a degradation of the choice readout (Fig. 4a). This behavior can be linked to the phase plane of the unstimulated circuit. Working memory is supported by two choice attractors that are separated by saddle points from the attractor with symmetric low activity state. The separation between the working memory attractors and the saddle points is smaller for more ipsispecific circuits (Fig. 4b). For highly ipsispecific circuits, working memory attractors are extinguished after merging with the saddle points (Fig. 4b).

Fig. 4: Strong ipsispecific inhibition destabilizes working memory attractors.
figure 4

a The probability of maintaining a choice after stimulus offset is diminished in ipsispecific circuits. b For ipsispecific circuits with high γEIγIE, working memory attractors are extinguished after merging with saddle points. The bifurcation diagram is for the same circuits as in Fig. 3g but in the absence of stimulus.

Inhibitory choice selectivity in trained recurrent neural networks

So far, we used the mean-field approach to establish that choice-selective inhibition supports the function of decision-making circuits by enhancing a competitive or stabilizing role. Next, we wanted to test whether this result holds broadly by using another class of decision-making network models. We therefore trained excitatory-inhibitory recurrent neural networks (RNNs) to perform a decision-making task28 and then tested whether inhibitory choice-selectivity regularly emerges in these networks after training and whether the dependence between the excitatory and inhibitory specificity aligns with the two roles for inhibition. We used RNNs with 100 excitatory and 25 inhibitory units (Fig. 5a), but our results are not specific to this number of units and hold in RNNs with twice the size (Supplementary Fig. 5). Two input streams projected to all excitatory units through input weights. Two output variables were calculated as a weighted sum of excitatory unit activity. We trained RNNs to perform an identical decision-making task as the mean-field circuits by raising an output variable which corresponds to the input stream with a higher mean value. Networks were trained by back-propagation through time to minimize the mean squared error between the network outputs and predefined targets. For a given trial, a choice was recorded when the output variables became separated by a fixed threshold set to 0.25. Trials were considered invalid if the outputs separated prior to the stimulus, failed to maintain separation after stimulus offset, or separation was never achieved. We trained networks until the correct choice was made on 85% of all trials (including correct, error, and invalid trials) in a 200 trial epoch. One hundred and fifty networks reached this training threshold in 104, 343 ± 9, 264 (mean ± s.d.) trials, ranging from 83,200 to 127,600 (Fig. 5b). Networks performed the task well, making errors and failing to complete trials only for difficult stimuli (Fig. 5c). Trained networks also took longer to make decisions when presented with a difficult stimulus, similarly to mean-field circuits (Supplementary Fig. 6).

Fig. 5: Inhibitory choice selectivity in trained recurrent neural networks.
figure 5

a RNNs are trained to compare two inputs and indicate which has higher mean by elevating the corresponding output. RNNs are composed of 100 excitatory and 25 inhibitory units. b We trained 150 RNNs to a consistent level of performance. RNN performance improved gradually during training. We stopped the training when the network performance reached 85% correct responses (gray dashed line). Lines show individual RNNs, color gradient indicates the network’s rank to reach 85% performance. c Psychometric functions (left) and probability of trial completion (right) for all trained RNNs (gray) and the mean across 150 networks (red). d Excitatory and inhibitory units in trained RNNs display choice selectivity. Traces show the activity of the RNN outputs (left), two excitatory units (center), and two inhibitory units (right) on two example trials with choice 1 (upper row, stimulus strength −20) and choice 2 (lower row, stimulus strength 20). Gray shading indicates the stimulus period. e In trained RNNs, the overall choice selectivity is greater for inhibitory than excitatory units. Distributions show the choice-selectivity index across all units from all networks. f In trained RNNs, the fraction of units with significant choice selectivity is greater for inhibitory than excitatory units. Distributions show the fraction of selective units across networks. g Trained RNNs show a range of specificity parameters γ for each connection class (colored lines - individual RNNs, black - mean ± s.d. across 150 networks). Color indicates networks sorted by γEE. h In trained RNNs, the specificity index γEIγIE is positively correlated with γEE. i In trained RNNs, excitatory-excitatory, excitatory-inhibitory, and inhibitory-excitatory specificity are correlated, whereas inhibitory-inhibitory specificity is uncorrelated with other connection classes. * indicates significant correlation (p < 0.05, two-tailed permutation test, 5000 permutations).

We determined whether inhibitory neurons in these RNNs were choice selective. We classified recurrent units as choice selective using receiver operator characteristic (ROC) analysis21 (Methods). We constructed ROC curves by decoding network choice from a unit’s activity on the time-step following stimulus offset. To identify which units significantly modulated their firing rate to reflect choice, we compared the area under the ROC curve (AUCROC) to a shuffle distribution generated from randomized trial labels (two-sided permutation test, p < 0.05, 150 permutations). Units that were identified as choice selective increased activation following the onset of a stimulus corresponding to their preferred choice (Fig. 5d). Inhibitory units had overall higher choice selectivity than excitatory units, as measured by the selectivity index AUCROC − 0.5 that can range from 0 to 0.5 (Fig. 5e, inhibitory 0.23 ± 0.17, excitatory 0.12 ± 0.16; mean ± s.d.; Wilcoxon rank-sum test p < 10−10). Also, the proportion of significantly selective units was higher for inhibitory than excitatory units (Fig. 5f, inhibitory 0.87 ± 0.07, excitatory 0.72 ± 0.06; mean ± s.d.; Wilcoxon Rank-Sum test p < 10−10). Thus, inhibitory unit activity contained overall more choice information than excitatory unit activity despite the fact that only excitatory units received stimulus input. In this respect RNNs differ from experimental data in which excitatory and inhibitory neurons contained similar choice information21.

Excitatory specificity aligns with ispi- and contraspecific inhibitory motifs in RNNs

Based on our mean-field model, we know that for choice-selective inhibition to impact circuit function, the connections from inhibitory to excitatory populations must be specific. Therefore, after identifying choice-selective units in RNNs, we sought to determine whether the connection specificity of excitatory-excitatory and excitatory-inhibitory pairs followed the relationship predicted by the mean-field model (Fig. 2b). To analyze the specificity of connections between choice-selective populations in the RNNs, we estimated the specificity parameter γ from the weights of trained RNNs defined in the same way as for the mean-field model (Methods). Trained networks consistently had strong excitatory-excitatory (γEE = 0.59 ± 0.07) and excitatory-inhibitory (γEI = 0.39 ± 0.06) specificity (Fig. 5g). This result is consistent with the constraint that inhibitory units inherit stimulus information from excitatory units to be choice or stimulus selective. Inhibitory-excitatory connections were nonspecific on average (γIE = 3.6 × 10−3 ± 0.03) but their distribution showed both ipsispecific and contraspecific motifs. Inhibitory-inhibitory connections were nonspecific on average with higher variation than inhibitory-excitatory connections (γII = −5.0 × 10−3 ± 0.06). Confirming the trend predicted by the mean-field model, excitatory specificity γEE was correlated with the inhibitory specificity index γEIγIE, where networks with stronger recurrent excitation were ipsispecific and networks with weaker recurrent excitation were contraspecific (Pearson’s r = 0.53, p < 10−10; Fig. 5h). When comparing the connection classes individually, we found positive correlations between excitatory-excitatory, excitatory-inhibitory, and inhibitory-excitatory specificity (Fig. 5i). Inhibitory-inhibitory connection specificity was not significantly correlated with any other connection class. The higher variance and negligible correlation with other connection classes suggest that the specificity of inhibitory-inhibitory connections was unconstrained in these networks, in line with the mean-field model, where specificity of inhibitory-inhibitory connections also had a small effect on whether circuits could perform decisions (Supplementary Fig. 3d). These results show that RNNs utilize choice selective inhibition to compensate for variation in excitatory-excitatory specificity.

To further test the relationship between the excitatory and inhibitory specificity, we trained additional sets of RNNs with higher or lower excitability of excitatory units. In the mean-field model, lower (higher) excitatory gain can be compensated by either an increase (decrease) in excitatory connection specificity or by strengthening of the contraspecific (ipsispecific) motif. Accordingly, we expect that changing the activation function slope of the excitatory units in RNNs should either shift the excitatory-excitatory specificity against the direction of the gain change or shift the inhibitory specificity towards contraselective (for lower slope) or ipsielective motif (for higher slope). We trained two additional sets of networks with hypoexcitable (slope 0.5) or hyperexcitable (slope 1.5) excitatory units. Changing the excitability of excitatory units led to large shifts in γEE without changing the distribution of inhibitory specificity (Supplementary Fig. 7). In these networks, γEE and γEIγIE were still correlated, with higher γEE leading to higher γEIγIE (Supplementary Fig. 8). These results indicate that excitatory-excitatory specificity is a higher leverage parameter that RNNs use as the most effective path to compensate for changes in the excitability of excitatory units. This observation is consistent with the effect of changes in γEE on the dynamics in the mean-field model. For accuracy, decision-time and τslow, changes in γEE are far more effective than changes in inhibitory specificity (Supplementary Fig. 9) when all other parameters are held constant. In both the mean-field and RNN models, excitatory-excitatory specificity has a larger effect than inhibitory specificity and is the main lever circuits use to compensate for changes in neural parameters.

Perturbing inhibitory neuron activity reveals regimes where stabilizing and competitive inhibition dominate

Using the mean-field and RNN models, we established how contra- and ipsispecific inhibitory motifs enhance two different roles for inhibition in decision making circuits. To further probe these roles, we next considered how circuits respond to perturbations of inhibitory neuron activity. We used perturbations that equally targeted all inhibitory neurons irrespective of their choice selectivity by driving them with a nonspecific input Δν0,I (Fig. 6a). Such perturbations could be realized in optogenetic experiments. In circuits where the competitive role of inhibition dominates, we expect that enhancing inhibitory activity should speed up dynamics whereas suppressing inhibition should slow them down (Fig. 6b). Vice versa, in circuits where the stabilizing role of inhibition dominates, we expect that enhancing inhibitory activity should slow dynamics down and suppressing inhibition should speed them up (Fig. 6b). Because τslow provides a readily available estimate of the pace of dynamics in the mean-field model, we calculated τslow for varying nonspecific baseline input to inhibitory neurons ν0,I. We found that depending on the baseline level of inhibitory activity both regimes are possible in the mean-field circuit: one where competitive role dominates and one where stabilizing role dominates (Fig. 6c). Around a low baseline value of inhibitory activity (ν0,I = 11.5 in Fig. 6c), contra-, ipsi-, and nonspecific circuits respond to perturbations similarly, such that enhancing inhibition (Δν0,I > 0) leads to a decrease in τslow, i.e. faster dynamics. Around a high baseline value of inhibitory activity (ν0,I = 14 in Fig. 6c), all circuits respond in the opposite way, such that enhancing inhibition increases τslow. This U-shaped dependence of τslow on the baseline input to inhibitory neurons ν0,I results from the system approaching bifurcation points at either extreme of the parameter range that supports decision making4 (Supplementary Fig. 10). These two regimes–a low inhibition and a high inhibition regime–differ in which role of inhibition dominates: competitive or stabilizing, respectively. The inhibitory motif (contra-, non-, or ipsispecific) further shifts this emphasis within the constraints of each regime. These regimes can be identified via perturbations by characterizing how the circuit dynamics respond to changes in inhibitory tone.

Fig. 6: Perturbations to inhibitory activity reveal regimes where stabilizing and competitive inhibition dominate.
figure 6

a We perturbed mean-field models by delivering a nonspecific input Δν0,I to all inhibitory neurons during stimulus period. Perturbations were similarly delivered to RNNs by adding a constant input ΔII to all inhibitory units during the stimulus. b Circuits where competitive (brown) or stabilizing (yellow) inhibition dominates are predicted to have diverging responses to perturbations of inhibitory activity. c In the mean-field model, dependence of τslow on the baseline input to inhibition ν0,I reveals two inhibitory regimes: a low inhibition regime where competitive inhibition dominates and a high inhibition regime where stabilizing inhibition dominates. Contra- or ipsispecific inhibitory motifs shift the emphasis within these regimes (e.g., τslow is always longer for ipsi- than contraspecific circuits). The effects of inhibitory perturbations in the mean-field model differ between these two regimes depending on the baseline value ν0,I. dg Around a low baseline (ν0,I = 11.5, brown line in c), enhancing inhibition speeds up decision times (magenta line in d; f), increases the rate of trial completion (e), and decreases accuracy (g), whereas suppressing inhibition produces the opposite effects, e.g., slows down decision times (green line in d; f). Results are shown for nonspecific circuits. Gray areas in d indicate stimulus strengths used to calculated the values in eg. Error bars indicate ± s.e.m. across 2000 trials. hk Same as dg for the high inhibition regime (ν0,I = 14, yellow line in c). Perturbations of inhibitory activity produce the reversed effects. Error bars indicate ± s.e.m. across 2000 trials. lo Same as hk for perturbations of inhibitory neurons in RNNs. RNN’s response to perturbations mirrors the effects in the mean-field model in the stabilizing regime (c.f. hk). Enhancing inhibition in RNNs slows down decision times, decreases the rate of trial completion, and increases accuracy. Error bars indicate ± s.e.m. across 75 networks.

To confirm the existence of competitive and stabilizing regimes, we perturbed the mean-field circuits around the low and high baseline values of the inhibitory activity. We enhanced or suppressed inhibition during the stimulus period of a trial and measured changes in the circuit performance. We constructed a set of metrics to quantify changes in the fraction of completed trials, decision time, and choice accuracy relative to the unperturbed circuit for all stimulus strengths. The effects of these perturbations followed the predictions from the calculation of τslow (Fig. 6d–k). Enhancing inhibition decreased decision time in the low inhibition regime, but increased decision time in the high inhibition regime (cf. Fig. 6d, f and h, j). Consistent with the slowing effects of the perturbation, circuits in the high inhibition regime failed more often to complete trials (Fig. 6e) and became more accurate (Fig. 6g) when inhibition was enhanced. Circuits in the low inhibition regime showed the opposite behavior (Fig. 6h–k). Thus, by perturbing inhibitory neuron activity we can determine whether the competitive or stabilizing inhibition dominates in a circuit.

We then delivered enhancing or suppressing perturbations to inhibitory units in trained RNNs during the stimulus period to identify in which inhibitory regime these networks operate. Enhancing inhibition increased decision times, reduced the fraction of completed trials, and increased accuracy, consistent with these RNNs operating in the stabilizing inhibition regime (cf. Fig. 6h–k and l–o).

Discussion

We showed that choice selectivity of inhibitory neurons can affect the function of decision making circuits by enhancing one of two roles for inhibition: facilitating competition or stabilizing recurrent excitation. In the mean-field model, choice selective inhibition and specific connections from inhibitory to excitatory populations expand the excitatory-excitatory specificity parameter space of circuits that support decision-making. For the range of excitatory connection specificities supporting both ipsispecific and contraspecific inhibitory circuits, the speed and accuracy of decisions tightly depend on whether the ipsi- or contraspecific inhibitory motif is present. Inhibitory choice selectivity also emerges in RNNs trained to perform a decision-making task, and the specificity of excitatory and inhibitory connections within trained RNNs is correlated, consistent with the mean-field model predictions. The mean-field model further predicts the existence of two dynamical regimes: (i) a low-inhibition regime where the competitive role dominates, and (ii) a high-inhibition regime where stabilizing role dominates. In trained RNNs, perturbations of all inhibitory neurons indicate that these networks operate in the stabilizing inhibition regime.

Decision-making circuits with non-selective inhibition exist only within a narrow range of excitatory-excitatory connection specificity. When inhibitory neurons inherit choice-selectivity from excitatory neurons and also project to excitatory neurons via specific connections, a broad range of circuit configurations can support decision-making. In circuits capable of decision-making, the correlation between the specificity of excitatory (γEE) and inhibitory connections (γEIγIE) reveals how the contra- and ipsispecific motifs enhance one of two roles for inhibition: facilitate competition between populations coding for opposite choices or stabilize amplification driven by strongly recurrent excitation. When γEE is low and excitatory populations alone cannot drive selective activation, contraspecific inhibitory motifs support decision-making by maximizing competition. Conversely, when γEE is high and excitatory self-amplification becomes unstable, ipsispecific inhibitory motifs stabilize firing rates.

The categorical output of decision-making circuits is thought to be driven by strongly selective excitatory to excitatory selectivity with the evidence accumulation based on amplification through NMDA receptors2,4. In these models the specificity of excitatory connections is sufficient to drive competition and selective activation. We found that deviations from a narrow range of γEE require complementary inhibitory circuitry. When recurrent excitatory specificity is low, contraspecific inhibition is required to form the attractors needed for decision-making computation. This mechanism was described in circuits where excitatory populations have limited capacity for amplification, such as the midbrain circuit in the owl22, and in linear integrator models29. On the other hand, when recurrent excitatory specificity is high, the strong excitatory feedback amplification needs matching ipsispecific inhibition to stabilize the circuit. This mode of inhibitory selectivity is known to improve stability and robustness of a circuit to perturbations17,30. Additionally, shifts in E/I balance through modulation of gain or synaptic efficacy can improve the robustness and parameter range of decicion-making circuit models31,32.

We found a similar relationship between excitatory and inhibitory connection specificity in RNNs suggesting the balance between competitive and stabilizing inhibition is a general principle in E-I networks. While specific connections between excitatory and inhibitory units were clearly important for the decision-making function in our networks, connections between inhibitory units appeared unconstrained, indicating this connection class has limited effect on circuit function like in the mean-field model. RNNs are increasingly often used to develop theories of how neural circuits perform computations23,28,33. Some studies trained RNNs under the constraint that units have either exclusively excitatory or exclusively inhibitory outputs28,34 (Dale’s law). Studies of E-I RNNs which focus on the impact of inhibitory connections show that specificity of inhibitory-inhibitory connections can be critical to circuit function23. The apparent difference in the importance of inhibitory-inhibitory selectivity between our networks and previous work could result from differences in the training procedures35. We observed a large impact of RNN training hyperparameters on the emerging circuit structure. Future work is needed to understand how details of training influence the emerging circuit structure and computations performed by RNNs.

Our results show that selective inhibition can have a marked effect on the function of neural circuits. Many models of categorical decision-making rely on a nonspecific pool of inhibitory neurons to enforce winner-take-all competition between excitatory neurons2,3. While these models reproduce the dynamics of decision-making circuits they do not fully account for the diversity of interneurons within the cortex. Cortical inhibitory neurons show selective activation in many modalities including primary sensory13,17,36,37 and association areas19,20,21. Moreover, choice-selectivity of parietal inhibitory neurons is equal to that of excitatory neurons during an audio-visual discrimination task21.

In the mean-field model, we assume that choice selectivity of inhibitory neurons arises from specific connections from choice-selective excitatory neurons (γEI in our model). While it is possible that choice selectivity could arise from external inputs to interneurons38 or even from random connections between excitatory and inhibitory neurons39, most circuit models assume stimulus information is exclusively provided by inputs to excitatory neurons. Inhibitory choice-selectivity also emerged in our RNNs trained to perform 2AFC task28. In our RNNs, inhibitory units can only inherit stimulus or choice information through specific connections from excitatory populations, unlike in other trained RNNs23. For both excitatory and inhibitory units in trained RNNs, we found that the fraction of selective units was higher than is commonly used in circuit models2,4 and found in experiments21. This difference could be due to the simplicity of RNNs compared to in vivo circuits, and also a training process which aims to minimize total activity through regularization. In addition, decisions in the RNN are fully determined by the local circuit, whereas an animal’s behavioral output arises from a broadly distributed circuitry. Although higher choice selectivity for inhibitory units was robust to doubling the network size (Supplementary Fig. 5b, c), it could result from the need to leverage all of these units in a network much smaller than those in the brain.

The core computation of the model is the selective activation of a single excitatory population when the stimulus is presented and a mechanism to integrate stimulus information before diverting to a choice attractor. By enhancing stability, ipsispecific circuits lengthen the period when a circuit can maintain mutual activation of populations encoding competing choices, thus increasing the integration window which leads to more accurate stimulus classifications. Contraspecific circuits, primed for competition, minimize the integration period which increases error frequency.

In attractor networks, modulation of τslow for controlling the speed and accuracy of decisions is well known and can arise from other mechanisms than inhibitory output specificity. In the model with nonspecific inhibition, τslow increases with stimulus difficulty4 and can be also modulated via top-down excitation27. Our finding that excitatory-inhibitory connectivity influences this well established mechanism highlights the importance of inhibitory circuitry to evidence accumulation. A key difference between controlling τslow via inhibitory motif versus top-down excitation is that the location of the saddle point is unaffected by γIE whereas increasing top-down excitation shifts the saddle towards the origin, effectively acting as a collapsing decision-bound27. Top-down excitation can be adjusted rapidly from one trial to the next to match the decision’s speed and accuracy to the task demands. Could the inhibitory motif also be dynamically changed to meet changing task requirements? Modulation of the speed-accuracy trade-off through changes of the inhibitory motif may be mediated by activation or inactivation of inhibitory subpopulations connected in either a contraspecific or ipsispecific pattern (representing a shift in γIE for the circuit as a whole).

Selective neuromodulatory control of genetically identifiable inhibitory subtypes may provide for control of inhibitory motifs. Inhibitory subtypes have distinct connectivity patterns to neighboring excitatory neurons: fast-spiking cells have far more reciprocal connections to excitatory neurons than adapting interneurons16. A shift in output specificity could be mediated through top-down activation of inhibitory subnetworks or through neuromodulation of distinct inhibitory subtypes such as PV+, SOM+, or VIP+. Acetylcholine has layer-dependent effects on the responsiveness of both regular spiking and fast spiking neurons in the visual cortex, which could differentially activate distinct inhibitory motifs on behaviorally relevant timescales40,41,42. Additionally, acetylcholine can reduce the release of inhibitory neurotransmitters in cortical neurons43, thus directly affecting inhibitory connectivity.

Our mean-field framework reduces the dynamics of the full network with 6 excitatory and inhibitory populations to a two-variable system using several approximations, in particular, the steady-state assumption for GABA dynamics4. This assumption is based on the timescale separation between decay time constants of slow NMDA (~100 ms) and fast GABA (~5 ms) conductance. The slow NMDA dynamics dominate the time evolution of network activity, and one can assume that all other variables reach their steady-state nearly instantly4. Despite being fast, the dynamics of GABA synapses can also affect decision-making behavior44. A study44 considered a set of circuits with parameters chosen so that when the steady-state assumption is applied, all models reduce to the two-variable model with the exact same parameter set. Thus, all differences in dynamics of these circuits were driven by GABA dynamics. In these circuits, the GABA dynamics mediated a speed-accuracy trade-off and, moreover, this tradeoff was more efficient in circuits with selective inhibition44. While this study considered only ipsispecific inhibitory connectivity and a narrow space of circuits that all map onto a single parameter set of a two-variable model, our work explores a wide range of circuit configurations ranging from contraspecific to ipsispecific inhibitory motifs. Our findings are robust to the steady-state approximation of GABA dynamics as we show using a four-variable mean-field model (Supplementary Fig. 3). Together these results show that inhibitory connectivity motifs and GABA dynamics both affect decision-making behavior.

Another key performance metric that depends on selective inhibition is the rate of trial completion. Our models (both the mean-field and RNNs) fail to reach the imposed decision threshold on a fraction of trials with low stimulus strength, which we call invalid trials. This behavior is common across spiking2,32,45, mean-field4,44 and RNN28 models of decision-making. Our treatment of invalid trials is conservative, as we report invalid trials as a separate behavioral outcome different from correct or incorrect decision32, whereas most other studies assign a choice at random on trials when the network does not reach the decision threshold2,4,28,44,45. The random assignment of choices on invalid trials can conceal differences in network dynamics, making distinct dynamical regimes indistinguishable in psychometric functions45. We find that the completion rate of difficult trials is reduced in circuits where stability is emphasized due to increased integration time. Circuit models frequently differ from experimental subjects in the rate of trial completion, which was attributed to an urgency signal gating the evidence accumulation process which is absent in circuit models46,47,48,49. One possible mechanism for an urgency signal in decision circuits could be a nonspecific external ramping input50. Incorporating such inputs into future models of decision-making would be an important next step in the study of selective inhibition.

We show that choice selective inhibition can enhance one of two roles for inhibition in decision-making circuits: facilitating competition or stabilizing excitatory feedback. Both these roles are simultaneously fulfilled by inhibition in any decision making circuit. Enhancing activity of all inhibitory neurons can shift the circuit from a regime where the competitive role dominates to a regime where the stabilizing role dominates regardless of which inhibitory motif is present. This effect echos results which find shifts in E/I balance can induce leaky or unstable integration45. The stabilizing and competitive regimes can be differentiated by the behavioral response to perturbations of inhibitory activity. Perturbations during reaction time tasks should reveal which inhibitory role is dominant in vivo. The balance of these two roles is critical for circuits to perform decision tasks, and shifts in this balance could align dynamics with changing task requirements. More experimental work is needed to uncover how inhibitory subnetworks strike this balance in the cortex. Specifically, whether functional selectivity is constrained to certain inhibitory subtypes and whether inhibitory neurons are recruited to perform a task in a state dependent manner are important questions for future work.

Methods

Mean-field model

Our mean-field model accounts for interactions among 6 populations: 3 excitatory (2 choice selective and 1 nonselective) and 3 inhibitory (2 choice selective and 1 nonselective). Including nonselective neurons in the model is consistent with previous work4 and reflects the experimental observation that only a fraction of all recorded neurons shows choice selectivity21. Each selective population contains the fraction f of the total number NE (NI) of excitatory (inhibitory) neurons, so that 1 − 2f is the proportion of nonselective neurons. We reduce the dynamics of the full network with 6 excitatory and inhibitory populations to a dynamical system with two variables representing the activations of N-methyl-D-aspartate (NMDA) conductances (in terms of fraction of channels open) for synapses originating from two choice-selective excitatory populations4. The model reduction to two dimensions leverages the timescale separation between decay time constants of the slow NMDA (~100 ms) and fast γ-aminobutyric acid (GABA, ~5 ms) and α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA, ~2 ms) receptors. The slow NMDA dynamics dominate the time evolution of the system, and one can assume that all other variables reach their steady-state nearly instantly4. The dynamics of the NMDA activation variable for population i (i {1, 2}) are governed by:

$$\frac{d{S}_{i}}{dt}=\frac{-{S}_{i}}{{\tau }_{{{{{{{{\rm{NMDA}}}}}}}}}}+(1-{S}_{i})\gamma {{\Phi }}({x}_{i}),$$
(1)

where τNMDA = 0.1 s and γ = 0.641. The non-linear function Φ transforms input current xi [nA] into firing rate:

$${{\Phi }}({x}_{i})=\frac{a{x}_{i}-b}{1-{e}^{-d(a{x}_{i}-b)}},$$
(2)

where a = 270 nC−1, b = 108 Hz, and d = 0.154 s. The input to population i is:

$${x}_{i}={\alpha }_{1}({\gamma }_{{{{{{{{\rm{EE}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{EI}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{IE}}}}}}}}}){S}_{i}+{\alpha }_{2}({\gamma }_{{{{{{{{\rm{EE}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{EI}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{IE}}}}}}}}}){S}_{j}+{I}_{0,\,i}({\gamma }_{{{{{{{{\rm{EE}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{EI}}}}}}}}},\,{\gamma }_{{{{{{{{\rm{IE}}}}}}}}})+{I}_{{{{{{{{\rm{stim}}}}}}}},\,i}+{I}_{\eta,\,i},$$
(3)

where index j refers to the other excitatory population. The complexity of the circuit structure, including interactions between all selective and nonselective excitatory and inhibitory neurons, is collapsed into two-dimensional model through the variables α1, α2, and I0,i as described in the section Circuit Structure below.

The stimulus \({I}_{{{{{{{{\rm{stim}}}}}}}},i}\) is defined as an increase in the rate of external excitatory inputs to choice-selective excitatory neurons of magnitude μ. We define the strength of evidence for one versus the other choice as stimulus coherence c, which can range between −100% and 100%. For population i the stimulus is then defined as:

$${I}_{{{{{{{{\rm{stim}}}}}}}},\,i}(t,\,\mu,\,c)=\left\{\begin{array}{ll}{J}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}}}\mu (1-\frac{c}{100})&{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{on}}}}}}}}} \, < \,t \, < \,{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{off}}}}}}}}},\; i=1,\\ {J}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}}}\mu (1+\frac{c}{100})&{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{on}}}}}}}}} \, < \,t \, < \,{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{off}}}}}}}}},\; i=2,\\ 0 \hfill &{{{{{{{\rm{otherwise}}}}}}}}.\hfill\end{array}\right.$$
(4)

For all cases, we set μ to 40 Hz. Noise is introduced through the inputs Iη,i to the two excitatory populations filtered through fast synaptic activation of AMPA receptors:

$$\frac{d{I}_{\eta,i}}{dt}=-\frac{{I}_{\eta,i}}{{\tau }_{{{{{{{{\rm{AMPA}}}}}}}}}}+\frac{\eta (t)}{\sqrt{{\tau }_{{{{{{{{\rm{AMPA}}}}}}}}}}},$$
(5)

where τAMPA is 0.002 s and η(t) is a white Gaussian noise with zero mean and standard deviation 0.02 nA. We performed numerical simulations using the Euler method with a 2 ms time step.

Circuit structure

We derived two-dimensional mean-field equations, which model the dynamics of the entire circuit through the effective interaction strengths α1, α2 between the two excitatory populations, and the background currents I0,i. This reduced model is based on approximating the firing rates of all three inhibitory populations (two choice-selective and one nonselective) and of the nonselective excitatory population as linear functions of their inputs. Thus, the firing rates of these populations change linearly in response to changes in the firing rates of the two explicitly modeled excitatory populations E1 and E24. We define α1 as a term which describes how activity S1(2) from the excitatory population E1(2) filters through the circuit (i.e. via E2(1), E0, I0, I1, I2, and feeding back onto itself) to impact its own firing rate. Similarly, α2 describes how the activity S1(2) filters through the circuit to impact the firing rate of the opposite excitatory population. I0,i describes the net input from the population activity that does not depend on the activity of E1 or E2. Thus, this model accounts for interactions between all six populations with only two dynamical system equations Eq. (1).

We parametrized connection specificity between choice-selective populations by γJK between presynaptic population J and postsynaptic population K. The index J, K {E, I} defines neuron type as excitatory or inhibitory. We translate γJK to a synaptic weight under a constraint that the total input to each population remains constant for all values of γJK. To this end, we defined an intermediate weight \({\hat{w}}_{JK}={N}_{s}{w}_{J}/({N}_{{{{{{{{\rm{s}}}}}}}}}+{\gamma }_{JK}(2-{N}_{{{{{{{{\rm{s}}}}}}}}}))\), where Ns = 2 is the number of competing choice-selective populations and wE = wI = 1. We then set connection weights between populations with the same choice selectivity to \({w}_{JK}^{+}={\hat{w}}_{JK}+{\gamma }_{JK}{\hat{w}}_{JK}\) and between populations with opposite selectivity to \({w}_{JK}^{-}={\hat{w}}_{JK}-{\gamma }_{JK}{\hat{w}}_{JK}\). We can rewrite γ in terms of w+ and w as:

$$\gamma=\frac{{w}^{+}-{w}^{-}}{{w}^{+}+{w}^{-}}.$$
(6)

Connections to and from nonselective neurons were held at wJ = 1. This definition enforces that all neurons receive the same total input weight for any value of γJK. We set the specificity parameter γEE = 0.32 as in refs. 2,4, except in Figs. 1 and 2. We set γEI = 0.25 except in Figs. 1 and 2.

The effective interaction strengths α1 describes the recurrent feedback from an excitatory population’s activity onto itself fed through other populations in the circuit. This term consists of four components α1 = λ1(α1a + α1b + α1c + α1d):

$${\alpha }_{1a}=f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EE}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{E}}}}}}}}},$$
(7)
$${\alpha }_{1b}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}),$$
(8)
$${\alpha }_{1c}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}),$$
(9)
$${\alpha }_{1d}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{E}}}}}}}}}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{I}}}}}}}}}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}).$$
(10)

These components of α1 account for the effect of an excitatory population’s activity on its own activity filtered via (a) direct self-coupling, (b) the activity of the inhibitory population with the same choice selectivity, (c) the activity of the inhibitory population with the opposite choice selectivity, and (d) the activity of nonselective inhibitory neurons. Similarly, α2 describes the influence of one excitatory population’s activity onto the other fed through all other populations in the circuit and also consists of four components α2 = λ2(α2a + α2b + α2c + α2d):

$${\alpha }_{2a}=f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EE}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{E}}}}}}}}},$$
(11)
$${\alpha }_{2b}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}),$$
(12)
$${\alpha }_{2c}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}),$$
(13)
$${\alpha }_{2d}=\frac{1}{\kappa {g}_{{{{{{{{\rm{I2}}}}}}}}}}({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{E}}}}}}}}}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})(\,f{w}_{{{{{{{{\rm{I}}}}}}}}}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}).$$
(14)

The components of α2 account for the effect on an excitatory population’s activity from the oppositely selective excitatory population’s activity filtered via (a) direct coupling, (b) the activity of the inhibitory population with the same selectivity, (c) the activity of the inhibitory population with the opposite selectivity, and (d) the activity of nonselective inhibitory neurons. The effects of nonselective neurons and external background inputs are described by I0,i = λI(I0,ia + I0,ib + I0,ic + I0,id):

$${I}_{0,ia}=(1-{N}_{{{{{{{{\rm{s}}}}}}}}}f)\,{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{E}}}}}}}}}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\psi }_{{{{{{{{\rm{3,in}}}}}}}}},$$
(15)
$${I}_{0,ib}={I}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},i}-(1-{N}_{{{{{{{{\rm{s}}}}}}}}}\,f){w}_{{{{{{{{\rm{I}}}}}}}}}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}({\nu }_{0,{{{{{{{\rm{I}}}}}}}}}+({c}_{{{{{{{{\rm{I}}}}}}}}}{I}_{0,{{{{{{{\rm{I}}}}}}}}}-{I}_{{{{{{{{\rm{m,I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}})/\kappa,$$
(16)
$${I}_{0,ic}=-f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}({\nu }_{0,{{{{{{{\rm{I}}}}}}}}}+({c}_{{{{{{{{\rm{I}}}}}}}}}{I}_{0,{{{{{{{\rm{I}}}}}}}}}-{I}_{{{{{{{{\rm{m,I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}})/\kappa,$$
(17)
$${I}_{0,id}=-f{w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}{N}_{{{{{{{{\rm{I}}}}}}}}}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}({\nu }_{0,{{{{{{{\rm{I}}}}}}}}}+({c}_{{{{{{{{\rm{I}}}}}}}}}{I}_{0,{{{{{{{\rm{I}}}}}}}}}-{I}_{{{{{{{{\rm{m,I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}})/\kappa,$$
(18)

where:

$${I}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},i}={J}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},{{{{{{{\rm{E}}}}}}}}}{\tau }_{{{{{{{{\rm{AMPA}}}}}}}}}{N}_{{{{{{{{\rm{ext}}}}}}}}}{\nu }_{{{{{{{{\rm{ext}}}}}}}}},$$
(19)
$${I}_{0,{{{{{{{\rm{I}}}}}}}}}={I}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},{{{{{{{\rm{I}}}}}}}}}+{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{eff}}}}}}}},{{{{{{{\rm{I}}}}}}}}}{w}_{{{{{{{{\rm{E}}}}}}}}}(1-{N}_{{{{{{{{\rm{s}}}}}}}}}\,f){N}_{E}{\psi }_{{{{{{{{\rm{3,in}}}}}}}}},$$
(20)
$${I}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},{{{{{{{\rm{I}}}}}}}}}={J}_{{{{{{{{\rm{AMPA}}}}}}}},{{{{{{{\rm{ext}}}}}}}},{{{{{{{\rm{I}}}}}}}}}{\tau }_{{{{{{{{\rm{AMPA}}}}}}}}}{\nu }_{{{{{{{{\rm{ext}}}}}}}}},$$
(21)
$${\psi }_{{{{{{{{\rm{3,in}}}}}}}}}=\frac{\gamma {\tau }_{{{{{{{{\rm{NMDA}}}}}}}}}{\nu }_{{{{{{{{\rm{3,in}}}}}}}}}}{1+\gamma {\tau }_{{{{{{{{\rm{NMDA}}}}}}}}}{\nu }_{{{{{{{{\rm{3,in}}}}}}}}}}.$$
(22)

These terms account for the input to the excitatory population Ei from the nonselective excitatory population filtered via (a) direct coupling, (b) the nonselective inhibitory population, (c) the inhibitory population with the same choice selectivity, (d) the inhibitory population with the opposite selectivity. The term ψ accounts for the NMDA activation of nonselective excitatory neurons. We calculated the firing rate of inhibitory populations as ΦI,1(2) = α1,IS1(2) + α2,IS2(1) + I0,II, where:

$${\alpha }_{1,{{{{{{{\rm{I}}}}}}}}}=({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDAeff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}},$$
(23)
$${\alpha }_{2,{{{{{{{\rm{I}}}}}}}}}=({c}_{{{{{{{{\rm{I}}}}}}}}}\,f{N}_{{{{{{{{\rm{E}}}}}}}}}{w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDAeff}}}}}}}},{{{{{{{\rm{I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}},$$
(24)
$${I}_{{{{{{{{\rm{0,II}}}}}}}}}={\nu }_{0,{{{{{{{\rm{I}}}}}}}}}+({c}_{{{{{{{{\rm{I}}}}}}}}}{I}_{0,{{{{{{{\rm{I}}}}}}}}}-{I}_{{{{{{{{\rm{m,I}}}}}}}}})/{g}_{{{{{{{{\rm{I2}}}}}}}}}.$$
(25)

All parameter values are provided in Table 1.

Table 1 Mean-field model parameters

Evaluation of circuit performance

We considered a trial to be valid if the following criteria were met: (i) the firing rate difference between the two choice selective excitatory populations was less than 5 Hz for the entire period prior to stimulus onset, (ii) the firing rate difference was above the decision threshold of 15 Hz for at least one time step during the stimulus period and the time point following stimulus offset. Fraction completed trials for each stimulus level was defined as the number of valid trials out of all trials presented. Only valid trials were considered for computing chronometric and psychometric functions. Our treatment of invalid trials is more conservative than in many other studies, as we report invalid trials as a separate behavioral outcome different from correct or incorrect decision32, whereas many other studies assign a choice at random on trials when the network does not reach the decision threshold2,4,28,44,45. The random assignment of choices on invalid trials can conceal differences in network dynamics, making distinct dynamical regimes indistinguishable in psychometric functions45.

Phase plane and bifurcation analysis

We analyzed the mean-field model to find null-clines and fixed points using MatLab’s fsolve function with the Levenberg-Marquant algorithm and a tolerance of 1 × 10−6. To identify the stability of the fixed points, we computed the Jacobian matrix analytically and found its eigenvalues numerically using the eig() function in MatLab. For the saddle points, τslow is the inverse of the positive eigenvalue of the Jacobian matrix.

Recurrent neural network models

Recurrent neural networks (RNNs) were composed of 100 excitatory and 25 inhibitory units. We obtained the same results with networks twice as large (Supplementary Fig. 5). The dynamics of these networks were governed by the equations:

$${{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}}(t)= (1-{\alpha }_{{{{{{{{\rm{r}}}}}}}}}){{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}}(t-1)+{\alpha }_{{{{{{{{\rm{r}}}}}}}}}({{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{EE}}}}}}}}}{{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}}(t-1)-{{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{IE}}}}}}}}}{{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{I}}}}}}}}}(t-1) \\ + {{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{in}}}}}}}}}{{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{in}}}}}}}}}(t)+{{{{{{{{\boldsymbol{\sigma }}}}}}}}}_{{{{{{{{\rm{r}}}}}}}}}^{{{{{{{{\rm{E}}}}}}}}}(t)),$$
(26)
$${{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{I}}}}}}}}}(t)=(1-{\alpha }_{{{{{{{{\rm{r}}}}}}}}}){{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{I}}}}}}}}}(t-1)+{\alpha }_{{{{{{{{\rm{r}}}}}}}}}({{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{EI}}}}}}}}}{{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}}(t-1)-{{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{II}}}}}}}}}{{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{I}}}}}}}}}(t-1)+{{{{{{{{\boldsymbol{\sigma }}}}}}}}}_{{{{{{{{\rm{r}}}}}}}}}^{{{{{{{{\rm{I}}}}}}}}}(t)),$$
(27)
$${{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{in}}}}}}}}}(t)=(1-{\alpha }_{{{{{{{{\rm{in}}}}}}}}}){{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{in}}}}}}}}}(t-1)+{\alpha }_{{{{{{{{\rm{in}}}}}}}}}{{{{{{{\bf{u}}}}}}}}(t),$$
(28)
$${{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}({{{{{{{\bf{I}}}}}}}})}(t)={s}_{{{{{{{{\rm{E}}}}}}}}({{{{{{{\rm{I}}}}}}}})}{[{{{{{{{{\bf{x}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}({{{{{{{\bf{I}}}}}}}})}]}_{+},$$
(29)
$${{{{{{{\bf{z}}}}}}}}(t)={{{{{{{{\bf{W}}}}}}}}}^{{{{{{{{\rm{out}}}}}}}}}{{{{{{{{\bf{r}}}}}}}}}_{{{{{{{{\rm{E}}}}}}}}}(t).$$
(30)

Here xE and xI are the vectors of activation variables for excitatory and inhibitory units, respectively. rE and rI are the corresponding activities after applying the rectified linear (RELU) nonlinearity sE(I)[]+, where sE(I) sets the excitability of the excitatory or inhibitory units. xin is the input activation and u(t) is the instantaneous input. The time constants of recurrent units and inputs are set by αr and αin. Weights within and between units are housed in the matricies WEE, WEI, WIE, WII. Only the excitatory units receive projections from the input and project to the output through Win and Wout, respectively.

RNNs received two input streams u(t) = [u1(t), u2(t)] representing sensory evidence:

$${u}_{i}(t,c)=\left\{\begin{array}{lll}{u}_{0}+(1+\mu \frac{c}{100})+{\sigma }_{{{{{{{{\rm{in}}}}}}}},i}(t)&{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{on}}}}}}}}} \, < \,t \, < \,{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{off}}}}}}}}},& i=1\\ {u}_{0}+(1-\mu \frac{c}{100})+{\sigma }_{{{{{{{{\rm{in}}}}}}}},i}(t)&{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{on}}}}}}}}} \, < \,t \, < \,{t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{off}}}}}}}}},& i=2\\ {u}_{0}+{\sigma }_{{{{{{{{\rm{in}}}}}}}},i}(t)\hfill&{{{{{{{\rm{otherwise}}}}}}}}.\hfill\end{array}\right.$$
(31)

The stimulus period was 21 time steps and \({t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{on}}}}}}}}}\) and \({t}_{{{{{{{{\rm{stim}}}}}}}},{{{{{{{\rm{off}}}}}}}}}\) were uniquely chosen for each trial. The stimulus magnitude μ = 3.2 was fixed and stimulus difficulty was set by c which ranged between − 20 and 20.

The recurrent and input noise are modeled by the elements of \({{{{{{{{\boldsymbol{\sigma }}}}}}}}}_{r}^{{{{{{{{\rm{E}}}}}}}}({{{{{{{\rm{I}}}}}}}})}(t)\) and σin(t) that are sampled from a Gaussian distribution. We ensure that each element has a standard deviation σ0,r and σ0,in via scaling:

$${\sigma }_{{{{{{{{\rm{r}}}}}}}},i}^{{{{{{{{\rm{E}}}}}}}}({{{{{{{\rm{I}}}}}}}})}(t)=\sqrt{2{\alpha }_{{{{{{{{\rm{r}}}}}}}}}}{\sigma }_{0,{{{{{{{\rm{r}}}}}}}}}{{{{{{{\mathcal{N}}}}}}}}(0,\,1),$$
(32)
$${\sigma }_{{{{{{{{\rm{in}}}}}}}},i}(t)=\sqrt{\frac{2}{{\alpha }_{{{{{{{{\rm{in}}}}}}}}}}}{\sigma }_{0,{{{{{{{\rm{in}}}}}}}}}{{{{{{{\mathcal{N}}}}}}}}(0,\,1).$$
(33)

RNN training

The goal of RNN training is to minimize the difference between the output z (Ntrial × Ntime × Nout) and targets T (Ntrial × Ntime × Nout). We set the entries in T to the baseline value of 0.2 and, following a stimulus onset, raise the entries to 1 for the output corresponding to the correct choice. This target is designed to train the network to remain in a low activity state until stimulated and elevate the correct output in response to a stimulus. Half of training trials were catch trials, on which no stimulus was presented and target values remained at 0.2 throughout the trial. The training batch consisted of Ntrial = 200 trials which were randomly generated every training epoch. Within the training batch, noncatch trials were equally divided between possible choices and the difficulty was randomly sampled.

Recurrent network weights were randomly initialized from a Gamma distribution with a shape wμ = 0.0375 and scale wσ = 0.5 for excitatory weights WEE, WEI, and θwμ and scale wσ for inhibitory weights WIE, WII. The scaling factor θ = NEsE/NIsI adjusts the strength of inhibitory connections to offset for differences in the number and excitability between excitatory and inhibitory units. Input and output weights Win, Wout were randomly initialized from a uniform distribution and then values were normalized so the weights associated with each input and output summed to 1 across units. All weights were trained via back-propagation through time to minimize the loss function:

$${{{{{{{\mathcal{L}}}}}}}}= \frac{1}{{N}_{{{{{{{{\rm{trial}}}}}}}}}}\frac{1}{{N}_{{{{{{{{\rm{time}}}}}}}}}}\mathop{\sum }\limits_{i=1}^{{N}_{{{{{{{{\rm{trial}}}}}}}}}}\mathop{\sum }\limits_{t=1}^{{N}_{{{{{{{{\rm{time}}}}}}}}}}\left(\frac{1}{{N}_{{{{{{{{\rm{out}}}}}}}}}}\mathop{\sum }\limits_{o=1}^{{N}_{{{{{{{{\rm{out}}}}}}}}}}{M}_{i,t}{({T}_{i,t,o}-{z}_{i,t,o})}^{2}+\frac{{\lambda }_{x}}{{N}_{e}+{N}_{i}}\mathop{\sum }\limits_{n=1}^{{N}_{e} \!+\!{N}_{i}}{x}_{i,t,n}^{2}\right)\\ +\frac{{\lambda }_{w}}{{({N}_{e}+{N}_{i})}^{2}}\mathop{\sum }\limits_{m,l=1}^{{N}_{e}\!+\!{N}_{i}}\left|{W}_{ml}\right|.$$
(34)

Here x is a concatenation of xE and xI of the size Ntrial × Ntime × (NE + NI), and W is a concatenation of WEE, WEI, WIE, and WII of the size (NE + NI) × (NE + NI). To encourage the network to integrate the stimulus for extended time, we used a mask M (Ntrial × Ntime), where entries were zero during the stimulus period so that time points during the stimulus were not considered when calculating the error term of the loss function. On catch trials, all entries of M were set to 1. The hyperparameter λx = 0.1 controls the amount of L2 regularization intended to minimize the activation of each unit. The hyperparameter λw = 1.0 controls the amount of L1 regularization applied to weights. We updated the weights by stochastic gradient descent using the ADAM optimizer in PyTorch and Python 3.7 with a learning rate 0.01. During training, the norm of the gradient was clipped at 1.

To maintain the identity of excitatory and inhibitory units and to keep the input and output weights positive, all negative elements of WEE, WEI, WIE, WII, Win, and Wout were set to 0 after every training step. We prevent self-connections by elementwise multiplying WEE and WII by (1 − I), where I is the identity matrix and 1 is a matrix of 1s, after every training step.

We terminated RNN training based on its task performance. We tested RNN performance on a validation batch of trials after every training epoch. Each validation batch consisted of 100 trials with stimulus strength ranging between −20 and 20 in steps of 2. The network registered a decision when the difference between the output variables was above a threshold of 0.25. Trials were considered valid if at least 75% of the prestimulus period was below the decision threshold and at least 50% of the post stimulus period was above the decision threshold. Overall performance was measured as the fraction of correct choices out of all trials except for the ambiguous case where stimulus was equal to 0. We compute the accuracy and the psychometric function only using valid trials. We terminated training when a network’s overall performance reached 85%. RNN parameter values are shown in Table 2.

Table 2 Recurrent neural network parameters

Measuring choice selectivity of RNN units

After training, we analyzed the activity of excitatory and inhibitory RNN units to quantify their choice selectivity. Our metric is based on the ability to decode the choice registered by the network based on the activity of the unit at the time point immediately following stimulus offset21. For each unit, we computed the receiver operating characteristic (ROC) using the roc function and the area under the ROC curve (AUCROC) using the trapz function in Matlab. A unit with the same activity for either choice will have an AUCROC equal to 0.5, thus our choice selectivity measure was defined by AUCROC − 0.5. To identify significantly selective units, we compared AUCROC to a shuffled distribution generated from that unit’s activity by shuffling the choice outcomes 150 times. We considered units to be choice selective if their AUCROC fell within the lowest or highest 2.5% percentiles of the shuffled AUCROC distribution.

Measuring connection specificity in RNNs

We measured the specificity of connections between choice selective units in RNNs. For each connection class (EE, EI, IE, and II), we computed \(\left\langle {w}^{+}\right\rangle\) and \(\left\langle {w}^{-}\right\rangle\), the mean strength of the weights between significantly selective units with, respectively, the same and opposite selectivity. Then we computed the specificity γ as:

$$\gamma=\frac{\left\langle {w}^{+}\right\rangle -\left\langle {w}^{-}\right\rangle }{\left\langle {w}^{+}\right\rangle+\left\langle {w}^{-}\right\rangle }.$$
(35)

This expression is identical to the specificity γ used in the mean-field model. To assess significance of correlations between γ for the 4 connection classes, we computed a shuffled distribution constructed by shuffling the network labels 5000 times.

Perturbing inhibitory populations

We perturbed activity of inhibitory neurons by delivering the same constant input to all inhibitory neurons during the stimulus period. In the mean-field model, we modified the parameter ν0,I by a small amount within the range [−0.5, 0.5] around a baseline. We used two baseline values of ν0,I: 11.5 for low-inhibitory regime and 14 for high-inhibitory regime. In RNNs, we delivered perturbations in a similar manner, where we delivered a constant input within the range  [−1, 1] during the stimulus period.

Four-variable mean-field model

To model the effects of inhibitory-inhibitory specificity and dynamics of inhibitory synapses, we developed a simplified version of our model which explicitly modeled the activity of selective inhibitory populations. In this model, the dynamics of NMDA synapses for excitatory populations E1 and E2 (i = 1 and i = 2, respectively) are governed by:

$$\frac{d{S}_{i}}{dt}=-\frac{{S}_{i}}{{\tau }_{{{{{{{{\rm{NMDA}}}}}}}}}}+(1-{S}_{i})\gamma {{\Phi }}({x}_{i}),$$
(36)

and dynamics of GABA synapses for inhibitory populations I1 and I2 (i = 3 and i = 4, respectively) are governed by:

$$\frac{d{S}_{i}}{dt}=-\frac{{S}_{i}}{{\tau }_{{{{{{{{\rm{GABA}}}}}}}}}}+{{\Phi }}({x}_{i}).$$
(37)

The nonlinear activation function Φ(x) is of the form Eq. (2) with a = 310 nC−1, b = 125 Hz, and c = 0.16 s for excitatory populations E1 and E2, and a = 615 nC−1, b = 177 Hz, and c = 0.087 s for inhibitory populations I1 and I2. The input to population i is

$${x}_{i}=\mathop{\sum }\limits_{j=1}^{4}{A}_{i,j}{S}_{j}+{I}_{0,{{{{{{{\rm{E}}}}}}}}({{{{{{{\rm{I}}}}}}}})}+{I}_{{{{{{{{\rm{stim}}}}}}}},i}+{I}_{\nu,i},$$
(38)

where the adjacency matrix A is

$${{{{{{{\bf{A}}}}}}}}=\left(\begin{array}{llll}{w}_{{{{{{{{\rm{EE}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{EE}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}\\ {w}_{{{{{{{{\rm{EE}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{EE}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}&{w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{E}}}}}}}}}\\ {w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{II}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{II}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}\\ {w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{NMDA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{II}}}}}}}}}^{-}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}&{w}_{{{{{{{{\rm{II}}}}}}}}}^{+}\,{J}_{{{{{{{{\rm{GABA}}}}}}}},{{{{{{{\rm{I}}}}}}}}}\end{array}\right).$$
(39)

Only excitatory populations (i = 1 and i = 2) receive stimulus information through \({I}_{{{{{{{{\rm{stim}}}}}}}}}\), which is identical to Eq. (4). Noise is introduced by Iν,i which is implemented as in the two-variable model (Eq. (5)) with the standard deviation of ν(t) set to 0.2 nA.

The weight parameters \({w}_{{{{{{{{\rm{EE}}}}}}}}}^{+}\), \({w}_{{{{{{{{\rm{EE}}}}}}}}}^{-}\), \({w}_{{{{{{{{\rm{EI}}}}}}}}}^{+}\), \({w}_{{{{{{{{\rm{EI}}}}}}}}}^{-}\), \({w}_{{{{{{{{\rm{IE}}}}}}}}}^{+}\), \({w}_{{{{{{{{\rm{IE}}}}}}}}}^{-}\), \({w}_{{{{{{{{\rm{II}}}}}}}}}^{+}\), and \({w}_{{{{{{{{\rm{II}}}}}}}}}^{-}\) were defined as in the two-variable model. The difference is the addition of \({w}_{{{{{{{{\rm{II}}}}}}}}}^{+}\), and \({w}_{{{{{{{{\rm{II}}}}}}}}}^{-}\) which define the specificity of inhibitory-inhibitory connections and depend on γII which can range between [−1, 1]. The synaptic parameters JNMDA,E, JNMDA,I, JGABA,E, JGABA,I, and the background input currents I0,E(I) were chosen so that the firing rate dynamics of E1 and E2 matched that of the two-variable model on a noiseless trial with a stimulus strength of 0.05 using PyABC parameter inference51. The values of these parameters are defined in Table 3. Simulations of the four-variable model were performed in Python 3.7.

Table 3 Four-variable mean-field model parameters

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.