Sensory integration dynamics in a hierarchical network explains choice probabilities in cortical area MT

Neuronal variability in sensory cortex predicts perceptual decisions. This relationship, termed choice probability (CP), can arise from sensory variability biasing behaviour and from top-down signals reflecting behaviour. To investigate the interaction of these mechanisms during the decision-making process, we use a hierarchical network model composed of reciprocally connected sensory and integration circuits. Consistent with monkey behaviour in a fixed-duration motion discrimination task, the model integrates sensory evidence transiently, giving rise to a decaying bottom-up CP component. However, the dynamics of the hierarchical loop recruits a concurrently rising top-down component, resulting in sustained CP. We compute the CP time-course of neurons in the medial temporal area (MT) and find an early transient component and a separate late contribution reflecting decision build-up. The stability of individual CPs and the dynamics of noise correlations further support this decomposition. Our model provides a unified understanding of the circuit dynamics linking neural and behavioural variability.

and a local (solid lines) external input. We injected a constant input current I 1 into both E1 and E2 (global), or an input 2I 1 into E1 only (local). The inhibitory population received I 1 in both cases. Excitatory neurons also received a baseline input of I 0 = 80 pA, equivalent to the mean input of zero-coherence stimuli (Methods). A global increase in the input leads to a marginal increase of the rates r E1 and r E2 . In contrast, a local increase of the input to E1 yields a substantial increase of r E1 , and, due to the strong common inhibition, a corresponding decrease in r E2 . This is the mechanism underlying competitive dynamics in the circuit. (c-f) Spike count correlation matrices kk' ρ (c-d) and correlation histograms (e-f) of the sensory circuit obtained for global ( or partly local (background inputsColor plots show the correlation matrices from 100 + 100 + 50 randomly picked neurons from the E1, E2 and I populations, respectively. Histograms are shown for within (EiEi) and across population (EiEj) pairs. Correlations were obtained from the response to replicate stimuli (count window T = 2 s). Correlations in a pair EiEj caused by a global ( or local (> 0) fluctuating input are proportional to the product of the slopes 1 Ej 1 Ei dI dr dI dr  obtained using global or local constant input I 1 (b; ref. 1 ). This explains why local background inputs induce positive EiEi and negative EiEj correlations with average correlation across all EE-pairs close to zero 2 . Correlations EI and II also had a mean close to zero (not shown). This mechanism also explains why non-replicate stimuli or top-down inputs, which can be both viewed as local inputs into E1 and E2 that fluctuate from trial-to-trial, generate correlations with the same spatial structure as local background inputs (compare Fig. 2c,f with Supplementary   Fig. 3c).
Supplementary Figure 3. Impact of bottom-up correlations and sensory integration dynamics on choice probability and categorization accuracy. (a-c) Average population rates for preferred and non-preferred choice trials in the integration and sensory circuits (a), average CP (b) and spike count correlations (c) obtained in a network with spatially localized external background input ( = 0.67, except in b) using replicate stimuli and top-down connections set to zero (b FB = 0). Correlations were sustained and had a structure among sensory neurons similar to that caused by trial-to-trial stimulus fluctuations (compare c to Fig. 2c). In consequence, they produced a CP with the fast-rise-decay time-course described in Fig. 2b. Shaded areas in a-c represent the stimulus interval. (d) Percentage of correct categorizations versus stimulus coherence when stimuli are classified according to their accumulated evidence for motion in the β -direction (by computing the integrals  ) (t s  of stimulus inputs into populations β = E1, E2; "Stimulus integrator") or by using a perfect integrator reading out the sensory activity (see Supplementary Figure 5). For = 0, the perfect integrator reached the accuracy of the stimulus integrator. Noise correlations introduced by > 0 had a structure similar to signal correlations ( Supplementary Fig. 2d) and for that reason they decrease discrimination accuracy 3,4 . Note that due to temporal stimulus modulations (σ > 0), a fraction of low-coherence stimuli actually provides accumulated evidence favoring the opposite direction and this limits the accuracy that can be reached. Although trial-to-trial fluctuations caused by non-replicate stimuli can generate a similar structure of correlations than local background inputs, they do not cause a decrease in discrimination accuracy because they are not truly noise correlations but spurious signal correlations. (e) Percentage of correct categorizations as a function of stimulus coherence for the perfect integrator (black; replotted from d) and for the hierarchical network model (red) when = 0. Because the network performs a transient integration of the sensory information (i.e. non-uniform across time, Fig. 3b) it yields a lower classification accuracy than the perfect integrator (see also Figure 7h). Simulation data (squares) were fitted using a Weibull function (solid lines).  Fig. 2b and Fig. 3b, respectively). (d-f) As in a-c but for the perfect integrator model where evidence is accumulated during the whole stimulus period (no bounds). The decision is determined by the sensory population which fired more spikes during the entire period. Traces show the integration variable in two example trials yielding choices 1 (red) and 2 (blue). The dashed line indicates the decision boundary. Both the CP (e) and the psychophysical kernel (f) show a sustained time-course. This indicates that the impact of sensory evidence on the decision is constant across time, which is at odds with the experimentally observed decay in the psychophysical kernel derived from RDKs ( Supplementary Fig. 7c,e). For both integrator models we employed the spikes generated by the sensory circuit in the simulations used to generate

Supplementary
that represent oriented motion energy, were correlated (r = 0.5) rather than independent. Average CP caused by nonreplicate stimuli and top-down inputs (σ = 2.7, b FB = 2; black) is sustained during the stimulus interval. Two complementary contributions to CP are revealed by removing trial-to-trial stimulus fluctuations (i.e. using replicate stimuli; green) or by removing the top-down connections (b FB = 0; blue). Thus, the results of presentations of replicate (red) and non-replicate (black) zero-coherence RDKs (replotted from ref. 9 ). The rapid and strong rate surges in the replicate PSTH are presumably induced by random conjunctions of dot trajectories in the preferred direction of this neuron 9,10 . In contrast, the non-replicate PSTH shows no temporal fluctuations because they were averaged out across the ensemble of RDKs. Inset: example spike count histograms from the indicated count windows, with similar mean (replicate: 26.9, non-replicate: 26.7 sp/s) and higher variance in the non-replicate condition (replicate: 32.6, nonreplicate: 51.5 sp 2 /s 2 ). The larger variability across trials in the non-replicate condition is due to the trial-to-trial stimulus fluctuations acting as an additional source of variability. (b) Spike count correlations for a second pair of neurons (emu034) 8 . Same layout as Fig. 4b. Count window T = 100 ms (left and center). Spike count correlations over the whole stimulus period are significantly larger in the non-replicate than in the replicate condition (P < 0.05, permutation test), as for the neuron pair in Fig. 4b. Note that only 2 pairs have been recorded in both stimulus conditions during the motion discrimination task. Error bars indicate s.e.m. and thick black lines periods of significant differences between the nonreplicate and the replicate condition (P < 0.05, permutation test). The average CP of each group is characteristically different, reflecting the hardwired heterogeneity. (d) Same as b, but for the heterogeneous network. Again, increasing the number of trials yields higher C(t i , t i+1 ) values but does not perturb the rising of C(t i , t i+1 ) over time. Thus, the rising of CP correlations is robust to small trial number. (e) Time-course of adjacent CP correlations C(t i ,t i+1 ) for the MT data 7 recorded in the replicate condition. Note that statistical power is greatly decreased compared to the non-replicate case (Fig. 5f), because here there are only n = 41 neurons and RDKs compared to n = 143 neurons and 25-221 RDKs (median: 59) for the non-replicate case. The CP correlation in the replicate condition shows elevated values early followed by a drop and rampand-plateau behavior similar to the non-replicate condition (Fig. 5f). The elevated values of CP correlation at the beginning imply that the ordering of early CPs remains fairly stable, although early CPs have a small average magnitude (Fig. 4c). A possible explanation is that some neurons exhibit a pre-stimulus CP component, which, due to the absence of a stimulusdriven early CP component, is fairly stable until the top-down component kicks in. In order to investigate why CP correlation is elevated early, we would need more neurons in the replicate condition and pre-stimulus data, which are unavailable in our data set.

G: Measurements
Spikes from all neurons of populations E1 and E2.

Network model
A standardized description of the model and all simulation parameters can be found in Supplementary Tables 1-4 11,12 .
Choice bias: The random connectivity within the sensory circuit and across circuits (i.e. bottom-up and top-down) can cause that the network's behavioral responses exhibit a bias towards one of the two choice options. We avoided this "quenched" bias by testing 10 different connectivity matrices and selecting the one that showed the least bias (51% choice 1 vs. 49% choice 2). This connectivity matrix was used in all simulations. Our conclusions remain valid for networks that show higher choice biases.
, and zero for β α  . This stimulus parameterization, adding a common and an independent term, accounts for the fact that each MT neuron "sees" the RDK through the specific and heterogeneous receptive fields of its afferent inputs 13 , causing the PSTHs from pairs of neurons obtained from repeated presentations of the replicate stimuli to be partially correlated (average correlation coefficient ~0.5, PSTH bin size T = 20 ms; Supplementary Fig. 1b-c). This is comparable to the two pairs in the MT data set which were recorded with replicate RDKs and exhibited different PSTHs 10 (correlation = 0.47 and 0.26 with direction difference = 9º and 20º for pairs emu035 and emu034, respectively; bin size T = 60 ms; coherence 0%). Introducing higher correlations in the current inputs to sensory neurons led to an increase in pair-wise correlations and to an increase in the early CP (Fig.   2b,c) but left all results unchanged (not shown).

Psychophysical data
Because the stimuli used during the MT recordings were not stored, we trained two male adult macaque monkeys (Macaca mulatta) to report the net motion direction of a fixed-duration random dot kinematogram (RDK, see below) along the horizontal axis with varying levels of difficulty (depending on the motion coherence of the RDK). On each trial we recorded both the monkey's choice and the presented stimulus (i.e. the positions of the dots in each frame). These data were used to compute average motion energy traces (Supplementary Fig. 7c,e). Minor stimulus differences aside (see above), the task was very similar to the classical fixed-duration version 7,8,14 , except that: (1) the monkey reported his choice with a reaching response rather than with a saccadic eye movement, (2) the targets were displayed 500 ms before stimulus onset rather than only after stimulus offset. We are confident that these differences do not provide a reason for expecting a different dynamics of decision formation (reflected in a different psychophysical kernel). Importantly, and as in the original studies, the monkeys were only trained in the fixed-duration task. They were never exposed to other variants of the task, which could in principle encourage the decision to be formed rapidly (such as reaction time tasks or variable duration tasks). All surgical and behavioral procedures conformed to guidelines established by the National Institutes of Health and were approved by the Institutional Animal Care and Use Committee of Stanford University.

Random dot kinematograms
The stimulus used in our psychophysical experiment -a random dot kinematogram (RDK) 14 -was generated using MATLAB and Psychophysics Toolbox 15 . Stimuli were presented on a 19-inch LCD touch monitor (Elo Touchsystems) with 75 Hz frame rate and 800 x 600 pixels resolution positioned 30 cm away from the monkey. The details for generating the stimulus have been extensively described previously 5,16 . We used the same algorithm and parameters expect: (1) the stimulus duration was fixed at 1 s, (2) the diameter of the stimulus aperture was 14 degrees, and (3) the speed of the dots was 8 degrees / second. Otherwise, we used the same coherences (0, 3.2, 6.4, 12.8, 25.6, and 51.2%), dot density (16.7 dots deg -2 s -1 ), and dot size (2 pixels). To create the impression of motion, the dots in the RDK were split into 3 consecutive sets of the same size (1 set displayed for each individual frame) and displaced 3 frames (40 ms) later. The fraction of dots displaced coherently toward one of the two targets was determined by the coherence (motion strength), with the remaining dots being displaced randomly. For zero-coherence stimuli all dots were displaced randomly but, due to the stochasticity of that process, one obtains non-zero net motion toward the targets over a small number of frames (i.e. the motion energy shows temporal modulations).
The MT data 7,8,14 analyzed here was obtained using 2-second RDK stimuli, which varied from session to session to match the stimulus preference of the MT neuron under study (diameter of the stimulus, the speed of the dots, between 0.4 and 28.4 degrees/second, and the axis of motion). The algorithm used to generate the stimulus was also different 14 . Despite these differences in stimuli and task structure, the monkeys' behavioral performance was similar across all studies (including ours, see Supplementary Fig. 7b,d,f).

Psychophysical reverse correlation
We used psychophysical reverse correlation 17,18 in order to measure the amplitude and time-course of the impact of stimulus fluctuations on the decision. The psychophysical kernels were computed as the difference of the average stimulus leading to each of the two possible choices. For the network model, it was defined as: is the average of the stimulus fluctuations across trials yielding choice  ( =1,2). The stimulus fluctuations in each trial were defined as ) are the temporal modulations in sensory input into population  in trial l (see equation (2)). Psychophysical kernels were smoothed by convolution with a 100 ms rectangular window.
The sensory integration window is defined as the interval, starting at stimulus onset, which contains 85% of the total area under the psychophysical kernel. The length of this window quantifies the period in which the system is sensitive to the sensory input.
To compute the psychophysical kernel from the experimental data (Supplementary Fig. 7c,e) we used equation 1, but defined the stimulus fluctuations in each trial ) (t z l as the difference of the motion energies 6 in the two directions that the monkey had to discriminate (left vs. right) 5 The filters were implemented as in ref. 5 . The space constants σ c and σ g were multiplied by a factor 2.8 in order to adjust the spatiotemporal frequency passband to the speed of coherent motion which was higher in the current experiment (7.95 degrees / second vs. 2.84 degrees / second).

Statistical methods
We mainly used permutation tests to establish the statistical significance of our results. Paired tests were used to compare measurements within neurons, implemented by permutation of trials across different conditions. When performing twosample tests (non-paired tests), we permuted neurons across the two independent samples conserving their sample size. When the model prediction identified the sign of the comparison we used one-sided tests, and this is explicitly indicated in the text. In particular, we used the following procedures for the permutation tests: Significant deviation of CP and psychophysical kernel from chance level (Supplementary Fig. 7) was tested by permutation of the choice outcome in each trial. Significant difference between replicate and non-replicate conditions were tested by shuffling neurons (which were either recorded with replicate or non-replicate stimuli) between the two conditions (Fano factors and CPs, Fig. 4). For the subset of neurons for which both conditions were available, we instead shuffled trials between the conditions within each neuron . The same was done for the spike count correlations of the two available pairs (Fig. 4b and Supplementary Fig.   8b). Positive slope of the regression of CP correlations (Fig. 5f) was tested using permutations of the time stamps of CP measurements of individual neurons, recalculating the correlations and their slope for the shuffled data. Similarly, the positive slope of lagged correlations (Fig. 6g) was tested by individually shuffling the time stamps of lagged correlations for each neuron and recalculating the slope for the shuffled data. Standard errors were calculated using bootstrap, based on 1,000 samples obtained by randomly resampling with replacement.
When applying parametric tests (ANOVA , t-test), we verified that our data satisfied the assumptions of normality and homoscedasticity (Lilliefors test of normality, P > 0.05, and Levene's test for equality of variances, P > 0.05, respectively). Paired tests are used to compare measurements within neurons, in ANOVAs this is accomplished by using a mixed-effects design with neuron as random factor, nested in the monkey factor. In all analyses, outliers beyond 3 standard deviations of the population mean were removed for population tests and descriptive statistics are indicated by mean ± s.e.m.