Confidence through consensus: a neural mechanism for uncertainty monitoring

Paz, Luciano; Insabato, Andrea; Zylberberg, Ariel; Deco, Gustavo; Sigman, Mariano

doi:10.1038/srep21830

Download PDF

Article
Open access
Published: 24 February 2016

Confidence through consensus: a neural mechanism for uncertainty monitoring

Luciano Paz¹,
Andrea Insabato²,
Ariel Zylberberg³,
Gustavo Deco² &
…
Mariano Sigman^1,4

Scientific Reports volume 6, Article number: 21830 (2016) Cite this article

3695 Accesses
10 Citations
16 Altmetric
Metrics details

Subjects

Abstract

Models that integrate sensory evidence to a threshold can explain task accuracy, response times and confidence, yet it is still unclear how confidence is encoded in the brain. Classic models assume that confidence is encoded in some form of balance between the evidence integrated in favor and against the selected option. However, recent experiments that measure the sensory evidence’s influence on choice and confidence contradict these classic models. We propose that the decision is taken by many loosely coupled modules each of which represent a stochastic sample of the sensory evidence integral. Confidence is then encoded in the dispersion between modules. We show that our proposal can account for the well established relations between confidence and stimuli discriminability and reaction times, as well as the fluctuations influence on choice and confidence.

The language network as a natural kind within the broader landscape of the human brain

Article 12 April 2024

Memorability shapes perceived time (and vice versa)

Article 22 April 2024

EEG is better left alone

Article Open access 09 February 2023

Introduction

The decision-making process has been widely modeled as a stochastic integration of sensory evidence to a threshold^1,2,3,4,5. These models have been used to explain with quantitative detail task accuracy, response times and confidence^6,7,8,9, yet it is still unclear how confidence is encoded in the brain. A classic model by Vickers¹⁰ assumes that, when deciding amongst alternatives based on sensory evidence, the evidence in favor of each option is noisily integrated until one reaches a threshold. After this race, the option that reached the threshold first is selected and the balance of evidence between competing alternatives encodes the confidence. On the other hand, a typical model that is derived from optimal statistical decision theory, encodes a single decision variable that integrates the difference between the evidences in favor of each alternative^11,12. This variable is related to the log odds of each alternative being correct given the evidence so far and follows a drift-diffusion process until a threshold of admissibility. The confidence is then given by the belief of having made a correct response.

Both models have been used to successfully explain several behavioral aspects of decision-making confidence, the most noteworthy being the relations between confidence and task difficulty and confidence and response times^8,13,14,15. Typically, confidence is low for difficult trials and high for easy trials and if subjects are free to respond whenever they choose, their confidence is higher for fast responses^{16,17,18,19,20,21,22,23}. Both models also have some shortcomings, for example they are not well defined to study confidence in more than two alternative tasks. Furthermore, recent psychophysical experiments have been able to measure the influence that the sensory evidence has on the resulting decision and confidence as a function of time²⁴. By letting external noise rather than internal noise limit task performance, the average of the noise conditioned by the subject’s choice provides an estimate of the integration window and the sensitivity of decisions to fluctuations as a function of time (i.e. the influence that sensory evidence at a given time has on choice)²⁵. This average is referred to as the decision and confidence kernels. The main characteristics of these kernels (at least for two alternatives tasks) are that evidence fluctuations in favor of an option, have the same influence on the resulting decision as fluctuations against the competing alternative. This is not the case for confidence, where fluctuations in favor of an option yield an increase in high confidence report rates, whereas fluctuations against the alternative option almost do not affect confidence. This property is in staggering contradiction with what drift-diffusion and balance of evidence models of confidence predict²⁴. Drift-diffusion is inherently symmetric in the sense that fluctuations in favor of an option immediately become fluctuations against the competing option. The balance of evidence also fails as it takes into account the difference between accumulated evidence for each option to construct confidence reports, thus being symmetrically affected by fluctuations.

In this work we propose a model that is able to explain both the decision and confidence kernels. We make a key assumption based on the fact that the integration of evidence is corrupted by internal and external noise²⁶. This implies that the integral that drives choice will be randomly distributed. If subjects have access to a representation of the former distribution (such as a summary statistic, like the distribution’s mean and variance), the belief that the decision is correct can be estimated. This belief will depend on the reliability of the evidence and we assume it is normatively associated to decision confidence through the dispersion of the distribution of evidence integrals²⁷. For instance, a highly reliable stimulus will be normatively associated to higher confidence reports whereas low reliability will produce low confidence reports. Furthermore, we ground our proposal on three key relations.

1
In stochastic processes, the variance of the evidence’s integral is very correlated with elapsed time²⁸ (See SI section A). Hence, by relying on the dispersion to construct confidence reports, confidence will be very correlated to response times.
2
Attractor neural networks have been shown to implement evidence integration for probabilistic decision-making^29,30. These attractors rely on the competition of two populations that encode the competing options and receive sensory input in favor of their encoded alternative. Competition is mediated by the dynamics of slow recurrent excitatory feedback, to integrate fast sensory input and lateral inhibition, to force an option’s selection. Fluctuations of the sensory evidence have an asymmetric effect on the attractor’s response time, as a consequence of two mechanisms: 1) positive fluctuations bring more total input current to the network compared to negative fluctuations and 2) the sensory input to one population affects its firing rate instantly but it takes more time to affect the dynamics of the competing population.
3
There is evidence that supports the coexistence of multiple integration processes in the brain, even for the same sensory cue^31,32,33. For example, Lafuente et al.³³ show in a random dot movement task that MIP and LIP integrate the evidence when the decision is to reach to a location, whilst LIP also integrates when the decision is signaled by saccades. We also propose that multiple almost independent integration processes coexist in the same brain region.

We propose that the decision process is performed redundantly and in parallel by many loosely coupled modules that integrate the time varying sensory input. These modules form a sample of the of evidence integral’s probability distribution and hence can be used to estimate its variance, which in turn is used to report confidence. Furthermore, we propose that each module is an ANN. We hypothesize that the fluctuations’ asymmetric influence on each ANN’s time should transfer to an asymmetric influence on the dispersion of samples, leading to asymmetric confidence kernels.

Our proposal is similar to Koriat’s self-consistency model³⁴ in which subjects are assumed to retrieve a sample of representations of the alternatives, decide based on a majority rule and assess their confidence based on the consistency of samples. Koriat assumes that the consistency of the samples is a measure of the selected alternative’s reliability just as we assume that the dispersion is a measure of sensory reliability. However, Koriat assumes that consistency is measured from the difference between the number of samples that are in favor and against the selected option, while we use the dispersion within the samples of the selected alternative. Our model also provides a neurophysiological implementation while Koriat’s is purely normative.

Our proposal is independent of a particular implementation of the decision modules. However, we provide an exemplary computational implementation in order to address the following issues:

1
What is the maximum degree of interconnection between modules so that they can no longer be considered independent samples of the evidence integral probability distribution? To address this, we study the correlation of the inter-module dispersion, increasing an inter-module coupling parameter (IC).
2
How can the decision be read from the activity of the modules? We propose a “voting scheme” where each module “votes” for an option and when an option gets more than half the votes, it is selected. We provide a simple firing rate threshold detection neural implementation for detecting each module’s vote.
3
How can the variance be readout from the distribution of firing rates? We show that an indirect measure of the dispersion can be computed by counting the number of modules that have their firing rate in certain range above the vote threshold. This is equivalent to computing percentiles over the distribution of firing rates.
4
How is the variance transformed to a binary confidence report? We propose that this occurs in a separate network, which selects high or low confidence with a sigmoid probability that depends on the dispersion’s estimate.

Figure 1 shows a schematic representation of our model’s operation for decision making and confidence reports.

Results

Model construction

We assume that in order to decide, sensory evidence is integrated to a threshold. We assume that confidence is determined by an estimate of the sensory evidence’s reliability, which can be decoded from the dispersion of the evidence integral’s probability distribution. In order to estimate said variability we propose a network of many modules that integrate the same sensory evidence in parallel and decide collectively (Fig. 1A). We propose that each module can be thought of as a sample from the evidence integral probability distribution, thus the sensory reliability and in turn confidence, can be decoded from an estimate of the inter-module firing rate dispersion (σ_dv) of the populations that are associated to the selected option at response time (Fig. 1B).

Additionally, we propose that each module is an attractor neural network (ANN), a widely studied neural implementation of evidence accumulation that relies on reverberant activity of competing neural populations and mutual inhibition mediated by slow NMDA channel opening dynamics^35,36,37,38. The decision of each ANN depends on the activity of the competing decision populations. When a threshold of activity (λ) is surpassed, the ANN commits to a choice. We study the simplified case where λ remains constant (we take it to be 15 Hz), although there is evidence the decision threshold varies^39,40,41,42. When a module’s ANN chooses an option, it casts a vote in favor of it. The global decision is taken when an option is voted by over half the modules (Fig. 1A).

We assume that confidence is decoded from an estimate of σ_dv (we provide the details of the proposed estimate in sec. “A neural mechanism to decide and estimate σ_dv”) in a separate layer. Our aim is to model binary confidence reports (i.e. high or low confidence), hence the layer that decodes confidence must produce binary values. We propose that the probability of a high confidence report is given by a sigmoid distribution that takes the σ_dv’s estimate as input, . The parameters of the sigmoid control the bias and the slope of the transition from high to low confidence (Fig. 1B).

Effect of module interconnectivity

In order to test our proposal, we first study how module interconnectivity (controlled by parameter IC) affects our assumption that σ_dv can be used to decode confidence. To do this, we study σ_dv’s correlation with stimuli discriminability, task accuracy and reaction times (RT) in free to respond perceptual decisions, with varying IC values (Fig. 2). When IC = 0, modules are fully independent, while for modules are fully coupled. We simulate a network of modules that must decide which of two stimuli is brighter. Both stimuli brightness flicker around a fixed mean each 40 ms. The distractor’s average luminance was fixed and the target luminance was varied. The stimuli discriminability measures the difference between the target and the distractor’s brightness. We simulated 2000 trials for each discriminability and IC value.

A first key result is that IC does not affect the mean accuracy nor the mean RT over trials as a function of stimuli discriminability (Fig. 2A,B). However, for small IC values, average σ_dv is strongly correlated with discriminability, task difficulty and accuracy (Fig. 2C–E). This correlation disappears for growing IC values. It is clear that for small IC values, average σ_dv decreases with discriminability and increases with mean RT (in fact, the positive correlation between σ_dv and RT also occurs within the same discriminability level as shown in SI Fig. S6). As we assume that σ_dv is an inverse measure of the reliability of the stimuli, high σ_dv will confer on average low confidence. This implies that our model reproduces the well known positive correlation between discriminability and confidence^16,19 and RT and confidence^17,18. Furthermore, our model also reproduces the known positive resolution of confidence - higher confidence for correct rather than incorrect responses (Fig. 2C). However, the studied task only yields slow errors (Fig. 2D) and is not suited to study our model’s ability to predict confidence in tasks characterized by fast errors^43,44.

Hence, even when the modules are interconnected, the known correlations between confidence and other relevant behavioral observables (accuracy, RT and discriminability) are preserved. However, this interconnection must be small.

A neural mechanism to decide and estimate σ _dv

In the previous sections we stated that the model commits to a choice based on the “votes” of all modules and decodes confidence from σ_dv. In this section we provide a neural implementation of the decision method and σ_dv’s estimation.

We propose that when the activity of one of the competing populations, e.g. populations sensitive to option A, in a module surpasses λ, the module “votes” for A. Detecting if a population has an activity greater than a certain threshold is easily accomplished using a disinhibition microcircuit^45,46,47. Briefly, the crossing is signaled by a separate binary population of neurons that are either silent or bursting. This binary population is normally inhibited by a group of interneurons. When one of the competing populations surpasses λ it inhibits the interneurons and releases the inhibition of the binary population, which begins to burst. Counting votes is simply accomplished by summing the activity of the binary populations.

This decision mechanism implies that, at the response time, the median of the distribution of firing rates for the selected option is λ. Hence, σ_dv can be estimated from the from the fraction of modules in the vicinity of λ⁴⁸. In particular, it is sufficient to count the fraction of modules that are between λ and a slightly higher value (FMC). We take λ = 15 Hz and Δλ = 5 Hz. This is accomplished by subtracting the sum of the bursting neurons that indicate activity greater than λ with a second group of binary neurons that signal activity greater than (Fig. 1B and SI Fig. S3).

The intuition of the mechanism is simple: if the variance is low, the firing rate of all modules should be within a narrow range relative to the median. Instead, if the variance is high, only a few modules will have firing rates within a narrow interval relative to the median (Fig. 3A,B). This relation can be empirically tested (Fig. 3C). It is clear that FMC is inversely correlated with σ_dv and the correlations with discriminability, RT and accuracy are preserved with an inverted dependence (Fig. 3D–F, the inverse relation exists even within the same discriminability level SI Fig. S7). Hence, FMC is a valid representation of stimuli reliability and can be used to decode confidence.

Fluctuations’ asymmetric influence on ANNs

A key relation, that had not been studied previously and which we rely on is the asymmetric influence that fluctuations have on an ANN’s RT. We hypothesize that this asymmetric influence should transfer to the whole network’s RT and to confidence, due to the correlation between σ_dv and RT. To illustrate this property, we simulate a network with modules and measure the network RT, σ_dv, FMC, accuracy and time at which one of the modules casts its vote (vote time) under two stimulation protocols (SP). One where a brief positive pulse (an increase of 1cd/m² for 40 ms) is injected into the A population and another SP where a pulse of the same amplitude but opposite sign is delivered to the B population (Fig. 4, the basal luminance is 50 cd/m²). A simulation of 10⁵ trials of each condition revealed that, as expected, both pulse manipulations resulted in comparable effects on accuracy (Fig. 4A), increasing the probability of selecting option A relative to chance for both SPs. However, the two pulse conditions had markedly different effects on RTs. For the single module’s vote time, correct votes were fastest for SP 1 (Fig. 4B).

This shows that at the module level, vote times are asymmetrically affected by sensory fluctuations. This asymmetry can be interpreted to be a consequence of the fact that a positive fluctuation brings more total input current (the combined input to both competing populations) while a negative fluctuation brings less total input. When a module has a higher total input current, it commits to a vote faster, which is in fact the basis for the neural implementation of the balance of speed and accuracy in decision making^40,42. Furthermore, the asymmetric influence can also be interpreted to derive from the different latencies with which fluctuations in favor of an option become synaptic input against the competing alternative. Sensory evidence in favor of the encoded option propagates rapidly (in V1, it has been shown to be mediated by AMPA receptors⁴⁹) and hence is integrated rapidly, while lateral interactions have a slower build-up time governed by the temporal constants of NMDA receptors and the characteristic time of recursion in the circuit^35,50. Hence, this could be interpreted to cause an asymmetric influence of the fluctuations on each of the competing populations, which in turn transfers to module vote time.

We hypothesized that the fluctuations’ asymmetric influence on vote time transfers to an asymmetric influence on the network RT and dispersion. We are able to confirm our hypothesis finding that network’s RT are significantly faster for SP 1 (Fig. 4C) and σ_dv is significantly lower (Fig. 4D). FMC is also asymmetrically affected (Fig. 4E) reflecting smaller dispersion for SP 1, thus leading to on average higher confidence reports for SP 1 relative to SP 2. This implies that positive fluctuations that target the selected option produce on average higher confidence than negative fluctuations that target the non-selected option.

The model accounts for subjects’ experimental performance and confidence

In this section we show that our model is able to reproduce subjects’ performance and confidence reports and the decision and confidence kernels in a two alternative, reaction time, perceptual task.

We aim to model the experimental data obtained in²⁴. In this experiment, human subjects had to select the brightest of two patches, reporting simultaneously the choice (‘left’ or ‘right’) and the confidence in their choice (‘high’ or ‘low’) (Fig. 5A). The critical manipulation of the experiment was the addition of time-varying luminance noise to the average luminance of each patch. By letting external noise rather than internal noise limit task performance, the average of the noise conditioned by the subject’s choice provides an estimate of the integration window and the influence that sensory evidence at a given time has on choice Ahumada1996 (Fig. 5B show a sample trial’s fluctuations for the selected and non-selected patches). The average of the fluctuations only conditioned by choice gives the decision kernel (D_S and D_N) and measures the influence of the fluctuation of the selected and non-selected patches on choice (Fig. 5C). By discriminating confidence reports, the confidence kernel is computed (C_S and C_N), the influence of the fluctuation of the selected and non-selected patches on confidence (Fig. 5D). Zylberberg et al., found a symmetrical decision kernel and an asymmetrical confidence kernel, which is inconsistent with both balance of evidence and diffusion decision confidence models²⁴.

In order to show that our model of confidence yields asymmetrical confidence kernels, we fit the subjects decision kernel and task performance. To do this, we simulate a network of 100 modules that receive sensory input from two sources with the same distribution target and distractor luminances as seen by the subjects. We propose that the luminance signal is linearly transformed to neural input current as, , where I is the neural input, L is the observed luminance, g is the input gain and b is the input bias. We determine the values of g and b by fitting the model’s decision kernel and task performance to the subjects (Fig. 6A). After obtaining the luminance transformation parameters, we fit the parameters of the sigmoid that is used to transform FMC into a binary confidence reports. To do this, we fit the subjects performance discriminated by confidence (i.e. high-hits, high-misses, low-hits and low-misses, Fig. 6B). Crucially, the confidence kernels are not used for fitting and thus the model’s confidence kernels can be considered a prediction. We found that the model’s confidence kernels are asymmetric and in excellent agreement with the subjects’ data (Fig. 6C).

Discussion

In this work we presented a neural model for two alternative decisions where an ensemble of modules collectively decides and, more importantly, encodes the decision’s confidence in the dispersion of firing rates. We showed that our model is able to account for well established relations between confidence and task difficulty (higher confidence for easier trials^{16,17,18,19,20}) and confidence and RT (higher confidence for faster decisions^10,13,18,51). More importantly, it is able to account for the sensory evidence’s asymmetric influence on confidence²⁴.

Our model’s key assumption is that decision confidence is decoded from the inter-module distribution of firing rates of the neurons encoding the selected option. We assume that the activity of each module can be interpreted as a sample of the sensory evidence’s integral over time. If subjects have access to a representation of the probability density of the sensory evidence integrals, they can estimate their belief of having made a correct decision, i.e. their confidence. This interpretation is based on the bayesian interpretation of confidence²⁷. Briefly, the bayesian interpretation assumes that subjects decide following bayesian inference to infer the probability distribution (posterior) that each alternative is correct given the evidence (distributional confidence). The subjects then report a summary confidence rating that combines the information of the posterior distribution (can be either a binary - high/low - or continuous rating). In order to do this, subjects can rely on summary statistics (such as the mean and variance) instead of the entire distributional information. Our model only has access to a sample of evidence integrals, not the probability distribution and decodes confidence from the inter-module dispersion of neurons sensitive to the selected option (a single summary statistic). We reason that it only requires this because the dispersion is strongly related to the sensory reliability, i.e. the signal to noise ratio, which by itself is a possible confidence heuristic⁵². It could however rely on more information of the samples (other statistics) to decode confidence. For instance, most models of confidence only rely on the difference between the firing rates of the selected and non-selected alternatives⁵³ (similar to Vickers balance of evidence¹⁰). Furthermore, this difference can be directly related to the log odds measure of confidence, used by diffusion models^11,12, if one assumes that neurons use a probabilistic population code^54,55. The main difference with our model is that we use many modules to decide and in turn have access to many samples of the sensory evidence’s integral. Hence our model has access to more summary statistics to report confidence and not only on the difference between competing options.

There are other proposals that also assume that confidence is determined from multiple samples, e.g Koriat’s self-consistency model³⁴. However, Koriat’s consistency rule is similar to the “balance of evidence” as it constructs confidence from the difference between the number of samples that are in favor and against the selected option. Our model uses a statistic that only takes in to account the samples of the selected option and disregards the samples of the non-selected option. Hence, our proposal, in a way, implements a confirmation bias^56,57, an ubiquitous stereotypical error in human confidence judgments, where only the evidence consistent with the decision is used to report confidence.

Our main contribution is that our model is able to account for the asymmetric confidence kernels, while “balance of evidence” models cannot²⁴. This is possible thanks to the asymmetric influence that sensory fluctuations have on each module’s vote time, which transfers to both network RT and inter-module dispersion. This asymmetry can be interpreted to be a consequence of the fact that a positive fluctuation brings more total input current (the combined input to both competing populations) while a negative fluctuation brings less total input. When a module has a higher total input current, it commits to a vote faster, which is in fact the basis for the neural implementation of the balance of speed and accuracy in decision making^40,42. Furthermore, the asymmetric influence can also be interpreted to derive from the different latencies with which fluctuations in favor of an option become synaptic input against the competing alternative. Sensory evidence in favor of the encoded option propagates rapidly (in V1, it has been shown to be mediated by AMPA receptors⁴⁹) and hence is integrated rapidly, while lateral interactions have a slower build-up time governed by the temporal constants of NMDA receptors and the characteristic time of recursion in the circuit^35,50. Hence, this could be interpreted to cause an asymmetric influence of the fluctuations on each of the competing populations, which in turn transfers to module vote time.

Furthermore, our model is also suited to explain confidence reports in different experimental paradigms such as the fixed delay paradigm^11,13,58. In this paradigm, subjects are presented with the stimulus during a fixed interval and are forced to decide after the stimulus is turned off. The main result is that the longer the stimulus is presented, the higher the confidence, a property that our model reproduces (SI sec. E). However, we do not study many known properties of confidence. For example, we did not study tasks characterized by fast errors^43,44. These are normally linked to tasks where subjects are forced to balance their speed and accuracy tradeoff with additional costs for the passage of time. Our model’s speed-accuracy tradeoff (i.e. decision policy) can be tuned by changing the background input that targets all the competing populations (SI sec. E). However, our model is constructed assuming a constant decision policy and thus we do not study the problem of confidence calibration²⁷. Calibration is the process through which a summary statistic (in our case the σ_dv or FMC) is transformed to certainty level or a confidence report (which we assume occurs in a separate layer). This problem requires feedback connections and parameter tuning to actively learn the proper calibration for a variable decision policy. This also implies that the parameters that determine the probability of high confidence should also change with background input. This interesting problem is well beyond what we studied in the present work.

One of the main questions that should be addressed by future studies is: “how does the modularity arise?” Our model assumes that the network that decides is formed by modules that are by themselves networks with many neurons. It is crucial to study how this modular architecture could emerge, either from a property of the topology, sparse connectivity or synaptic plasticity. However, one of our important findings is that the modules do not need to be fully independent. Some degree of interconnection does not make the dispersion amongst module less informative, hence neurons residing in a given module can be connected to neurons outside it and the network could still function. A detailed study of the extension of our model to a spiking neuron network is necessary.

We also report a novel prediction of the model in order to falsify (or confirm) it with new experiments. The network modules are ANNs, that when stimulated, increase their firing rate until they are in the vicinity of a stable steady state that has one population with increased firing rate and the other with low firing rate. This implies that as time goes on, more modules will reach a stable steady state and variability will decrease or reach a constant level. Our prediction is that intermodule variability should increase as a function of RT and task difficulty. This prediction should be tested in a neurophysiological experiment, with simultaneous neural recordings, where RT and confidence are observed. Furthermore, our model is built upon the idea that the decision is taken redundantly by many modules. This implies that the covariance between pairs of “integrating” neurons is not merely described by two point-processes with the same underlying rate, to some degree they must integrate evidence independently and, thus, may arrive at opposite conclusions.

Methods

Network model

Here we detail the network model without the neural layers that monitor choice and FMC (the full network that contains the neural populations that monitor choice and FMC is described in SI sec. B). The network contains N decision modules, where each module is composed of two decision populations (A and B) described by rate equations similar to³⁶:

where the superscript k denotes the module, s_i is the fraction of NMDA channels open in population i (i can be either A or B), is the NMDA closure time, γ is a parameter of the NMDA opening rate, is population I’s firing rate, is an effective input-output relation between the synaptic input and the firing rate, is the connection weight from population i in module k to population j in module . This connection weight is determined by the interconnection parameter IC and that is the connection weight between populations in the same module when the modules are taken as fully independent . When , all modules are fully connected, thus all populations i, throughout the entire network of modules, are connected with the same weight to every population j, independently of the module they belong to. is the external synaptic current that arrives to population i in module k. has two separate contributions, the sensory signal, that is the same for every module and the background noise produced by synaptic bombardement that differs for every module. The latter is modeled as an Ornstein-Ulhenbeck process that arises from the background noise filtered by the AMPA channel time constant (constant background is , O-U time constant is 10 ms and O-U variance is nA²). The connection weights are nA and nA. The response function parameters are , and .

Behavioral task

Taken from²⁴, participants fixated a central red dot (diameter of 0.56°) on a gray background (50 cd/m²) for 200 ms. Two flickering gray patches were presented at both sides of the fixation dot until a response was made. Patches were presented on the horizontal meridian, centered at ±1.04° from the fixation point. Each patch was composed of four vertical, spatially adjacent bars (0.14° × 0.56°). The luminance of the bars was updated synchronously every 40 ms, sampling from a Gaussian distribution with a standard deviation of 10 cd/m². The mean of this distribution equaled the luminance of the background for one of the patches and was set higher for the other (referred as “target”). Subjects simultaneously reported the location they considered was the target and a binary confidence (high or low).

Psychophysical kernels

The luminance fluctuations are pooled into four groups depending on the subject’s selection and reported confidence. The groups are: (fluctuations of the selected patch with high/low confidence) and (fluctuations of the non-selected patch with high/low confidence), where the super-index T indicates the trial, k indicates the bar within the patch and t is the time. The decision and confidence psychophysical kernels are computed from these groups as:

where the average is taken over trials and bars within each patch and and are the decision and confidence kernels respectively.

Simulation protocol

The correlations between our model’s accuracy, RT and task discriminability were studied by simulating 2000 independent trials for several ICs and discriminabilities (d_i), where a network of N = 100 modules was stimulated with two competing sensory signals. Both sensory signals were resampled each 40 ms from a gaussian with 5 cd/m² standard deviation and mean 50 cd/m² (distractor) and cd/m² (target). The network spent 0.2 s with no stimulation before stimulus onset. The sensory signal was transformed to neural input as , where nA m²/cd and cd/m².

The asymmetrical influence of the sensory fluctuations on an ANN’s RT were studied by performing 10000 independent simulations, where a network of N=100 modules had to decide under 2 different stimulation protocols (SP1 and SP2 from Fig. 4). In all stimulation protocols, the network had no stimulation during 0.2 s and then all stimuli were turned on. The baseline sensory input was 50 cd/m². In SP 1, input targeting A had a positive fluctuation of 1 cd/m² during the first 40 ms. In SP 2, input targeting B a had a negative fluctuation of 1 cd/m² during the first 40 ms.

Data fitting

To fit the behavioral data, sensory input, that represented the average patch luminance observed by the subjects, was sent to the network of modules. The onset of the stimulation was after a 1 s wait period. The mean patch luminance was resampled each 40 ms from a gaussian distribution with 5 cd/m² standard deviation and mean value that changed over trials. The mean target luminance was taken from the distribution of mean luminances observed by the subjects and the mean distractor luminance was fixed at 50 cd/m². The network was forced to decide within 1 s after stimulus presentation and penalized for non-decided trials and early decision (prior to stimuli presentation). The sensory input was transformed into neural input by a linear transformation as . The parameters g and b were determined by fitting subjects’ decision kernel and task performance, using a covariance matrix adaptation evolutionary strategy CMA-ES. The algorithm efficiently explores the parameter space in order to find the parameter values that minimize a merit function. The implementation was taken from⁵⁹. The merit function we used was:

where the first term is the squared difference between the subject and model’s decision kernels, and respectively. In the second term, is the Pearson chi squared test statistic⁶⁰ that the subject and simulation’s number of hits and misses come from the same multinomial. is the weight of the least squares and the weight of the Pearson statistic. The merit function is penalized by the number of early decision trials (N_e, decisions prior to stimulus onset) and the number of non-decided trials (N_d, trials where no option was selected). These forced the network to commit to a choice only due to the sensory input in 1 s. The model’s decision kernel and task accuracy were approximated by simulating 10000 trials for each parameter set. Each trial’s mean target luminance was taken from the distribution of mean target luminances observed by all subjects. The values of g and b that minimized the merit function were nA m²/cd and cd/m².

Once the decision kernel and performance were fitted using the luminance transformation parameters, the confidence report rate (number of high confidence hits and misses and low confidence hits and misses) was fitted. We propose that a separate neural layer decodes confidence from the estimate of σ_dv. The dispersion was estimated with the activity of pools CA/CB (the equivalent to FMC but computed with neural populations as shown in SI sec.B) at decision time. The probability of high confidence is taken as , where x is the dispersion’s estimate. We determine a and c by sampling the model’s binary confidence responses from and minimizing the Pearson chi squared test statistic between the model and subject’s performance discriminated by confidence (number of high-hits, high-misses, low-hits and low-misses). Again, we used the same CMA-ES optimization algorithm to perform these fits⁵⁹. The resulting parameters were, s and Hz. The results are qualitatively the same when taking the artificially counted FMC as the estimate of σ_dv at response time.

Additional Information

How to cite this article: Paz, L. et al. Confidence through consensus: a neural mechanism for uncertainty monitoring. Sci. Rep. 6, 21830; doi: 10.1038/srep21830 (2016).

References

Usher, M. & McClelland, J. L. The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review 108, 550–592 (2001).
CAS PubMed Google Scholar
Smith, P. L. & Ratcliff, R. Psychology and neurobiology of simple decisions. Trends in Neurosciences 27, 161–168 (2004).
CAS PubMed Google Scholar
Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J. D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review 113, 700–65 (2006).
PubMed Google Scholar
Brown, S. D. & Heathcote, A. The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology 57, 153–178 (2008).
PubMed Google Scholar
Smith, P. L. & McKenzie, C. R. L. Diffusive information accumulation by minimal recurrent neural models of decision making. Neural computation 23, 2000–2031 (2011).
MathSciNet PubMed Google Scholar
Audley, R. J. A stochastic model for individual choice behavior. Psychological Review 67, 1–15 (1960).
CAS PubMed Google Scholar
Vickers, D., Burt, J., Smith, P. & Brown, M. Experimental paradigms emphasising state or process limitations: I effects on speed-accuracy tradeoffs. Acta Psychol 59, 129–161 (1985).
Google Scholar
Kepecs, A., Uchida, N., Zariwala, H. A. & Mainen, Z. F. Neural correlates, computation and behavioural impact of decision confidence. Nature 455, 227–31 (2008).
CAS ADS PubMed Google Scholar
Fetsch, C. R., Kiani, R., Newsome, W. T. & Shadlen, M. N. Effects of cortical microstimulation on confidence in a perceptual decision. Neuron 83, 797–804 (2014).
CAS PubMed PubMed Central Google Scholar
Vickers, D., Burt, J., Smith, P. & Brown, M. Experimental paradigms emphasising state or process limitations: Ii effects on confidence. Acta Psychol 59, 163–193 (1985).
Google Scholar
Kiani, R. & Shadlen, M. N. Representation of confidence associated with a decision by neurons in the parietal cortex. Science (New York, N.Y.) 324, 759–64 (2009).
CAS ADS Google Scholar
Kiani, R., Corthell, L. & Shadlen, M. N. Choice certainty is informed by both evidence and decision time. Neuron 84, 1329–1342 (2014).
CAS PubMed PubMed Central Google Scholar
Moreno-Bote, R. Decision confidence and uncertainty in diffusion models with partially correlated neuronal integrators. Neural computation 22, 1786–1811 (2010).
MathSciNet PubMed MATH Google Scholar
Pleskac, T. J. & Busemeyer, J. R. Two-stage dynamic signal detection: a theory of choice, decision time and confidence. Psychological review 117, 864–901 (2010).
PubMed Google Scholar
Rolls, E. T., Grabenhorst, F. & Deco, G. Choice, difficulty and confidence in the brain. NeuroImage 53, 694–706 (2010).
PubMed Google Scholar
Garret, H. E. A study of the relation of accuracy to speed. Archs Psychol. 56, 1–105 (1922).
Google Scholar
Johnson, D. M. Confidence and speed in the two-category judgment. Archs Psychol. 34, 1–53 (1939).
Google Scholar
Festinger, L. Studies in decision: I. decision-time, relative frequency of judgment and subjective confidence. J Exp Psychol 32, 291–306 (1943).
Google Scholar
Vickers, D. Decision Processes in Visual Perception (Academic Press, New York, 1979).
Kornell, N., Rhodes, M. G., Castel, A. D. & Tauber, S. K. The ease-of-processing heuristic and the stability bias: Dissociating memory, memory beliefs and memory judgments. Psychol Sci 22, 787–794 (2011).
PubMed Google Scholar
Henmon, V. A. C. The relation of the time of a judgment to its accuracy. Psychol Rev 18, 186 (1911).
Google Scholar
Volkmann, J. The relation of time of judgment to certainty of judgment. Psychol Bull 31, 672–673 (1934).
Google Scholar
Reed, J. B. The speed and accuracy in discriminating differences in hue, brilliance, area and shape. In Johnson, D. M. (ed.) The Psychology of thought and Judgment 371–372 (Harper, New York, 1951).
Zylberberg, A., Barttfeld, P. & Sigman, M. The construction of confidence in a perceptual decision. Frontiers in integrative neuroscience 6, 79 (2012).
PubMed PubMed Central Google Scholar
Ahumada, A. J. J. Perceptual classification images from vernier acuity masked by noise. Perception 25 (1996).
Kahneman, D. & Tversky, A. Variants of uncertainty. Cognition 11, 143–157 (1982).
CAS PubMed Google Scholar
Meyniel, F., Sigman, M. & Mainen, Z. Confidence as bayesian probability: From neural origins to behavior. Neuron 88, 78–92 (2015).
CAS PubMed Google Scholar
Gardiner, C. W. Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences (Springer-Verlag: Berlin Heidelberg New York,, 1985).
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annual review of neuroscience 30, 535–74 (2007).
CAS PubMed Google Scholar
Wang, X.-J. Neural dynamics and circuit mechanisms of decision-making. Current opinion in neurobiology 22, 1039–46 (2012).
PubMed PubMed Central Google Scholar
Brunton, B. W., Botvinick, M. M. & Brody, C. D. Rats and humans can optimally accumulate evidence for decision-making. Science 340, 95–8 (2013).
CAS ADS PubMed Google Scholar
Hanks, T. D. et al. Distinct relationships of parietal and prefrontal cortices to evidence accumulation. Nature 520, 220–223 (2015).
CAS ADS PubMed PubMed Central Google Scholar
Lafuente, V. D., Jazayeri, M. & Shadlen, M. N. Representation of accumulating evidence for a decision in two parietal areas. Journal of Neuroscience 35, 4306–4318 (2015).
PubMed Google Scholar
Koriat, A. The self-consistency model of subjective confidence. Psychological review 119, 80–113 (2012).
PubMed Google Scholar
Wang, X.-J. Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36, 955–68 (2002).
CAS PubMed Google Scholar
Wong, K.-F. & Wang, X.-J. A recurrent network mechanism of time integration in perceptual decisions. The Journal of neuroscience: the official journal of the Society for Neuroscience 26, 1314–28 (2006).
CAS Google Scholar
Wang, X.-J. Decision making in recurrent neuronal circuits. Neuron 60, 215–34 (2008).
CAS PubMed PubMed Central Google Scholar
Mart, D., Deco, G., Mattia, M., Gigante, G. & Del Giudice, P. A fluctuation-driven mechanism for slow decision processes in reverberant networks. PloS one 3, e2534 (2008).
ADS Google Scholar
Churchland, A. K., Kiani, R. & Shadlen, M. N. Decision-making with multiple alternatives. Nature neuroscience 11, 693–702 (2008).
CAS PubMed PubMed Central Google Scholar
Bogacz, R., Wagenmakers, E.-J., Forstmann, B. U. & Nieuwenhuis, S. The neural basis of the speed-accuracy tradeoff. Trends in neurosciences 33, 10–6 (2010).
CAS PubMed Google Scholar
Thura, D., Beauregard-Racine, J., Fradet, C.-W. & Cisek, P. Decision making by urgency gating: theory and experimental support. Journal of Neurophysiology 108, 2912–2930 (2012).
PubMed Google Scholar
Hanks, T. D., Kiani, R. & Shadlen, M. N. A neural mechanism of speed-accuracy tradeoff in macaque area lip. eLife 2014, 1–17 (2014).
Google Scholar
Swensson, R. G. & Edwards, W. Response strategies in a two-choice reaction task with a continuous cost for time. Journal of Experimental Psychology 88, 67–81 (1971).
Google Scholar
Ratcliff, R. & Rouder, J. N. Modeling response times for two-choice decisions. Psychological Science 9, 347–356 (1998).
Google Scholar
Lo, C.-C. & Wang, X.-J. Cortico-basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nature neuroscience 9, 956–63 (2006).
CAS PubMed Google Scholar
Chevalier, G. & Deniau, J. M. Disinhibition as a basic process of striatal functions. Trends in Neurosciences 13, 277–280 (1990).
CAS PubMed Google Scholar
Letzkus, J. J. et al. A disinhibitory microcircuit for associative fear learning in the auditory cortex. Nature 480, 331–335 (2011).
CAS ADS PubMed Google Scholar
Cecchi, G. A. et al. Noise in neurons is message dependent. Proceedings of the National Academy of Sciences of the United States of America 97, 5557–61 (2000).
CAS ADS PubMed PubMed Central Google Scholar
Self, M. W., Kooijmans, R. N., Supèr, H., Lamme, V. A. & Roelfsema, P. R. Different glutamate receptors convey feedforward and recurrent processing in macaque v1. Proceedings of the National Academy of Sciences 109, 11031–11036 (2012).
CAS ADS Google Scholar
Wang, M. et al. Nmda receptors subserve persistent neuronal firing during working memory in dorsolateral prefrontal cortex. Neuron 77, 736–49 (2013).
CAS PubMed PubMed Central Google Scholar
Vickers, D. & Packer, J. Effects of alternating set for speed or accuracy on response time, accuracy and confidence in a unidimensional discrimination task. Acta Psychologica 50, 179–197 (1982).
CAS PubMed Google Scholar
Maniscalco, B. & Lau, H. A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition 21, 422–430 (2012).
PubMed Google Scholar
Wei, Z. & Wang, X.-J. Confidence estimation as a stochastic process in a neural dynamical system of decision making. Journal of Neurophysiology 114, 99–113 (2015).
CAS PubMed PubMed Central Google Scholar
Ma, W. J., Beck, J. M., Latham, P. E. & Pouget, A. Bayesian inference with probabilistic population codes. Nature neuroscience 9, 1432–8 (2006).
CAS PubMed Google Scholar
Beck, J. M. et al. Probabilistic population codes for bayesian decision making. Neuron 60, 1142–52 (2008).
CAS PubMed PubMed Central Google Scholar
Kahneman, D. & Tversky, A. On the psychology of prediction. Psychological Review 80, 237–251 (1973).
Google Scholar
Nickerson, R. S. Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology 2, 175–220 (1998).
Google Scholar
Irwin, F. W., Smith, W. A. S. & Mayfield, J. F. Tests of two theories of decision in an “expanded judgment” situation. J Exp Psychol 51, 261–268 (1956).
CAS PubMed Google Scholar
Hansen, N., Niederberger, A. S. P., Guzzella, L. & Koumoutsakos, P. A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. Ieee Transactions on Evolutionary Computation 13, 180–197 (2009).
Google Scholar
Plackett, R. L. Karl pearson and the chi-squared test. International Statistical Review 51, 59–72 (1983).
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by CONICET-Argentina (to L.P. and M.S.), the Spanish Ministry of Science and Technology Grant BFM2002-02042 (to D.C. and J.L.R.), by National Science Foundation Grant DMS-0245242 (to C.F.), by Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR - 2014SGR856 to AI and GD), by MINECO (PSI2013-42091-P to AI and GD) and the James McDonnell Foundation 21st Century Science Initiative in Understanding Human Cognition - Scholar Award (to M.S.).

Author information

Authors and Affiliations

CONICET and Physics Department, Integrative Neuroscience Laboratory, IFIBA, FCEyN, UBA, Buenos Aires, Argentina
Luciano Paz & Mariano Sigman
Universidad Pompeu Fabra, Barcelona, Spain
Andrea Insabato & Gustavo Deco
Department of Neuroscience, Howard Hughes Medical Institute, Columbia University, New York, 10032, NY, USA
Ariel Zylberberg
Universidad Torcuato di Tella, Buenos, Aires, Argentina
Mariano Sigman

Authors

Luciano Paz
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Insabato
View author publications
You can also search for this author in PubMed Google Scholar
Ariel Zylberberg
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Deco
View author publications
You can also search for this author in PubMed Google Scholar
Mariano Sigman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.P., A.I., A.Z., G.D. and M.S. contributed to the model design. L.P. and A.I. wrote the computational implementation. L.P. wrote the parallel computational implementation and performed the simulations and data fitting. A.Z. collected the behavioral data used in figures 5–6. L.P. wrote the main manuscript and supplementary information. L.P., A.I., A.Z., G.D. and M.S. reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Paz, L., Insabato, A., Zylberberg, A. et al. Confidence through consensus: a neural mechanism for uncertainty monitoring. Sci Rep 6, 21830 (2016). https://doi.org/10.1038/srep21830

Download citation

Received: 21 August 2015
Accepted: 13 January 2016
Published: 24 February 2016
DOI: https://doi.org/10.1038/srep21830

This article is cited by

Adaptive neurons compute confidence in a decision network
- Luozheng Li
- DaHui Wang
Scientific Reports (2021)
Nonlinear neural network dynamics accounts for human confidence in a sequence of perceptual decisions
- Kevin Berlemont
- Jean-Rémy Martin
- Jean-Pierre Nadal
Scientific Reports (2020)
Do Process-1 simulations generate the epistemic feelings that drive Process-2 decision making?
- Chris Fields
- James F. Glazebrook
Cognitive Processing (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.