Abstract
Humans infer motion direction from noisy sensory signals. We hypothesize that to make these inferences more precise, the visual system computes motion direction not only from velocity but also spatial orientation signals – a ‘streak’ created by moving objects. We implement this hypothesis in a Bayesian model, which quantifies knowledge with probability distributions, and test its predictions using psychophysics and fMRI. Using a probabilistic patternbased analysis, we decode probability distributions of motion direction from trialbytrial activity in the human visual cortex. Corroborating the predictions, the decoded distributions have a bimodal shape, with peaks that predict the direction and magnitude of behavioral errors. Interestingly, we observe similar bimodality in the distribution of the observers’ behavioral responses across trials. Together, these results suggest that observers use spatial orientation signals when estimating motion direction. More broadly, our findings indicate that the cortical representation of lowlevel visual features, such as motion direction, can reflect a combination of several qualitatively distinct signals.
Similar content being viewed by others
Introduction
Estimating the direction of motion of an object is arguably one of the most ubiquitous tasks. Whether it is to catch a ball, cross a busy street, or make sure that your toddler does not run into something, the visual system needs to quickly and efficiently parse the retinal input and infer the direction in which things are moving. Yet, this task is also very difficult because of noise. The visual system needs to rapidly infer an object’s direction of motion, despite occlusion of motion paths, changes in motion speed and direction, and additional variability in neural signals. As a result of all these sources of variance, estimates of motion direction are necessarily uncertain – any given pattern of neural activity is almost always consistent with multiple interpretations.
Are there ways in which the nervous system could reduce this uncertainty in the interpretation of its sensory signals? One potential strategy for reducing uncertainty could be to use additional visual cues when inferring motion direction. Using multiple cues to obtain a more precise estimate of a visual feature is a wellknown strategy observed for slant, depth, and shape perception, among others^{1}. Although motion direction is often thought of as a “simple” visual feature, the underlying inference steps need not be simple from a computational perspective and could involve cues other than velocity. For example, the nervous system could rely on “motion streaks” in its estimates of direction^{2} – a “streak” is a smeared representation of a fastmoving object due to the temporal integration of signals along its motion path. From a computational standpoint, combining such spatial orientation signals with the information provided by velocitytuned neurons would decrease the uncertainty in inferred motion direction (as also illustrated in simulations below). Although the noise in velocity and spatial orientation signals is probably correlated (e.g., due to common retinal factors), integrating the information provided by both sources would still be beneficial as long as the correlation between them is not perfect (see^{1,3,4} for similar rationale). While behavioral studies suggest that observers are sensitive to motion streaks (e.g., ^{2,5,6}), direct neural evidence for streakbased computations in human motion perception is currently lacking.
Here, we investigate the neural and computational basis of human motion perception, using a combination of computational modeling, psychophysics, and functional MRI. To arrive at a set of quantitative predictions, we first implemented a Bayesian observer model that uses both velocity and spatial orientation signals in its estimates of motion direction. The model quantifies its beliefs about direction of motion as a probability distribution, wherein each direction of motion (interpretation) is assigned a probability of being true. As we will illustrate below, the observer model makes a surprising prediction: when an observer relies on both velocity and orientation signals, their belief should be represented as a bimodal probability distribution, with peaks around the true and opposite motion direction. We tested this prediction using fMRI in combination with a generative modelbased decoding technique to extract probability distributions from cortical activity^{7}. Interestingly, this revealed that motion direction is represented as a bimodal probability distribution in visual areas V1, V2, V3, and V4, as well as motionsensitive middle temporal cortex (hMT+). Notably, the shape of the decoded distribution (peak locations and entropy) predicted both the direction and magnitude of the behavioral errors made by the participants. The observer model furthermore predicted a perceptual illusion that we subsequently tested and found support for in a followup behavioral study: when sensory information is particularly noisy, participants sometimes perceive the stimulus as if it is moving in the direction opposite to the true direction of motion. Taken together, our findings provide strong evidence that spatial orientation signals are used by the nervous system in its judgments of motion direction and demonstrate complexity in the visual processing of a seemingly simple visual feature.
Results
Bayesian observer model
Visual perception is necessarily uncertain. Because of noise and ambiguity, it is impossible to infer with absolute precision the stimulus from the sensory response. Instead, any sensory measurement is consistent with a whole range of different interpretations. What strategies could the brain employ to reduce this uncertainty and improve the precision of its sensory estimates?
One wellknown strategy to reduce uncertainty is to use additional sources of information. For example, when the observer’s task is to determine an object’s direction of motion from noisy sensory measurements, combining velocity signals with spatial orientation signals could help to decrease ambiguity^{2}. Orientation signals are useful because they convey information about the trajectory of a moving object. That is, if the observer integrates the object’s position over time, the orientation of the resulting path (a motion “streak”) will be aligned with motion direction. Assuming noise is sufficiently independent between the signals, orientation signals could provide additional information about an object’s motion direction and improve the observers’ estimates. Indeed, given known neural response properties in visual areas^{8,9,10,11}, it seems likely that velocity and spatial orientations signals remain largely independent in cortex. Here, we develop a Bayesian observer model that implements this strategy. The model results in a set of concrete predictions that we will test in experiments.
The observer’s task is to infer the direction of motion of a stimulus s using two types of signals: velocity and spatial orientation. These signals are corrupted by noise (Fig. 1a). Thus, across trials, the sensory measurements \({x}_{V}\) (based on velocity) and \({x}_{O}\) (based on orientation) can take different values and are best described by a probability distribution (the generative distribution). For the velocity measurements \({x}_{V}\), we assume that the probability distribution of their values, \(p\left({x}_{V}{\,\,s}\right)\), is a von Mises (circular normal) distribution with variance \({{{{{{\rm{\sigma }}}}}}}_{V}^{2}\). The values of the spatial orientation measurements, \(p\left({x}_{O}{\,\,s}\right)\), also follow a von Mises distribution but wrapped in the 180° orientation space and with variance \({{{{{{\rm{\sigma }}}}}}}_{O}^{2}\).
To infer the object’s direction of motion from the sensory signals, the Bayesian observer uses knowledge of these generative distributions to compute a likelihood function. Given the velocity measurements alone, the likelihood function \({L}_{V}\left({s\,\,}{x}_{V}\right)=p\left({x}_{V}{\,\,s}\right)\). When computed as a function of s, the likelihood reflects the range of possible motion directions that are consistent with the velocity measurement \({x}_{V}\). The observer similarly computes a likelihood function from the orientation signals. Notably, compared to the velocity measurements, the orientation signals are even more ambiguous with respect to motion direction, because a given orientation is consistent with two ranges of opposite motion directions. For example, a snowflake moving left and a snowflake moving right move along the same horizontal motion path. This is why the likelihood function given the orientation measurement is bimodal, with two peaks that indicate opposite motion directions:
How should the observer make use of all this information so as to determine the object’s direction of motion? Both likelihood functions provide information about the stimulus, and the Bayesian observer combines these to arrive at a better estimate. Specifically, under the assumption that the noise is independent, the observer simply multiplies the two likelihoods (see Supplementary Fig. 1 for predictions when noise is correlated). The observer then uses Bayes rule to infer the posterior distribution, \(p\left(s\,\,{x}_{O},\,{x}_{V}\right)\), which describes the range of possible directions of motion given the two measurements. Assuming that the prior \(p(s)\) is flat, \(p\left(s\,\,{x}_{O},\,{x}_{V}\right)\propto {L}_{V}\left({s\,\,}{x}_{V}\right){L}_{O}\left({s\,\,}{x}_{O}\right)\). While the prior is likely not flat for orientation signals^{12}, this assumption does not affect any of our conclusions. The peak of the posterior function is the most likely direction of motion, and we assume that this is the observer’s decision (maximum a posterior estimate, \({\hat{s}}_{{MAP}}\); please see Methods, Bayesian observer models, for a discussion of other readout strategies). We can take the entropy of the distribution (the Shannon information for a given probability distribution; see Methods) as a measure of the degree of uncertainty in this estimate. Because the sensory measurements vary from trial to trial, so do the observer’s estimates of the direction of motion. Across trials, this results in a distribution, \(p({\hat{s}}_{{MAP}}{\,\,s})\), which we will call the behavioral response distribution. As we will demonstrate in simulations, uncertainty is reduced when combining the two sources of information. For behavior, the integration of velocity and orientation signals is beneficial as well, but will also lead to a surprising bimodal pattern of responses under some conditions that we discuss below.
We simulated the trialbytrial decisions of the Bayesian observer when presented with noisy stimuli. Specifically, we used different levels of noise in the observer’s measurements to illustrate the predictions. We computed the posterior distribution, quantified uncertainty, and obtained the observer’s behavioral responses. In what follows, we start with the description of the simulations in the lowstimulusnoise regime, which corresponds to our fMRI study design. Later, we will describe the results of the simulations in a highstimulusnoise regime, which matches the design of a followup behavioral experiment.
First, we show that combining velocity and orientation signals is indeed beneficial for observers. To this end, we compare two different model observers. The first observer infers motion direction from velocity signals only, while the second one uses both orientation and velocity signals. We analyzed the relationship between behavioral variability and uncertainty for each model observer, with the velocity and orientation standard deviation (\({\sigma }_{V}\) and \({\sigma }_{O}\)) parameters spanning the range from 3° to 100° to ensure that our predictions hold for different parameter settings. We found that, in general, behavioral variability increases when uncertainty increases (Fig. 1b). In addition, the posterior distribution computed by the velocityonly observer always indicates greater levels of uncertainty than the posterior obtained by the observer who combines velocity and orientation likelihoods. This is because the orientation signals provide additional information as to which motion directions are likely. This reduction in uncertainty also results in improved behavior. That is, the more information (the less uncertainty) there is, the better the observer’s estimates of motion direction. Thus, behavioral variability decreases when both velocity and orientation signals are used. Overall, the simulations demonstrate that combining the velocity and spatial orientation signals is beneficial for the observer’s behavior: it decreases uncertainty, which reduces the variability of behavioral responses.
A second prediction of the model is that the posterior distribution could become bimodal when the orientation likelihood is combined with a sufficiently uncertain velocity likelihood. In other words, the inference process results in a posterior distribution that has two peaks (see Fig. 1c). In our simulations, we varied uncertainty by manipulating the amount of noise in the velocity measurements (\({{{\rm{\sigma}}}}_{V} \, > \, 30^\circ\) for our set of simulations; see Methods for details). When velocity signals indicate high uncertainty (so the likelihood function \({L}_{V}\left(s\,\,{x}_{V}\right)\) is wide), the orientation signal dominates in the posterior, resulting in a bimodal distribution. At the extreme, when the velocity signals provide no information at all (the velocity likelihood function is flat), the posterior becomes fully proportional to a bimodal orientation likelihood function (Supplementary Fig. 2). Of course, this is an extreme scenario in which the internal representation strongly deviates from the physical stimulus (for example, a low coherence stimulus combined with strong attentional effects on motion streaks might create a representation that is dominated by orientation signals). But even in less extreme noise scenarios, when velocity signals do provide some imprecise clues to motion direction (i.e., the velocity likelihood function is wide, but not flat), bimodality stemming from the presence of orientation signals is still visible, because the ambiguity of the orientation signal cannot be fully resolved by the velocity signals. This bimodal shape is exclusively tied to the presence of orientation signals, as the velocityonly observer always arrives at unimodal posteriors, even with increased velocity uncertainty. Notably, bimodal posteriors can also be observed when stimulus noise is low but there is additional noise from other sources, such as fMRI noise (see details in Methods). That is, the noise incurred by fMRI measurements further increases the uncertainty associated with each of the two cues (Supplementary Fig. 3) and can result in bimodality at the level of voxels – even when at the neural level the posterior is unimodal. To account for this in our predictions, we included fMRI noise in our simulations of this experiment (see Eq. 30). We verified that our predictions are qualitatively the same when the posterior distribution is estimated directly from voxel population activity (see Supplementary Methods). The bimodality in the observed posterior is particularly evident when averaging distributions across trials: When orientation signals are present, the resulting mean posterior distribution is bimodal with peaks around the true and opposite motion directions (Fig. 2a).
The bimodality in the shape of the posterior can be observed not only when averaged across trials, but also on a trialbytrial basis for the observer who uses both sources of information. The exact shape of the posterior varies from trial to trial and can be quantified by fitting a mixture of two von Mises components (basis functions) to the posterior, in line with its analytic description (see Methods). Parameters of the von Mises basis functions provide a convenient quantification of the posterior shape and allow us to trace the bimodality in the posterior at the singletrial level. Specifically, we analyzed the location of the peaks of the two components and plotted for each possible combination of locations the probability of observing a trial with this particular combination (i.e., the joint probability distribution of the component locations, Fig. 2b). Three clusters of trials are readily observed when looking at this plot. The first large cluster has two components colocated around the true motion direction. This combination of components corresponds to a unimodal posterior. In the second large cluster, the larger component is located around the true direction and a smaller one is located around the opposite direction of motion. This cluster is important for us, as it describes a bimodal posterior with a larger peak around the true direction. Finally, a small third cluster also corresponds to a bimodal posterior, but in this case the velocity measurement just happened to be closer to the opposite motion direction because of random noise. Accordingly, the larger component is located around the opposite direction, and the smaller fitted peak lies closer to the true direction of motion. This cluster of trials also signifies that orientation signals are present, but it might be difficult to detect in our empirical analyses, because it comprises only a small number of trials (about 0.3% in our simulations). In sum, when the observer computes a posterior distribution from two likelihoods, one obtained from velocity and the other from spatial orientation signals, then we should observe a large cluster of bimodal trials.
For the third prediction, we turned to the observer’s behavioral errors. We found that the direction and magnitude of the observer’s behavioral errors should be correlated with the location of either of the two peaks in a bimodal posterior. Fig. 2c shows the error in the observer’s simulated behavioral estimates of direction of motion (i.e., the difference between the MAP estimate and the true motion direction) as a function of the location of each peak in the posterior distribution. For the observer who uses both sources of information, there is a positive relationship between either peak location and the direction and magnitude of the behavioral error (Fig. 2c, left). That is, if the first peak is shifted clockwise relative to the true direction of motion (or the second peak is shifted clockwise relative to the opposite direction), the observed response is also shifted clockwise relative to the true direction. In contrast, for an observer whose representations are bimodal, but who nonetheless uses only velocity (and not orientation) signals, the relationship between the second peak location and behavioral errors has the opposite sign (Fig. 2c, right). This inverse relationship arises because the second orientation peak is “pulled” towards the velocity peak in the posterior. This gives rise to a negative circular correlation between the location of the second peak and the velocity estimate: when the velocity estimate shifts clockwise relative to the true stimulus, the second peak shifts towards it, counterclockwise. When the observer’s behavioral response is based on velocity signals alone, the behavioral response also has this inverse relationship with the second peak of the integrated posterior. Thus, the observed relationship between the second peak location and behavioral errors should enable us to determine whether or not orientation signals are used in the observer’s estimates of direction of motion; in other words, whether the bimodality in brain signals is behaviorally relevant.
Finally, we turned to a highnoise regime, in which the level of noise associated with the observer’s measurements is high (note that this refers to noise in the neural signals, and not the additional noise due to fMRI recordings). This noise regime matches that of our followup behavioral experiment. The model predicts that under these conditions, the behavioral response distribution should become bimodal. Specifically, the model shows that if the observer uses both sources of information, then bimodality should become more pronounced under high levels of noise. Fig. 2d shows the expected distribution of behavioral responses across trials for this condition; that is, the probability of an estimated motion direction given the true direction of motion. Interestingly, we observed a striking bimodal response distribution across trials. In other words, the observer cannot always reliably tell apart the true and the opposite motion direction in the posterior distribution, and sometimes mistakes the opposite direction for the true one. Altogether, this means that if the task becomes sufficiently difficult, we should sometimes observe a behavioral response that matches the stimulus in its orientation but not its direction of motion. We additionally analyzed the relationship between the shape of the posterior distribution and the observer’s behavioral responses, but found that it would not allow us to further adjudicate between the models (Supplementary Fig. 4). For this reason, we focus exclusively on the behavioral response distribution here, as these data are most informative.
In sum, when the observer uses velocity and spatial orientation signals, the model makes the following predictions:

1.
Uncertainty (posterior entropy) should be linked to behavioral variability.

2.
The posterior distribution should be of bimodal shape (with the peaks at the true and the opposite motion direction), both when averaged across trials and for a large portion of trials in trialbytrial analyses.

3.
Both peaks of the posterior should predict behavioral responses; that is, the direction and the magnitude of the observer’s errors.

4.
High levels of uncertainty should result in a bimodal response distribution in behavior.
With these predictions in hand, we now turn to the experimental data to see if they hold true.
fMRI study
How do human observers represent and estimate motion direction? We ran an fMRI study and a followup behavioral experiment to address this question. In the fMRI study, we used randomdot kinematograms with 100% coherent motion. Because of relatively low levels of neural variability, we predicted that, behaviorally, this would correspond to the lownoise motion regime of the simulations. Participants viewed dots moving in a single direction in an annular window for 1.5 seconds. After a brief delay period, they reported the direction of motion of the stimulus by rotating a bar presented at fixation. Under these conditions, the task was relatively easy for the participants, whose behavior showed a mean absolute error of M = 6.18°, 95% CI = [5.72°, 6.62°]. For each trial of fMRI data, we decoded a probability distribution over motion direction from patterns of BOLD activity in visual areas V1, V2, V3, hV4 and hMT+ combined, using a probabilistic decoding technique^{7,17}. We took the most likely value of the decoded distribution as the decoded direction of motion, and its entropy reflected the degree of decoded uncertainty. Please note that the reliability of the two cues likely changes with stimulus parameters such as motion speed and dot size, so our results likely depend on the specific dot size and chosen speed of 7 deg/s.
To benchmark the decoding approach, we first tested the degree to which the decoded direction of motion matched the true motion direction of the stimulus (Fig. 3a). The decoded and true directions were significantly correlated, with a mean circular correlation coefficient across participants of r = 0.72, 95% CI = [0.62, 0.80], BF = \(2.82\times {10}^{5}\) (Bayes factors above 3.2 are usually taken to indicate substantial evidence, above 10 – strong, and above 100 – decisive evidence^{13}). We then tested the decoder’s assumptions about the covariance structure of the data. To the extent that the model assumptions match the true generative structure of the data, the decoded uncertainty should be correlated with the magnitude of the error in the decoded direction of motion. Indeed, we found this to be the case (BF > \({10}^{170}\)). Together, these findings indicate that the decoder provides a reasonable estimate of the aggregated sources of uncertainty in the data.
To test the degree to which the algorithm also captured neural sources of uncertainty (as opposed to imprecision due to the fMRI measurements), we then turned to behavior. Specifically, we reasoned that a more precise neural representation in cortex should result in more precise behavior (as also quantified by our simulations, see Fig. 1b). To test this relationship and benchmark the degree to which the decoding technique was able to catch neural sources of uncertainty in particular, we first investigated the link between decoded uncertainty and behavioral variability across motion directions, using Bayesian hierarchical regression to estimate the withinsubject effect of uncertainty on behavioral variability while accounting for individual differences between participants (Fig. 3b). The judgments of motion direction of our participants showed a classical oblique effect^{14,15,16}: observer responses were more variable at oblique compared to cardinal directions. Specifically, behavioral variability increased from 6.18° at cardinal to 9.42° at oblique directions (BF = \(2.1\times {10}^{15}\)). Importantly, decoded uncertainty also increased from cardinal to oblique directions of motion (entropy changed from 7.40 bits for cardinal to 7.49 bits for oblique, BF = \(1.6\times {10}^{13}\)), and this change in entropy across directions was significantly correlated with behavioral variability, b = 9.17, 95% HPDI = [4.73, 13.65], BF = \(8.54\times {10}^{5}\) (b is the regression coefficient). This indicates that across motion directions, the decoded distributions reflect the precision of the information contained in underlying neural activity.
Even when the stimulus is held constant, uncertainty varies on a trialbytrial basis due to random fluctuations in cortical activity. Is there a relationship between uncertainty and behavior when betweenstimulus variability is accounted for? We tested if our approach captures trialbytrial fluctuations in the fidelity of the cortical representation by quantifying the effect of uncertainty on behavioral variability in a hierarchical Bayesian regression model that accounted for the oblique effect in addition to betweensubject differences. Decoded uncertainty again reliably predicted behavioral variability (b = 5.93, 95% HPDI = [1.98, 9.96], BF = 157). Control analyses showed that these results cannot be explained by mean BOLD amplitude, head motion, gaze fixation position, and variability of gaze fixation positions (Supplementary Fig. 5). This suggests that our decoder captures not only betweenstimulus uncertainty, but also fluctuations in the quality of the underlying cortical representation on a trialbytrial basis (Fig. 3c).
Having established that the decoded distributions are meaningful and capture the degree of uncertainty in the underlying cortical representation, we then turned to the shape of the distribution. To what degree do the decoded distributions show evidence of an advantageous estimation process in which velocity and orientation signals are combined for estimates of motion direction? Our simulations suggest that for such an advantageous decision process, the mean decoded posterior across trials should be bimodal, with a larger peak on the true direction and a smaller peak at the opposite direction of motion (Fig. 2a). Indeed, this is what we found when we analyzed the shape of the decoded posterior (Fig. 4a). When averaged across trials, the decoded distributions had peaks located around the true (M = 0.01°, 95% CI = [0.13, 0.10]) and opposite (M = 178.69°, 95% CI = [177.04, 180.00]) motion directions for 16 out of 18 participants (for the remaining two participants, the average posterior was unimodal and located at the true direction of motion). This similarly held for individual visual areas, including hMT+ (Supplementary Fig. 6). That is, on average, the decoded posterior is bimodal, matching our predictions.
Our decoding approach further allowed us to more formally test for the presence of bimodality and demonstrate that the same pattern is observed on a trialbytrial basis. This is important because, when evaluated across trials, the bimodality in the averaged decoded posterior might result from an aggregation of unimodal trials with peaks concentrating either at the true or at the opposite direction. To address this concern, we quantified the shape of the posterior for each individual trial as a mixture of two von Mises components (basis functions) and estimated the location of each component (i.e., its mean). Fig. 4b shows how these locations are distributed across trials (their joint distribution). This pattern of results is qualitatively very similar to what is predicted when the posterior reflects both velocity and spatial orientation signals, and rather distinct from when it would be based on orientation or velocity signals alone (Supplementary Fig. 7). To quantify these results, we estimated the number of clusters in the observed pattern of location combinations. We modeled the joint distribution of the peak locations with models that assumed one to three independent clusters of peaks (see Methods). For example, a model with a single cluster assumed that the peak locations on all trials are similar (belong to a single bivariate von Mises distribution). The best fit was provided by a twocluster model (∆WAIC against the singlecluster model = 9741, \({BF} > {10}^{20}\); adding a third cluster did not significantly improve the fit, ∆WAIC = −0.2) with the first cluster corresponding to unimodal trials (the two peaks overlap and are located at the true direction) and a second cluster corresponding to bimodal trials (a larger peak at the true and a smaller peak at the opposite direction). This bimodality in the posterior distribution was observed for a significant portion of the trials (52% were allocated to the second cluster). These results match the model predictions (Fig. 2b) and show that the bimodality in the decoded posterior is not an artifact of aggregation, but rather is a property of the singletrial posterior. This provides further support for our theoretical predictions: the presence of orientation signals leads to a bimodal posterior distribution both on average and in trialbytrial analyses.
Next, we turned to the third prediction of the Bayesian observer model: if the two peaks of the decoded distribution are relevant for behavior (which would suggest that the observer uses spatial orientation signals to estimate direction of motion), then there should be a relationship between peak location and the direction and magnitude of the error in the participant’s behavioral response. More specifically, the relationship should be positive for either peak of the distribution. Thus, when a first peak is shifted clockwise relative to the true direction of motion (or a second peak is shifted clockwise relative to the opposite direction), the observed response should also be shifted clockwise relative to the true direction. Testing this prediction revealed that the location of either peak is indeed positively and reliably correlated with the behavioral errors of our participants (the peak located closer to the true direction: b = 0.63, 95% HPDI = [0.40, 0.88], BF = \(2.46\times {10}^{5}\); the peak located closer to the opposite one: b = 0.30, 95% HPDI = [0.10, 0.50], BF = 19.05; Fig. 3c). Control analyses showed that these results cannot be explained by the direction in which saccades are made (Supplementary Fig. 8). Thus, the decoded direction of motion (i.e., the location of the first peak) reliably predicts the trialbytrial behavioral responses of our participants. Crucially, also the second peak of the decoded distribution appears to be behaviorally relevant, providing further support for the hypothesis that human observers use orientation signals when they are estimating direction of motion.
Followup psychophysical experiment
We then tested the final prediction of our model: when uncertainty is high, also the behavioral estimates of the participants should follow a bimodal distribution, much like their internal representation. In a followup behavioral experiment, we increased levels of uncertainty using stimuli in which only 18% of dots were moving in a single direction while all other dots moved in random directions. The conditions were otherwise identical to the fMRI study and the participants performed the same task. We first confirmed that our manipulation increased uncertainty. Indeed, participants performed on average much worse than in the main study (a mean absolute error of M = 41.02°, 95% CI = [33.42°, 49.44°] in this experiment against M = 6.18°, 95% CI = [5.72°, 6.62°] in the main study, Supplementary Fig. 9; but note that the mean error estimates are less informative for this experiment because of the shape of the response distribution, as we discuss next). Further analysis of the behavioral data revealed clear bimodality in the behavioral estimates of our participants. Their responses were clustered around not only the true direction (the main diagonal in Fig. 5a, b), but also the opposite direction of motion (the dashed lines parallel to the main diagonal in Fig. 5a, b). To quantify the degree of bimodality in the behavioral responses, we fitted and then compared two models equivalent to the descriptive models we used in the analysis of the decoded posterior shape: 1) a unimodal (von Mises) distribution centered on the true direction, and 2) a bimodal distribution mixture (two von Mises) with peaks centered at the true and the opposite direction. Both models additionally included a uniform component to account for random guesses. The models were first fitted to each observer’s responses separately, and the results were then combined across participants. The bimodal model provided a significantly better fit than a model consisting of a single peak (group ∆BIC = 322). This indicates that across participants, the distribution of behavioral responses is indeed bimodal with peaks centered at the true and opposite motion direction. That is, for higher levels of uncertainty, the participants reported seeing dots moving in a direction opposite from the true direction of motion. Altogether, this shows that human participants use spatial orientation signals when estimating motion direction, leading to a surprising bimodality not only in the representations decoded from early visual cortex, but also in behavior.
Discussion
What neural computations enable human observers to infer the direction in which an object is moving? Here, we argue that the nervous system uses not only velocity but also spatial orientation signals to estimate motion direction^{2}. We implemented this hypothesis in a Bayesian observer model and tested its predictions using a combination of psychophysics and fMRI. Using a generative modelbased fMRI analysis technique^{7,17}, we decoded probability distributions of motion direction from activity in areas V1V4 and hMT+. Corroborating the predictions of the Bayesian observer model, we discovered that the decoded distribution of motion has a bimodal shape. Moreover, the two peaks of the distribution predicted the magnitude and direction of the participants’ behavioral errors. In a followup behavioral experiment, we furthermore showed that this bimodal shape is also observed in the distribution of the participants’ behavioral responses when analyzed across trials, indicating that the participants sometimes reported the direction of motion opposite to the true one, as predicted by the Bayesian model. Altogether, this suggests that the nervous system uses not only velocity signals, as assumed by dominant models of motion perception (e.g.,^{18}), but also spatial orientation cues to infer motion direction.
It is interesting that the decoded direction of motion predicted not only the presented direction of dot motion with relatively high accuracy across the 360degree motion feature space, but also the subjective judgments of the observer, irrespective of the presented stimulus. That is, the direction decoded from cortical activity predicted the participants’ behavioral errors on a trialbytrial basis. This suggests that the decoded representations are behaviorally relevant and not an artifact of the fMRI measurements. This relationship between cortical activity and behavior is reminiscent of previous neurophysiological results showing a correlation between neural activity and behavioral choices (“choice probability”^{19,20,21,22,23}) – a link that is often taken to indicate that the signals are used by the animal to determine its decision. Our ability to predict the participant’s judgments surpasses that of previous fMRI findings, and is likely due to our use of the TAFKAP decoder, which improves decoding precision by estimating not only voxel tuning properties, but also the voxel (co)variance induced by the fMRI measurements and neural variability^{7,24}.
Previous work has shown that the degree of imprecision in the cortical representation of orientation^{17,25,26} and location^{27} can reliably be decoded from fMRI activity patterns. The current study extends these earlier findings by showing that also the fidelity of the cortical representation of motion can be successfully characterized with fMRI. That is, the decoder’s trialbytrial estimates of uncertainty predicted behavioral variability – a measure of perceptual imprecision. This illustrates the versatility of the probabilistic decoding approach and suggests that the neural code for uncertainty may be similar across different visual features, such as motion, location and orientation.
As integrating fully correlated signals would make little sense from a computational standpoint (no additional information would be gained from doing so), our Bayesian model assumes that orientation and velocity signals are conditionally independent. While full independence is likely too strong an assumption for visual cortical neurons (e.g., due to common retinal input), it seems nonetheless likely that a good fraction of the noise added by postretinal stages of processing will not be shared – for example, because orientation and velocity signals are processed in segregated visual pathways^{9,11} and by neurons with different spatiotemporal receptive fields^{28,29,30}. Computationally, as long as some of the noise is independent, the estimate will be better when signals are combined. Indeed, our results suggest that human participants use such integration strategies, even when the cues are likely partially correlated.
It is well known that fMRI signals reflect many forms of (correlated) noise, in addition to the correlated sources of noise in the orientation and velocity signals. For example, the amount of noise in the orientation and velocity signals could fluctuate jointly due to factors such as attention or alertness. Voxel responses could also be correlated due to nonneural sources of noise that are associated with the fMRI measurements themselves, such as participant head motion. However, none of these correlations can explain the observed relationship between the locations of the peaks in the decoded posterior and the direction and magnitude of behavioral errors (Supplementary Figs. 10 and 11). That is, for both scenarios, the predicted correlation between the second peak location and behavioral errors would be negative, much like the pattern shown in Fig. 2c (“velocityonly readout”). This is opposite to what we find in the data, further supporting the conclusion that orientation signals are used by the observers in their direction judgments.
Previous behavioral work has also suggested that the nervous system might use spatial orientation signals to determine motion direction^{2,5,6,31,32,33,34,35}. For example, prolonged exposure to moving dot stimuli creates aftereffects similar to the effects produced by static gratings^{33}, and removing information about streaks increases thresholds for motion detection^{5}. Other studies have highlighted potential neural mechanisms that could give rise to motion streak sensitivity^{9,10,11,29,36,37,38,39,40} or found preliminary evidence to suggest that streakbased signals are represented in the human visual cortex^{41}. Our work extends these previous findings in several ways. Our normative model explains why observers should use both velocity and spatial orientation cues, as integrating these signals reduces uncertainty and improves direction estimates. The model furthermore made a number of quantitative and qualitative predictions that we tested in experiments. This revealed that motion direction is represented in cortex as a bimodal probability distribution – a level of complexity that stands in sharp contrast to previously observed probabilistic representations, such as those for orientation and location^{17,25,26,27}. No less important, we discovered that the shape of the bimodal distribution is linked to the participants’ behavioral estimates in various ways. Altogether, this suggests that the human visual system uses spatial orientation signals for determining direction of motion and reveals the hidden complexity of probabilistic feature representations in cortex.
We also considered several alternative strategies to judging direction of motion, in addition to the Bayesian observer and velocityonly models (Supplementary Fig. 12). The first alternative model assumed that the observer only uses velocity signals to infer motion direction, with an arbitrary constant bias away from the velocitybased estimate. However, this strategy cannot capture bimodality in the decoded posterior distribution, nor does it explain the bimodal behavioral response distribution as observed here. The second model assumed that observers combine spatial orientation and velocity signals while ignoring uncertainty. That is, the response is a weighted average of the velocitybased and orientationbased estimates, where the weights are assigned randomly. While this model does capture the bimodal behavioral response distribution for high levels of orientation noise, it wrongfully predicts a very wide behavioral response distribution when the precision of velocity and orientation signals is, respectively, high and low. This is inconsistent with behavioral data from previous studies showing that observers perform relatively well at slow motion speeds when orientation information is presumably very noisy or even absent (e.g., ^{5}). The third model assumed that observers randomly switch between the orientation and motion likelihoods when making the decision^{42}. This model predicts that bimodality is always observed, regardless of the degree of uncertainty. Critically, this is not what we observe in our data, where bimodality in the behavioral response distribution clearly depends on stimulus reliability. Finally, we considered a strategy based on the motion aftereffect. Here, the hypothesis is that observers experience and report aftereffects after viewing the stimulus, which results in behavioral responses that are opposite to the true direction of motion. However, such a model would predict stronger aftereffects with greater motion coherence in the stimulus (so lower uncertainty), because the strength of aftereffects is positively related to signal strength (e.g., ^{43,44}). Therefore, greater bimodality in behavior is expected with greater certainty, which is opposite to our results. In sum, none of the alternative models considered can explain the full scope of our findings.
While our data suggest that observers combine velocity and orientation cues when inferring motion direction, we do not argue that these cues are necessarily integrated optimally – that is, that each estimate is perfectly weighted by its uncertainty. We believe it will be difficult, if not impossible, to argue and test for optimality in this particular situation. The main reason for this is that it will be difficult to infer the likelihood for the spatial orientation and velocitybased signals alone. That is, in a typical cue integration experiment, the likelihood of each cue is manipulated by the experimenter and therefore (roughly) known. This makes it possible to predict what the behavioral response should be for both the optimal integration strategy and alternative strategies that ignore uncertainty. For the integration problem considered here, however, the likelihoods are not a priori known to the experimenter, and would have to be inferred from brain data. An fMRI voxel, however, reflects the aggregate response of many neural populations, where the responses from the individual populations are unknown. Without knowledge of the individual signals for spatial orientation and velocity, their likelihoods cannot be calculated, which makes it impossible to predict and compare between the Bayesian and alternative integration strategies.
What neural mechanisms might underlie the observed bimodal distribution in visual cortex? Studies in nonhuman primates^{9,10,29,36,37}, mice^{8} and cats^{10} have shown that many orientationtuned neurons in primary visual cortex respond to dots moving parallel to their spatial orientation receptive field. It seems likely that similar neural response properties could have led to the bimodal posterior distribution observed here. Interestingly, also many directionselective neurons in V1 are tuned somewhat bimodally, with strong responses to one motion direction and a weaker response to the opposite direction^{9,10}. To address whether these tuning properties could similarly give rise to a bimodal posterior distribution, we simulated neural population activity using a realistic range of direction selectivities (see Supplementary Fig. 13). We found that the posterior distribution decoded from the obtained population response is always unimodal and never bimodal. This strengthens the hypothesis that the empirically observed posteriors reflect the combined responses of directionselective neurons and orientationtuned cells whose spatial orientation receptive field runs parallel to the presented motion direction.
Interestingly, we observed bimodal posterior distributions throughout visual cortex (i.e., areas V1, V2, V3, hV4, and hMT+, see Supplementary Fig. 6), with no clear differences between areas. It is important to keep in mind, however, that the signaltonoise ratio is likely not constant across these regions, which makes it difficult to draw firm conclusions from this finding. The signaltonoise ratio also makes it difficult to test which cortical areas integrate the velocity and spatial orientation signals, as a unimodal posterior at the level of voxels does not necessarily imply unimodality in the underlying populations (Supplementary Fig. 14). Notwithstanding these considerations, it does seem likely that all the areas investigated here should show at least some degree of bimodality, as they all contain orientation and velocity sensitive neurons (e.g., ^{11,28}).
Our stimulus consisted of dots moving at 7 degrees visual angle per second. Interestingly, this speed roughly matches that of optic flow in the natural environment. That is, for a person with eyes 1.5 m above the ground who is walking at 1.4 m/s (the average walking speed), the optic flow in the ground plane will be 7 deg/s at 5 m to the left and right. This suggests that the observed motion streak signals may be highly relevant for the encoding of optic flow and other forms of realworld motion. To capture more complex realworld scenarios, our Bayesian model would have to be extended so as to also include, for example, mechanisms of causal inference^{45}. It would enable the observer to determine whether or not the motion and orientation signals share a common cause and should be integrated or rather segregated, much like earlier mechanistic models have proposed before^{2}.
Our Bayesian decoding approach differs from previous methods in that we explicitly describe the generative structure of the data – that is, we model the effects on the cortical response for each stimulus. To infer the range of motion directions that could have caused a given cortical response (i.e., the posterior probability distribution), we simply invert this model using Bayes’ rule (see Methods). This contrasts strongly with other decoding methods, such as SVM^{46} or IEM^{47}, which merely focus on the response’s single most likely interpretation. This methodological difference may explain why previous fMRI decoding studies^{46,48,49,50} did not observe bimodality in the acrosstrial histogram of decoded motion directions: under low levels of uncertainty, the best guess estimate usually falls within a narrow range of the true direction of motion, and hardly ever on the opposite motion direction. Altogether, our findings show how fully characterizing the full probability landscape can improve our understanding of the computational mechanisms of cortical feature extraction.
Our work furthermore demonstrates the added value of visualizing cortical representations for understanding behavior. Crucially, while the influence of spatial orientation signals on direction estimation remained hidden in the participants’ behavioral estimates, their effects were uncovered via the decoding of activity patterns in visual cortex. That is, while the behavioral histograms showed bimodality for highuncertainty conditions only, the decoded probabilistic cortical motion representation nonetheless revealed that orientation signals provide information about motion direction even when uncertainty is relatively low. These results point to the veiled intricacy of perceptual decisionmaking in a direction estimation task.
Furthermore, our results suggest that at even the earliest levels of cortical processing, multiple sources of evidence are combined to better represent the visual environment. It is well known that the visual system integrates multiple cues to infer visual properties for midlevel object properties, such as depth or object shape^{1,4}. At a first glance, motion direction estimation may seem like a straightforward task, devoid of the need for additional evidence – after all, why would directionsensitive neurons not provide sufficient information for direction estimation (see, e.g., ^{51}, for a review)? However, as we show here, even for simple tasks the brain appears to utilize additional cues, such as spatial orientation, to reduce uncertainty. This highlights the fact that even simple tasks might be more intricate from the brain’s perspective and that cue integration may be a ubiquitous feature of the human visual system.
Methods
Participants
This study complies with all relevant ethical regulations and was approved by the local ethics committee (Commissie Mensgebonden Onderzoek Regio ArnhemNijmegen, The Netherlands; Protocol CMO2014/288). Participants provided written informed consent before participation and received monetary compensation for their participation. 18 participants (aged 1832, ten female, based on selfreport) with normal or corrected to normal vision participated in the study. We did not test for differences in effect across gender, as it is unlikely that this factor will underlie differences in lowlevel visual cortical processing.
fMRI data acquisition
MRI data were acquired using a Siemens 3 T MAGNETOM PrismaFit MR scanner with a 32channel head coil located at the Donders Center for Cognitive Neuroimaging. For each participant and each session, a highresolution T1weighted magnetizationprepared rapid gradient echo anatomical scan (MPRAGE, FOV 256 × 256, 1mm isotropic voxels) was collected at the start of the session. Functional imaging data were acquired using T2*weighted gradientecho echoplanar imaging covering the whole brain (68 slices, TR 1500 ms, TE 38.60 ms, FOV 210 × 210, slice thickness 2 mm, inplane resolution 2.019 × 2.019 mm).
Experimental design and stimuli
fMRI experiment
The fMRI experiment was run using an ASUS GL502V laptop (OS Kubuntu 17.04) connected to a luminancecalibrated projector EIKI LC  XL100 (resolution 1024 × 768 pixels, refresh rate 60 Hz). Participants viewed the visual display through a mirror mounted on the head coil. The stimuli were generated, and the experiment was controlled, using MATLAB and the Psychophysics Toolbox^{52,53,54}.
The stimulus consisted of dots coherently moving in a pseudorandomly chosen direction (i.e., to ensure an even sampling of directions in each run, 18 evenlyspaced directions were selected from the full 360 deg. range with a random offset and were presented in random order during the run) within a circular aperture centered at the fixation point (inner radius 1.5 degrees of visual angle, dva; outer radius 7.5 dva; dot contrast reduced to 0 over the outer and inner 0.5 dva radius of the aperture). Each dot was white and had a Gaussian envelope with SD = 0.03 dva. There were 530 dots in total, resulting in an average density of approximately 3 dots/dva^{2}. The dot density was uniform within the aperture. Each dot was moving at 7 dva/s and had a limited lifetime of 10 to 14 frames (167 to 233 ms, randomly chosen for each dot). At the end of a dot’s lifetime, it was pseudorandomly repositioned in such a way that uniform dot density was maintained.
Participants were required to maintain fixation on a bull’s eye target (diameter 0.5 dva) throughout the experiment. Each run consisted of an initial fixation period (12 s), followed by 18 trials (12.5 s each with a 4second intertrial interval) and a final fixation period (12 s). Each trial began with the disappearance of the fixation target, which reappeared after 100 ms. After another 400 ms, the stimulus was presented and remained on the screen for 1500 ms. This was followed by a 6 s fixation interval, after which a black line (length 0.9 dva) appeared at fixation (Supplementary Fig. 15). The participants reported the direction of motion of the dots by rotating this line. They did this by pressing the upper buttons on a Current Designs’ HHSC2×4C fMRI response pad with the index fingers of the right and left hands. The response window was 4.5 s in duration, and the line began to dim after 3.5 s to indicate the approaching end of this window. Participants received no trialbytrial feedback about the accuracy of their judgments.
The participants completed 39–49 stimulus runs during three experimental sessions on separate days. Before the experiment, the participants additionally participated in a 30minute training session outside the scanner to ensure that they understood the task. Each scan session also included two visual localizer runs, which were used to select voxels that responded to the retinotopic location of the stimulus. The localizer stimulus consisted of moving dots presented within a circular aperture (described by the same parameters as the main experiment; it did not include the retinotopic area in which the response bar appeared). The dots were presented in seven 12s intervals (“stimulus interval”), interleaved with fixation intervals of equal duration. During stimulus intervals, dot motion direction changed every 1.5 s, resulting in 8 directions of motion per interval. The 8 directions were chosen pseudorandomly in the same way as during the main task. To ensure that participants paid attention to the localizer stimulus, they were asked to press a response button when the stimulus briefly dimmed to 50% contrast. Dimming events lasted 500 ms and appeared at random intervals with 2 to 7 seconds between events.
Retinotopic maps of the visual cortex were acquired in a separate scanning session using conventional retinotopic mapping procedures^{55,56,57}. To determine the cortical boundaries of hMT + , we used two functional localizers based on a combination of approaches from previous studies^{50,58,59,60}. Each localizer was repeated 3 to 7 times, either within a separate session or combined with the retinotopy session. For the first localizer, the participants viewed coherently moving dots presented in seven 12s intervals, interleaved with seven 12s intervals in which randomlymoving dots were presented. For the second localizer, we contrasted dots moving inwards or outwards from the fixation point (an optic flow pattern) with a static dot pattern, again presented in interleaved fashion with 12s intervals. For both localizers, the dots had the same parameters as in the main fMRI experiment (i.e., dot color: white; Gaussian dot envelope with SD = 0.03 dva; 530 dots; dot density uniform; approximately 3 dots/dva^{2}; each dot moving at 7 dva/s; limited lifetime of 10 to 14 frames (167 to 233 ms), randomly chosen for each dot). The dots were presented within an aperture with a radius of 8.7 dva and no inner window (dot contrast reduced linearly to 0 over the 0.5 dva outer radius of the aperture). During the coherent motion presentation (first localizer), the dots’ direction changed every 1.5 s (8 evenlyspaced directions were selected from the full 360 deg. range with a random offset, and were presented in random order for each 12s interval). During the random motion presentation, each dot direction was selected randomly. For the optic flow pattern (second localizer), motion direction (inward or outward) changed every 1.5 s. For the static dot pattern, the dots were generated using the same parameters as for the other localizers but did not move. The dot pattern was generated anew every 1.5 s. For both localizers, the same attention probes as for the withinsession localizer were used: participants were asked to press a response button when the stimulus briefly dimmed to 50% contrast (dimming events lasted 500 ms and appeared at random intervals with 2 to 7 seconds between events).
Followup psychophysical experiment
The followup behavioral study was run using a luminancecalibrated LaCie Electron 22blue II CRT display using the same laptop and software as in the fMRI experiment. A chinrest was used to stabilize the participant’s head and reduce motion. The experiment was run in a dark, soundproof room with the display as the only light source. The task, run and trial structure, and stimuli were the same as in the main fMRI experiment, except that only 18% of the dots (randomly selected) on each trial moved in a single direction, while the directions of the remaining dots were distributed uniformly (i.e., randomly sampled from a uniform distribution). Participants responded using the left and right arrow keys on a keyboard. Each participant completed 1220 runs within a single session.
Data preprocessing
Behavioral data
Cardinal biases (Supplementary Fig. 16) were removed from the behavioral data by fitting four 4^{th}degree polynomials to each observer’s behavioral errors as a function of motion direction. Specifically, we first determined bias direction by fitting two models that described either attraction or repulsion from cardinal directions. For the model that describes attraction biases, the behavioral errors are expected to be close to zero at cardinal directions, hence trials were split into four 90degree bins centered at cardinal ({0, 90, 180, 270} degrees) directions. Dividing trials into bins enabled us to model the discontinuity that arises from repulsive biases around cardinal (see subject B in Supplementary Fig. 16 for an example; see also^{17,61}). For each bin, we then fitted a regression model with 4^{th}degree orthogonal polynomials of motion direction (computed relative to the bin center) as independent variables and behavioral error as the dependent variable using the GAMLSS package in R^{62}. To account for the heterogeneity of responses across motion direction (e.g., the oblique effect), the standard deviation of the behavioral errors was allowed to vary with distance to the polynomial’s center. Accordingly, for each bin, we predicted the mean and standard deviation of participant errors as a function of the distance to the bin center. For the repulsion biases, on the other hand, the errors are expected to be close to zero at oblique directions. Hence, trials were split into four 90degree bins centered at oblique ({45, 135, 225, 315} degrees) directions, but all remaining steps were identical. Both polynomial models were fitted to the data of each bin using maximum likelihood estimation. To remove the bias, the best fitting model (i.e., either attraction or repulsion bias) as indicated by their likelihood was selected. We used the residuals of these fits in subsequent analyses. We verified that our conclusions remain the same if no bias correction is applied. Errors that were larger than ±3 times the predicted standard deviation (obtained from the regression models described above) were considered outliers and not included in subsequent analyses (0.7% of all trials).
fMRI data
Functional images were motion corrected using FSL’s MCFLIRT^{63} and passed through a highpass temporal filter with a cutoff period of 50 s to remove slow drifts in the BOLD signal. Residual motioninduced fluctuations in the BOLD signal were removed through linear regression, based on the alignment parameters generated by MCFLIRT. Functional volumes were aligned to an unbiased withinsubject anatomical template, which was created using the participant’s anatomical templates as obtained in each scanning session and FreeSurfer’s longitudinal processing stream^{64,65,66}.
Regions of interests (ROIs) were defined using standard retinotopic procedures^{55,56,57} (visual areas V1, V2, V3AB, and hV4) and a functional localizer (hMT+). Specifically, for each individual participant, hMT+ was delineated manually on the inflated cortical surface as the area that included voxels responding more strongly to both 1) moving (i.e., optic flow patterns) rather than static dots (p < .05, FDRcorrected), and 2) coherent rather than random motion (p < .05, FDRcorrected; see Supplementary Fig. 17 for an example participant). Unless otherwise specified, individual ROIs were combined into a single ROI for the main analyses.
Voxels that responded to the retinotopic location of the stimulus were selected based on the withinsession stimulus localizer. Specifically, within the native space of each participant, we selected all voxels within the ROI that were activated by the withinsession stimulus localizer at a lenient threshold of p < .01 (uncorrected). The BOLD response of each voxel and time point within a given trial was znormalized using the corresponding time points of all trials within the run. Activation patterns for each trial were defined by averaging together the first 4.5 s (3 TRs) of each trial, after adding a 3s (2 TRs) temporal shift to account for the hemodynamic delay (Supplementary Fig. 18). Control analyses verified that the results were similar for individual visual areas (Supplementary Fig. 6), and not strongly affected by changes in the number of voxels selected for analysis (Supplementary Fig. 19). We furthermore confirmed that the selected time window was close to the peak of the hemodynamic response function (Supplementary Fig. 18).
For the control analyses involving head motion (Supplementary Fig. 5), we calculated for each participant and at each time step, the squared root of the sum of squares (i.e., the Euclidean norm) of the temporal derivatives of the realignment parameters as estimated by the motion correction algorithm; this quantity reflects the amount of head motion per time step. To obtain a measure of head motion per trial, the data were subsequently averaged across the trial’s first 4.5 s, similar to our main analysis.
Decoding analysis
We used a generative modelbased method for estimating the degree of uncertainty in the cortical representation. This method (called TAFKAP^{7,17,24}) computes the posterior distribution of motion direction from a given cortical response as measured with fMRI.
Generative model
The TAFKAP decoding algorithm assumes that BOLD activity varies randomly from trial to trial around a fixed stimulusdependent mean that is different for each voxel:
where \({b}_{i}\) is the response of voxel i, \({f}_{i}(s)\) is the voxel’s mean response to stimulus s, and \({\varepsilon }_{i}\) reflects random noise.
The mean response of the ith voxel as a function of stimulus s (i.e., the voxel’s “tuning function,” \({f}_{i}(s)\)) is modeled as a weighted sum of \(K=8\) bellshaped basis functions:
where \({\varphi }_{k}\) is the preferred motion direction of the kth basis function (in radians), and the K \({\varphi }_{k}\)’s are spread evenly across the 2π motion space.
Around its tuning function \({f}_{i}\left(s\right)\), each voxel is assumed to fluctuate randomly due to Normally distributed noise \({\varepsilon }_{i}\). This noise is described by covariance matrix \({{{{{\mathbf{\Omega }}}}}}\), such that \({{{{{\mathbf{\varepsilon }}}}}}\, \sim {{{\mathscr{N}}}}{{{{{\boldsymbol{(}}}}}}0,{{{{{\mathbf{\Omega }}}}}}{{{{{\boldsymbol{)}}}}}}\). The probability of cortical activity pattern \({{{{{\bf{b}}}}}}={\left[{b}_{i}\right]}^{T}\) is therefore given by
where \({{{{{\mathbf{\theta }}}}}}=\left\{{{{{{\bf{W}}}}}},\, {{{{{\mathbf{\Omega }}}}}}\right\}\) describes the model’s free parameters (determined by the data), and \({{{{{\boldsymbol{f}}}}}}\left(s\right)=\left[{f}_{i}(s)\right]\) are the voxel tuning functions (via W also determined by the data).
The covariance matrix of this multivariate Normal distribution was obtained as follows. Ideally, we would have used the sample covariance. However, when the number of voxels is larger than the number of trials, the estimation of the sample covariance matrix is noninvertible. To improve the estimation of the covariance matrix, TAFKAP therefore uses a concept called “shrinkage.” Specifically, the model’s covariance matrix Ω is modeled as the sample covariance matrix \({{{{{\mathbf{\Omega }}}}}}_{{{{{{\rm{sample}}}}}}}\) “shrunk” towards a parametrized theoretical covariance matrix \({{{{{\mathbf{\Omega }}}}}}_{0}\):
where λ is a shrinkage parameter. The sample covariance is defined as follows:
And given TAFKAP’s assumptions, the theoretical matrix \({{{{{{\mathbf{\Omega }}}}}}}_{0}\) is given by:
where the first component (\({\sigma }^{2}{{{{{\bf{W}}}}}}{{{{{{\bf{W}}}}}}}^{{{{{{\rm{T}}}}}}}\)) describes variance shared among similarlytuned voxels, the second component (\(\left(1\rho \right){{{{{\rm{diag}}}}}}({{{{{{\boldsymbol{\tau }}}}}}}^{2})\)) describes independent sources of variance, and the third component (\(\rho {{{{{\mathbf{\tau }}}}}}{{{{{{\mathbf{\tau }}}}}}}^{{{{{{\rm{T}}}}}}}\)) captures noise shared globally across all voxels. The \({\tau }_{i}^{2}\) parameter in TAFKAP is given by:
where \({\lambda }_{{{{{\mathrm{var}}}}}}\) is another shrinkage parameter. Please see ref. ^{17} for the derivation of the theoretical covariance matrix, and ref. ^{7} for further detail and rationale regarding TAFKAP’s shrinkage estimation of the model’s parameters.
Training and testing
Model parameters were estimated for each individual participant in a leaveonerunout crossvalidation procedure. That is, the model’s free parameters were first estimated from the data of all but one run, and then the remaining run was used to decode posterior distributions on a trialbytrial basis. Each run was used as a test set once. While training the model, TAFKAP uses “bootstrap aggregating” or “bagging” to take the uncertainty of model parameters into account. Specifically, trials in the training set were resampled many times (with replacement) to generate resampled data sets, each of which had the same number of trials as the original set. Model parameters were estimated for each set using ordinary least squares (see details in^{7}). For each trial in the test set, the posterior distribution over motion direction was subsequently computed, conditioned on the fitted model parameters for a given bootstrapped training sample. The posterior distribution was obtained using Bayes’ rule:
where the prior \(p(s)\) was flat (reflecting the uniform distribution of motion directions in the experiment) and the normalizing constant \(\int p\big({{{\bf{b}}}}\,\,s,\, \hat{{{\mathbf{\theta }}}}\big)p\left(s\right){ds}\) was estimated numerically. The posterior distribution was then averaged across each of the bootstrapping iterations to obtain one posterior per test trial. We took the circular mean of the decoded distribution as the estimated motion direction, and its entropy as a measure of uncertainty.
Statistical procedures
Benchmarking analyses
When analyzing decoding accuracy, we computed the circular correlation coefficient between the decoded and the true direction for each participant. We applied the Fisher transformation to individual coefficients and computed a Bayesian ttest on the transformed values. We used the standard (recommended) conservative priors in our Bayesian statistical analyses^{67,68}, both here and in the remaining analyses. The mean correlation coefficient across observers and its confidence intervals were computed on Fishertransformed individual coefficients and the resulting values were transformed back to the correlational scale for reporting.
For analyses relating trialbytrial uncertainty to behavioral variability, we used a Bayesian hierarchical regression with the brms^{67} library in R. The analysis across motion directions included the biascorrected squared behavioral error as the dependent variable. Trialbytrial uncertainty (demeaned across all trials for each individual participant) was used as an independent variable, both as a populationlevel (fixed) effect and as a participantlevel (random) effect along with participantlevel (random) intercepts. This design allowed us to estimate the withinsubject effect of uncertainty on behavioral variability while accounting for individual differences between participants. In the second set of analyses, we additionally controlled for differences between motion directions by including the oblique effect (distance to the nearest cardinal direction) in the model, both at the population and participant levels. In control analyses, we logtransformed the squared behavioral errors to account for the nonnormality of their distribution; this did not change any of our conclusions.
Analyses of the shape of the decoded distribution
To test the predictions about the number of peaks in the decoded posterior (see Results; Bayesian observer model), we first fitted, for each subject, a descriptive bimodal model to both the mean posterior across trials and to singletrial posteriors. The model enabled us to estimate the location of the distribution’s two peaks without any additional assumptions about the relationship between these peaks. The model is a mixture of two von Mises distributions and a circular uniform distribution:
where \(\lambda\) is the weight of the uniform component, \(\alpha\) is the relative weight of the first von Mises component, and \({\mu }_{i}\) and \({\kappa }_{i}\) are the mean and precision of the respective component. The probability density function \({f}_{{VM}}\) is a von Mises distribution and has two parameters, location (mean, \({\mu }_{i}\)) and precision (\({\kappa }_{i}\)):
where \({I}_{0}\) is a modified Bessel function of order 0. Because the component labels in this model are arbitrary, we disambiguate the components based on which one is larger (i.e., is higher at the maximum) or which one is located closer to the true direction of motion.
The model was fitted by minimizing the JensenShannon divergence (JSD, a symmetrized version of the Kullback–Leibler divergence) between the decoded posterior and the model. The parameters were minimally constrained to avoid degenerate solutions: \({\kappa }_{i}\in [0.001,100]\), \(\alpha \in \left[\right.1\times {10}^{5},\) \(11\times {10}^{5}\left]\right.\), \(\lambda \in [0,0.9]\). To avoid local minima and assess the uniqueness of the solutions, we ran the optimization algorithm 100 times for each trial using random starting parameters, and computed the circular standard deviation of the estimated component locations across optimization runs. We found the solutions to be fairly unique. The SD across the estimated locations, averaged across trials and participants, was 0.15° for the larger component and 1.23° for the smaller component, for solutions with the JSD up to 5% larger than the optimal solution to allow for small numerical errors. The results were similar when the components were disambiguated based on the closeness to the true direction rather than their height. Thus, while some trials did not result in a unique description of the decoded posterior (e.g., this might happen for a uniform posterior), most decoded posteriors were consistent with a unique description of component locations.
For the analyses of the mean posterior distribution, we averaged the posterior distribution across trials for each observer. We then estimated the bestfitting parameters for Eq. 9 as described above and computed the mean across observers and confidence intervals for both peak locations.
For analyses of peak location on a trialbytrial basis, we did not categorize the trials into unimodal or bimodal based on the number of peaks in the singletrial posteriors. Instead, we use peak location to have objective, quantitative criteria for posterior shape analyses. That is, any goodnessoffit measure is based on a statistical model (which describes how the data is generated so that the model’s likelihood can be estimated). However, it is not clear how best to describe the statistical model for a mixture of functions that are fitted to the singletrial decoded posterior. This is why we tested for peak location instead, as the statistical model for peak location is much better understood. Specifically, we first fitted the von Mises mixture model (Eq. 9) to the trialbytrial decoded posteriors for each individual participant. This gave us two peak locations for each trial, which are plotted as a joint probability distribution of peak locations across trials in Fig. 4c. For comparison, Supplementary Fig. 7 shows the predicted probability distributions for the fMRI data assuming that only orientation, only velocity or both signals are used. We then fitted two bivariate von Mises mixture models to the joint distribution of peak locations using the BAMBI package in R^{69}. The first model assumed that all location pairs belong to the same bivariate distribution (that is, a single cluster of trials is present) while the second assumed that they are best described as a mixture of two distributions (two clusters are present). We compared the models fits using the Watanabe–Akaike information criterion (WAIC).
To analyze the relationship between behavioral errors and the locations of the first and second peak in the decoded posterior, we first selected trials for which one of the peaks was closer to the true (−90° to 90°) and the other was closer to the opposite (90° to 270°) direction of motion (this selection criterion is conservative as it excludes trials for which both peaks correspond to approximately the same direction, that is, the “unimodal” trials). The peak locations were then transformed as \({\mu }^{{\prime} }=\sin \left(\frac{\pi }{90}\mu \right)\) to account for the nonlinear circular relationship predicted by the Bayesian observer model. In other words, we transformed the axes because of a nonlinearity in the model predictions, which arises because of the circularity of the motion space. One (standard) way to linearize a circular variable is to apply a sine and cosinetransformation (this is, for example, also done in a standard circularcircular regression). Because the model predictions are linear in the sinetransformed space (as shown in Fig. 2c), the transformation simplified our subsequent analyses. Next, we estimated the relationship between the transformed peak locations and the behavioral errors using a Bayesian hierarchical regression model that included behavioral errors as the dependent variable and the two peak locations as independent variables at both populationlevel (fixed) and participantlevel (random) effects, as well as participantlevel (random) intercepts.
Followup psychophysical experiment
In the analyses of the followup behavioral experiment, we compared two models fitted to the behavioral errors of each participant. We expected that if participants use both orientation and velocity signals, then there should be two peaks in the error distribution at the true and the opposite direction of motion. Accordingly, we fitted a von Mises mixture model (Eq. 9) to the behavioral error distribution with peak locations constrained to the true (\({\mu }_{1}=0^\circ\)) and opposite direction (\({\mu }_{2}=180^\circ\)). Alternatively, if observers do not use spatial orientation signals in their decision, there should be just one peak at the true direction of motion. For this alternative hypothesis, we fitted a singlepeak von Mises model (\(\alpha=1\) in Eq. 9). Both models included a uniform noise component. Because we fitted the model to the behavioral errors (rather than the decoded posterior distribution), we used a maximum likelihood (MLE) approach with the DEoptim package in R^{70} instead of the JSDbased approach. The models were fitted to the data of each individual participant, and Bayesian information criterion (BIC) differences were summed across participants for the groupwise inference.
Eye tracking data
Eye movements were recorded during the main fMRI experiment using an SR Research Eyelink 1000 eye tracker with 1000 Hz sampling rate and used for control analyses (Supplementary Figs. 5 and 8). After removing blinks, four variables were computed for the first 4.5 s of each trial. First, we computed gaze position as the absolute Euclidian distance between the fixation point (the screen center) and the mean gaze position within this time period. Second, we computed gaze position variability as the mean absolute Euclidian distance between pointby point gaze position and the mean position within this time period. Third, we computed the circular mean of saccade direction. Finally, we computed the mean axis of saccade direction as the circular average of all saccade directions wrapped in a 180degree space.
Bayesian observer models
The goal is to infer the direction of motion from noisy sensory signals. We consider two observer models for this task. The Bayesian observer model uses both the velocity and the spatial orientation measurements to estimate motion direction. The velocityonly model bases the judgments on velocity signals alone. Both models are described in three steps. First, we define the generative model that describes how the stimulus generates the velocity and orientation measurements of the observers. Second, we describe how each observer infers the range of motion directions that are likely given their measurement(s) – that is, how they compute the posterior distribution. The Bayesian model performs inference using both cues, whereas the velocityonly model only uses the velocity signals and ignores the spatial orientation measurements altogether. Finally, each observer selects their response given the computed posterior distribution combined with a cost function that determines how “bad” or costly potential errors are.
Generative model (both models)
To infer the direction of motion s of the stimulus, the observer measures its velocity (\({x}_{V}\)) and spatial orientation (\({x}_{O}\)) signals. These measurements are noisy, and are therefore best described as being drawn from a probability distribution; \(p\left({x}_{V} \,\, s\right)\) and \(p\left({x}_{O} \,\, s\right)\) for the velocity and spatial orientation signals, respectively. For velocity, we define the measurement probabilities as a von Mises (VM) distribution:
where \({I}_{0}\) is a modified Bessel function of order 0, and \(\kappa\) is a precision parameter. Note that higher precision corresponds to lower circular standard deviation, \(\sigma\):
For spatial orientation, the measurement distribution \(p\left({x}_{O}\,\,s\right)\) is similar to the velocity measurement distribution, but wrapped in the 180° orientation space. It is defined as follows:
for any \({x}_{O}\in \left[0,\pi \right)\). Note that this function is unimodal in the orientation space, but bimodal in direction of motion. While this particular shape of the distribution is chosen to simplify the later analytical derivations, in principle any bellshaped circular distribution can be used with qualitatively similar predictions.
Finally, the probability distribution of the stimuli (i.e., the prior distribution \(p(s)\)) is assumed to be a circular uniform distribution, \({f}_{{UC}}\left(s\right)\), corresponding to the uniform distribution of motion directions used in our task:
Together, \(p\left(s\right)\), \(p\left({x}_{M} \,\, s\right)\) and \(p\left({x}_{O} \,\, s\right)\) define the generative model of how the moving stimulus gives rise to the measurements in our task.
Inference (Bayesian model)
To infer the motion direction of the stimulus from the noisy sensory measurements, the Bayesian observer inverts the generative model. In other words, this observer estimates the most likely causes for the observed measurements by calculating the likelihood \(L\left(s\,\,{x}_{V},\, {x}_{O}\right)\) of different stimulus values given the velocity and orientation measurements. The likelihood function given the velocity measurement alone is computed as follows:
while the likelihood given the orientation measurement is:
with the locations of the two peaks separated by 180 degrees (\(\pi\) radians). Note that a horizontal orientation measurement is equally likely to be caused by a stimulus moving left and by a stimulus moving right. Hence, in the motion feature space, the likelihood becomes bimodal.
The Bayesian observer estimates the posterior distribution of motion direction s, \(p\left(s \mid {x}_{V},\, {x}_{O}\right)\), by computing the product of the stimulus likelihood \(L\left(s\,\,{x}_{V},\, {x}_{O}\right)\) and the prior distribution \(p\left(s\right)\):
Given that in our case the prior \(p\left(s\right)\) is uniform, it can be subsumed under the proportionality sign:
In words, the posterior distribution is proportional to the likelihood of stimulus motion direction given the two measurements. Assuming that the velocity and orientation measurements are independent (the results are qualitatively similar if they are correlated, Supplementary Fig. 1), the likelihood \(L(s\,\,{x}_{V},\, {x}_{O})\) is:
Given the bimodality of the orientationbased likelihood, the likelihood given both measurements is bimodal, as well:
Using Eq. 18 and 21 and properties of the von Mises distribution (see details in Supplementary Methods), we reformulate the posterior distribution as a weighted sum of two von Mises probability density functions, A and B, with weights (\({w}_{A}\), \({w}_{B}\)), means (\({\theta }_{A}\), \({\theta }_{B}\)) and precision (\({\kappa }_{A}\), \({\kappa }_{B}\)) depending on the precision of the velocity and orientation components and the distance between their locations:
Eq. 22 shows how the posterior can be described as a mixture of two components, which is useful when we quantify the shape of the posterior on a trialbytrial basis (Eq. 9). While the equations specifying the parameters are given in the Supplementary Methods, we highlight three specific cases to provide an intuition about the posterior. First, when the variance of the velocity measurements is relatively low (that is, \({\kappa }_{V}\) is high), the weight of the second component \({w}_{B}\) approaches zero, so that the posterior becomes a unimodal von Mises distribution. In contrast, if the variance of the velocity measurements is high (\({\kappa }_{V}\) is approaching zero) and the variance of the spatial orientation measurements is relatively low (\({\kappa }_{O}\) is high), the posterior becomes bimodal of shape with two identical peaks at the true and opposite motion direction. Finally, when both velocity and orientation measurements are highly variable (both \({\kappa }_{O}\) and \({\kappa }_{V}\) are close to zero), the posterior becomes close to uniform (Supplementary Fig. 2).
Decisionmaking (Bayesian model)
To judge the stimulus’ direction of motion, the Bayesian observer estimates the relative cost associated with each response, as defined by the cost function combined with the posterior distribution. Which cost function is sensible for the decision about motion direction? An object moving in a given direction cannot simultaneously move in the opposite direction, hence if the posterior is bimodal, only one of the peaks would correspond to the true direction of motion, while another is just a byproduct of the orientation signals. This suggests that a sensible strategy would be to use a delta cost function, selecting the most probable direction according to the posterior:
Note that the mean squarederror cost function (which corresponds to taking the mean of the posterior) would be somewhat problematic for highnoise scenarios in which bimodality is observed, because the mean of a bimodal distribution falls in between the two peaks – it would create a paradoxical situation in which the true stimulus is never chosen as the response. Please also note that if the peaks in the posterior are wellseparated, the MAP estimate matches a heuristic twostep strategy, by which observers first select the orientation peak that is more probable given the velocity peak location, and then estimate the true direction by combining this orientation peak and the velocity peak with any symmetric cost function (e.g., squared error). In other words, if observers use velocity estimates to disambiguate the orientation signals, and velocity signals provide relatively precise information, then the resulting estimate is the same as the maximum a posteriori (MAP) from the full posterior. The same results are also obtained when the velocity and orientation likelihoods are computed in a 180degree space, and a separate binary variable (obtained from the velocity measurement \({x}_{V}\)) indicates which half of the 360degree motion space most likely contains the true direction of motion.
Given this decision strategy, what shape should we expect for the distribution of behavioral responses over trials? The distribution of maximum aposteriori estimates is linked to the posterior distribution. When the velocity measurements have relatively low variance, the weight of the component corresponding to the opposite motion direction in the posterior (\({w}_{B}\)) will approach zero and become negligible. The posterior distribution is then just the product of two unimodal likelihoods computed from the velocity and orientation measurements. In this case, the distribution of MAP estimates can be approximated by a von Mises distribution (Murray & Morgenstern, 2010) with:
That is, on a trialbytrial basis, the maximum a posteriori estimate depends on the velocity and orientation measurements (\({x}_{M}\) and \({x}_{O}\)) and the variability of these direction estimates is inversely related to precision parameters \({\kappa }_{O}\) and \({\kappa }_{V}\). Across trials, the distribution of the estimates is centered on the true stimulus (\(s\)) and its precision \({\kappa }_{{MAP}}\) is equal to the sum of the motion and orientation precision parameters. However, if the velocity measurements are highly variable, this approximation no longer holds, and simulations are necessary to assess the acrosstrial distribution of responses and its relationship with the parameters of the posterior.
Inference (velocityonly model)
The velocityonly model follows the same inference steps but relies only on velocity estimates. First, the observer estimates the most likely causes for the observed measurements by calculating the likelihood \({L}_{V}\left(s\,\,{x}_{V}\right)\) of different stimulus values given the velocity measurements (Eq. 16, repeated here for convenience):
The observer then estimates the posterior distribution of motion direction s \(p\left(s \,\, {x}_{V}\right)\) by computing the product of the stimulus likelihood \({L}_{V}\left(s\,\,{x}_{V}\right)\) and the prior distribution \(p\left(s\right)\):
Given that in our case the prior \(p\left(s\right)\) is uniform, it can be subsumed under the proportionality sign:
In words, for the velocityonly model, the posterior distribution is proportional to the likelihood of the stimulus given the velocity measurements.
Decisionmaking (velocityonly model)
For the velocityonly model, any symmetric cost function (e.g., squared error or delta function) would result in the same decision. For consistency, we used the delta cost function, as we did for the Bayesian observer model:
The maximum of the posterior for the von Mises distribution lies at the measurement value, and the acrosstrial distribution of the MAP estimates is a von Mises distribution centered at the stimulus with precision \({\kappa }_{V}\).
Simulations
To obtain the predictions shown in Fig. 1b, we first simulated the posterior distribution for all of the possible combinations of the velocity and orientation standard deviation parameters spanning the range from 3° to 100° in 8 steps, \({\sigma }_{V},{\sigma }_{O}\in \left\{3,5,10,20,30,40,60,100\right\}\). For each of the combinations, 10,000 trials were simulated using the measurement distributions for velocity and orientation (Eqs. 11, 14). The posterior distribution was calculated as described above (Eq. 21). For each trial, we obtained the MAP (maximum aposteriori, i.e. the observer’s judgment of motion direction) estimate by locating the maximum of the generated posterior on a 0.5°step grid. For the velocityonly model, the same simulated measurements were used but the inference was based on the velocity measurements alone (Eqs. 28, 29). The same simulated data was used for Fig. 2a and d, with the results split into low and high levels of uncertainty in the velocity likelihood. Fig. 2a compares low levels of uncertainty (i.e., using a \(30^\circ\) cutoff for \({\sigma }_{V}\)) with high levels of uncertainty (i.e., \({\sigma }_{V} > 30^\circ\)). Fig. 2d shows the results for high levels of uncertainty (i.e., \({\sigma }_{V} > 30^\circ\)).
To facilitate a direct comparison with the posterior distribution decoded from the brain data (see Fig. 2b and c), in a second set of simulations, we simulated posteriors that are corrupted by additional (i.e., nonneuronal) sources of noise due to the fMRI measurements. MRI noise was modeled as independent noise on the observer’s internal measurements, drawn from a von Mises distribution centered on 0, and with precision parameters \({\kappa }_{V}^{{\prime} }\) and \({\kappa }_{O}^{{\prime} }\) for the velocity and orientation measurements, respectively. This resulted in the following “decoded” likelihood:
where \({{x{{\hbox{'}}}}}_{V}\) and \({{x{{\hbox{'}}}}}_{O}\) refer to random noise offsets caused by the fMRI measurements. Importantly, fMRI noise affected only the decoded likelihood and MRI measurements; the observer’s measurements were unaffected by this particular form of noise.
For these simulations (Fig. 2b and c), the observer’s measurements were drawn independently from two von Mises distributions (one for velocity and one for orientation). Because neural uncertainty varies on a trialbytrial basis, the precision parameters of these von Mises distributions fluctuated across trials. Namely, on each trial \({\kappa }_{V}\) and \({\kappa }_{O}\) were drawn independently from a lognormal distribution with \({\mu }_{{neur}}=3.8\) and \({\sigma }_{{neur}}=0.6\). Parameter values were chosen such that the predicted distribution of behavioral responses (Eq. 24) matched the variability of the human participants’ responses across trials (as estimated from the empirical data). These parameters were used in the model to predict the model’s behavioral responses. The additional noise in the fMRI measurement of the observer’s cortical representation (\({\kappa }_{V}^{{\prime} }\), \({\kappa }_{O}^{{\prime} }\)) was also drawn randomly from a lognormal distribution with \({\mu }_{{MRI},\,{velocity}}=0.9\) and \({\sigma }_{{MRI},\,{velocity}}=1.1\) for velocity, and \({\mu }_{{MRI},\,{orientation}}=1.4\) and \({\sigma }_{{MRI},\,{orientation}}=0.7\) for orientation. These parameter values were chosen so as to match the actual posterior distribution decoded from the brain data (Eq. 30, obtained via searching on a parameter grid with 0.1 step for all four parameters). For the velocityonly model, the same simulated measurements were used, but the observer’s decision was based only on the likelihood computed from the velocity signals (Eqs. 28, 29).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Preprocessed behavioral and fMRI data of individual participants generated in this study have been deposited in the Donders Repository database: https://doi.org/10.34973/yk4ktp41^{71}. This includes the data necessary to reproduce the figures. These data are available open access. The raw fMRI data are protected and are available upon request from the last author (Janneke F.M. Jehee) due to data privacy regulations. Requests for data will be answered within a reasonable timeframe (1 month).
Code availability
Custom code for data analysis can be obtained via the Donders Repository: https://doi.org/10.34973/yk4ktp41^{71}. Custom code for the probabilistic decoding technique can also be found at https://github.com/jeheelab/^{72}.
References
Landy, M. S., Banks, M. S. & Knill, D. C. IdealObserver Models of Cue Integration. in Sensory Cue Integration (eds. Trommershäuser, J., Kording, K. & Landy, M. S.) 5–29 (Oxford University Press, 2011). https://doi.org/10.1093/acprof:oso/9780195387247.003.0001.
Geisler, W. S. Motion streaks provide a spatial code for motion direction. Nature 400, 65–69 (1999).
Oruç, I., Maloney, L. T. & Landy, M. S. Weighted linear cue combination with possibly correlated error. Vis. Res. 43, 2451–2468 (2003).
Knill, D. C. & Saunders, J. A. Do humans optimally integrate stereo and texture information for judgments of surface slant? Vis. Res. 43, 2539–2558 (2003).
Edwards, M. & Crane, M. F. Motion streaks improve motion detection. Vis. Res. 47, 828–833 (2007).
Burr, D. C. & Ross, J. Direct evidence that ‘speedlines’ influence motion mechanisms. J. Neurosci. J. Soc. Neurosci. 22, 8661–8664 (2002).
van Bergen, R. S. & Jehee, J. F. M. TAFKAP: An improved method for probabilistic decoding of cortical activity. biorXiv (2021) https://doi.org/10.1101/2021.03.04.433946.
Tohmi, M., Tanabe, S. & Cang, J. Motion streak neurons in the mouse visual cortex. Cell Rep. 34, 108617 (2021).
Gur, M. & Snodderly, D. M. Direction selectivity in V1 of alert monkeys: Evidence for parallel pathways for motion processing. J. Physiol. 585, 383–400 (2007).
Geisler, W. S., Albrecht, D. G., Crane, A. M. & Stern, L. Motion direction signals in the primary visual cortex of cat and monkey. Vis. Neurosci. 18, 501–516 (2001).
An, X. et al. Distinct functional organizations for processing different motion signals in V1, V2, and V4 of macaque. J. Neurosci. 32, 13363–13379 (2012).
Girshick, A. R., Landy, M. S. & Simoncelli, E. P. Cardinal rules: visual orientation perception reflects knowledge of environmental statistics. Nat. Neurosci. 14, 926–932 (2011).
Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995).
Gros, B. L., Blake, R. & Hiris, E. Anisotropies in visual motion perception: a fresh look. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 15, 2003–2011 (1998).
Ball, K. & Sekuler, R. Directionspecific improvement in motion discrimination. Vis. Res. 27, 953–965 (1987).
Dakin, S. C., Mareschal, I. & Bex, P. J. An oblique effect for local motion: Psychophysics and natural movie statistics. J. Vis. 5, 9 (2005).
van Bergen, R. S., Ma, W. J., Pratte, M. S. & Jehee, J. F. M. Sensory uncertainty decoded from visual cortex predicts behavior. Nat. Neurosci. 18, 1728–1730 (2015).
Rust, N. C., Mante, V., Simoncelli, E. P. & Movshon, J. A. How MT cells analyze the motion of visual patterns. Nat. Neurosci. 9, 1421–1431 (2006).
Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S. & Movshon, J. A. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis. Neurosci. 13, 87–100 (1996).
Cumming, B. G. & Nienborg, H. Feedforward and feedback sources of choice probability in neural population responses. Curr. Opin. Neurobiol. 37, 126–132 (2016).
Dodd, J. V., Krug, K., Cumming, B. G. & Parker, A. J. Perceptually bistable threedimensional figures evoke high choice probabilities in cortical area MT. J. Neurosci. 21, 4809–4821 (2001).
Goris, R. L. T., Ziemba, C. M., Stine, G. M., Simoncelli, E. P. & Movshon, J. A. Dissociation of choice formation and choicecorrelated activity in macaque visual cortex. J. Neurosci. 37, 5195–5203 (2017).
Nienborg, H. & Cumming, B. G. Decisionrelated activity in sensory neurons reflects more than a neuron’s causal effect. Nature 459, 89–92 (2009).
van Bergen, R. S. & Jehee, J. F. M. Modeling correlated noise is necessary to decode uncertainty. NeuroImage 180, 78–87 (2018).
Walker, E. Y., Cotton, R. J., Ma, W. J. & Tolias, A. S. A neural basis of probabilistic computation in visual cortex. Nat. Neurosci. 23, 122–129 (2020).
Geurts, L. S., Cooke, J. R. H., van Bergen, R. S. & Jehee, J. F. M. Subjective confidence reflects representation of Bayesian probability in cortex. Nat. Hum. Behav. 6, 294–305 (2022).
Li, H.H., Sprague, T. C., Yoo, A. H., Ma, W. J. & Curtis, C. E. Joint representation of working memory and uncertainty in human cortex. Neuron 109, 3699–3712.e6 (2021).
Albright, T. D. Direction and orientation selectivity of neurons in visual area MT of the macaque. J. Neurophysiol. 52, 1106–1130 (1984).
Gur, M., Kagan, I. & Snodderly, D. M. Orientation and direction selectivity of neurons in V1 of alert monkeys: Functional relationships and laminar distributions. Cereb. Cortex 15, 1207–1221 (2005).
Shmuel, A. & Grinvald, A. Functional organization for direction of motion and its relationship to orientation maps in cat area 18. J. Neurosci. J. Soc. Neurosci. 16, 6945–6964 (1996).
Apthorp, D., Wenderoth, P. & Alais, D. Motion streaks in fast motion rivalry cause orientationselective suppression. J. Vis. 9, 1–14 (2009).
Tong, J., Aydin, M. & Bedell, H. E. Directionofmotion discrimination is facilitated by visible motion smear. Percept. Psychophys. 69, 48–55 (2007).
Apthorp, D., Cass, J. & Alais, D. The spatial tuning of “motion streak” mechanisms revealed by masking and adaptation. J. Vis. 11, 17 (2011).
Burr, D. C. & Thompson, P. Motion psychophysics: 19852010. Vis. Res 51, 1431–1456 (2011).
Manning, C., Meier, K. & Giaschi, D. The reverse motion illusion in random dot motion displays and implications for understanding development. J. Illusion 3, 7916 (2022).
An, X., Gong, H., McLoughlin, N., Yang, Y. & Wang, W. The mechanism for processing randomdot motion at various speeds in early visual cortices. PLoS ONE 9, e93115 (2014).
Rasch, M. J., Chen, M., Wu, S., Lu, H. D. & Roe, A. W. Quantitative inference of population response properties across eccentricity from motioninduced maps in macaque V1. J. Neurophysiol. 109, 1233–1249, (2013).
Jancke, D. Orientation formed by a spot’s trajectory: a twodimensional population approach in primary visual cortex. J. Neurosci. 20, RC86–RC86 (2000).
Basole, A., White, L. E. & Fitzpatrick, D. Mapping multiple features in the population response of visual cortex. Nature 423, 986–990 (2003).
Barlow, H. B. & Olshausen, B. A. Convergent evidence for the visual analysis of optic flow through anisotropic attenuation of high spatial frequencies. J. Vis. 4, 1–1 (2004).
Apthorp, D. et al. Direct evidence for encoding of motion streaks in human visual cortex. Proc. Biol. Sci. 280, 20122339 (2013).
Laquitaine, S. & Gardner, J. L. A switching observer for human perceptual estimation. Neuron 97, 462–474.e6 (2018).
Keck, M. J., Palella, T. D. & Pantle, A. Motion aftereffect as a function of the contrast of sinusoidal gratings. Vis. Res. 16, 187–191 (1976).
Nishida, S., Ashida, H. & Sato, T. Contrast Dependencies of Two Types of Motion Aftereffect. Vis. Res. 37, 553–563 (1997).
Körding, K. P. et al. Causal inference in multisensory perception. PLoS ONE 2, e943 (2007).
Kamitani, Y. & Tong, F. Decoding seen and attended motion directions from activity in the human visual cortex. Curr. Biol. 16, 1096–1102 (2006).
Sprague, T. C., Ester, E. F. & Serences, J. T. Restoring latent visual working memory representations in human cortex. Neuron 91, 694–707 (2016).
Hong, S. W., Tong, F. & Seiffert, A. E. Directionselective patterns of activity in human visual cortex suggest common neural substrates for different types of motion. Neuropsychologia 50, 514–521 (2012).
Hebart, M. N., Donner, T. H. & Haynes, J. D. Human visual and parietal cortex encode visual choices independent of motor plans. NeuroImage 63, 1393–1403 (2012).
Wang, H. X., Merriam, E. P., Freeman, J. & Heeger, D. J. Motion Direction biases and decoding in human visual cortex. J. Neurosci. 34, 12601–12615 (2014).
Manning, T. S. & Britten, K. H. Motion Processing in Primates. Oxf. Res. Encycl. Neurosci. 1–29 (2017) https://doi.org/10.1093/acrefore/9780190264086.013.76.
Kleiner, M., Brainard, D. H. & Pelli, D. G. What’s new in Psychtoolbox3? in Perception 36 ECVP Abstract Supplement (2007).
Brainard, D. H. The psychophysics toolbox. Spat. Vis. 10, 433–436 (1997).
Pelli, D. G. The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spat. Vis. 10, 437–442 (1997).
Engel, S. A., Glover, G. H. & Wandell, B. A. Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cereb. Cortex 7, 181–192 (1997).
DeYoe, E. A. et al. Mapping striate and extrastriate visual areas in human cerebral cortex. Proc. Natl Acad. Sci. USA. 93, 2382–2386 (1996).
Sereno, M. I. et al. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268, 889–893 (1995).
Huk, A. C., Dougherty, R. F. & Heeger, D. J. Retinotopy and functional subdivision of human areas MT and MST. J. Neurosci. 22, 7195–7205 (2002).
Vintch, B. & Gardner, J. L. Cortical correlates of human motion perception biases. J. Neurosci. 34, 2592–2604 (2014).
Maloney, R. T., Watson, T. L. & Clifford, C. W. G. Determinants of motion response anisotropies in human early visual cortex: The role of configuration and eccentricity. NeuroImage 100, 564–579 (2014).
Wei, X.X. & Stocker, A. A. A Bayesian observer model constrained by efficient coding can explain ‘antiBayesian’ percepts. Nat. Neurosci. 18, 1509–1517 (2015).
Stasinopoulos, M. D., Rigby, R. A., Heller, G. Z., Voudouris, V. & Bastiani, F. D. Flexible Regression and Smoothing: Using GAMLSS in R. (Chapman and Hall/CRC, 2017). https://doi.org/10.1201/b21973.
Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage 17, 825–841 (2002).
Fischl, B., Sereno, M. I. & Dale, A. M. Cortical surfacebased analysis: II: inflation, flattening, and a surfacebased coordinate System. NeuroImage 9, 195–207 (1999).
Reuter, M., Rosas, H. D. & Fischl, B. Highly accurate inverse consistent registration: a robust approach. NeuroImage 53, 1181–1196 (2010).
Reuter, M., Schmansky, N. J., Rosas, H. D. & Fischl, B. Withinsubject template estimation for unbiased longitudinal image analysis. NeuroImage 61, 1402–1418 (2012).
Bürkner, P.C. brms: An R Package for Bayesian Multilevel Models Using Stan. J. Stat. Softw. 80, (2017).
Morey, R. D. & Rouder, J. N. Bayes factor approaches for testing interval null hypotheses. Psychol. Methods 16, 406–419 (2011).
Chakraborty, S. & Wong, S. W. K. BAMBI: An R package for fitting bivariate angular mixture models. J. Stat. Softw. 99, 1–69 (2021).
Mullen, K. M., Ardia, D., Gil, D. L., Windover, D. & Cline, J. DEoptim: An R package for global optimization by differential evolution. J. Stat. Softw. 40, 1–26 (2011).
Chetverikov, A. & Jehee, J. F. M. Data accompanying the paper ‘Motion direction is represented as a bimodal probability distribution in the human visual cortex’. Radboud University. https://doi.org/10.34973/yk4ktp41 (2023)
van Bergen, R. S. & Jehee, J. F. M. TAFKAP [computer software]. GitHub. https://github.com/jeheelab/TAFKAP (2023).
Acknowledgements
We would like to thank P. Gaalman for MRI support. This work was supported by European Research Council Starting Grant No. 677601 (to J.F.M.J.) and a Radboud Excellence Initiative fellowship (to A.C.). The funder had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Funding
Open access funding provided by University of Bergen.
Author information
Authors and Affiliations
Contributions
Conceptualization: A.C. and J.F.M.J.; Data curation: A.C. and J.F.M.J.; Formal analysis: A.C. and J.F.M.J.; Funding acquisition: A.C. and J.F.M.J.; Investigation: A.C.; Methodology: A.C. and J.F.M.J.; Project administration: A.C. and J.F.M.J.; Resources: J.F.M.J.; Software: A.C.; Supervision: J.F.M.J.; Validation: A.C.; Visualization: A.C. and J.F.M.J.; Writing – original draft: A.C. and J.F.M.J.; Writing – review & editing: A.C. and J.F.M.J.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Wilson Geisler and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chetverikov, A., Jehee, J.F.M. Motion direction is represented as a bimodal probability distribution in the human visual cortex. Nat Commun 14, 7634 (2023). https://doi.org/10.1038/s4146702343251w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4146702343251w
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.