Single-trial visually evoked potentials predict both individual choice and market outcomes

A central assumption in the behavioral sciences is that choice behavior generalizes enough across individuals that measurements from a sampled group can predict the behavior of the population. Following from this assumption, the unit of behavioral sampling or measurement for most neuroimaging studies is the individual; however, cognitive neuroscience is increasingly acknowledging a dissociation between neural activity that predicts individual behavior and that which predicts the average or aggregate behavior of the population suggesting a greater importance of individual differences than is typically acknowledged. For instance, past work has demonstrated that some, but not all, of the neural activity observed during value-based decision-making is able to predict not just individual subjects’ choices but also the success of products on large, online marketplaces—even when those two behavioral outcomes deviate from one another—suggesting that some neural component processes of decision-making generalize to aggregate market responses more readily across individuals than others do. While the bulk of such research has highlighted affect-related neural responses (i.e. in the nucleus accumbens) as a better predictor of group-level behavior than frontal cortical activity associated with the integration of more idiosyncratic choice components, more recent evidence has implicated responses in visual cortical regions as strong predictors of group preference. Taken together, these findings suggest a role of neural responses during early perception in reinforcing choice consistency across individuals and raise fundamental scientific questions about the role sensory systems in value-based decision-making processes. We use a multivariate pattern analysis approach to show that single-trial visually evoked electroencephalographic (EEG) activity can predict individual choice throughout the post-stimulus epoch; however, a nominally sparser set of activity predicts the aggregate behavior of the population. These findings support an account in which a subset of the neural activity underlying individual choice processes can scale to predict behavioral consistency across people, even when the choice behavior of the sample does not match the aggregate behavior of the population.

The area under the receiver-operator curve (ROC-AUC) is a non-parametric measure of class separation, which follows a known probability distribution; specifically, it is directly proportional to the generalized U-statistic used in the non-parametric Mann-Whitney U-test, which follows a normal distribution with known mean under the null hypothesis (Mason et al., 2002).If the ROC-AUC values we measured were independent measurements, we could have used this known distribution to compute a p-value directly, which constitutes an accepted statistical test for comparing receiver-operator curves to each other or to chance (DeLong et al., 1988).However, since our cross-validated ROC-AUCs are not independent, as the training and test sets used to compute them overlap across repeated cross-validation folds (Bengio and Grandvalet, 2003), we instead compared them to chance (ROC-AUC = 0.5) in the main text using a version of the t-test that explicitly accounts for (1) co-dependence between crossvalidated measurements and (2) variability in the test statistic due to random cross-validation splits (Dietterich, 1998).Since the other assumptions of a t-test are handily met by a sample of generalized U-statistics, this procedure can be expected to conservatively control the falsepositive rate.Since we use a parametric t-test on a non-parametrically computed U-statistic, one might call our significance testing procedure semi-parametric.
In the larger machine learning literature, the problem of significance testing for comparing the cross-validation performance of classifiers is well-studied, and approaches such as ours that leverage known distributional information are often preferred (Nadeau and Bengio, 1998;Bouckaert and Frank, 2004), as they offer exact p-values and improved computational expediency over randomization-based methods.However, the multi-variate pattern analysis literature tends to favor permutation or Monte-Carlo simulation approaches to significance testing (e.g.Bae and Luck, 2018) to account for biases that may occur in the random crossvalidation splits (though this is mainly a problem when the test statistic is sensitive to class imbalances, such as accuracy).As mentioned above, our testing procedure is relatively insensitive to these anticipated biases, but we nonetheless present permutation test results here to demonstrate consistency with the semi-parametric approach we take in the main text.

Permutation Tests
For our permutation tests, we compute the same corrected, cross-validated t-statistic as in the main text (Dietterich, 1998) at each time point across 10,000 permutation of the test labels, using the same cross-validation scheme across permutations so that any bias induced by the crossvalidation splits would be reflected in the permutation null distribution.Since we use a one-tailed test, as only above-chance decoding performance is interpretable, the p-value at each time-point is computed as one minus the percentile rank of the observed t-statistic in the permutation null distribution (i.e. the test is significant at a level of 0.05 if the t-statistic is at or exceeds the 95 th percentile of the permutation null, or falls outside of the middle 90 % of the distribution).
However, it is necessary to correct for multiple comparisons.In this vein, we use the t-max procedure to strongly control the family-wise error rate (Nichols and Holmes, 2002), in which a null distribution is constructed by taking the largest t-statistic (across all individual tests / time points) seen on each permutation -that is, the max-t statistic.Each individual test is then compared not to its own null distribution, but to the null distribution of the max-t statistic; the multiple-comparisons corrected rejection threshold, then, is the 95 th percentile of the max-t's null distribution.
We additionally visualize the observed ROC curves for market-level predictions against 1,000 permutation ROC curves at select time points (namely, those times at which market-level prediction is significant after correcting for multiple comparisons).