Long-term priors influence visual perception through recruitment of long-range feedback

Perception results from the interplay of sensory input and prior knowledge. Despite behavioral evidence that long-term priors powerfully shape perception, the neural mechanisms underlying these interactions remain poorly understood. We obtained direct cortical recordings in neurosurgical patients as they viewed ambiguous images that elicit constant perceptual switching. We observe top-down influences from the temporal to occipital cortex, during the preferred percept that is congruent with the long-term prior. By contrast, stronger feedforward drive is observed during the non-preferred percept, consistent with a prediction error signal. A computational model based on hierarchical predictive coding and attractor networks reproduces all key experimental findings. These results suggest a pattern of large-scale information flow change underlying long-term priors’ influence on perception and provide constraints on theories about long-term priors’ influence on perception.


SUPPLEMENTARY NOTE 1
To address the question of how stimulus characteristics including size, color, and spatial location influence perceptual bias, we conducted an online behavioral task (using Gorilla.sc). We recruited subjects using Amazon Mturk, who had the Mturk masters qualification. We further selected participants who were using a computer with a monitor, and reported that they were not colorblind. Subjects signed an informed consent and were paid for a 30 minute task (including 10 minutes of resting time).
Before the task, we first ran a screen calibration where participants matched the size of an image of a credit card on screen to an actual credit card, and also reported their distance from the screen. This enabled us to present images at a specified visual angle. Each participant was assigned to one of three images: the same Necker cube ('ViewFromAboveGreen') and FaceVase image as used for ECoG participants, and a Necker cube image where the blue and green edges were swapped ('ViewFromAboveBlue'). In the task, Subjects were asked to always fixate on a cross in the center of the screen, and report their perception of the image presented in different conditions (see below) for 60 seconds each. For the cube images, participants reported the color of the cube face that was closest (as in the ECoG study).
To assess the effect of image size, the images were presented in the center of the screen at 3 different sizes (Cube: 4, 8, 12 degrees, FaceVase: 8, 12, 16 degrees). Different sizes were used for the two images, as during piloting we did not experience perceptual switches for the FaceVase image at 4 degrees. This is also consistent with prior literature, which has typically presented the FaceVase image at a larger visual degree (8-24 o , see Supplementary Table 6) than the Necker cube (3-14 o ). To assess the effect of spatial location, images with a size of 8 o were presented with a 5 o offset from the central fixation point in 4 locations (left/right/up/down). Each participant completed 2 trials of each condition (7 conditions total: 3 Sizes presented at fixation and 4 peripheral locations), making a total of 14 image presentations.
Sixty participants completed the full task (22 female, mean age 37.7; range: 25-66, handedness: 3 left-handed, 1 ambidextrous, 56 right-handed), and 14 participants were removed from further analysis due to not following task instructions. The analysis reported below includes 16 participants for the FaceVase image, 12 participants for the 'ViewFromAboveGreen' Cube, and 18 participants for the 'ViewFromAboveBlue' Cube.
For each stimulus condition we calculated the percentage of time perceiving the ViewFromAbove or Vase percepts (excluding unsure time). For the Necker cube, we first combined data across the two participants groups who were shown different color versions of the cube image ('ViewFromAboveGreen' and 'ViewFromAboveBlue'). In all of the 7 stimulus conditions (3 image sizes at fixation and 4 peripheral locations), participants on average tended to perceive the 'view-from-above' percept more often (Fig. S7, left). This bias was significant for 3 out of 7 stimulus conditions following Bonferroni correction (Fig. S7, left; *: two-tailed Wilcoxon sign-rank tests, p<0.05, Bonferroni corrected). Importantly, there was no significant effect of image size (F2,87=0.32, p=0.73) or image location (F4,147=2.13, p=0.08) on perceptual bias.
To assess the effect of the coloring scheme of the cube image, we compared the perceptual bias for each of the 7 stimulus conditions between the two versions of cube image (ViewFromAboveGreen and ViewFromAboveBlue). There was no significant differences between the two versions of the cube image (two-tailed Wilcoxon rank sum tests; all p>0.05). A two-way 2x7 ANOVA with image color and presentation condition as the two factors also yielded no significant main or interaction effect (all p > 0.1).
For the FaceVase image, we found no significant group-level bias after Bonferroni correction across the 7 stimulus conditions (two-tailed Wilcoxon sign-rank tests), consistent with the lack of a significant group-level effect in the ECoG patients. In addition, there was no significant effect of image size on perceptual bias (F2,46=1.33, p=0.275), but there was a significant effect of spatial location (F4,79=3.68, p=0.0086), where the vase percept was perceived more often in the up and down locations and the face percept was perceived more often in the left and right locations. This is likely because the faces are at the right and left flanks of the image and therefore would be closer to fovea when the entire image is presented in the left or right location.
To summarize, we observed a strong group-level perceptual bias toward the 'view-from-above' percept for the cube image (same as for the ECoG participants; Fig. 1B), which was robust to the color, size, and visual field location of the image. There was a mild effect of visual field location consistent with prior studies (see Discussion), which did not reach statistical significance. For the face-vase image, we did not observe a significant group-level perceptual bias, consistent with results from the ECoG participants (Fig. 1B); in addition, there was no significant effect of image size. The spatial location effect for the facevase can be explained by the asymmetry within the image itself. Lastly, an individual participant's perceptual bias is strongly correlated across different stimulus conditions. Together with behavioral results from a separate group of healthy participants (N=24, tested in the laboratory) showing that individual perceptual bias is stable across weeks (see Results, section "Perceptual Bias during Bistable Perception of Ambiguous Images"), these finding strengthen the conclusion that perceptual biases reflect individual-specific long-term priors. Lighter shades indicate electrode with significant 'switch' or 'maintain' behavior for both ambiguous images; darker shades indicate electrodes with significant 'switch' or 'maintain' behavior for one image. Source data are provided as a Source Data file. Figure 3 (Complement to Fig. 3). Frequency-domain inter-lobe feedforward-feedback biases during the preferred percept (green) and non-preferred percept (magenta) of the Cube image. Horizontal bars: p<0.05, 2-sided binomial test, cluster-corrected. Format is the same as Fig. 3D. Source data are provided as a Source Data file. Fig. 4). (A) Same as Fig. 4D, except that only participants who had a significant perceptual bias (i.e., a significant preference for one of the two percepts, see Supplementary Table 2) were included in the analysis. Line-width indicates significance of 2-sided binomial test (uncorrected). (B) Same as Fig. 4D, except that only inter-lobe electrode pairs where at least one electrode exhibited significant 'switch' or 'maintain' behavior ( Figure 2D) were included in the analysis. Line-width indicates significance of 2-sided binomial test (uncorrected). (C) The time distance of each 'trial' (i.e., 250-ms time window, see Methods) from the nearest button press was calculated, and overall distributions are shown for the preferred (green) and non-preferred (magenta) trials. Trials were dropped from the preferred percept to match the distance distribution between the two percepts (gray). Example data from one participant are shown. (D) Same as Fig. 4D, except that trials for the preferred and non-preferred percept were selected so that their distributions of temporal distance to the nearest button press were matched within each participant. Line-width indicates significance of 2sided binomial test (uncorrected). (E) Locations of 7 ROIs, colored spheres represent a 20-mm radius sphere around each ROI's center coordinate, which were used to identify ECoG electrodes within each ROI. (F) Same as Fig. 4D except that connectivity is defined between electrodes located within the 7 ROIs instead of lobes. Line-width indicates significance of 2-sided binomial test (uncorrected). Source data are provided as a Source Data file.