Introduction

Psychotic symptoms, which include hallucinations and delusions, are a defining feature of schizophrenia, schizoaffective disorder, and bipolar disorder. These symptoms can have a dramatic impact on a person’s ability to perform daily tasks [1, 2]. In addition to frank hallucinations, distorted perceptions of real stimuli are often reported, most often in the auditory domain, but visual distortions are common as well [3, 4]. This type of visual dysfunction can be studied using well-established and tested behavioral paradigms that allow the measurement of more subtle behavioral and cognitive disruptions [3]. Studying visual dysfunction in people with psychotic psychopathology (PwPP) using behavioral paradigms may thus provide insight into the neural basis of psychotic symptoms, and may also inform a better understanding of basic sensory processing [5,6,7].

Bi-stable perception is a phenomenon that occurs when the same external stimulus evokes two alternating dominant percepts. Since the early 20th century [8, 9], researchers have studied these bi-stable stimuli by measuring the rate at which participants’ subjective percept switches from one interpretation to the other. There are many examples of this phenomenon, including both static and dynamic stimuli. Static versions include the Necker cube and Rubin’s face-vase illusions, while dynamic versions include the spinning dancer, point light walker, and the rolling wheel. There are both static and dynamic examples of binocular rivalry [10,11,12].

Another dynamic bi-stable percept is the rotating cylinder [13,14,15]. In this illusion, dots traverse from left to right and right to left in a rectangular area, speeding up when approaching the middle and slowing down when approaching the edge, which gives the impression that the dots are ‘painted’ on the surface of a translucent rotating cylinder. The rotating cylinder is an example of structure-from-motion (SFM) perception, which involves integrating many moving elements into one coherent percept, where the perceived object is defined by the movement of the elements [16]. This stimulus provides an opportunity to study the interaction between the perception of object motion and form.

The rotating cylinder is thought to involve neural processing at multiple levels within the visual hierarchy, including within regions selective for motion and shape processing. Functional magnetic resonance imaging (fMRI) studies have shown that responses in the human medial temporal complex (hMT+) are selective for stimulus features including motion direction and coherence [17,18,19]. Additionally, hMT+ shows selectivity for motion in depth and structure-from-motion [20, 21], suggesting that hMT+ may play an important role in the perception of the rotating cylinder illusion. Likewise, the lateral occipital complex (LOC), which plays an important role in object and form processing [22], has been shown to respond more strongly to rotation-in-depth SFM stimuli (versus scrambled or translational motion), indicating a role for LOC in the processing of motion-defined shapes [20, 23].

Abnormalities in laboratory assessments of visual perception have been reported in psychotic disorders, ranging from low to higher-level processes [5,6,7, 24], including both atypical motion perception and visual integration. For example, studies in PwPP have shown impairments in some aspects of visual motion perception [25,26,27,28,29,30,31,32]. In general, these findings suggest there may be deficits in global motion perception associated with psychosis, while local motion perception may be relatively intact [28, 33, 34]. Visual integration is a process by which visual information is combined to create the perception of a more complex whole [35, 36]. Integration of visual forms also involves processing across multiple levels within the visual hierarchy, and includes visual functions such as center-surround suppression, contour integration, border detection, shape representation, and figure-ground segmentation, all of which have been found to be impaired in PwPP [29, 32, 37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]. Thus, impairments in visual integration and visual motion perception may both be relevant for SFM perception in PwPP.

One potential explanation for abnormal vision in psychosis is a disruption in excitatory/inhibitory (E/I) balance [56, 57]. This has been proposed in PwPP based on evidence showing disruption in glutamate [58, 59] and γ-amino butyric acid (GABA) [60, 61]. Bi-stable visual paradigms may provide a convenient behavioral marker of E/I balance in the visual cortex. Specifically, the rate at which bi-stable stimuli alternate, or the duration for which one percept dominates, has been linked to excitatory (i.e., glutamatergic) and inhibitory (i.e., GABA-ergic) neural processes in humans [62,63,64,65,66]. If PwPP show different bi-stable switch rates versus controls, this may indicate an impairment in the neural processes that promote one percept while suppressing the other, which could reflect disrupted E/I balance. Prior research has shown that switch rates during binocular rivalry are slower in PwPP relative to healthy controls [67,68,69,70,71,72]. Additionally, studies using MR spectroscopy in PwPP have found abnormalities in neural metabolite concentrations in the visual cortex, including glutamate and GABA [73,74,75,76,77]. However, the specific role of different neurochemicals during bi-stable perception in psychosis is currently unknown.

Genetics is another factor known to play a significant role in the etiology of psychotic disorders, including schizophrenia and bipolar disorder [78, 79]. One way to study the role of genetic liability for psychosis in perceptual and cognitive dysfunction is to study task performance among unaffected biological relatives of PwPP [45, 46, 80, 81]. First-degree biological relatives (i.e., parents, siblings, and children) share on average 50% of their genes with an individual with psychosis. If abnormal perceptual or cognitive performance is observed among such relatives, this may provide evidence for an endophenotype [82,83,84], which is an anomaly associated with genetic liability for psychosis, rather than with a diagnosed psychotic disorder per se.

In the current study, we tested the hypothesis that PwPP have impaired visual integration of local motion cues into a global percept. Specifically, we examined bi-stable perception in PwPP, their close biological relatives, and healthy controls using an SFM rotating cylinder task. We also measured metabolite concentrations in the visual cortex using MR spectroscopy at 7 tesla, including glutamate, GABA, and glutamine levels, to examine the potential role of these metabolites during SFM perception in PwPP. Based on previous studies of binocular rivalry [67,68,69], we expected to find slower switch rates during SFM perception in PwPP. Contrary to our prediction, we in fact observed faster switch rates during SFM perception in both PwPP and their biological relatives, compared to healthy controls. Our findings suggest an impairment in bi-stable visual motion and form perception that is linked to genetic liability for psychosis.

Methods

This study was conducted as part of a larger series of experiments in the Psychosis Human Connectome Project (pHCP). The essential details of our experimental approach are described below. For a full description of all pHCP study methods, please see [85, 86].

Participants

Participants were recruited as part of the pHCP study at the University of Minnesota (Table 1). Data were collected as part of a series of experiments focused on visual perception, which included fMRI and MR spectroscopy at 7 tesla [86].

Table 1 Subject group demographics, clinical symptom levels, and cognitive measures.

A total of 152 participants participated in the SFM task. Participants were divided into three groups: people with a history of psychotic psychopathology (PwPP hereafter), their first-degree biological relatives (relatives hereafter), and healthy controls (see Table 1 for detailed demographics; see Supplemental Table 1 for demographics after exclusion). We chose our sample size for the 7 T portion of the pHCP to ensure adequate power to detect small to medium effect sizes, based on previous studies of visual dysfunction in PwPP. Of the 152 participants initially tested, 49 returned for a second (repeat) session (Table 1). The median number of days participants returned after their initial session was 133.5 (range: 36–1173 days; see Supplementary Fig. 1).

All participants were between 18–65 years of age, spoke English as their primary language, had not been diagnosed with any learning disability, did not have an IQ of less than 70, nor any current or past central nervous system disease. PwPP had a history of a disorder with psychotic psychopathology (i.e., schizophrenia, schizoaffective, bipolar), although not all were symptomatic at the time of their participation. People with bipolar disorder may or may not experience psychotic symptoms (delusions and hallucinations) [87, 88]; we included only those who had a history of psychotic psychopathology. Relatives were biological parents, siblings, or offspring of individuals with psychotic psychopathology. All participants had Snellen visual acuity (with correction, if used) of 20/40 or better. Individuals with poorer than 20/40 acuity were excluded. A total of 3 participants were excluded based on the criteria above (0 controls, 2 relatives, 1 PwPP) and are not included in Table 1. As part of the pHCP participants completed two 3 T fMRI scanning sessions and one or two 7 T scanning sessions. Details of the clinical assessments and the 3 T scanning protocol are included in our recent publication [85]. SFM task data were collected outside of the scanner, in a separate psychophysics room, prior to 7 T scanning on the same day. All participants provided written informed consent prior to participating and were compensated $20 per hour. All procedures were in compliance with the Declaration of Helsinki and were approved by the University of Minnesota IRB.

Apparatus

Psychophysical data were collected at the Center for Magnetic Resonance Research at the University of Minnesota. Data were collected on an Apple Mac Pro using an Eizo FlexScan SX2462W monitor with a 60 Hz refresh rate (mean luminance = 61.2 cd/m2). A Bits# stimulus processor (Cambridge Research Systems) was connected but not used for this particular experiment. Luminance values were linearized using a PR655 spectrophotometer (Photo Research). The head position was stabilized using an adjustable chin rest positioned at a viewing distance of 70 cm. Stimuli were generated and responses were collected using PsychoPy [89].

Stimuli

The visual stimuli used were standard structure-from-motion rotating cylinders [13, 14]. The rotating cylinder is a classic illusion in which small moving elements (here, black and white squares) move back and forth across a rectangular area in order to induce the perception of a cylinder rotating in 3-dimensional space. The rotating cylinder stimuli used here were composed of 400 small black and white squares (each 0.25° of visual angle; 200 black and 200 white) that alternated between moving from left to right and right to left across the width of a rectangular area and each positioned pseudo-randomly along the height of the rectangular region (height = 10°, width = 7°). The squares sped up as they approached the center of the rectangle and slowed down as they approached each edge. This was done to simulate the perceived speed of the squares as if they were positioned on a transparent cylinder rotating in the depth plane (simulated rotation speed of 90°/s).

Two versions of the stimuli with subtle but important differences were used for the two different task conditions, referred to as the real switch and bi-stable tasks. In both tasks, the same black and white squares moved back and forth across the rectangular area with speed, size, and position being constant across the two tasks. In the bi-stable task, a small red fixation point (0.6° diameter) was positioned in front of all squares in the center of the rectangular area (Fig. 1A). When dots collided in this task, they occluded one another randomly in order to remove this potential depth cue. In the absence of inherent depth cues, the direction of motion is ambiguous, and perception spontaneously alternates between the front surface rotating to the left and to the right.

Fig. 1: Stimulus and task design.
figure 1

Static examples of A the bi-stable and B real switch task stimuli. C Hypothetical time course of rotation switches during the bi-stable task. Figure reproduced from ref. [86].

In the real switch task, we simulated physical switches of the rotation direction in depth by changing which set of dots (i.e., those rotating left to right or right to left, surface #1 and surface #2 of the cylinder) were presented in the front or back. To do so, we used the same red fixation dot overlaid on a larger blue circle (1.8° diameter) and alternated which set of dots passed in front of or behind the larger fixation circle (Fig. 1B). This provided an explicit depth cue, thus biasing one percept (e.g., front rotating to the left) to become dominant. Switches occurred every 9–13 sec with a total of 11 real switches per block (order and timing were pseudorandomized, but fixed and identical for all subjects). The real switch task was added to the experiment in October 2017, thus data in this condition were not collected from a number of the early data sets (10 controls, 0 relatives, 2 PwPP). Of these, 1 control and 2 PwPP were excluded based on other exclusion criteria listed in the Participants section above, leaving 9 control data sets with missing real switch task data. Four of these controls returned for a repeat session during which real switch task data were collected (none of which were excluded for poor real switch task data quality, see below). We chose to retain the bi-stable task data from participants who did not have real switch task data, which gave us different numbers of control participants for the two tasks.

Experimental procedure and task design

In both the real switch and bi-stable tasks, participants were asked to fixate on the small red central circle and use their peripheral vision to determine the direction of rotation of the front surface of the cylinder, either left or right. This was done in order to help ensure the same visual stimulation for all participants, and to hold the viewing position constant over time. Participants were then instructed to respond using the left or right arrow keys to indicate which direction of rotation they perceived. Importantly, they were told to respond immediately at the beginning of the stimulus presentation to their initial percept in order to ensure this initial dominant percept was recorded, in addition to each change in percept. Each participant first completed a short 30 s practice version of the real switch task to familiarize themselves with the task. They then ran one block of the real switch task and 5 blocks of the bi-stable task. Each block was 2 min long; the rotating cylinder was presented for the entirety of the block.

Behavioral task analysis

Real switch task

Data from the real switch task were used to exclude subjects who had trouble detecting real stimulus direction switches from further analyses (i.e., of their bi-stable task data). We defined correct responses in the real switch task as those that matched the stimulus rotation direction and occurred within 4 s after a direction change. This was based on our examination of the average reaction time for correct responses (Supplementary Fig. 2), the average percept durations towards and away from the physical rotation direction (Supplementary Fig. 3), the distribution of reaction times during the real switch task (Supplementary Fig. 4), and the total number of reported switches (Supplementary Fig. 5). We defined poor real switch task performance as having 6 or fewer correct responses (≤ 63.6% accuracy) made within 4 s of a stimulus direction change (Supplementary Fig. 4B). We chose these thresholds after visually inspecting the data and determining that the large majority of participants and responses fell above these thresholds. Of the 147 total data sets with real switch data, including repeat sessions, we excluded a total of 30 data sets that did not meet the above post-hoc criteria (6 controls, 7 relatives, 17 PwPP).

Bi-stable Task

In order to assess the stability of ambiguous motion direction percepts in this task, we compared the distribution of durations for all reported percepts across the three groups using two-sample Kolmogorov-Smirnov tests. This allows us to examine how the durations of reported percepts may have differed between groups.

Next, we calculated the switch rate (switches reported per second) and average percept duration (average length of percept dominance) during each task for each subject. To quantify switch rates, we measured the total number of switches in each block and divided them by the total time (2 min per block). Because the switch rate data were highly skewed (based upon visual inspection), we then normalized the data by performing a log10 transformation. As it is possible to not respond during a 2 min block in the bi-stable task, some participants can have switch rates of 0 Hz, which when normalized using log10, become values of negative infinity. Therefore, we replaced values of 0 switches per block with 0.5 switches/120 s when normalizing. This occurred for a total of 2 participants (1 PwPP, 1 control). Although we measured both switch rate and percept duration, both of these measures displayed the same trends, and were highly correlated with one another. Therefore, to avoid repetition, we present the results of the switch rate analysis in the main text, while percept duration data are presented in the Supplemental Information. We note that although the switch rate and percept duration data were very similar, they were not identical. This is because the percept duration is calculated only after the participant’s initial response in each block, which occurs a short time after the stimulus onset, whereas the switch rate was defined based on the full two-minute duration of the block. As the data in both the real switch and bi-stable tasks did not have equal variance across groups and were skewed (even after log10 transformation), we performed a Kruskal-Wallis one-way non-parametric ANOVAs to assess group differences in switch rates during the real switch task.

We also measured the stability of bi-stable perception dynamics over time by examining longitudinal variability in a subset of our participants. To do so, we compared switch rates in the bi-stable task measured across two different experimental sessions (months apart) within the same individuals (49 individuals with repeat data; 10 controls, 0 relative, 39 PwPP). Of these individuals with repeat data, 9 participants (1 control and 8 PwPP) had 1 or more of their data sets excluded based on poor real switch task performance. This left a total of 40 individuals (9 controls and 31 PwPP) with usable repeat data. Information about the amount of time between testing sessions and the number of participants who completed a second test session is provided in Table 1. We calculated the intraclass correlation coefficient (ICC(3,k)) between switch rates for the first and second sessions (ICC model type 3, used for interrater reliability for k number of measures taken per subject; [90]).

Finally, we sought to identify any differences between clinical diagnoses within our PwPP group. Specifically, we compared participants with schizophrenia (n = 25) and bipolar disorder (n = 16) to healthy controls (n = 37). People with schizoaffective disorder were not included in this analysis, due to a smaller sample size (n = 7).

Association between clinical symptom level and switch rate

We used a set of measures to assess psychiatric symptoms and cognitive functioning, which included the Schizotypal Personality Questionnaire (SPQ; [91]), The Brief Assessment of Cognition in Schizophrenia; (BACS; [92]), The Brief Psychiatric Rating Scale (BPRS; [93]), and The Sensory Gating Inventory (SGI; [94, 95]). Out of a variety of clinical assessments administered as a part of the pHCP [85], we chose these as they provide measures of overall psychiatric symptom levels, schizotypy, and cognitive functions, and previous work has found relationships between these factors and visual dysfunction in people with psychotic disorders [67, 96, 97]. The SGI was chosen as a measure of sensory abnormalities (i.e. sensory modulation, filtering, discrimination, and attention) in PwPP and used to measure the relationship between self-reported sensory dysfunction and bi-stable perception. These measures were collected for participants in all three groups (unlike other measures that were collected only from PwPP). In addition to the four primary measures mentioned here, we also chose to include the cognitive-perceptual subscale from the SPQ (SPQ-CP) and a disorganization factor from the BPRS (BPRS-D; [98]), as they are particularly relevant to our analyses and hypotheses. Specifically, the disorganization factor of the BPRS is interesting as previous studies have observed correlations between disorganized symptoms and abnormal visual task performance [39, 99].

To investigate relationships between symptom severity and percept switch rate, we first correlated individual symptom measures with average switch rates across all three groups. We used Spearman rank correlations to avoid assuming linear relationships. Data from repeat sessions were excluded from these correlations, as Spearman correlations assume independent data points.

Quantification and analysis of neurochemical concentrations from MR spectroscopy

Neurochemical concentrations in the mid-occipital lobe (i.e., centered on the calcarine sulcus) were collected as part of a 7 T magnetic resonance spectroscopy (MRS) scan on the same day as behavioral SFM data. For full scanning details, see [86]. Data were acquired on a Siemens MAGNETOM 7 T scanner with a custom-built surface radio frequency head coil using a STEAM sequence [100] with the following parameters: TR = 5000 ms, TE = 8 ms, volume of interest (VOI) size = 30 mm (left-right) x 18 mm (anterior-posterior) x 18 mm (inferior-superior), 3D outer volume suppression interleaved with VAPOR water suppression [101], 2048 complex data points with a 6000 Hz spectral bandwidth, chemical shift displacement error = 4% per ppm. B0 shimming was performed using FAST(EST)-MAP to ensure a linewidth of water within the occipital VOI ≤ 15 Hz [102].

We processed our MRS data using the matspec toolbox (github.com/romainVala/matspec) in MATLAB, including frequency and phase correction. Concentrations for 18 different metabolites including glutamate, glutamine, and GABA were quantified in each scanning session using LCModel. We scaled metabolite concentrations relative to an unsuppressed water signal reference, after correcting for differences in gray matter, white matter, and CSF fractions within each subject’s MRS VOI, the proportion of water in these different tissue types, and the different T1 and T2 relaxation times of the different tissue types. Tissue fractions within the VOI were quantified in each subject using individual gray matter and white matter surface models from FreeSurfer [103]. MRS data sets were excluded based on the following data quality criteria: H2O line width > 15, LCModel spectrum line width > 5 Hz, or LCModel SNR < 40. Out of a total of 193 MRS datasets (54 controls, 44 relatives, and 95 PwPP), 10 datasets (1 control, 4 relatives, 5 PwPP) were excluded in this way, leaving 183 total MRS datasets. In addition to subjects whose SFM data we excluded for having poor real switch task performance, and excluding repeat sessions, this left a total of 114 participants with usable SFM and MRS data (37 controls, 33 relatives, and 44 PwPP).

In order to probe the possible role of excitatory and inhibitory markers during bi-stable perception in PwPP, we examined relationships between metabolite concentrations from MRS and our bi-stable SFM behavioral measures. Specifically, we used Spearman rank correlation to test for correlations between metabolite levels from MRS (i.e., GABA, glutamate, and glutamine) and average switch rates across participants from all three groups. As in our other correlational analyses, data from repeat sessions were excluded, as Spearman correlations assume independence across data points.

Results

Individuals with psychotic psychopathology reliably detect real switches in rotation direction

To ensure that all participants understood the task and were able to detect real changes in the rotation direction, we used a real switch task in which physical depth cues were present. This allowed us to measure the detection of real rotation direction switches.

We examined the distributions of the number of responses made in the real switch task in each group (Supplementary Fig. 5; panel A shows all responses, B shows responses with reaction time < 4 s), to ensure participants were responding to the majority of physical switches. The large majority of participants responded to ≥ 63.4% of the physical switches or 7/11 total switches with a reaction time < 4 s (83.3% controls, 81.4% relatives, and 70.7% PwPP). We excluded a total of 30 individual participants who did not meet the above post-hoc criteria (6 controls, 7 relatives, 17 PwPP; i.e., individuals with an accuracy below the dotted line in Fig. 2), leaving a total of 122 participants with bi-stable task data. To determine whether exclusion rates differed across groups, we performed a contingency table analysis. The proportion of excluded participants across groups was not significantly different from what would be expected by chance (i.e., the number of participants excluded within each group was proportional to the total size of the group; X2(2) = 1.55, p = 0.46).

Fig. 2: Real switch task results.
figure 2

Box plots depict the median accuracy (middle line) as well as the 25th and 75th quartiles (box), 1.5 x the interquartile range (whiskers), and outliers (plus) for the real switch task for each group (controls: green, relatives: blue, psychosis: red). Data plotted represent detection accuracy for all correct responses (those that matched the real rotation direction, and that were made within 4 s of the direction change). The dashed horizontal line represents the exclusion cutoff value of ≥ 63.6% accuracy.

As can be seen in Fig. 2, PwPP did not differ significantly in real switch task accuracy (X2(2) = 5.32, p = 0.069), as compared to healthy controls or first-degree biological relatives of PwPP. However, there was a significant difference in reaction time in the real switch task (X2(2) = 9.33, p = 0.009; Supplementary Fig. 2), with PwPP tending to respond most slowly. Importantly, our analyses of the real switch task data demonstrate that participants in this study, including PwPP, can accurately detect real switches in the direction of the rotating cylinder. This suggests that participants in each group generally understood the task instructions, could integrate the local stimulus elements into a global rotational motion percept, and could correctly respond to physical changes in depth cues (i.e., occlusion) to disambiguate the cylinder’s rotation direction.

Faster switch rates in individuals with psychotic psychopathology

We hypothesized that integration of local information into a global percept is impaired for individuals with psychotic psychopathology, which results in abnormal bi-stable switch rates. To examine this, we first looked at histograms of the distributions of reported percept durations in each group, in order to test whether the three groups reported similar switch dynamics. As shown in Fig. 3A, B, although the distributions of percept durations from the three groups follow similar trends, they are significantly different (Kolmogorov-Smirnov (K-S) test real switch task – controls vs. relatives: D(32,37) = 0.146, p < 0.001; controls vs. PwPP: D(32,48) = 0.281, p < 0.001; relatives vs. PwPP: D(37,48) = 0.147, p < 0.001; K-S test bi-stable task - controls vs. relatives: D(37,37) = 0.082, p < 0.001; controls vs. PwPP: D(37,48) = 0.13, p < 0.001; relatives vs. PwPP: D(37,48) = 0.075, p < 0.001). For the real switch task (Fig. 3A), this indicates that participants in all 3 groups could detect the real switches occurring at intervals of 9, 11, and 13 s, but that relatives, and especially PwPP, tended to report shorter percept durations. For the bi-stable task (Fig. 3B), distributions were also shifted toward shorter percept durations for PwPP versus controls, with relatives showing an intermediate pattern. Below, we tested the directionality of these effects using average switch rate and percept duration metrics.

Fig. 3: Histograms of percept durations for the real switch and bi-stable tasks.
figure 3

Shown are normalized distributions of percept durations, defined as times between reported switches, across the three groups (controls, n = 32 & 37 – green, relatives, n = 37 & 37 – blue, psychosis, n = 48 & 48 – red) for the real switch task (left) and bi-stable task (right). Distributions were normalized by dividing by the total number of responses across all participants within each group.

We examined switch rates for both the real switch and bi-stable tasks (Fig. 4). The real switch task data allowed us to assess how well participants could detect real switches in rotation direction based on explicit depth cues, whereas the bi-stable task probed spontaneous percept switches in the absence of such cues.

Fig. 4: Switch rates for the real switch task and bi-stable task.
figure 4

Real switch task (A) and bi-stable task (B) switch rates are plotted in the right and left panels respectively. Box plots show median (middle line), 25–75% quartiles (box), 1.5 x the interquartile range (whiskers), and outliers (plus) for the three groups. Gray dots show average switch rates from individual participants. Dashed line in A shows average switch rate for physical stimulus changes in the real switch task. Bi-stable switch rates (B) were significantly higher among PwPP vs. controls as well as among PwPP vs. relatives, using post-hoc chi-squared tests (*). The effect size for the difference in switch rates between PwPP vs. controls, as measured by Cohen’s d, was 0.58, indicating a medium effect.

During the real switch task PwPP, relatives, and controls tended to report perceptual switches at similar rates (Fig. 4A). As described in the Methods section, the real switch task had physical switches overtly implemented that occurred at an average rate of 0.09 Hz (i.e., every 9–13 s). There was no significant group difference in switch rates in the real switch task (X2(2) = 5.36, p = 0.069), and median values across participants were close to the true physical switch rate, indicating that participants in each group generally could detect real switches in rotation direction in this task. This suggests that all participant groups were able to integrate local motion signals into an unambiguous global percept of rotational motion given some explicit depth cues.

During the bi-stable task, we observed a significant difference in switch rates across groups (X2(2) = 6.49, p = 0.039), with PwPP and their relatives having faster switch rates overall versus controls (Fig. 4B). Post-hoc comparisons showed that switch rates among PwPP were significantly higher than in controls (X2(1) = 6.93, p = 0.0085). This suggests that the ability to integrate ambiguous motion stimuli into a global percept in the absence of contextual depth information may be disrupted among PwPP.

Switch rates for the relative group also differed significantly from controls but not from PwPP (controls vs. relatives: X2(1) = 3.98, p = 0.046; PwPP vs. relatives: X2(1) = 0.516 p = 0.473). This suggests that genetic risk for psychotic disorders, as expressed by unaffected first-degree relatives, may be associated with an attenuated disruption in bi-stable motion perception during our SFM task.

Additionally, we found a main effect of time (i.e., block number: F(119,2) = 27.15, p < 0.001) in the bi-stable task, indicating a slight overall decrease in percept switch rates across the five 2-min task blocks. As can be seen in Supplementary Fig. 6, the pattern of differences in switch rates between the three groups remained consistent across blocks. Changes in switch rates over time did not differ significantly between groups (F(119,2) = 0.1.046, p = 0.354). The effect of slower switch rates during later task blocks may reflect an effect of practice, learning, or adaptation during bi-stable task performance.

We measured both switch rates and percept durations in the bi-stable task; both of these measures displayed the same trends across groups (group difference in percept duration, X2(2) = 8.67, p = 0.013; controls vs. PwPP: X2(1) = 7.45, p = 0.0063; relatives vs. PwPP: X2(1) = 5.23, p = 0.022) and were highly correlated with one another. Likewise, for the real switch task, we saw no significant group differences in percept duration (X2(2) = 2.85, p = 0.24). We show the average percept duration results in Supplementary Fig. 7.

In order to assess whether variability in bi-stable perception differed across groups, we calculated the coefficient of variance (i.e., the standard deviation of percept duration divided by the mean) for both task types within each participant. We found no significant difference in percept variability between groups (real switch: X2(2) = 5.351, p = 0.067; bi-stable: X2(2) = 4.06, p = 0.131; see Supplementary Fig. 8), suggesting that the variability of percept durations was generally comparable across groups.

Lastly, we analyzed differences in bi-stable perception across different clinical diagnoses. We compared switch rates among people with schizophrenia, people with bipolar disorder, and healthy controls (Supplementary Fig. 9). Unlike in the full group comparisons, we did see significant differences between these diagnostic groups for the real switch task (X2(2) = 6.85, p = 0.033). Post-hoc comparisons between the individual groups showed a significant difference between the control and bipolar groups (controls vs. bipolar disorder: X2(1) = 4.34, p = 0.037). This suggests that patients with bipolar disorder may have had particular difficulty in detecting real switches. However, we did not see any differences between the schizophrenia group and healthy controls (controls vs. schizophrenia: X2(1) = 1.45, p = 0.228) or between the two psychosis groups (schizophrenia vs. bipolar disorder: X2(1) = 0.818, p = 0.366). For the bi-stable task we once again saw a significant effect of group, where people with schizophrenia and bipolar disorder reported switches at a faster rate than controls (X2(2) = 11.28, p = 0.004). Post-hoc comparisons showed no differences between people with bipolar disorder and healthy controls (controls vs. bipolar disorder: X2(1) = 2.52, p = 0.112) or between the two psychosis groups (schizophrenia vs. bipolar disorder: X2(1) = 0.086, p = 0.769). However, there was a significant difference between people with schizophrenia and healthy controls (controls vs. schizophrenia: X2(1) = 5.304, p = 0.021). As can be seen in Supplementary Fig. 9, these results suggest that differences in switch rates between the diagnostic sub-groups and controls may be smaller / less reliable than for the psychosis groups as a whole. Alternatively, we may have had insufficient statistical power to detect diagnostic group differences, given the smaller number of participants in each sub-group.

Switch rates remain consistent over months

Next, to examine the stability of bi-stable perception dynamics over a long period of time, we brought back a subset of participants for a second repeat experimental session. Table 1 and Supplementary Fig. 1 show the median and range of the time between the first and second sessions in each group (median = 133.5 days). Across participants, the inter-class correlation for switch rates during the first and second sessions was high (ICC(3,k) = 0.88; Fig. 5), indicating that switch rates remained rather consistent within participants over a period of several months. Additionally, this provides evidence that abnormalities in bi-stable perception among the psychosis group are relatively stable over a long time period.

Fig. 5: Longitudinal variability of bi-stable switch rates.
figure 5

Plotted are switch rates (Hz) for session 1 (x-axis) and session 2 (y-axis) for all subjects who participated in two sessions. Red dots show data from PwPP (n = 31), and green dots show data from controls (n = 9).

Association between bi-stable perception and clinical symptoms

We next examined the relationships between measures of clinical psychopathology and bi-stable perception. To do so, we quantified Spearman correlations between clinical scores and bi-stable switch rates (Supplementary Fig. 10). We found no significant relationships between switch rates and measures of cognitive functioning (BACS; r(119) = −0.139, uncorrected p = 0.129) or sensory gating (SGI; r(120) = 0.161, uncorrected p = 0.077). However, we did find a correlation between faster switch rates and higher overall psychiatric symptom scores (BPRS; r(103) = 0.207, uncorrected p = 0.034) as well as schizotypy (SPQ; r(120) = 0.188, uncorrected p = 0.038). We also examined the cognitive-perceptual subscale of the SPQ and a disorganization factor from the BPRS [98] as these measures are relevant to cognitive/perceptual dysfunction. We found positive relationships with switch rate for both measures (BPRS-D; r(103) = 0.264, uncorrected p = 0.006, Bonferroni corrected p = 0.036; SPQ-CP; r(120) = 0.209, uncorrected p = 0.021). This indicates that across all groups, individuals who had higher levels of psychiatric symptoms, especially disorganization, also showed greater perceptual abnormalities in terms of faster bi-stable switch rates. However, of these, only the BPRS-D correlation survived correction for multiple comparisons. Thus, these findings may be considered preliminary and in need of confirmation.

Neurochemical concentrations and perceptual bi-stability

We quantified concentrations for a number of metabolites, including GABA, glutamate, and glutamine in the medial occipital lobe using 7 T 1H-MRS. We chose to focus our analyses on these 3 metabolites as they are putative excitatory and inhibitory markers and have, in the case of GABA, previously been shown to correlate with SFM switch rates among healthy adults [62]. First, we compared concentrations across our three groups for each of these three metabolites (Supplementary Fig. 11). We did not observe any significant differences across the three groups for these three metabolites (GABA: X2(2) = 0.194, p = 0.91; glutamate: X2(2) = 0.80, p = 0.67; glutamine: X2(2) = 1.07, p = 0.58). We then measured correlations between metabolite concentrations and switch rates from the bi-stable task across all participants, excluding repeated scan data (Supplementary Fig. 12). We found no significant relationships between switch rates and occipital concentrations of glutamate (r(112) = −0.058, p = 0.54) or glutamine (Spearman’s: r(112) = 0.06, p = 0.53). We did see a non-significant trend between higher occipital GABA levels and slower SFM switch rates (r(106) = −0.16, p = 0.1), which is in the same direction as a relationship observed in healthy adults that has been reported previously [62]. However, as our result does not reach statistical significance, this finding should be interpreted with caution. Correlations between GABA levels and switch rates within each group were not significant for both controls (r(35) = −0.091, uncorrected p = 0.59) and PwPP (r(42) = −0.091, uncorrected p = 0.558). There was a correlation between higher GABA levels and slower switch rates within the relatives group, but it did not survive correction for multiple comparisons (r(31) = −0.367, uncorrected p = 0.036, Bonferroni corrected p = 0.11). Thus, we did not find strong evidence from MRS to support a relationship between excitatory or inhibitory markers in the visual cortex and switch rates from our bi-stable SFM task.

Discussion

We examined visual motion and form integration across the psychosis spectrum using a bi-stable SFM perception task among PwPP, their first-degree relatives, and healthy controls. In our bi-stable task, we found significantly faster switch rates among PwPP compared to healthy controls, with relatives showing a similar pattern of results. Faster switch rates in PwPP and relatives suggest that SFM percepts were less robust or stable during the task. The level of instability tended to be consistent across experimental sessions held several months apart. There were no significant group differences in switch rates during a real switch task, indicating a similar ability to detect actual physical changes in the rotation direction. Across participants, faster switch rates correlated with significantly higher psychiatric symptom levels, particularly disorganization, suggesting that this perceptual abnormality reflected psychiatric symptom severity. This is in line with previous findings demonstrating correlations between disorganization symptoms and abnormal visual perception [39, 99, 104,105,106]. Together, our findings indicate that abnormal perception of the bi-stable rotating cylinder illusion is associated with psychotic psychopathology as well as unexpressed genetic liability for psychosis.

Prior research has shown that perception of the rotating cylinder requires processing of both object shape [20, 23] and complex rotational motion [17,18,19], and involves brain regions implicated in both processes, such as LOC and hMT+. Therefore, differences in switch rates among PwPP may be caused by impaired integration of local motion cues into a larger rotating object percept. Consistent with this notion, previous studies have found that visual functions involving integration (including center-surround suppression, contour integration, border detection, object and shape representation, figure-ground segmentation, and structure-from-motion) are also impaired in PwPP [26, 28, 29, 32, 33, 37, 40, 42,43,44,45,46].

Our results suggest that the faster switch rates we observed in our SFM task among PwPP may not be attributed solely to impaired local motion detection, but rather involve impairments in the ability to combine those local motion signals into one of two stable competing percepts of a rotating cylinder. Performance in our real switch task depends on the ability to integrate local motion cues to detect rotation direction, and the explicit depth information facilitates the perception of an unambiguous global motion percept. Although we observed differences across groups in the distributions of percept durations during our real switch task (i.e., greater numbers of shorter percepts in PwPP), switch rates in the real switch task did not differ significantly between groups. Reaction times were significantly longer among PwPP during the real switch task, but this does not appear sufficient to explain our observations of faster switch rates among PwPP, as slower reaction times alone would not be expected to yield a greater number of reported switches. Thus, relatively normal performance in the real switch task among PwPP suggests that their ability to detect changes in local motion direction was not dramatically impaired, in agreement with previous work [27, 33, 34].

Our results suggest that PwPP experienced less stable global SFM percepts, as switch rates during the bi-stable task were significantly faster than for control participants. Models of bi-stable perception propose that reciprocally connected neural populations with different tuning properties (e.g., selectivity for rightward vs. leftward rotation in depth) may compete when a person views ambiguous bi-stable stimuli. Percepts alternate as one neural population suppresses the other, and then is suppressed in return [107,108,109]. Our results may suggest that the depth of suppression between competing populations may be weaker among PwPP during SFM perception; weaker suppression might be expected to yield faster switch rates, as the dominant population would be less able to sustain suppression of the competing percept. This would appear consistent with other reports of weaker perceptual suppression in PwPP (e.g., center-surround suppression, temporal masking; [37, 45, 81, 110, 111]).

A pair of recent studies examined bi-stable perception among people with schizophrenia using a rotating sphere illusion [96, 112]. Those studies used methods to bias perception towards one rotation direction by stopping and starting the stimulus (continuation bias) or through the repeated presentation of unambiguous stimuli that tended to rotate more in one direction (learning bias). Their goal was to examine bi-stable SFM perception under the framework of predictive coding. They found that people with schizophrenia showed less biased perception in both their continuation and learning paradigms. Although bi-stable switch rates were not reported in these previous studies, our current findings appear generally consistent with their notion of weaker sensory predictions during bi-stable SFM perception in PwPP.

We predicted switch rates would be slower among PwPP compared to controls, based on results in the literature showing slower switch rates during binocular rivalry among people with schizophrenia [67,68,69,70,71,72]. Contrary to this prediction, we found switch rates in our SFM task were in fact faster among PwPP when compared to healthy controls. There are a number of factors that may explain this apparent discrepancy. Previous studies used a different bi-stable paradigm, binocular rivalry, in which participants are shown two dissimilar stimuli (one in each eye, e.g., orthogonal gratings) that compete for perceptual dominance. This task is thought to particularly engage neural populations selective for low-level image features (e.g., orientation, color), such as those in V1, to drive interocular suppression and perceptual rivalry [10, 113]. Thus, binocular rivalry may depend more strongly on suppression at an earlier stage of neural processing compared to bi-stable SFM. Indeed, a recent study by Cao and colleagues [114] showed that switch rates from rotating cylinder and binocular rivalry tasks were only weakly correlated across individuals.

In addition, previous studies have focused specifically on people with schizophrenia or bipolar disorder, whereas we examined bi-stable visual perception across the psychosis spectrum. Although we saw significantly faster switch rates among PwPP versus controls overall, and significant differences in the post-hoc comparison between schizophrenia and controls, post-hoc tests between bipolar and control groups were not significant. This may reflect a lack of statistical power due to the smaller sample sizes within these diagnostic sub-groups. Our schizophrenia and bipolar sub-groups showed very similar switch rates to one another, suggesting that bi-stable SFM perception was not dramatically different across different diagnostic categories within the psychosis spectrum.

We hypothesized that abnormal switch rates during bi-stable SFM perception might be related to impaired E/I balance in psychosis. Although the binocular rivalry literature has shown slower switching in PwPP compared to healthy controls, a reason one might expect faster switch rates comes from reports of reduced GABA in calcarine cortex in schizophrenia [74, 115]. Among healthy adults, lower occipital GABA levels have been associated with increased perceptual switches during SFM [62]. Dysfunction in both excitatory glutamate [58, 59] and inhibitory GABA functioning [60, 61, 74, 115] have been reported in psychotic disorders and suggest disruptions in E/I balance. Investigations support the contention that bi-stable perception, including in the rotating cylinder illusion, is linked to E/I balance in the visual cortex among healthy adults [62, 64, 65]. Although we observed a relationship between higher inhibitory GABA levels and slower SFM switch rates in a direction that matched previous reports [62], this correlation did not reach statistical significance, and we saw no significant relationships between SFM switch rates and glutamate or glutamine. Methodological differences may explain the discrepancy between our results and those of van Loon and colleagues [62]. For example, their study used a MEGA-PRESS sequence at 3 T with a relatively long echo time (73 ms), whereas we used an ultra-short echo time (8 ms) STEAM sequence at 7 T (see [86] for a discussion of the advantages of our technique). Because we saw no significant group differences in GABA levels, nor any significant correlations within groups between GABA and switch rates (after correction for multiple comparisons), we cannot draw any strong conclusions about the role of GABAergic inhibition in abnormally fast switch rates among PwPP. However, we note that group differences in bi-stable switch rates remained significant even after excluding participants taking benzodiazepines (3 relatives, 17 PwPP; X2(2) = 6.47, p = 0.04, Cohen’s d = 0.49). Finally, it is not clear to what extent neurochemical concentrations in other brain regions (e.g., those with greater foveal selectivity, and/or higher visual areas such as hMT+) may be linked to the perception of bi-stable SFM.

Examining SFM task performance among first-degree relatives allowed us to explore the role of genetic liability for psychosis during bi-stable SFM perception. Across our analyses, results from relatives tended to fall in between controls and PwPP. This pattern of results can be seen both in the average switch rates (Fig. 4) and the distribution of percept durations (Fig. 3). Previous studies from our group and others have found that biological relatives show attenuated visual dysfunction (e.g., backward masking, perceptual organization) that falls on a continuum between healthy controls and PwPP [42, 80, 81, 116]. The results of the current study appear consistent with this notion and suggest that faster-than-normal switch rates during bi-stable SFM perception may be linked in part to genetic liability for psychosis.