Introduction

Schizophrenia is a mental disorder with psychosis, that is, delusion and hallucination. Previous studies have successfully identified various behavioral markers of schizophrenia1. Various abnormalities in eye movements have been reported in patients with schizophrenia2,3. Among them, significant differences in eye movements are observed in a simple free-viewing task. Gaze covers a relatively large area of the images in healthy participants, whereas participants with schizophrenia tend to limit their gaze to a narrower area of the images, resulting in shorter scanpath lengths than healthy participants4,5,6,7. Visual and visuo-cognitive processing are affected8,9,10, while the motor aspects of saccadic eye movements are less affected7,11. In schizophrenia, it would be reasonable to assume that abnormalities in visual exploration result from abnormalities in visual or visuo-cognitive processing such as visual attention. While deficits in top-down attention have been well-documented in schizophrenia research12, it is relatively less studied the effect of bottom-up, visual attention or, more specifically, visual salience in schizophrenia. Here, we examined the possibility that processing of visual salience is affected in schizophrenia. It is important to note that accumulating studies showed motivational salience is affected in schizophrenia13,14 and the aberrant salience hypothesis of psychosis15 proposes that aberrant assignment of salience to elements of oneā€™s experience leads to delusion and hallucination. The motivation of the current study is to expand the concept of the aberrant salience hypothesis explicitly to the visual domain.

More specifically, we sought to determine whether these abnormalities in eye movements were due to altered visual salience to objects in the visual scene. To this end, we examined eye movements during free-viewing of natural images in participants with schizophrenia (SZ) and healthy controls (HC). The visual salience of the presented images was quantified by Itti and Kochā€™s computational model of salience, which has already proven useful for predicting and analyzing the visual exploration behavior of humans and non-human primates16,17,18,19. In the present study, we first tested whether the gaze sequences of participants with schizophrenia differed from those of healthy controls in terms of salience values. Since the results showed caseā€“control differences, we examined in detail which low-level visual features contributed to the differences in eye movements and found that the difference stemmed specifically from orientation salience. We then explored the origin of the effects of orientation salience by examining the stages of the salience computation in the model.

Results

Visual-oculomotor properties are affected in schizophrenia but oculomotor properties are not

Eye-tracking data was obtained from 82 participants with schizophrenia and 252 healthy controls while freely viewing 56 natural and/or complex images. Table S1 summarizes the demographics of the participants, scores of cognitive tests, and saccade characteristics during free-viewing. Analysis of saccades revealed that visual exploration is affected in schizophrenia, as reported in previous studies7,11. The average number of saccades during 8Ā s of the viewing period, the average amplitude of saccades, and the scanpath length were smaller in participants with schizophrenia than in healthy controls (Table S1). In contrast, oculomotor properties, as assessed by the fitted parameters of the main sequence relationship, were relatively unaffected (Table S1).

Salience-guided eye movements are affected in examples of single images

To investigate the relationship between the gaze positions (defined as endpoints of regular saccades) and the visual salience of the test images, saliency maps were computed using the Ittiā€“Koch saliency computational model (Fig.Ā 1A and Fig. S1)16,20.

Figure 1
figure 1

The mean value of orientation salience at the gaze of participants with schizophrenia is higher than that of healthy control subjects. (A) The saliency map was calculated from the Ittiā€“Koch model. Visual salience for low-level visual features (color, ā€œColā€; luminance, ā€œLumā€; orientation, ā€œOriā€) was also computed in this model. (B) Gaze positions of two control subjects (left) and two SZs (right) represented by numbers and superimposed on the saliency map of test images. Numbers indicate saccade order. (C) The saliency values averaged across test images and participants are plotted across saccade numbers on a log scale. (C) As in (B), but for single feature salience models. Age-matched resampled data are plotted for healthy controls. Magenta, the healthy controls (HC; nā€‰=ā€‰252); blue, the participants with schizophrenia (SZ; nā€‰=ā€‰82). Numbers on the plots denote P values for the main effect of the participant group. (D) Mean saliency values for all images and saccades are plotted for the healthy controls (HC) and the schizophrenia group (SZ). Data from four salience models are plotted. Symbols denote median values. Error bars denote the first and the third quartile. Numbers on the plots denote P values (ā€œpā€) and the effect sizes (ā€œĪ”ā€, Cliffā€™s delta) of the Wilcoxon rank-sum test.

In representative examples (for the test image #8 and #38), saccades of healthy controls were distributed not only in highly salient positions (shown in yellow) but also in other locations in the image (Fig.Ā 1B, left). In contrast, saccades of participants with schizophrenia remained at salient locations of the image during free-viewing (Fig.Ā 1B, right). To quantify the time course of salience values during the 8-s viewing period, the salience values for each gaze were averaged across participants for the test images and were plotted across time (Fig. S2A, left). The mean salience values were higher throughout the viewing period in participants with schizophrenia (SZ; blue in Fig. S2A, left) than in healthy controls (HC; magenta in Fig. S2A, left). The difference between the participant groups becomes more evident when the salience values were averaged across all saccades during the viewing period (Fig. S2A, right).

Visual salience affects the gaze of participants with schizophrenia

To assess population averages, we plotted the salience values for each saccade averaged across all test images and all participants (Fig. S2B, top). Consistent with the single image results (Fig. S2A), the mean saliency values were consistently higher in SZs than in HCs.

For statistical analyses, a linear mixed model with random intercepts and slopes was used to test the main effect of the participant group (SZ vs. HC) and the interaction between the participant group and the saccade number. Using the log of saccade numbers as a factor, it is more reasonable to fit the data with a linear mixed model (Fig. S2B, bottom). Since the interaction term was not significant (F(1, 330.11)ā€‰=ā€‰0.17 and Pā€‰=ā€‰0.67), a model without interaction was selected. The main effect for the participant group was highly significant (F(1, 332.00)ā€‰=ā€‰44.29 and Pā€‰=ā€‰1.17ā€‰Ć—ā€‰10āˆ’10). These results suggest that the gaze of SZs is more likely to be directed toward visually salient locations than that of HCs.

When the data from age-matched resampled data for HC were compared with the data from SZ (Fig.Ā 1C, leftmost), the results of the statistical analysis were not affected. The interaction term was not significant (F(1, 159.69)ā€‰=ā€‰0.02 and Pā€‰=ā€‰0.87), and a model without interaction was selected. The main effect for the participant group was highly significant (F(1, 162.00)ā€‰=ā€‰25.49 and Pā€‰=ā€‰1.18ā€‰Ć—ā€‰10āˆ’6). The resampling procedure was repeated 100 times, one of which was used throughout the following analyses. See also the supplementary text, ā€œConsideration of resampling schemeā€ for more details.

The gaze of participants with schizophrenia is affected by orientation salience

To assess the contribution of low-level visual features, single-feature saliency maps for color (ā€œColā€), luminance (ā€œLumā€), and orientation (ā€œOriā€) were used for analysis as in the full salience model (Fig.Ā 1A and Fig. S1). In all three models, interaction terms were not statistically significant (F(1, 160.79)ā€‰=ā€‰0.93 and Pā€‰=ā€‰0.34 for color, F(1, 161.60)ā€‰=ā€‰0.52 and Pā€‰=ā€‰0.47 for luminance, and F(1, 160.14)ā€‰=ā€‰1.79 and Pā€‰=ā€‰0.18 for orientation, respectively). In the color (Fig.Ā 1C, middle left) and the luminance (Fig.Ā 1C, middle right) models, the main effects for the participant group were not significant (F(1, 162.01)ā€‰=ā€‰0.16 and Pā€‰=ā€‰0.69 for color and F(1, 161.97)ā€‰=ā€‰0.12 and Pā€‰=ā€‰0.72 for luminance, respectively). In contrast, the main effect of the participant group was highly significant (F(1, 161.99)ā€‰=ā€‰44.80 and Pā€‰=ā€‰3.39ā€‰Ć—ā€‰10āˆ’10) in the orientation model (Fig.Ā 1C, rightmost). These results suggest that the main effect in the participant group in the full model (Fig.Ā 1C, leftmost) is explained sorely by orientation salience.

The specific effect of orientation salience is also evident in time-averaged data

Since the liner mixed model showed no significant interaction effect between the participant group and saccade numbers in all four salience models (Fig.Ā 1C), it is reasonable to summarize the data by averaging all saccades during the viewing period (Fig.Ā 1D). The mean saliency value from the full model was higher in SZ than in HC (Zā€‰=ā€‰4.82, Pā€‰=ā€‰5.75ā€‰Ć—ā€‰10āˆ’6; in Wilcoxon rank-sum test with Bonferroni correction). The effect size evaluated by Cliffā€™s delta was 0.44. The mean salience value from the orientation model was higher in SZ than in HC (Zā€‰=ā€‰6.41, Pā€‰=ā€‰5.64ā€‰Ć—ā€‰10āˆ’10; in Wilcoxon rank-sum test with Bonferroni correction). The effect size evaluated by Cliffā€™s delta was 0.58, which is classified as a large effect. On the other hand, the mean salience value from the color and the luminance models were not significantly different between SZ and HC (Zā€‰=ā€‰0.08, Pā€‰>ā€‰1.0 for color; Zā€‰=ā€‰0.26, Pā€‰>ā€‰1.0 for luminance; in Wilcoxon rank-sum test with Bonferroni correction). Based on these findings, the following analyses were performed on the time-averaged data.

All of these mean salience values were significantly higher than chance. See also the supplementary text, ā€œConsideration of random sampling schemeā€ for more details. We also tested whether the saliency values for the first saccade were significantly different between SZ and HC (Fig. S5). See ā€œDiscussionā€ for details.

Image category does not affect the effect of orientation salience

To examine whether the effects of orientation salience depend on image category, the effect size (Cliffā€™s delta) that evaluates the difference between the time-averaged saliency values of HC and SZ for each test image was plotted for all four salience models (Fig. S3). Positive values in the effect size indicate that saliency values were higher in SZ than in HC. The results (Fig. S3) show that the effect sizes for the Full and the orientation models were consistently positive for all image categories except for the Face and Noise categories. This suggests that the effect of orientation salience is overall robust to image categories.

The Lā€‰+ā€‰M channel of the DKL color space is dominant in the effect of orientation salience

Since it is already known that the magnocellular pathway conveys relatively low spatial frequency signals21 and that the magnocellular pathway is specifically impaired in schizophrenia8, this point is directly examined. In the Ittiā€“Koch salience model (hereafter referred to as the ā€œoriginal modelā€), the orientation salience is calculated from grayscale images. Instead, it is possible to decompose the test images into images of three channels (Lā€‰+ā€‰M, Lā€‰āˆ’ā€‰M, and S-(Lā€‰+ā€‰M)), based on the Derringtonā€“Krauskopfā€“Lennie (DKL) color space (Fig.Ā 2A). In the DKL color space, the three channels (Lā€‰+ā€‰M, Lā€‰āˆ’ā€‰M, and S-(Lā€‰+ā€‰M)) separately convey the visual information for the magnocellular, parvocellular, and koniocellular pathways22. The decomposed images were then processed to obtain saliency maps with (orientation map) or without (intensity map) Gabor filtering (Fig.Ā 2B). We refer to this as the ā€œExtended six-channel model.ā€

Figure 2
figure 2

The Lā€‰+ā€‰M channel of the DKL color space is dominant in the effect of orientation salience. (A) The Derringtonā€“Krauskopfā€“Lennie (DKL) color space. See ā€œResultsā€ and ā€œMethodsā€ for details. (B) To construct the extended six-channel model, the original image was decomposed into three channels in the DKL color space: the magnocellular Lā€‰+ā€‰M channel, the parvocellular Lā€‰āˆ’ā€‰M channel, and the koniocellular S-(Lā€‰+ā€‰M) channel. Then saliency maps for intensity and orientation were obtained for each of the three channels (six maps in total). (C,D) As in Fig.Ā 1D, the salience values averaged across test images and saccades were plotted, but for six saliency maps. Symbols are the same as in Fig.Ā 1D.

Then the salience values for these six maps were compared between SZ and HC (Fig.Ā 2C,D). In the intensity maps (Fig.Ā 2C), all differences were not statistically significant (Zā€‰=ā€‰0.21, Pā€‰>ā€‰1.0 for the Lā€‰+ā€‰M channel; Zā€‰=ā€‰1.09, Pā€‰>ā€‰1.0 for the Lā€‰āˆ’ā€‰M channel; Zā€‰=ā€‰āˆ’ā€‰0.51, Pā€‰>ā€‰1.0 for the S-(Lā€‰+ā€‰M) channel; in Wilcoxon rank-sum test with Bonferroni correction). In the orientation maps (Fig.Ā 2D), all of the differences were statistically significant (Zā€‰=ā€‰6.25, Pā€‰=ā€‰2.51ā€‰Ć—ā€‰10āˆ’9 for the Lā€‰+ā€‰M channel; Zā€‰=ā€‰4.20, Pā€‰=ā€‰1.59ā€‰Ć—ā€‰10āˆ’4 for the Lā€‰āˆ’ā€‰M channel; Zā€‰=ā€‰3.84, Pā€‰=ā€‰7.49ā€‰Ć—ā€‰10āˆ’4 for the S-(Lā€‰+ā€‰M) channel; in Wilcoxon rank-sum test with Bonferroni correction). These results further support the finding of a specific effect of orientation salience and suggest that the Lā€‰+ā€‰M channel of the DKL color space is dominant in the effect of orientation salience.

Saliency maps for orientation are correlated with gaze maps

In Fig.Ā 1C,D, saliency values were evaluated separately for each feature/channel. It is important to note that saliency maps for the different features/channels are spatially correlated (Fig. S4A). This correlation may affect statistical analyses in Fig.Ā 1C,D. To address this issue, we calculated the partial correlation between the saliency maps and the ā€œgaze mapsā€ which quantifies gaze distribution as a density map summed across participants for each test image (Fig. S4B). We also created a gaze map for the difference between SZ and HC (ā€œSZ-HCā€). The partial correlations of these maps were then calculated for each test image (nā€‰=ā€‰56).

For the original model, we plotted the partial correlations between the gaze maps and the saliency maps for the three features (color, luminance, and orientation) (Fig. S4C). For the three gaze maps (SZ, HC, and SZ-HC), the partial correlation with the saliency map for orientation was significantly larger than zero (Pā€‰=ā€‰4.0ā€‰Ć—ā€‰10āˆ’8 for SZ, Pā€‰=ā€‰1.7ā€‰Ć—ā€‰10āˆ’7 for HC, and Pā€‰=ā€‰1.6ā€‰Ć—ā€‰10āˆ’7 for SZ-HC, respectively; Wilcoxon signed-rank test with Bonferroni correction). In contrast, the partial correlations with the saliency map for color and luminance were not significantly larger than zero (Pā€‰>ā€‰0.05, Wilcoxon signed-rank test with Bonferroni correction) except for a negative correlation in luminance for SZ-HC (Pā€‰=ā€‰2.7ā€‰Ć—ā€‰10āˆ’4).

For the extended six-channel model, three maps for orientation were shown (Fig. S4D). For two gaze maps (SZ, HC), the partial correlations with the saliency map for the Lā€‰+ā€‰M and the Lā€‰āˆ’ā€‰M channels were significantly larger than zero (Pā€‰=ā€‰2.0ā€‰Ć—ā€‰10āˆ’5 and 0.002 for SZ, Pā€‰=ā€‰6.0ā€‰Ć—ā€‰10āˆ’5 and 0.008 for HC; Wilcoxon signed-rank test with Bonferroni correction). On the other hand, for the SZ-HC gaze map, the partial correlation with the saliency map of the Lā€‰+ā€‰M channel was significantly larger than zero (Pā€‰=ā€‰5.2ā€‰Ć—ā€‰10āˆ’4, Wilcoxon signed-rank test with Bonferroni correction). These results suggest that the Lā€‰+ā€‰M channel contributes specifically to the effect of orientation salience, even after accounting for spatial correlations between saliency maps.

The key computational stage that produces differences in orientation salience is Gabor filtering

Next, we examined which stage of the salience computation produces differences in orientation salience. In the Ittiā€“Koch salience model, saliency maps for the orientation feature are generated through five processing stages (Fig. S1): (1) transforming the input image into a grayscale image with various spatial resolutions (i.e., Gaussian pyramids), (2) filtering the images with Gabor patches with four different orientations, (3) center-surround inhibition, (4) peak normalization, and finally, (5) adding all images together to obtain the saliency map of the orientation feature. These intermediate maps were calculated and used for analysis as was done in the analysis of Fig.Ā 1D. Differences between participant groups were detected not only in the saliency map for the orientation feature (Fig.Ā 3A, rightmost) but also in peak normalization, center-surround inhibition, and Gabor filter (Fig.Ā 3A, middle). In contrast, no differences between participant groups were detected in intensity (Fig.Ā 3A, leftmost). These differences became more evident when each comparison was evaluated by the effect size (Fig.Ā 3B). These results suggest that the differences between participant groups emerge at the stage of Gabor filtering with four orientations, a unique computational stage that is not present in the color or luminance salience but only in the orientation salience.

Figure 3
figure 3

The key computational stage that produces differences in orientation salience is Gabor filtering. (A) The Ittiā€“Koch saliency model computes the saliency map of the orientation feature through five stages (Fig. S1). The map values were calculated for these intermediate images are plotted as in Fig.Ā 1C. (B) Effect sizes (Cliffā€™s delta) for the tests in (A). Numbers indicate the spatial scales of the Gaussian pyramids. In the three plots in the center, the mean effect size for four orientations (bar), as well as the effect size for four orientations (symbols) were plotted. ā€œo,ā€ 0Ā°; ā€œx,ā€ 45Ā°; triangle, 90Ā°, and square, 135Ā°. The horizontal lines in the rightmost plot (ā€œSaliency map (Orientation)ā€) indicate the common guidelines for the effect size (Cliffā€™s delta). See ā€œMethodsā€ for details.

Orientation salience is correlated with the scores of cognitive tests and visual oculomotor characteristics

Finally, we examined whether orientation salience is correlated with demographic data, cognitive ability, and eye movement characteristics collected from the same participants (see7,23 for details). When the dependent variables (demographic data, cognitive tests, or eye movement characteristics) were fitted with orientation salience with the participant group as a covariate, age was not correlated with orientation salience but WAIS-3, processing speed (PS), social functioning scale (SFS), and scanpath length (SPL) were correlated with orientation salience (Fig.Ā 4A).

Figure 4
figure 4

Orientation salience is correlated with the scores of cognitive tests and visual oculomotor characteristics. (A) Scattered plots for the mean orientation salience (averaged across images and saccades) and age, WAIS-3 processing speed (PS), social functioning scale (SFS), and scanpath length (SPL). Each dot represents the value for one participant. Magenta: healthy controls (HC); blue, participants with schizophrenia (SZ). Lines indicate regression lines for each participant group. (B) Absolute value of t-values from regression analysis 1, where the dependent variables were fitted individually with the saliency values of the original model. Gray bars indicate statistical significance (Pā€‰<ā€‰0.05) after correction of multiple comparisons by FDR. (C) As in (B), but those from regression analysis 2, where the dependent variables were fitted individually with the saliency values of the extended six-channel model.

For a more systematic analysis, two regression models were constructed. In the first model, the dependent variables were fitted with saliency values from the original model, with the participant group as a covariate. The t-values obtained from the fitting were then plotted (Fig.Ā 4B). Orientation salience is correlated with various scores of cognitive tests and eye movement characteristics (indicated by the gray bars in Fig.Ā 4B). This is consistent with previous findings that scores of cognitive tests are correlated with visual oculomotor characteristics such as scanpath length23. Regression analysis also suggested that these correlations are specific to orientation.

In the second model in which the dependent variables were fitted with saliency values from the extended six-channel model, with the participant group as a covariate, orientation salience in the Lā€‰+ā€‰M channel was significantly correlated with a cognitive test score (SFS) and eye movement characteristics (saccade amplitude and SPL) (Fig.Ā 4C). In contrast, orientation saliences in the other channels are not correlated with these values. These results again suggest that the Lā€‰+ā€‰M channel is dominant in the effect of orientation salience.

Discussion

In this study, we examined whether abnormalities in eye movements result from aberrant processing of visual salience. For this purpose, we analyzed the eye movement data during free-viewing with Ittiā€“Kochā€™s salience model (Fig.Ā 1A). We found that the saliency values at the gaze were reduced across saccades, which is consistent with previous findings24,25. We found that the saliency values at the gaze of SZs were persistently higher during the viewing period compared to the HCs (Fig.Ā 1C). Further analysis using single-feature saliency maps revealed that this difference was due to orientation salience (Fig.Ā 1C,D). We then confirmed that these results are robust for various image categories (Fig. S3). We also analyzed the gaze with an extended salience model that evaluates the channels for the DKL color space separately and found that the Lā€‰+ā€‰M cannel has a dominant role in the effects of orientation salience (Fig.Ā 2). We also evaluated a spatial correlation between gaze maps and saliency maps and found that orientation salience in the Lā€‰+ā€‰M channel correlates specifically with the difference between the gaze maps of the SZs and those of HCs (Fig. S4). In addition, we delved into the stages of salience computation and found that differences between SZs and HCs were found in an early stage of salience computation, where grayscale images were filtered with Gabor patches with four orientations (Fig.Ā 3). Finally, the saliency values at gazes were not correlated with symptom-related measures such as PANSS scores but were correlated with various measures of cognitive functions and saccade-related characteristics (Fig.Ā 4). These results suggest that the difference between schizophrenia and healthy control emerges at the earlier stage, suggesting functional decline in early visual processing. Our findings also suggest that visual salience is affected in schizophrenia, thereby expanding the concept of the aberrant salience hypothesis of psychosis to the visual domain.

Relationship with eye movement abnormalities: visual-oculomotor properties

As described in the ā€œIntroductionā€, visual-oculomotor properties such as scanpath length, saccade number, and saccade amplitudes are affected in the gaze of SZs during free-viewing (see also Table S1). We found that these visual-oculomotor properties are correlated with saliency values (Fig.Ā 4). Since the bottom-up salience of the images is related to target selection, the abnormalities in visual exploration that have been reported in various studies8 arise, at least in part, from the affected salience-guided eye movement found in this study.

Relationship with eye movement abnormalities: inhibition-of-return

Recently, our research group reported the finding that inhibition-of-return is impaired in SZs26. This might raise the question of whether return saccades toward salient stimuli are more frequent in schizophrenia (Fig. S5A). If this is the case, salience values at the gaze could be higher in SZ than in HC. However, it is unlikely because salience values at the first gaze for the viewing period, where it has no contribution to inhibition-of-return, were significantly higher in SZ than HC in the orientation model (Fig. S5C). Furthermore, if the SZs made more return saccades toward salient positions than in the HCs, the slope of the time course of saliency values would be shallower in the SZs than in the HCs (Fig. S5B, left). However, this is not the case because analyses using linear mixed models showed no interaction between the participant group and saccade numbers. Thus, abnormalities in visual salience found in this study have a different origin from abnormal inhibition-of-return. For further analysis and discussion, see also Okada et al.26.

Contribution of low-level visual features, early visual cortex, and the magnocellular pathway

We found that salience computation for the orientation feature is specifically affected in schizophrenia. Consistent with this finding, previous findings have also shown that orientation processing is affected in schizophrenia27,28. Other papers have shown that the early visual cortex is involved in changes in contextual modulation in schizophrenia29,30,31. The effect of orientation salience can explain impairment in contour integration in schizophrenia28 because contour integration can be distracted by aberrant processing of orientation salience.

Based on the analysis of Fig.Ā 3 and the accompanying text, we argue that the crucial difference between SZs and HCs stems from their response to Gabor-filtered images. Since the spatial frequencies of the Gabor filters were less than 5 cycles per degree (see also Fig. S1), they resemble responses in the early visual cortex. On the other hand, there was no difference in gazes toward luminance salience. Computation of luminance salience involves center-surround inhibition without Gabor filtering. Such processing resembles the response of the lateral geniculate nucleus (LGN). We propose that aberrant orientation salience reflects abnormalities at the level of the primary visual cortex but not at the level of the LGN. Post-mortem morphological studies of the schizophrenia brains support this possibility. The volume and number of neurons in the primary visual cortex are reduced in schizophrenia32,33 but it is not the case for the LGN34,35. Thus, the idea that visual abnormalities in schizophrenia occur between the LGN and primary visual cortex is consistent with both previous studies and the present current study.

The analyses in Figs.Ā 2 and 4 suggest that orientation salience in the Lā€‰+ā€‰M channel of the DKL color space is specifically affected in schizophrenia. Since the Lā€‰+ā€‰M channel carries achromatic information in the magnocellular pathway22, the present results are consistent with previous findings that visual information processing in the magnocellular pathway is specifically affected in schizophrenia36,37,38.

Relationship with clinical/cognitive tests

FigureĀ 4 shows that symptom-related measures such as the PANSS total score, CPZ equivalent, and duration of illness were not significantly correlated with orientation salience. These results suggest that orientation salience is a trait marker rather than a state marker. This is consistent with the view that eye movement-related measures are trait markers of schizophrenia39.

FigureĀ 4 also shows that scores on some of the cognitive tests are correlated with orientation salience. This is consistent with previous findings that scores of cognitive tests are correlated with visual oculomotor characteristics such as scanpath length23. It is unlikely that the abnormality in scanpath length causes abnormalities in orientation salience, as oculomotor properties of eye movements are less likely affected in schizophrenia than visual or visuo-cognitive processing. Rather, we propose that aberrant processing of orientation salience causes abnormalities in visual exploration, which can be assessed by visual oculomotor properties such as scanpath length, which in turn causes difficulty in cognitive abilities such as social functioning.

Relevance to the aberrant salience hypothesis of psychosis

The aberrant salience hypothesis of psychosis proposes that an aberrant assignment of salience to the elements of oneā€™s experience leads to delusion and hallucination15. The concept of salience in this hypothesis covers not only salience derived from emotion and motivation (motivational/incentive salience)40, but also salience due to novelty and sensory features (perceptual salience)41. The present study provides direct evidence that visual salience is affected in schizophrenia, thereby explicitly extending the concept of aberrant salience to the visual domain. In support of our findings, a recent study revealed that brain responses to images with various forms of salience such as novelty, negative emotion, targetness, and rarity/deviance are affected in schizophrenia42.

Future projects

In this study, we obtained clues about what is affected in the brain of schizophrenia, for example, Gabor filtering that is presumably performed in the magnocellular pathway of the early visual cortex. To understand exactly how the brain performs such computation, it is necessary to understand the processes at the neuronal level. To this end, neurophysiological studies using animal models of schizophrenia are needed. Since eye-tracking during free-viewing and analysis of visual salience is an experimental paradigm that has been successfully used in non-human primates such as macaque monkeys17 and marmosets19, replicating the present results in a non-human primate model of schizophrenia43,44 would open the door to understanding the precise brain mechanism of schizophrenia.

Methods

Participants

Eye movement data were sampled from 82 SZs (male, 42; female, 40) and 252 HCs (male, 144; female, 108) as part of a large-scale cohort recruited at Osaka University (Table S1)7,11,23. There was an overlap in data with a previous study on eye movement abnormalities in schizophrenia7. All participants were biologically unrelated, were of Japanese descent, and had no history of the ophthalmologic disease, or neurological/medical conditions that could influence the central nervous system. Specific exclusion criteria included atypical headaches, head trauma with loss of consciousness, chronic lung disease, kidney disease, chronic hepatic disease, thyroid disease, active cancer, cerebrovascular disease, epilepsy, seizures, substance-related disorders, or mental retardation7,11.

SZs were recruited from Osaka University Hospital and had been diagnosed by two or more trained psychiatrists according to criteria from the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) based on the Structured Clinical Interview for DSM-IV (SCID). Estimated cognitive decline was calculated by the methods described by Fujino45. The current symptoms of the SZs were assessed using the Positive and Negative Syndrome Scale (PANSS)46, and daily antipsychotic use was calculated using chlorpromazine (CPZ) equivalents (mg/day)47.

HCs were assessed for psychiatric, medical, and neurological concerns using a non-patient version of the SCID to exclude individuals with current or past contact with psychiatric services or who had received psychiatric medication.

Informed consent was obtained from all subjects after a full explanation of the study procedures. Anonymity was preserved for all participants. The study was performed in accordance with the World Medical Associationā€™s Declaration of Helsinki and was approved by the Research Ethical Committee of Osaka University, the National Center of Neurology and Psychiatry, and Center for Experimental Research in Social Sciences, Hokkaido University.

Task and stimuli

In the free-viewing task, the participants faced a 19-inch liquid crystal display monitor (1280ā€‰Ć—ā€‰1024 pixels) placed 70Ā cm from the observerā€™s eyes. Presentation of the visual stimuli was done using the Psychophysics Toolbox extension48 in MATLAB (The Mathworks, Natick, MA, USA). Each trial began with the presentation of a fixation point on the center of the display. Once the participant fixated on the fixation spot for a random time, a test image was presented for 8Ā s. The participant was instructed to view the image as they like. One task consisted of 56 images, the order of which was randomly shuffled for each participant. The images were chosen from eight categories: natural environments, buildings, everyday items, foods, faces, animals, fractal patterns, and noise (seven images for each). The images of natural environments and animals were selected from the International Affective Pictures System (IAPS)49, and the face images from Matsumoto and Ekman50. Since IAPS images are not allowed to be published in scientific journals, only the saliency maps for these images (#8 and #38) are shown in the figures. In cases when images needed to be shown as examples (Figs.Ā 1A, 2B, and Fig. S1 ), a photograph taken by one of the authors was used.

Recording and preprocessing of eye movement data

Recording and preprocessing of eye movement data were performed as follows7. Eye position and pupil area of the left eye were measured at 1Ā kHz using EyeLink1000 (SR Research, Ontario, Canada). Eye position data (in degrees) were smoothed with a digital finite impulse response (FIR) filter (āˆ’ā€‰3Ā dB at 30Ā Hz), and eye velocity and acceleration traces were derived from a two-point difference algorithm. Eye movement recordings were segmented into blink, saccade, and fixation periods. Detected saccades included both regular saccades and microsaccades. Here, following previous papers on microsaccades during free-viewing such as51, saccades with amplitudes greater than one degree were selected as regular saccades.

We examined the main sequence relationship of the saccades of individual subjects by fitting the function Vā€‰=ā€‰aā€‰Ć—ā€‰{1ā€‰āˆ’ā€‰exp (āˆ’ā€‰bā€‰Ć—ā€‰A)}ā€‰+ā€‰c to the amplitude (A) and peak eye velocity (V) of the saccades obtained from all trials, where a, b and c were optimized (Fig.Ā 4B,C and Table S1).

Computational models and saliency analysis

To assess salience-guided eye movements, we used a validated computational model of visual attention and compared it with individual eye movements. The saliency maps for the test images were computed with the Ittiā€“Koch saliency model for static images16 implemented in the Graph-Based Visual Saliency (GBVS) toolbox for Matlab20. The Itti and Koch model is a neurobiologically inspired model which computes salient locations for low-level visual features (Fig.Ā 1A and Fig. S1). Since we are interested in neurobiological mechanisms of saliency-guided eye movements, the Ittiā€“Koch model was chosen.

In the full model, saliency maps of the three features were summed with equal weights. When evaluating the contribution of low-level visual features, single-feature saliency maps were used (Fig.Ā 1A). For a test image of 640ā€‰Ć—ā€‰512 pixels, we obtained a saliency map of 80ā€‰Ć—ā€‰64 pixels. To treat the saliency maps as density maps, all maps were normalized so that the sum of saliency values of all pixels was one.

The details of the computational stages in the Ittiā€“Koch saliency model are shown in Fig. S1. The original image is decomposed into the luminance channel, the red-green opponent channel (ā€˜R-Gā€™ in Fig. S1), and the blue-yellow opponent channel (ā€˜B-Yā€™ in Fig. S1). Both color channels were calculated as either the Lā€‰āˆ’ā€‰M or S-(Lā€‰+ā€‰M) channels of the DKL color space. Then Gaussian pyramids at five scales were obtained (ā€˜2ā€“6ā€™ in Fig. S1). For orientation salience, the luminance images were processed with Gabor filters with four different orientations (0Ā°, 45Ā°, 90Ā°, and 135Ā°; Fig. S1). The formula of the 2D Gabor filter with 0-degree orientation is as follows.

$${\text{G}}({\text{x}},{\text{y}}) = {\text{cos}}(2*{\text{pi/freq}})*{\text{exp}}( - {\text{stdx}}^{2} *{\text{x}} - {\text{stdy}}^{2} *{\text{y}}).$$

The size of the filter is 27ā€‰Ć—ā€‰27 in pixels. The spatial frequency (ā€˜freqā€™ in the formula) was 3.14 pixels. The standard deviation of the x and y axis (ā€˜stdxā€™ and ā€˜stdyā€™) was 2 and 4 pixels, respectively. The filter was convolved with the gaussian pyramids with five scales to obtain five filtered images (ā€˜Gabor filterā€™ at 0Ā° in Fig. S1). Similarly, the Gabor filters with other three orientations (45Ā°, 90Ā°, and 135Ā°) were applied to obtain the filtered images. As a result, this is equivalent to these images being processed by Gabor filters with five spatial frequencies (3.3, 1.6, 0.82, 0.41, and 0.20 cycles per degree). Then center-surround inhibition and peak normalization were done for these images. These intermediate files for salience computation were used for the analysis in Fig.Ā 3 and the accompanying text. These files were also generated by the GBVS toolbox.

Extended six-channel model for visual salience

An ā€œextended six-channel modelā€ for visual salience was constructed for the analysis in Figs.Ā 2 and 4 and the accompanying text. For this purpose, the test images were decomposed into three-channel images (Lā€‰+ā€‰M, Lā€‰āˆ’ā€‰M, and S-(Lā€‰+ā€‰M)) based on the DKL color space22. The decomposed images were then subjected to salience computation separately with (orientation map) or without (intensity map) Gabor filtering (Fig.Ā 2B). The luminance salience in the original model corresponds approximately to the intensity map of the Lā€‰+ā€‰M channel. The color salience in the original model corresponds approximately to the intensity map of the Lā€‰āˆ’ā€‰M channel plus the S-(Lā€‰+ā€‰M) channel. The orientation salience in the original model corresponds approximately to the orientation map of the Lā€‰+ā€‰M channel. The other two maps (the orientation map of the Lā€‰āˆ’ā€‰M channel and the orientation map of the S-(Lā€‰+ā€‰M) channel) are new components not found in the original model.

Age-matched resampling

To account for the effect of the age difference between the healthy control group (HC, 28.8ā€‰Ā±ā€‰11.5Ā years, meanā€‰Ā±ā€‰SD) and the schizophrenia group (SZ, 35.1ā€‰Ā±ā€‰12.4Ā years, meanā€‰Ā±ā€‰SD), resampling of the data for the HC group was performed to match the mean age of the HC participants with that of the SZ group. For this purpose, the ages of both groups were grouped into 5-year bins. The HC participants in each bin were then randomly selected to match the number of SZ participants in each bin. This procedure was repeated 100 times to obtain 100 sets of resampled HC participants. The mean age of the resampled HC participants ranges from 34.9 to 35.2Ā years, in close agreement with that of the SZ group (35.1Ā years). The standard deviation of the age of the resampled HC participants ranges from 12.0 to 12.5Ā years, in close agreement with that of the SZ group (12.4Ā years old). We confirmed that the results of statistical analysis were not affected by the choice of the resampled dataset (see supplementary text, ā€œConsideration of resampling schemeā€). Therefore, we selected one of the resampled datasets and used it throughout our analysis.

Linear mixed models

The procedure for the statistical analysis of the time course of the mean salience values at saccade endpoints (Fig.Ā 1 and Fig. S2) is as follows. First, a cut point for the number of saccades was set because the number of saccades during the 8-s viewing time varied between trials and between the participant groups (see Table S1 for the mean numbers of saccades; 16.1 in SZ and 20.9 in HC). The cut point was set to 16 because the mean number of images obtained for each saccade number (up to 56) is less than half (Fig. S2B, top).

We then fit linear mixed models using the R package lme452 to obtain estimates and statistics. For this purpose, the salience values for each saccade number and each participant were averaged over 56 test images. Saccades that landed off-screen were treated as not a number. The mean salience values (ā€œSalā€) were then treated as the dependent variable and as a linear function of participant groups (HC and SZ as ā€œSubjectGroupā€) and saccade numbers (1st, 2ndā€¦, 16th as ā€œSaccadeNumā€). The random effects were modeled using random intercepts and random slopes for individual differences (ranging from 1 to 334 as ā€œSubjectIDā€) nested under the participant group. The model formulae in R format are as follows:

  1. (1)

    Salā€‰~ā€‰SubjectGroupā€‰+ā€‰SaccadeNumā€‰+ā€‰(SaccadeNum |SubjectID)

  2. (2)

    Salā€‰~ā€‰SubjectGroup * SaccadeNumā€‰+ā€‰(SaccadeNum |SubjectID)

Model #1 is a model with the main effects of participant group and saccade number. Model #2 is a model with an interaction between the participant group and saccade number. The two-sided probability values and degrees of freedom associated with each statistic were then determined using the Satterthwaite approximation implemented in the R package LmerTest53.

General linear models

To examine whether the salience values are corelated to other measures such as demographic data, cognitive tests, and eye movement properties, general linear models were constructed (Fig.Ā 4). The model formulae in R format are as follows:

  1. (1)

    Dependent variablesā€‰~ā€‰SubjectGroupā€‰+ā€‰Sal(Col)ā€‰+ā€‰Sal(Lum)ā€‰+ā€‰Sal(Ori)

  2. (2)

    Dependent variablesā€‰~ā€‰SubjectGroupā€‰+ā€‰Sal(Int, Lā€‰+ā€‰M)ā€‰+ā€‰Sal(Int, Lā€‰āˆ’ā€‰M)ā€‰+ā€‰Sal(Int, S-(Lā€‰+ā€‰M))ā€‰+ā€‰Sal(Ori, Lā€‰+ā€‰M)ā€‰+ā€‰Sal(Ori, Lā€‰āˆ’ā€‰M)ā€‰+ā€‰Sal(Ori, S-(Lā€‰+ā€‰M))

In Model #1, the dependent variables (demographic data, cognitive tests, and saccade-related properties) were fitted with the participant group (ā€œSubjectGroupā€), and the salience values were calculated by the original saliency model consisting of three features. In Model #2, the dependent variables were fitted with the participant group (ā€œSubjectGroupā€) and the salience values were calculated by the extended six-channel model. P values were adjusted for multiple comparisons using FDR.

Quantification and statistical analysis

Statistical analyses were performed using MATLAB (Mathworks, NY), except for the linear mixed models, which were performed by the R packages. The significance level was set at Pā€‰<ā€‰0.05. Non-parametric Wilcoxon rank-sum tests after Bonferroni correction were performed in a post-hoc analysis to compare SZ and HC for the median salience values or other indices (Figs.Ā 1D, 2C,D, Fig. S5C). Cliffā€™s delta (Ī”) was used to quantify the effect size of the estimated difference. For interpretation, we followed general guidelines: negligible for |Ī”|ā€‰<ā€‰0.147, small for 0.147ā€‰<ā€‰=ā€‰|Ī”|ā€‰<ā€‰0.33, medium for 0.33ā€‰<ā€‰=ā€‰|Ī”|ā€‰<ā€‰0.474, and large for |Ī”|ā€‰>ā€‰=ā€‰0.47454.