Prevalence and assessment of self-disorders in the schizophrenia spectrum: a systematic review and meta-analysis

Self-disorders have been proposed as the “clinical core” of the schizophrenia spectrum. This has been explored in recent studies using self-disorder assessment tools. However, there are few systematic discussions of their quality and utility. Therefore, a literature search was performed on Medline, Embase, PsychINFO, PubMed and the Web of Science. Studies using these assessment tools to explore self-disorders within schizophrenia spectrum disorders (SSDs) were included. A meta-analysis was performed on the outcomes of total self-disorder score and odds ratios of self-disorders, using Comprehensive Meta-Analysis software. Weighted pooled effect sizes in Hedge’s g were calculated using a random-effects model. 15 studies were included, giving a sample of 810 participants on the schizophrenia spectrum. Self-disorders showed a greater aggregation within schizophrenia spectrum groups compared to non-schizophrenia spectrum groups, as measured with the Bonn Scale for the Assessment of Basic Symptoms (Hedge’s g = 0.774, p < 0.01) and Examination of Anomalous Self-Experiences (Hedge’s g = 1.604, p < 0.01). Also, self-disorders had a greater likelihood of occurring within SSDs (odds ratio = 5.435, p < 0.01). These findings help to validate self-disorders as a core clinical feature of the broad schizophrenia spectrum.


Results
Study selection. Following  Study characteristics. Table 1 shows the characteristics of all included studies. Roughly half of included studies were performed in inpatient (five studies) and outpatient units (seven studies). Three studies were set in a combined inpatient and outpatient unit. The geographical setting for included studies varied. However, Denmark was the setting of the largest proportion of studies (eight studies). The other studies were set in Norway (two studies), Melbourne (three studies), Portugal (one study), and Italy (one study). Of the included studies, six used the BSABS and nine used the EASE for assessment of SDs. Studies using the EASE varied in how the SD score outcome was measured. Six of the EASE studies reported dichotomous scores, two of the nine EASE studies reported continuous scores, and one reported both scores.
Regarding the target population, studies varied in terms of which participants with SSDs were recruited. Eight out of the 15 studies recruited participants with SSDs exclusively, one study used participants with schizotypal personality disorder (SPD), and one study recruited participants with non-affective psychosis (NAP), which included schizophrenia. Five studies recruited participants with an SSD or SPD and one study recruited participants with either SPD or NAP, which included schizophrenia. Only one study recruited participants based upon symptoms rather than diagnosis, recruiting participants with first rank symptoms (FRS) instead.
When combining the samples of all included studies, a population of 810 participants on the schizophrenia spectrum were included. This consisted of 56 participants with an unspecified SSD, 150 participants with schizophrenia, and 262 participants with an NAP that included schizophrenia. It also contained 262 participants with SPD, 50 CHR participants (with SPD), and 30 participants with FRS.
There was significant variation in the comparison groups for each included study. A mixed composition of OMI was the most common comparison group (five studies), followed by HC (three studies). Other comparison groups included participants with no FRS (one study), autism spectrum disorder (ASD) (one study), no SSD (two studies), non-schizophrenic NAP (one study), bipolar disorder (BD) (one study) and obsessive-compulsive disorder (OCD) (one study).
A comparison population of 781 participants without an SSD were included. This included 302 participants with a variety of OMIs, 195 HCs, and 86 participants with no SSD. Smaller numbers of other comparison groups were also included: no FRS (68), BD (67), OCD (28), ASD (22), and non-schizophrenic NAP (13 www.nature.com/scientificreports/ Whilst all studies reported either total SD score or the odds ratio of SDs as a primary outcome, secondary outcomes varied between included studies. Most studies reported clinical secondary outcomes, notably the OPCRIT (six studies), PANSS (seven studies) and GAF (six studies). Neurocognitive outcomes (two studies), aberrant salience outcomes (two studies) and EEG neurophysiology outcomes (one study) were other notable secondary outcomes reported in included studies.
Risk of bias and quality of evidence assessment. The risk of bias and quality of evidence rating for included studies can be found in Table 1, with a detailed breakdown of each rating in Supplementary Table S2. Concerning the quality of evidence, most studies (11) achieved a moderate quality of evidence score. Four of the 15 studies were determined to have a low quality of evidence. None of the included studies were determined to have a high quality of evidence.
Regarding risk of bias, three of the 15 included studies were determined to have a low risk of bias. Two studies were judged to have a low to moderate risk of bias. Nearly half of the studies (seven) were determined to have a moderate risk of bias. Although three studies were judged to have a moderate to high risk of bias, no studies clearly had a high risk of bias. In the FEP group: SZ spectrum diagnoses included SZ (8) and schizophreniform (9). Non-SZ spectrum diagnoses included mood disorders + psychosis (9) and non-specified psychoses (13) In the UHR group: SZ spectrum diagnoses included paranoid personality (2) and SPD (2). Non-SZ spectrum diagnoses included anxiety disorder (9) www.nature.com/scientificreports/ Differences in mean self-disorder score between SSD and control groups in studies using the BSABS. Panel (a) of Fig. 2 portrays the standardised mean effect sizes and 95% confidence intervals for studies using the BSABS. Three of the four studies (exception 36 ) expressed statistically significant effect sizes suggestive of greater SD aggregation in SSD groups compared to control groups. The pooled effect size for BSABS studies was Hedge's g = 0.774, 95% CI 0.529-1.019. The variance for the pooled effect size was Z = 6.191. The pooled effect size was statistically significant (p < 0.01). Heterogeneity was moderate (I 2 = 49%).
The likelihood of expressing self-disorders in SSD versus control groups in studies using the BSABS. Panel (b) of Fig. 2 displays the effect sizes for odds ratios (OR) and the 95% confidence intervals (CI) for studies using the BSABS. The effect sizes of four of the five studies (exception 36 ) showed a significantly greater likelihood of SDs in SSD groups compared to controls. The pooled effect size for BSABS studies was OR = 5.435, 95% CI 2.499-11.823. Heterogeneity was judged to be high (I 2 = 66%). A sensitivity analysis was performed given the high heterogeneity and large variance (> two standard deviations) in three of the studies: 20,]4,34,3535 . Panels (a) to (d) of Supplementary Fig. S1 describe the odds ratio effect sizes when each potential outlier is removed. Panel (e) of Supplementary Fig. S1 describes the odds ratio effect sizes when all potential outliers are removed. A detailed description of the results from the sensitivity analysis can be found in Supplementary Material 1.
Differences in mean self-disorder score between SSD and control groups in studies using the EASE with dichotomous scores. Panel (a) of Fig. 3   www.nature.com/scientificreports/ confidence intervals for studies using the EASE with dichotomous scores. The effect sizes for all seven studies showed greater SD scores within SSD groups when compared to control groups. The pooled effect size for EASE studies using dichotomous scores was Hedge's g = 1.604, 95% CI 1.176-2.032. The pooled effect size expressed a variance of Z = 7.343. The pooled effect size was statistically significant (p < 0.01). Despite a statistically significant pooled effect size, heterogeneity bordered on very high (I 2 = 76%). It must be noted that in Nordgaard et al. 's 2020 study on FRS 33 , some patients with schizophrenia but no FRS have been included as controls; as such it would be inaccurate to assume that there were no patients with schizophrenia in the control group. To deal with this issue, we did an additional sensitivity analysis removing this study (also see "Methods" section for three-level analysis for dependent effect sizes). The pooled effect size for EASE studies using dichotomous scores after removing the FRS study was Hedge's g = 1.707, 95% CI 1.266-2.148. The pooled effect size expressed a variance of Z = 7.591. The pooled effect size was statistically significant (p < 0.01) and heterogeneity still bordered on very high (I 2 = 72%). Fig. 3 presents the standardised mean effect size and 95% confidence intervals for studies using the EASE with continuous scores. All three studies demonstrated effect sizes suggestive of greater SD scores in SSD groups compared to control groups. The pooled effect size for these studies was Hedge's g = 2.584, 95% CI 1.476-3.693. The variance for the pooled effect size was Z = 4.568. The pooled effect size was statistically significant (p < 0.01). However, one study 16 had an effect size which was not significant at the 1% level, although it was significant at the 5% level. Heterogeneity was very high (I 2 = 92%).

Three-level analysis of combined odds ratio and Hedge's g effect sizes.
In a three-level randomeffects model meta-analysis combining all the effect sizes where odds ratios were log-transformed to approxi-  www.nature.com/scientificreports/ mate a normal distribution similar to Hedge's g, the Q-statistic on testing the homogeneity of effect sizes was 94.514 (p < 0.001). The estimated heterogeneity at level 2 Tau-squared and at level 3 Tau-squared were 0.5908 and 0.8039, respectively. The level 2 I-squared and the level 3 I-squared were 0.4028 and 0.5481, respectively. SSD status (level 2) and cluster of studies (level 3) explain about 40% and 55% of the total variation, respectively. The average population effect (Z-statistic and its 95% Wald CI) was 2.1429 (1.0915-3.1942).

Discussion
This meta-analysis is among the first to explore the merit of theories which posit that SDs show a specificity within the schizophrenia spectrum, a finding that is consistent with that from two very recent previous reviews 26,27 .
Our meta-analysis appears to indicate a significant magnitude of effect suggestive of a greater expression of SDs within the schizophrenia spectrum population, when compared with HCs and OMIs. This magnitude of effect was observed in both studies using the BSABS and the EASE. Thus, we have found good evidence to support the over-expression of these SD phenomenon within the schizophrenia spectrum, whether they are interpreted as a subgroup of basic symptoms or a more pervasive distortion in the minimal self 7,14,39 . This overexpression of SDs within the schizophrenia spectrum is further supported by our meta-analysis of odds ratios for the likelihood of SDs occurring. This meta-analysis reported a 2.5-12 times greater likelihood of SDs occurring within the schizophrenia spectrum population, when compared with non-schizophrenia spectrum populations. Even following the removal of outliers, SDs were over one to 4.5 times more likely to occur within schizophrenia spectrum populations when compared to non-schizophrenia spectrum populations.
Despite good evidence suggesting that SDs are a core clinical feature of the schizophrenia spectrum, there are some limitations to the evidence. The variation in pooled effect sizes suggests that SDs are not experienced by  www.nature.com/scientificreports/ everyone within the schizophrenia spectrum. Given that our meta-analysis did not subgroup for different comparator groups, it is difficult to establish the boundaries of SDs in SSDs. This is perhaps reflected in the significant heterogeneity observed across all pooled effect sizes. Alongside methodological differences and variability in target population, the range of different comparison groups likely contributed to this generally high heterogeneity. With high heterogeneity, this study has less confidence in its pooled effect sizes. Also, these results should be interpreted with caution given the results of the three-level meta-analysis. The three-level meta-analysis found effect sizes generated by the meta-analysis to be highly dependent. Another methodological consideration that must be borne in mind is the inclusion of the same patients as separate samples in different types of analyses. Whilst we do not consider it at all likely that this approach is an intrinsic deficiency or a source of significant bias with regard to our findings, we must interpret these results with some caution given the high degrees of variability and inconsistency in the included studies' original methodologies, which were what necessitated our analytical approach in the first place. We anticipated that there would likely be some differences in the patterns of results from the EASE and the BSABS from the outset, given their conceptual and methodological differences. In particular, we expected results from BSABS studies to demonstrate less variance than those from EASE studies. Hence, we chose to analyse the BSABS and EASE separately, which has enabled this systematic review to empirically compare the two assessment tools. There are, of course, important caveats with regard to making this comparison. Studies utilising the BSABS have often not used the full scale, and some components of the BSABS (e.g. perceptual disorders) cannot be rated by the EASE and vice versa. However, there are still significant overlaps as to what these two scales measure, which focus on SDs albeit from different schools of thought. The BSABS was designed to facilitate the prediction of imminent risk of psychosis, hence its empirical grounding 7 . This is reflected in the results of our meta-analysis. For studies using the BSABS, we observed a medium to large effect size suggesting greater SD aggregation in SSDs, less variance compared to the EASE, and moderate heterogeneity. The BSABS was developed from the unpublished Heidelberg checklist using in-depth interviews to identify basic symptoms, which were then grouped by clinical reasoning 7,40,41 . The smaller range of items, refined incrementally with a good empirical basis is a likely explanation for these results.
In contrast, the EASE was developed with a focus on exploring the nature and experience of SDs as a core phenotype within the broad schizophrenia spectrum based on self-descriptions obtained from patients suffering from SSDs, thus it has a more theoretical grounding and is informed by the Husserlian approach to phenomenology 7,19 . Studies using the EASE showed a very large effect size suggestive of greater SD aggregation within SSDs, greater variance, and very high heterogeneity. The EASE was developed from a subgroup of BSABS items which were hybridised with philosophical concepts and qualitative explorations of the abnormal self-experiences of those with SSDs 19,41 . Therefore, it is logical that studies using the EASE, with items assessing a greater range of SDs but with less empirical grounding, would express a greater effect size with more variance. It is important to recognise these conceptual differences as they may explain some of the differences observed in the BSABS and EASE study results.
This study gives validation to the concept of SDs as a core clinical feature of the broad schizophrenia spectrum 3 . Therefore, this review hopes to encourage clinicians' interest in phenomenology, since there are vital and clinically relevant findings to be drawn from it. From the perspective of the ipseity disturbance model, the magnitude of effect observed within this analysis gives credit to the concept of SDs as a core clinical vulnerability phenotype of the schizophrenia spectrum, thus informing the construct validity of SSDs. From the perspective of the perceptual anomalies model, this study's lack of focus on CHR groups prevents commenting on the BSABS's use in predicting conversion to psychosis. However, the model is supported by the observed effect sizes for SD score and significant odds of SDs being present within SSDs. Regardless of the favoured model, it is apparent that SDs are a core phenomenon within SSDs. These findings could improve clinicians' understanding of the lived experience of individuals on the schizophrenia spectrum, enabling them to improve patients' quality of life. From a research perspective, exploration of SDs via the BSABS and the EASE provides one of the most promising avenues for advancing current understanding of psychosis development.
In addition, with these findings on the validity of SD assessment tools in identifying SDs within SSDs, this study hopes to encourage the adoption of assessment tools for SDs within clinical practice. However, we do not recommend the implementation of current assessment tools. Despite the high interrater reliability of the EASE and BSABS 12,42 , they are lengthy and resource intensive assessments 7,43 . This creates a difficult conundrum in clinical practice, where the volume of first-hand personal data gathered must be balanced with the limited time clinicians have to perform assessments. This review proposes the development of a shorter assessment than the current BSABS and EASE, aimed at capturing key SD manifestations. What classifies as a key SD phenomenon is beyond the scope of this review but should be investigated. However, this review would like to emphasise that tick-box checklists should not necessarily be pursued for clinical use when assessing SD in patients; their utility is perhaps better suited for the purpose of screening large, potentially healthy, populations for SD. Self-report assessments, such as the FCQ and IPASE, have been shown to be unreliable when used as an SD assessment tool despite their potential as screening measures for SD [43][44][45][46] . Both have poor agreement with interviewer assessments, frequently overestimating the presence of SDs. Although there are many valid and reliable tick-box checklists, in the context of SDs, it is a logical extrapolation that tick-box checklists would have the same unreliability in this area. Perhaps future research into SD assessment tools should consider a mixed-methods approach, not unlike the EASE. However, a greater emphasis should be placed on empirically based items, as with the BSABS.
Finally, this study recommends that future research into SDs adopt a more robust methodological framework with more consistent reporting. This recommendation is based on the generally high heterogeneity and inconsistent quality of included studies. At least one study made inappropriate use of statistics, for example using Fisher's exact to calculate odds ratios in small sample sizes. Often a lower quality was due to bias introduced by a lack of random selection, failure to blind participants and assessors, and samples unrepresentative of the www.nature.com/scientificreports/ target population. It is worth noting it would be difficult to reduce this bias in some of these studies. Current SD assessments require intense interviewing, making it difficult to introduce blinding. This also makes it difficult to recruit participants with poor cognitive functioning or aggression, an underrepresented subpopulation in included studies. However, if future research can form a standardised protocol for the exploration of SD phenomenon and transparently report methodology, then the reliability of both individual and pooled study results will be improved. The pooled results of our meta-analysis provide more powerful evidence for the association of SDs and the schizophrenia spectrum than existing individual studies. The findings of this meta-analysis echo the findings of the recent meta-analysis by Raballo et al. 26 . This meta-analysis demonstrated greater effect sizes than this meta-analysis, however this can likely be explained by the greater number of included studies, which in turn increases their analysis' power. This meta-analysis included fewer studies as our eligibility criteria excluded studies involving adolescent and CHR groups. By contrast, the current analysis included longitudinal studies and performed a three-level meta-analysis with nested effect sizes. This analysis showed a high level of dependency and so reduces the confidence with which conclusions can be drawn. However, by performing it the robustness of this meta-analysis' methodology and the reliability of its results have been increased.
Whilst not the first published meta-analysis within this field 26,27 , this meta-analysis is the first to opt to analyse the BSABS and EASE separately. Although this reduces the power of pooled effect sizes, it improves the precision of the results and allows for more accurate conclusions to be drawn. It also facilitates comparison of each assessment tool.
However, there are several limitations to this study. Firstly, the meta-analysis did not perform subgroup analysis on different comparison groups. HCs and OMIs as comparisons provide their own respective strengths and weaknesses. This study chose not to perform the subgroup analysis to maintain sample sizes and power. However, this likely accounted for a proportion of observed heterogeneity.
This meta-analysis originally intended to calculate the prevalence of SDs in SSDs. However, the lack of a cutoff score for the presence/absence of an SD prevented this from being done. Although this was compensated for by pooling odds ratios for the likelihood of SDs, it still diminishes the accuracy and generalisability of our results.
Finally, this meta-analysis chose not to explore populations with a CHR for SSDs. This was done given the sometimes fleeting and non-specific nature of SDs within CHR populations, which would not warrant accurate measurement of SDs. However, the use of these assessment tools in the prediction of SSD conversion risk is a major area of interest for their clinical utility. Therefore, by not analysing this population, this study cannot make a complete assessment of the clinical utility of these SD assessment tools.
In summary, evidence from this meta-analysis suggests that SDs show a greater aggregation and likelihood of occurring within the broad schizophrenia spectrum, when compared to HCs and OMIs. This aids in validating SDs as a core clinical feature of SSDs, which carries implications for aetiological research into SSDs. Assessment tools for SDs have potential for clinical application, however, this might be unlikely in their current iterations.

Methods
Search strategy and data collection. This systematic review was conducted following the guidelines provided in the Preferred Reporting Items for Systematic Reviews (PRISMA) statement checklist 47 . The electronic literature search was conducted by one researcher (S.B.) using the databases Medline, Embase, PsychINFO, Pub-Med and the Web of Science. To ensure all relevant literature was captured, grey literature was searched for on Google Scholar, Opengrey, Proquest, and Psychextra. The references and citations of included studies were also explored to gather literature missed in the initial search. Where relevant, the researchers contacted the authors of identified studies with inaccessible, incomplete, or ongoing trials to gather extra data.
In line with other systematic reviews, a condition, context, and population (CoCoPop) process was developed for this systematic review and was as follows: the condition as adults with an SD; the context as any setting; and the population as adults with a clinically diagnosed SSD or first episode non-affective psychosis (NAP).
A detailed description of the methodology for data collection can be found in Supplementary Materials. This includes rationales for the search terms, the screening process, the review process, and the process for handling disputes. Supplementary Material 3 presents an example search strategy. The eligibility criteria for included studies are shown in Supplementary Table S1. To summarise, the following inclusion criteria were applied: participants with a diagnosis of SSD or NAP only, inclusion of an SD assessment tool, inclusion of an observer rated SD score, adult only participants (mean age > 18 years old), English language only, participants with a clinical diagnosis only, and inclusion of a comparison group. The following exclusion criteria were applied: publication pre-1967, inclusion of a self-reported SD score, inclusion of children or adolescents, non-English language, participants with research diagnoses only, no comparison group, single case studies, and qualitative studies. Data extraction. The researcher (S.B.) responsible for data collection carried out data extraction in line with the PRISMA statement guidelines 47 . The research supervisor (C.H.) provided oversight to the process for quality assurance. The full texts of included articles were re-read, with key characteristics extracted and placed within a summary of findings table (Table 1). Key characteristics reported within the table included geographical setting, instrument for clinical diagnosis, SD assessment tool, sample type, sample size, SD assessment results (mean SD score), SD odds ratio, key findings, descriptive psychopathology, and demographic features of the sample. The summary of findings table included a rating for "quality of evidence" and "risk of bias". To maintain transparency, a detailed summary of how each quality of evidence and risk of bias rating was determined can be found in Supplementary Table S2. A detailed description of the methodology for quality of evidence and risk of bias assessments is given in Supplementary Materials. The quality of included studies was determined through assessment with the Grading of www.nature.com/scientificreports/ Recommendations, Assessment, Development, and Evaluation (GRADE) handbook 48 (Supplementary Materials). Study quality was graded from "high" to "very low". The risk of bias for included studies was determined using a risk-of-bias tool designed specifically for systematic reviews of prevalence studies (Supplementary Materials) 49 . Study bias was rated from "low" to "high".

Data analysis (systematic review and meta-analysis). A narrative synthesis in line with Cochrane
guidelines was performed on the 15 studies which met the aims and eligibility criteria of this systematic review 50,51 . This involved an initial synthesis of data relating to the utility of SD assessment tools, followed by the extraction of relevant findings (Table 1) mentioned above. Findings relevant to the research aims were systematically discussed and both the quality and potential bias in included studies were critically appraised. A meta-analysis was performed on all included studies with an SSD group and at least one other comparison group. Given the varied comparison groups for different included studies, we anticipated considerable heterogeneity. Meta-analysis was performed using the Comprehensive Meta-analysis Professional statistical software package version 3.0.
For descriptive analysis, all results were reported via a random-effects model 52 . The random-effects model was favoured for the anticipated heterogeneity in methodologies of different included studies in this field of research. The random-effects model's aim of facilitating inferences about population level effects also aligned with this systematic review's research aims 52,53 . Effect sizes were calculated for standardised differences (Hedge's g with 95% confidence intervals) in mean total SD scores and the odds ratios of having an SD.
Using Comprehensive Meta-Analysis, heterogeneity was quantified and assessed with Cochran's Q and I 2 statistics 54 . Regarding I 2 , heterogeneity was graded as follows: low heterogeneity (0-25%), moderate heterogeneity (25-50%), high heterogeneity (50-75%), and very high heterogeneity (75-100%) 55,56 . Sensitivity analyses were performed on some models with very high heterogeneity. Potential outliers were identified as studies with point estimates over 2 standard deviations and p < 0.05. Each potential outlier was removed individually, and the model recalculated to determine their individual impact on the effect size and heterogeneity. Finally, all outliers were removed, and the model recalculated.
Due to the significant differences between assessment with the EASE and BSABS, subgroup analysis was performed on the type of SD assessment tool used. Assessment with the EASE does not enable the generation of odds ratios because there is no quantitative cut-off score for the presence/absence of SDs. Therefore, odds ratios for the presence/absence of SDs were only performed on studies using the BSABS. A further subgroup analysis was performed on EASE studies based off the type of scoring reported (dichotomous or continuous scores).
Some included studies used the same sample of participants as other included studies or were follow-ups of other included studies. For this reason, a three-level random-effects meta-analysis with nested (dependent) effect sizes in R using the metaSEM package was performed 57 . For the purpose of this analysis, relevant studies using the BSABS and EASE were mixed together. This was done given the highly dependent nature of the effect sizes. Odds ratios, such as in the Parnas et al. studies 5,35 , were log-transformed to approximate a normal distribution like Hedge's g. Also, only one set of effect sizes (usually SSD/HC) were selected for each comparison.
Finally, to assess potential publication bias, a funnel plot was produced encompassing all studies included in the current meta-analysis (see Supplementary Materials) 58 .