Introduction

In psychiatry, phenotypic heterogeneity of disorders and their overlapping symptoms that may presumably share some fundamental biologic underpinnings is a major challenge for tailoring individualized therapies1. Similarly, the course and phenotypic expression of schizophrenia are variable2. Schizophrenia is a complex polygenic psychotic disorder with a lifetime morbidity risk of 0.7%3. The twin- and SNP-based heritability estimate of schizophrenia was 80%4 and 30%5, respectively.

According to the diagnostic and statistical manual of mental disorders (DSM) criteria, the clinical manifestations of schizophrenia are positive (e.g., hallucinations, delusions and disorganized behaviour) and negative (e.g., emotional expressive deficit, social amotivation, social withdrawal and difficulty in experiencing pleasure) symptoms6. Cognitive deficit is also one of the hallmark manisfestations of schizophrenia that occur in 75–80% of patients and often associated with poor daily functioning and quality of life7. Cognitive impairment in schizophrenia can be selective or general though the most common deficits occur in executive function, processing speed, memory (e.g. episodic, verbal and working), attention, verbal fluency, problem-solving and social cognition8,9,10,11. Patients harbor a wide range of subjectively defined symptoms, which together yields instinctively heterogeneous groups of people who are collectively diagnosed with schizophrenia. Subclinical or prodromal symptoms are also evident in relatives of patients with schizophrenia and healthy general population12,13,14.

Despite a century of efforts, understanding the heterogeneity in the clinical presentation and course of schizophrenia has been unsuccessful. This can be due to the subjective measurement of its clinical symptoms, variation in response to treatment, lack of valid, stable, and meaningful sub-phenotyping methods, and molecular complexity with limited understanding of the pathophysiology15,16,17. Phenotypic heterogeneity can be related to several intrinsic and extrinsic factors and expressed in patients, time, and disease sub-phenotypes16,18. Identification of meaningful homogeneous subgroups of people based on their symptoms or endophenotypes (e.g. neuropsychological markers, neural substrates, and neurological soft signs) requires the use of both supervised and unsupervised analyses. Distinguishing heterogeneous patients to more behaviorally and biologically similar subgroups is expedient not only to unveil common etiologies but also to examine the patterns of clinical symptoms, understand the biology of disease, predict treatment response and develop a new targeted treatment that improves recovery and functional outcomes15,16,19,20.

For tackling heterogeneity, in the past decade, numerous efforts have been undertaken by carefully designing studies and developing statistical models implemented in various programming languages and software16. In 2013, the American Psychiatric Association also endorsed a dimensional approach to identify intermediate categories based on the subjective report of severity of symptoms6. As a result, researchers have been using latent class cluster analyses and growth mixture models to explore clusters of individuals and trajectories of clinical symptoms in various settings15,21,22. Statistical methods can be used to identify subgroups and describe within and between individual variations to guide clinicians and statisticians to explore the relationship of diseases with various clinical and functional outcomes, treatment response, and neuropathological change. Moreover, subtyping using imaging, biological and symptom data is a recognizable method and widely used in psychiatric research21.

Several reviews have been conducted on positive symptoms23, negative symptoms24,25,26 and cognitive dysfunction7,9,27,28,29,30,31,32,33,34,35. However, these reviews have largely focused on the conventional approach for determining an average change in the course of symptoms over time and the difference between subjects (e.g., patient vs sibling, sibling vs control, or patient vs control) and diagnosis. Reviewed studies are also based on correlation analysis, which is believed not to be a strong measure of association between predictors and outcomes36. Besides, these primary studies vary in terms of study population and use of assessment tools, scoring and standardization techniques, and have several limitations, such as small sample size, short duration of follow-up and limited use of data from healthy siblings and/or controls9,37,38. Of interest, none of these reviews fully addressed evidence from both longitudinal and cross-sectional data-driven studies on schizophrenia symptoms and cognitive deficits among patients with schizophrenia spectrum disorders, relatives and healthy controls. Taken together, thus far, our understanding of the heterogeneity of the course of schizophrenia symptoms and cognitive deficits is still limited. In the present systematic review, we summarized the contemporary evidence from cross-sectional and longitudinal studies on positive and negative symptoms and cognitive deficits among patients with schizophrenia spectrum disorders, their unaffected siblings and healthy people. Additionally, we explored the extent and origin of heterogeneity across studies. We further highlighted common methodological gaps and point out future directions to optimize the translatability of evidence from data-driven studies within the outlook of a personalized approach.

Methods

Registration and reporting

This systematic review was conducted and reported based on a registered protocol39 and the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) statement (Supplementary File 1), respectively40,41. The screening and selection process of the reviewed articles are further illustrated using a PRISMA flow diagram.

Databases and search terms

A systematic search of PubMed, PsycINFO, PsycTESTS, PsycARTICLES, SCOPUS, EMBASE and Web of Science electronic databases was performed. A comprehensive search strategy was developed for PubMed and adapted for each database in consultation with a medical information specialist (Supplementary File 2). The following search terms were used in their singular or plural form in the title, abstract, keywords and text fields of the articles: “schizophrenia”, “psychosis”, “non-affective psychosis”, “cognitive deficit”, “cognitive dysfunction”, “cognitive alteration”, “negative symptoms”, “deficit syndrome”, “positive symptoms”, “psychopathology”, “cognit*”, “neuropsycholog*”, “neurocognition”, “longitudinal”, “follow-up”, “course”, “heterogeneity”, “endophenotype”, “profile”, “cluster analysis”, “siblings”, “healthy controls”, “latent class analyses”, “Symptom trajectories”, “traject*”, “group modelling” and “trajectory”. Cross-references of included articles and grey literature were also hand-searched. Furthermore, we searched the table of contents of the journals of Schizophrenia Research, Schizophrenia Bulletin, Acta Psychiatrica Scandinavica, BMC Psychiatry, American Journal of Psychiatry and British Journal of Psychiatry to explore relevant studies. The freezing date for the final search was August 2019. In this review, we use ‘trajectory’ for groups identified in longitudinal studies and “cluster” for groups identified in cross-sectional studies.

Inclusion and exclusion criteria

Studies which met the following criteria were included: (1) longitudinal and cross-sectional studies; (2) studies that reported at least two clusters or trajectory groups of individuals using a statistical method based on a distinct positive symptom, negative symptom, and cognitive deficit or a combination of these symptoms; (3) studies conducted in patients with schizophrenia spectrum disorders, unaffected relatives, or healthy individuals irrespective of their clinical (e.g. medication status, severity of illness) and sociodemographic characteristics; and (4) studies published in English from 2008 to 2020. The publication year was limited to the last decade to capture the latest available evidence, which is likely to provide statistically powerful estimates and successfully subtyping schizophrenia symptoms given the increased number of large cohorts. To maximize the number of searched articles, the follow-up period in longitudinal studies was not restricted. Longitudinal studies based on the analyses of the mean levels of change of symptom scores were excluded because they did not capture individuals’ patterns of change over time by treating between-subject variation as an error, so that the actual heterogeneity of groups cannot be revealed42. Also, studies based on the non-statistical methods of clustering (e.g. family-based clustering) were excluded. Review papers, commentaries, conference abstracts, duplicate studies, editorials, and qualitative studies were excluded as well. Furthermore, we excluded studies in which the trajectory groups or clusters were generated based on scores constructed using a combination of schizophrenia symptoms and other unspecified psychotic symptoms.

Data retrieval and synthesis

Studies retrieved from all databases were exported to RefWorks version 2.0 for Windows web-based citation manager, which followed by the removal of close and exact duplicates. All independent studies were exported to a Microsoft Excel spreadsheet to screen for further inclusion criteria. Authors T.D.H. and L.H.R. independently screened the titles and abstracts. The two reviewers had a substantial agreement (Kappa statistic (κ) = 0.62). Inconsistent decisions were discussed and solved with consensus. Finally, full-text was reviewed, and the following data were independently extracted by T.D.H. and L.H.R.: first author name, publication year, country, cohort/research center, study population, sample size, symptom dimension(s), assessment tool, study design, duration of follow-up for longitudinal studies, frequency of assessment, method of calculating composite score, method of clustering/trajectory analysis, number of identified clusters or trajectory groups and significant correlates of clusters and predictors of trajectories43. The corresponding author was contacted by email if the full-text of included article was not accessible. When studies did not report the cohort or research center, we extracted the institutional affiliation of the first or corresponding author.

Results

Search results

In total, 2262 articles were identified through database searching and an additional 26 articles were obtained through manual searching of cross-references and tables of content of relevant journals. After removing duplicate and unrelated articles, the titles and abstracts of 1292 articles were screened. The evaluation of titles and abstracts resulted in the exclusion of 1231 articles. In total, 61 articles were selected for full-text review, and eight articles44,45,46,47,48,49,50,51 were excluded due to unclear outcomes, mixed diagnosis of the study population and use of a non-statistical method of clustering or clustering based on different phenotypes of schizophrenia. Finally, data were extracted from 53 longitudinal and cross-sectional studies. The PRISMA flow diagram of screening and the selection process is shown in Fig. 1.

Fig. 1
figure 1

PRISMA flow diagram illustrating the screening and selection of literature.

Overview of included studies

The included 53 studies were conducted globally in 30 countries and published over a decade from 2009 to 2020. Seventeen studies were conducted in the USA and few studies were internationally conducted. Of these, 19 studies were longitudinal that involved 11,684 patients, 1059 siblings and 2194 controls or general population from more than eight countries, whereas 34 studies were cross-sectional that involved 6138 patients, 7670 siblings, and 3326 controls from 14 countries. Most of the longitudinal studies examined trajectories of positive and negative symptoms in patients, whereas most of the cross-sectional studies explored cognitive subtypes in patients. Only one longitudinal study52 and three cross-sectional studies53,54,55 examined cognitive subtypes among siblings. Overall, two to six subtypes of positive and negative symptoms and cognitive deficits were identified.

Longitudinal studies

In total, 19 longitudinal studies were reviewed that included all population age groups with the duration of follow-up ranged from six weeks to 10 years. The sample size ranged from 138 to 1990 subjects. Even though all studies had a similar aim, they have used slightly different models of trajectory analysis and model selection criteria. Growth mixture modelling (GMM)17,56,57, latent class growth analysis (LCGA)16,19,20,58,59,60,61, mixed-mode latent class regression modelling22,62,63, group-based trajectory modelling (GBTM)52,64,65,66 and Ward’s method67 were reported data-driven methods. Akaike’s Information Criterion (AIC), Bayesian information criterion (BIC) (i.e., used in most studies), deviance information criterion (DIC), logged Bayes factor, sample size adjusted BIC (aBIC), bootstrap likelihood ratio test [BLRT], Gap statistic, Lo–Mendell–Rubin Likelihood Ratio Test (LMR-LRT) and entropy were reported model selection indices.

Most longitudinal studies, Table 1, investigated the trajectory of positive, negative or both symptoms in patients whereas one study68 explored the trajectory of schizotypy in a nonclinical population. Another study57 examined the association between positive and negative symptom trajectories in patients. Moreover, three studies examined the long-term trajectories of cognitive impairment in patients, their unaffected siblings and healthy controls16,52,66. One study52 investigated the association between patients’ and siblings’ cognitive trajectories as well. Overall, these studies characterized the general pattern of identified trajectories as progressive deterioration, relapsing, progressive amelioration and stable, and the detail results are presented per symptom domains as follows.

Table 1 Detailed characteristics of longitudinal studies (n = 19).

Positive symptoms

As presented in Table 1a, four studies19,20,57,65 investigated the trajectory of positive symptoms in patients with first-episode schizophrenia spectrum disorders with no or prior antipsychotics treatment for less than three months. The duration of follow-up and frequency of assessment ranged from six weeks to 10 years and five to seven times, respectively. Two studies19,65 have used the Scale for the Assessment of Positive Symptoms (SAPS) to assess positive symptoms and identified five trajectories with more than one-third of patients subtyped as decrease positive symptoms or good responders. The other two studies used the Positive and Negative Syndrome Scale (PANSS) tool to assess positive symptoms and identified three trajectories that most of them grouped to class two57 and two trajectories being in the most of the cases early response and stable trajectory over time20. The identified predictors were male gender, low educational status, substance use, diagnosis with schizophrenia, long duration of untreated psychosis, poor global functioning, and severe baseline positive and negative symptoms (Fig. 2).

Fig. 2
figure 2

Schizophrenia spectrum circle illustrating the schizophrenia symptoms and cognitive deficits (innermost circle), sample groups (inner circle), identified trajectories (outer circle) and predictors (outermost circle) in longitudinal studies. Findings are read and interpreted based on the line up in the circle.

Negative symptoms

Eight longitudinal studies19,20,57,58,61,64,65,67 explored negative symptom trajectories among patients with first-episode non-affective psychosis with no prior or minimal treatment up to three months (Table 1b). Two studies19,65 used the Scale for the Assessment of Negative Symptoms (SANS), four studies20,57,61,64 used the PANSS scale and two studies used the High Royds Evaluation of Negativity Scale58 and Clinical Global Impressions-Schizophrenia scale67 to assess negative symptoms. The duration of follow-up and frequency of assessment ranged from 6 weeks to 10 years and three to 64 times, respectively. Five studies19,20,57,61,64 identified four trajectories of negative symptoms with variable patterns, whereas one study65 found five trajectories with approximately half of them had persistent symptoms or poor response to treatment. The other two studies58,67 found three trajectories with most of the participants had minimally stable negative symptoms. Our review depicted that trajectories of negative symptoms were predicted by older age, male gender, low educational status, ethnic minority, being unmarried, family history of psychosis, long duration of untreated psychosis, poor premorbid adjustment, severe depressive and disorganized symptoms, diagnosis of schizophrenia, unemployment, poor functioning and quality of life, high antipsychotics dosage, low cognitive performance, and high level of baseline negative and positive symptoms (Fig. 2).

Positive and negative symptoms

Combining both positive and negative symptom dimensions, which is illustrated in Table 1c, eight studies17,22,56,57,59,60,62,63 inspected trajectories in patients with first-episode or chronic schizophrenia with antipsychotics treatment for more than three months and all of these studies used the PANSS scale to measure positive and negative symptoms. The duration of follow-up and frequency of assessment ranged from three months to two years and five to ten times, respectively. Among these studies, four22,56,60,62 of them revealed five trajectories, two57,63 of them revealed three trajectories, one study17 found four trajectories and another study59 found two trajectories with substantial difference in the nature, pattern and distribution of trajectories. Symptom trajectories were predicted by older age, male gender, ethnic minority, increased weight, diagnosis with schizophrenia, late age of illness onset, depressive and extrapyramidal symptoms, general psychopathology, type of antipsychotics treatment (e.g., aripiprazole, olanzapine), exacerbation, long duration of illness, poor premorbid and cognitive functioning, low global functioning and quality of life, living situation, involuntary admission, previous hospitalization and severe baseline positive and negative symptoms (Fig. 2).

Cognitive deficits

As shown in Table 1d, three studies investigated the trajectories of global cognitive deficits in patients with first-episode psychosis patients, their siblings and healthy controls52,66, and clinically stable outpatients with schizophrenia (SCZ) together with healthy controls16. The first six-year longitudinal study52, which cognitive function was assessed by the cognitive battery test, depicted five trajectories of cognitive impairment in patients (i.e., most of them with mild to moderate deficits) and four trajectories in healthy siblings (i.e., most of them had normal cognitive function). The second study66, which was the follow-up of the previous study, found six cognitive trajectories (i.e., nearly half of the population had mild to severe cognitive impairment) by combining patients, siblings and controls. The third longitudinal study16 have used the Mattis Dementia Rating Scale and reported three trajectories (i.e., half of them with high and stable trajectory) of global cognitive function by combining patients and controls. Two studies found that patients with poor cognitive trajectories had younger age, low educational status, non-Caucasian ethnicity, lived in a sheltered facility, low IQ, poor premorbid adjustment, severe positive and negative symptoms, and low baseline cognitive performance16,52. Likewise, siblings with poor cognitive trajectories had younger age, female gender, low educational status, non-Caucasian ethnicity, low IQ, poor premorbid adjustment, severe schizotypy, frequent positive psychotic experience, and low baseline cognitive performance (Fig. 2)52. One study discovered that polygenic risk score for schizophrenia significantly predicted poor long-term cognitive trajectory in combined samples of patients, siblings and controls66.

Schizotypy

A single longitudinal study assessed schizotypy in healthy college students using the Chapman Psychosis Proneness Scales (CPPS) and found four trajectories, in which nearly three-fourths of students were categorized as non-schizotypal68. This study also found that male gender and a high level of baseline schizotypy significantly predicted trajectories (Table 1e, Fig. 2).

In summary, when we inspecting the longitudinal study’s findings shown in Table 1, studies that found the same number of trajectories were substantially different concerning participants composition (patient, sibling and controls), assessment instruments, symptom dimensions, frequency of assessment, duration of follow-up, methods used to generate a composite score, data-driven methods applied, label, proportion, pattern and type of trajectories, and identified predictors. In addition, there was no link between the numbers and types of trajectories and the use of trajectory analysis methods, study population and symptom dimensions.

Cross-sectional studies

Of the 53 included studies, 34 studies were cross-sectional (Table 2) that conducted in different groups of population. The total sample size per study ranged from 62 to 8231 individuals irrespective of participants’ diagnostic status. The reported clustering methods were K-means or non-hierarchical clustering analysis21,53,55,69,70,71,72,73,74,75,76, Ward’s method or hierarchical analysis77,78,79,80,81,82,83, K-means clustering and Ward’s method18,38,54,84,85,86,87,88,89, latent class or profile analysis15,90,91 and two-step cluster analysis92,93,94. One study95 identified clusters using a combination of clinical/empirical and statistical clustering methods. The model selection criteria or similarity metrics were visual inspections of the dendrogram, Pearson correlation, squared Euclidean distance (i.e., the most common index), agglomeration coefficients, Dunn index, Silhouette width, Duda and Hart index, elbow test, variance explained, inverse scree plot, average proportion of non-overlap, AIC, BIC, aBIC, Schwarz’s BIC, Lo–Mendell–Rubin (LMR) test, adjusted LMR and BLRT.

Table 2 Detailed characteristics of cross-sectional studies (n = 34).

Among the 34 studies (Table 2), 22 studies18,38,53,54,70,71,75,76,78,79,80,81,82,84,86,87,88,90,91,92,93,94,95 reported cognitive clusters in patients with first-episode, stable or chronic schizophrenia with or without antipsychotics treatment and one study54 reported cognitive clusters in unaffected siblings. Other studies investigated trajectories of negative symptoms15,85, positive symptoms83, positive and negative symptoms21,69,77 in patients and positive and negative schizotypy in a nonclinical population55,72,73,83. Furthermore, two studies75,90 investigate the data-driven clusters by combining cognitive deficit and negative symptoms. Details on clusters and correlates of clusters presented per symptom dimensions as follows.

Positive symptoms

Only one study83 assessed hallucinatory experience in patients with schizophrenia using Launay–Slade Hallucination Scale-Revised (LSHS-R) and identified three clusters (Table 2a)83. Given this was an explanatory study, correlates of clusters were not studied.

Negative symptoms

Two studies15,85 reported three clusters of patients with (chronic)schizophrenia based on the negative symptoms that assessed by the SANS scale85 and Schedule for the Deficit Syndrome (Table 2b)15. Identified clusters were significantly correlated with male gender, ethnic minority, low educational status, summer season of birth, early age onset of illness, severity of positive and negative symptoms, poor cognitive performance, poor functioning, high level of general psychopathology, previous hospitalization, poor premorbid adjustment, social anhedonia and poor attitude (Fig. 3).

Fig. 3
figure 3

Schizophrenia spectrum circle illustrating the schizophrenia symptoms and cognitive deficits (innermost circle), sample groups (inner circle), identified clusters (outer circle) and correlates (outermost circle) in cross-sectional studies. Findings are read and interpreted based on the line up in the circle.

Positive and negative symptoms

Two studies21,77 assessed positive and negative symptoms in patients with childhood-onset or first-episode schizophrenia using the SAPS and SANS scales, respectively and found three clusters, while another study69 used the PANSS scale and found four clusters (Table 2c). Reported symptom clusters were characterized as low positive and negative symptoms, high positive and low negative, low positive and high negative, and high positive and high negative though the patterns and distributions of clusters were different across studies. Identified clusters were significantly correlated with male gender, low IQ, poor global functioning, poorer metacognitive ability, and high level of positive and negative symptoms (Fig. 3).

Cognitive deficits

Of the 22 studies conducted on neurocognitive deficits, 17 studies38,53,70,71,76,78,79,80,81,82,84,90,92,93,94,95 found three clusters, five studies18,75,86,87,91 reported four clusters and one study88 discovered five clusters among patients (Table 2d). Most studies assessed global cognitive function using a comprehensive neuropsychological test that included three to 18 cognitive subtests. Poor cognitive function in patients was associated with age, gender, non-Caucasian ethnicity, low socioeconomic and educational status, poor premorbid adjustment, low premorbid and current IQ, early age of illness onset, long duration of illness, severe positive and negative symptoms, poor social cognition, high antipsychotics dosage, use of second-generation antipsychotics, and poor functioning and poor quality of life (Fig. 3). In siblings, one study54 found three cognitive clusters in unaffected siblings that associated with young age, low educational status, low IQ, poor premorbid adjustment and severe positive schizotypy (Table 2d, Fig. 3)54. One study92 found that polygenic score (PRS) for schizophrenia, cognition, educational attainment and attention deficit hyperactivity disorder (ADHD) correlated with cognitive clusters in patients and their unaffected siblings.

Negative symptoms and cognitive deficits

One study89 found three clusters of (out)patients with stable schizophrenia spectrum disorder by combining social cognition that assessed by the Mayer-Salovey-Caruso Emotional Intelligence Test and negative symptoms that assessed by the PANSS scale, whereas another study74 found four clusters in patients by combining neurocognition that assessed by Continuous Performance Tests and negative symptom that assessed by the PANSS scale (Table 2e). Clusters were significantly correlated with being unmarried, poor self-esteem, low cognitive (attention, social) performance, stigma, severity of positive and negative symptoms, poor social functioning and quality of life, and previous hospitalization (Fig. 3).

Schizotypy

Three studies investigated schizotypy in unaffected first-degree relatives of patients with schizophrenia55 and healthy college students72,73 using the CPPS scale and found four clusters, whereas another study83 found two clusters based on hallucinatory experience that assessed by LSHS-R scale in healthy general population (Table 2f). Schizotypy clusters were significantly associated with male gender, lack of pleasure experiences, difficulty of emotional expression, psychotic-like symptoms, severity of positive and negative schizotypy, depressive, schizoid and somatic symptoms, poor social and cognitive functioning, substance abuse and poor personality (Fig. 3).

To summarize, as we observed in longitudinal studies, cross-sectional studies that found the same number of clusters were conducted in a different group of samples and used various assessment instruments and methods of generating composite scores and clustering. The labeling, pattern, proportion, and type of clusters were remarkably different. Generally, three clusters were the most replicated number of clusters and characterized by low (severe deficits), mixed (intermediate deficits) and high (intact or normal performance) cognitive function. In addition, cognitive clustering, such as verbal fluency deficit, verbal memory and executive function deficit, face memory and processing deficits, or global cognitive deficits were revealed. Cross-sectional studies that found the same number of clusters were largely different in the characteristics of study population, pattern of identified clusters, symptom dimensions, methodology of assessment, applied data-driven methods and identified associated factors.

Overall, as shown in Table 3, the reviewed studies reported two to six clusters or trajectories and 58 factors that linked with identified clusters and/or trajectories across all study participants and symptom dimensions. The most common associated factors were old age, male gender, non-Caucasian ethnicity, low educational status, late age of illness onset, diagnosis of schizophrenia, several general psychopathology and depressive symptoms, severe positive and negative symptoms, low cognitive performance, and poor premorbid functioning, quality of life and global functioning.

Table 3 Heatmap summary of clusters/trajectories and predictors across study participants, symptom dimensions and study design.

Discussion

To our knowledge, this is the first comprehensive systematic review based on recent cross-sectional and longitudinal data-driven studies in positive and negative symptoms, and cognitive deficits in patients with schizophrenia spectrum disorders, their relatives and healthy people. Our review has three major findings. First, longitudinal trajectory-based studies found two to five positive and negative symptoms trajectories in patients and four to six cognitive trajectories in patients, siblings, controls, or combined samples. Second, cross-sectional cluster-based studies identified three positive and negative symptoms clusters among patients and four positive and negative schizotypy clusters among healthy siblings. Additionally, three to five cognitive clusters were discovered in patients and their unaffected relatives. Third, numerous sociodemographic, clinical and genetic factors that determine trajectories and/or clusters were identified.

We showed that longitudinal and cross-sectional studies in patients, their siblings and healthy general population have inconsistently identified two to five trajectories/clusters and various predictors across the schizophrenia symptoms and cognitive deficits. Several shortcomings across studies may cause this inconsistency. Previous longitudinal studies did not uniformly research symptoms and cognitive deficits. For example, only three studies16,52,66 longitudinally investigated cognitive trajectories, but 22 cross-sectional studies investigated cognitive clusters. Utterly, none of the reviewed longitudinal and cross-sectional studies also validated their model using empirical methods or comparable statistical methods though they have used different complex data-driven methods. Accumulating evidence showed that the number of classes in the optimal model derived from one method can be remarkably different compared to the other method96. Given that these studies were conducted in patients at a different stage at diagnosis, disease course or severity of illness and treatment status, the results may not be expectedly consistent as well. For example, studies that included only first-episode psychosis, chronic or stable patients may identify smaller clusters than studies that included a mixture of patients with first-episode and chronic psychosis or patients with severe illness. Additionally, since the reported studies were conducted in more than 20 countries, the use of different treatment strategies and assessment methods in different countries could further confound the assessment of symptoms and clinical heterogeneity. Obviously, in patients who are treated, the observed symptoms and cognitive characteristics are the product of those features that were present before treatment and the response to treatment. Moreover, the different measurement tools may lead to discrepant results. For instance, the discrepancy of negative and positive symptoms trajectories (or cross-sectional clusters) might partly be attributable to the use of a specific negative (e.g. SANS) and positive (e.g. SAPS) symptom scale or a more general symptom scale (e.g. PANSS) that included items measuring cognitive or disorganization symptoms. Additionally, some studies administered up to 18 different neuropsychological tests to measure cognition while others have used as few as two or three cognitive assessment tests.

We further observed common methodological limitations across studies. Firstly, the reviewed studies included various groups of participants from different age groups and ethnicities. Secondly, while the comparison of patient clusters and trajectories with healthy siblings or controls could provide an accurate means of disentangling the heterogeneity and causes of heterogeneity of schizophrenia symptoms, only four studies (three were cross-sectional studies) examined clusters in siblings. Likewise, most studies used healthy controls to standardize patients neurocognitive composite scores, and a few other studies used controls to compare the distribution of patient clusters or trajectory groups. Thirdly, substantial differences between studies were also noted in constructing composite scores, use of model selection criteria and method of parameter estimation. Fourthly, we observed several ways of subtyping and nomenclature for clusters or trajectories, which may be difficult for clinicians to translate the evidence in diagnosing and treating diseases. This is due to the lack of a standard for designing a study (e.g. adequate sample size), reporting data analysis approaches and publishing results42.

Generally, we saw that studies conducted in patients with similar stages of illness (i.e, first-episode, stable, chronic stage or with or without treatment) and used similar assessment methods (i.e., SANS, SAPS or PANSS) showed some level of similarity in results with respect to identified trajectories and predictors, but studies are largely different in duration of follow-up, frequency of assessment and methods used to assess symptoms or cognition. By the same token, studies that used the similar data-driven statistical methods showed similarity in the number of identified trajectories/clusters, but largely different in study population, stage of illness, use of measurement tool, duration of follow-up, frequency of assessment and identified factors. Moreover, studies with duration of follow-up less than two years and above two years showed a similar level of heterogeneity in symptoms and cognitive deficits and identified predictors. In addition, a 10-year study with five times assessment showed similar findings with a 6 week study with every week assessment on positive symptoms. On the other hand, a 2-year study with five times assessment identified only two trajectories. Despite these facts, all studies interestingly showed heterogeneity of symptoms and cognitive deficits at various level with “four trajectories” is the most replicated in longitudinal studies and “three clusters” is the most replicated in cross-sectional studies. Besides, these studies consistently reported age, gender, ethnicity, educational status, age of illness onset, diagnosis, general psychopathology and depressive symptoms, positive and negative symptoms, cognitive performance, functioning and quality of life as determinant factors of trajectories and/or clusters.

In the era of team science and big data, the use of data-driven statistical methods is becoming increasingly popular for the analysis of longitudinal repeated measures (i.e., latent growth mixture models (LGMMs)) and cross-sectional (i.e., cluster analysis) data (Fig. 4). In our review, we observed that LGMMs, such as GMM, latent class growth analysis (LCGA), mixed mode latent class regression modelling and group-based trajectory modelling (GBTM) were commonly used data-driven methods in longitudinal studies. LGMMs can identify realistic categories based on temporal patterns of change in outcome by assuming the existence of latent classes or subgroups of subjects exhibiting similarity with regard to unobserved (latent) variables19,97. LGMMs have four advantages for modelling longitudinal data. First, they are flexible and data-driven methods that can accurately reveal actual heterogeneity. Second, they allow the classification of individual subjects into latent classes based on the largest probability of class membership. Third, they are sensitive to the pattern of change over time and robust in the presence of missing data. Fourth, subject-level predictors can be directly assessed for association with class membership and hence with different trajectory subtypes16,19,97. Cluster analysis, which is commonly used in cross-sectional studies, is also a data-driven approach for classifying people into homogeneous groups by determining clusters of participants that display less within-cluster variation relative to the between-cluster variation81. Among the reviewed cross-sectional studies, K-means and Ward’s method clustering analyses were commonly used alone or in combination. K-means cluster analysis is a non-hierarchical form of cluster analysis appropriate when previous evidence or hypotheses exist regarding the number of clusters in a sample74. On the other hand, Ward’s method is a hierarchical cluster analysis aiming to determine group assignment without prior hypothesis74. It is believed, K-means cluster analysis can handle larger data sets compared with Ward’s method73.

Fig. 4
figure 4

A hypothetical model for driving big multidimensional data towards a personalized selection of treatments in schizophrenia spectrum disorders. GBTM: Group-based trajectory modeling; LCGA: Latent class growth analysis; CBT: Cognitive behavioural therapy.

The results of statistical subtyping approaches, such as cluster or trajectory analysis depend on mathematical assumptions, type of data, number of variables or tests, sample size and sampling characteristics. Therefore, the models can be unstable and parameter estimates of clinical symptoms may not converge to a consistent set of subgroups and lack a direct relationship to clinical reality59,87,98. For example, intermediate clusters and trajectories substantially vary between studies that used the same cluster or trajectory analysis method87. We advocate that study results from data-driven methods should be applicable, comparable, generalizable and interpretable into clinical practice. As a result, we recommend to validate models using at least one additional comparable statistical methods, combine statistical methods of subtyping with empirical/clinical methods, or work together with clinicians to create a common understanding and clinically relevant clustering or trajectories nomenclatures. Furthermore, it is relevant to replicate clusters or trajectory groups using independent samples, different assessment tools that measure the same construct, or different linkage methods38,99. Finally, further studies are required that focus on longitudinal study design, unaffected siblings, genetic markers and more detailed measures of brain network function for improving our understanding of the biological mechanism underlying heterogeneity of schizophrenia.

Future clinical advances may benefit from the subgrouping of patients to implement tailored therapy. In our review, we observed that several longitudinal studies were conducted based on drug response. One study found individuals who treated with aripiprazole had delayed response56, whereas another study found olanzapine treated patients had good response63. Another study also revealed individuals receiving standard treatment, compared to assertive treatment, showed delayed negative symptom trajectory19. Furthermore, individuals with substantial cognitive deficit received high dose of antipsychotics18,82,87. Subtyping of symptoms and cognitive deficits can also contribute to uncover the biological basis of individual symptoms, rather than studying constellation of co-occurring symptoms1. The identified factors associated with clusters and/or trajectories could be used for developing a clinical risk prediction model for high-risk individuals with prodromal symptoms100,101.

Thus far, findings from this review showed that data-driven approaches could have substantial role to optimize the efficacy of personalized care by predicting individual susceptibility to disease, providing accurate assessments of disease course, contribute to best-choice of early intervention, and selecting treatments (e.g., antipsychotics, cognitive behavioral therapy, social skill training, family therapy) targeting subgroups of patients with similar phenotypic or psychosocial characteristics (Fig. 4)102. When data-driven methods are implemented on samples/cohorts following different pharmacological and non-pharmacological interventions, then, we believe that our proposed model (Fig. 4) can identify individuals who successfully treated, not treated or even harmed and who needs further intervention and close follow-up to protect from unnecessary cost and side effect of medication(s). Therefore, findings from our review could assist in the implementation of personalized and preventive strategies for clinical practice at least in national or regional level.

Conclusions

Our review indicated a significant heterogeneity in results and conclusions obtained from both cross-sectional and longitudinal studies in terms of the number of group membership for positive and negative symptoms and cognition as well as factors (predictors) associated with the group membership. This review also identified several methodological issues contributing to the discrepant results. Generally, the longitudinal studies identified trajectories characterized by progressive deterioration, relapsing, progressive amelioration and stability, whereas low, mixed (intermediate) and high psychotic symptoms and cognitive clusters were identified by cross-sectional studies. Future studies can be more benefited from data-driven methods if applied based on pharmacological and non-pharmacological treatment responses. The use of empirical methods to distinguish more homogeneous subgroups of patients along heterogeneous symptom dimensions has gained traction in the last several years and it is an essential step toward implementation of a more precise prediction of disease risk and individualized selection of interventions.