Introduction

Psychotic symptomatology includes loss of contact with reality, thought disorder, delusions and hallucinations, unusual or bizarre behavior, impaired social interactions and difficulties to carry out daily activities1. While psychosis could be caused by recreational drug use, physical illness or brain trauma, it is often symptomatic of the onset of severe psychiatric disorders such as schizophrenia, schizoaffective disorder or bipolar disorder.

According to current guidelines, first-line treatments of psychosis involve the use of the minimum effective dose of second-generation antipsychotics whenever possible. Whatever criteria are used to assess response to treatment, responses are highly heterogeneous. While 25–30% of first-episode psychosis (FEP) patients fully respond, a majority respond partially or not at all, and are therefore switched to second-line treatments2. As early response to treatment is one of the main factors associated with improved long-term prognosis3,4,5, identifying predictors of treatment response in FEP patients is an important issue in the field6. Response to treatment could be assessed either using predefined cutoffs in the percentage of reduction of baseline scores on a psychopathology rating scale7, or by measuring the proportion of patients meeting remission criteria. According to the definition proposed by the Remission in Schizophrenia Working Group (RSWG), remission can be defined by an absolute threshold of severity of symptoms in three dimensions: reality distortion, disorganization and negative symptoms8. Using this consensus definition, it was found that global functioning in the year before admission, the total score of the Strauss Carpenter Prognostic Scale and the Positive and Negative Syndrome Scale (PANSS) negative sub-score at admission were all predictive of symptom remission in cohorts of schizophrenia inpatients9,10. Despite these latter studies, clinicians still lack reliable predictors of remission in FEP patients.

Several environmental risk factors for psychosis have been identified11 including autoimmune disorders12 and infection with Toxoplasma gondii13, cytomegalovirus (CMV)14 and herpes simplex virus (HSV) type 115. Meta-analyses have shown that drug-naive FEP patients exhibit altered serum levels of various cytokines compared to healthy individuals16,17,18. Since these data suggested a possible link between immune dysregulation and psychosis, it was proposed that serum levels of cytokines, chemokines and biomarkers of inflammation could predict early response to treatment6. However, antipsychotic treatments could impact cytokine levels including those of interleukin (IL)-1β, interferon (IFN)-γ, IL-12 and tumor necrosis factor (TNF)-α19,20,21,22,23. Therefore, the predictive value of serum biomarkers should be ideally assessed in minimally treated or untreated FEP patients, which is a challenge because of the difficulty to enroll these patients in clinical trials. This may be the reason for which only a few studies have investigated the association between baseline levels of peripheral biomarkers and remission in FEP patients. To identify biological predictors of remission in FEP patients, we have analyzed clinical data and biological samples from the multinational, multi-centered, randomized, double-blind “Optimization of Treatment and Management of Schizophrenia in Europe (OPTiMiSE)” study in which FEP patients were clinically assessed before and after 4 weeks of treatment with the second-generation antipsychotic amisulpride24. Our results demonstrate that serum levels of immune-related proteins before treatment combined with a few clinical variables could predict remission in at least a subtype of FEP patients.

Materials and methods

Patients

The OPTiMiSE study was conducted in 27 general hospitals and clinics in 14 European countries, Israel and Australia (Clinicaltrials.gov identifier is NCT01248195). FEP patients based on the EUFEST (European First Episode Schizophrenia Trial) study definition25 were recruited between May 2011 and April 2016 at the participating centers from nearby healthcare facilities. Eligible patients were aged 18–40 years and met criteria of the Diagnostic and Statistical Manual of Mental Disorders (4th edition) for schizophrenia, schizophreniform disorder or schizoaffective disorder. A total of 479 patients signed informed consent. Diagnoses were confirmed by the Mini International Neuropsychiatric Interview plus. Patients were excluded if more than 2 years had passed since the start of the FEP; if any antipsychotic drug had been used for more than 2 weeks in the previous year and/or for a total of 6-week lifetime; if patients had a known intolerance to one of the study drugs; if patients met any of the contraindications for any of the study drugs; if patients were coercively treated and/or represented by a legal guardian or under legal custody; or if patients were pregnant or breast feeding. Patients were required to provide written informed consent.

Patient clinical assessment and primary outcome

A screening visit was conducted during which eligibility was assessed. Baseline data were obtained regarding demographics, diagnoses, current treatments and psychopathology: PANSS total score and sub-scores, overall severity of symptoms assessed using the Clinical Global Impression (CGI) scale26, depression assessed using the Calgary Depression Scale for Schizophrenia (CDSS)27 and social functioning assessed using the Personal and Social Performance Scale (PSP)28. Recreational drug use was also assessed. Data were collected at baseline and 4–5 weeks later.

All patients were treated for 4 weeks with up to 800 mg/day amisulpride in an open design. The primary outcome was symptomatic remission according to the criteria of Andreasen et al.8: a score of ≤3 (on a scale ranging from 1 to 7) simultaneously on 8 PANSS items: P1, P2, P3, N1, N4, N6, G5 and G9.

Blood samples

Peripheral blood samples were obtained from fasting subjects between 7:00 am and 9:00 am. Five milliliters of peripheral blood were drawn by venipuncture into serum Vacutainer tubes. For the serum collection, the blood was allowed to clot for 1 h before centrifugation (1500 × g, 10 min). The serum and plasma samples were stored in 0.5 ml aliquots at −80 °C. For measuring protein and antibody levels, serum samples were thawed on ice, and 50 µl aliquots were prepared and stored at −80 °C.

Immunoassay

Serum levels of IL-1α, IL-1β, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-10, IL-12p40, IL- 12p70, IL-13, IL-15, IL-16 IL-17, IL-18, IL-21, IL-23, IL-27, IFN-γ, chemokines (C-C motif chemokine ligand (CCL)-2, CCL3, CCL4, CCL11, CCL13, CCL17, CCL19, CCL20, CCL22, CCL26, CCL27, and C-X3-C motif chemokine ligand (CX3CL)-1, CXCL10, CXCL11, CXCL12), TNF-α, TNF-β, granulocyte macrophage-colony stimulating factor, vascular endothelial growth factor (VEGF), C reactive protein (CRP), serum amyloid A protein (SAA), soluble intercellular adhesion molecule 1 (sICAM-1) and soluble vascular adhesion molecule 1 (sVCAM-1) were measured using the Pro-inflammatory Panel 1, Cytokine Panel 1, Chemokine Panel 1, Th17 Panel 1 and Vascular Injury Panel 2 v-PLEX® kits (MSD). All assays were performed according to the manufacturer’s instructions. The data were acquired on the V-PLEX® Sector Imager 2400 plate reader and analyzed using the Discovery Workbench 3.0 software (MSD). The standard curves for each cytokine were generated using the premixed lyophilized standards provided in the kits. Serial twofold dilutions of the standards were run to generate a 13-standard concentration set, and the diluent alone was used as a blank. The cytokine concentrations were determined from the standard curve using a 4-parameter logistic curve fit to transform the mean light intensities into concentrations. The lower limit of detection (LLOD) was determined for each cytokine and for each plate as the signal recorded for the blank plus 2 standard deviations (SDs).

Serology

We measured plasma immunoglobulin G (IgG) antibodies reacting to HSV type 1, CMV and T. gondii using previously described immunoassay methods29. Diluted plasma was applied to antigens immobilized on the wells of microtiter plates and bound antibodies were quantified by means of reaction with enzyme-labeled anti-human IgG and the corresponding substrate. Reagents and assay kits for anti-HSV-1 were obtained from Focus Laboratories (USA). Anti-CMV and anti-Toxoplasma antibodies were obtained from IBL Laboratories (Germany). Results were obtained as quantitative values determined by comparison of the level of reactivity to standards run with each assay, as well as qualitative results listed as “positive” or “negative”.

Statistical analysis

In univariate analysis, Mann–Whitney–Wilcoxon tests were performed to assess statistical significance of non-Gaussian distributed data. To develop a predictive model for remission we used the elastic net, which is a regularized regression model, i.e., general linear model with penalties to avoid extreme parameters that could cause overfitting30. Elastic net is also a method of selection of variables that addresses the issue of multicollinearity that arises in our dataset because cytokines and chemokines are not independent of each other. To minimize variation across testing datasets, we repeated fivefold cross-validation 100 times with independent random dataset partitions to optimize stability31. We tuned the hyper-parameters α and λ 10 times for each partition via fivefold cross-validation with the optimal tuning parameter values chosen to maximize the area under the receiver operating characteristics (ROC) curve (AUC)32. Weighted odds ratios (ORs) were calculated using the proportion of drawings in which the variable was selected as a weight. All statistical analyses were performed using the R software packages Stats33, Caret34, Glmnet35, pROC36 and eNetXplorer37.

Unsupervised statistical classification

To stratify m patients into k clusters based on their PANSS scores (d items per patient), we prepared a matrix X with one patient per line and one PANSS item per column (Supplementary Figure 1). Our objective was to find a matrix Y of labels. We thus tried to solve an optimization problem for finding a space which discriminated clusters based on a limited number of weighted PANSS items. The output was a W (weight) matrix with k columns and d lines computing the weight of each PANSS item. We achieved this goal using an alternating minimization procedure on Y and W in which we tried to minimize the Frobenius norm38.

Results

Soluble serum biomarkers did not predict remission in non-stratified FEP patients

A total of 479 patients were included in the OPTiMiSE clinical trial. Out of the 446 patients in the intention-to-treat sample, 371 completed amisulpride treatment. Among those, 325 had serum samples collected before the study treatment was initiated and were included in the present study (Table 1). Clinical assessment 3-4 weeks after treatment initiation revealed that 68.6% of the patients were in symptomatic remission39 according to the consensus definition8. As a first attempt to identify biomarkers that could predict remission, we analyzed serum samples for 43 interleukins, chemokines and biomarkers of inflammation. Among these proteins, 8 were below the LLOD in more than 10% of the samples and were not included in downstream analyses (Supplementary Table 1). In an exploratory analysis, we compared the levels of the 35 remaining proteins in remitters and non-remitters using univariate analysis. After correction for multiple test, none of these 35 proteins was present at different levels in remitters and non-remitters (Supplementary Table 2).

Table 1 Patient clinical characteristics

In contrast to univariate methods that assess the differential expression of proteins on a single feature level, multivariate classification methods such as regularized logistic regression allows for establishing a prediction model based on samples with known class outcomes, e.g., remission versus non-remission40. A set of clinical and biological variables with the best joint discriminatory ability to differentiate between classes could be identified, and the resulting prediction model could then be used to predict the class outcomes of new patient samples. As a second attempt to predict remission, we investigated the association between serum protein levels and remission using regularized logistic regression after adjustment for age, gender, body mass index (BMI), waist circumference, use of recreational drugs and seropositivity to T. gondii, CMV and HSV-1. Applying this method to the dataset did not allow for identifying proteins whose serum levels were associated with increased odds of being non-remitters (not shown).

One obvious explanation for this negative result could be that none of the studied serum proteins is relevant for discriminating remitters and non-remitters among FEP patients. Alternatively, the heterogeneity of psychotic disorders in terms of symptomatology and likely etiology and pathophysiology may impede the identification of underlying remission predictors in a general population of FEP patients. To overcome this issue, we sought to stratify FEP patients based on their individual symptomatology assessed using the PANSS instrument41.

Patient clustering

We sought to stratify patients in clusters in which patients within one cluster would be more similar (cohesion) than patients in the others (separation). We applied a two-step hierarchical unsupervised clustering method to a dataset consisting of the 30 individual PANSS scores of the 325 patients in the OPTiMiSE study sample, therefore resulting in four clusters: C1A and C1B, and C2A and C2B. We compared two methods for data clustering: principal component analysis (PCA)-K-means42 that is a popular method for cluster analysis, and K-sparse* that is a modified version of K-sparse38. While both methods were successful at stratifying the 325 patients of the OPTiMiSE study sample, K-sparse* outperformed PCA-K-means as demonstrated by both a higher mean silhouette value (0.76 compared to 0.43) and t-distributed stochastic neighbor embedding (t-SNE) graphical representations (Fig. 1a). We therefore selected K-sparse* for data clustering. First-level classification using K-sparse* identified two subtypes: C1 (n = 159) and C2 (n = 166). K-sparse* selected nine items that discriminated C1 and C2 patients, among which five belonged to the negative PANSS sub-scale (NPANSS) and four to the general psychopathology PANSS sub-scale (GPANSS) (Supplementary Table 3). Second-level classification identified four subtypes: C1A (n = 97) and C1B (n = 62) on one hand, and C2A (n = 95) and C2B (n = 71) on the other. K-sparse* selected eight PANSS items for discriminating C1A from C1B patients, among which four belonged to the positive PANSS (PPANSS) sub-scale and four to the GPANSS sub-scale (Supplementary Table 3). K-sparse* selected seven PANSS items for discriminating C2A from C2B patients, among which three belonged to the PPANSS sub-scale and four to the GPANSS sub-scale (Supplementary Table 3). In agreement with the nature and the weight of the PANSS items selected by K-sparse*, C1A and C1B patients exhibited more severe negative and general psychopathology symptoms compared to C2A and C2B patients respectively (Table 1). C1A and C2A patients exhibited more prominent positive and general psychopathology symptoms compared to C1B and C2B patients respectively (Table 1). Compared to other patients from the study sample, those from the C1A subtype exhibited more severe symptoms in the positive, negative and general psychopathology dimensions (Table 2, Fig. 1c). C1A patients also exhibited higher clinical scores as measured by the CGI and the CDSS and showed the worst psychosocial performance/functioning as measured by PSP scale (Table 2). In contrast, C2B patients exhibited less severe symptoms in the positive, negative and general psychopathology dimensions, exhibited lower CGI and CDSS scores and showed the best psychosocial performance/functioning as measured by the PSP scale (Table 2, Fig. 1c).

Fig. 1: Clinical characteristics of patient subtypes.
figure 1

a First-level stratification. Silhouettes and t-distributed stochastic neighbor embedding (t-SNE) representations of patient clustering using principal component analysis (PCA)-K-means (left panels) and K-sparse* (right panels). Silhouette values could range from −1 to +1, with high values reflecting higher similarity within cluster. Mean silhouette values (coefficient) are indicated. b Second-level stratification. Silhouettes of C1 (left) and C2 (right) patient clustering using K-sparse*. Mean silhouette values (coefficient) are indicated. c Three-dimensional (3D) scatter plot representation of positive PANSS (PPANSS), negative PANSS (NPANSS) and general psychopathology PANSS (GPANSS) sub-scores of C1A, C1B, C2A and C2B patients. PANSS Positive and Negative Syndrome Scale

Table 2 Clinical score comparisons between individual patient subtypes

In summary, applying a two-step hierarchical unsupervised classification method to FEP patients identified four patient subtypes characterized by different symptom profiles. C1A patients exhibited the most severe symptoms in all dimensions, and 57.70% of them were remitters after 4-week treatment with amisulpride, compared to 68.6% in the study sample as a whole (Table 1). In contrast, C2B patients exhibited less severe symptoms, and 90.4% of them were remitters after 4 weeks of treatment (Table 1).

Validation of the clustering solution

The complexity of deriving clustering solutions makes validation crucial not only to ensure reproducibility but also to confirm that the derived clusters index clinically meaningful variations43,44. We first sought to validate our clustering solution using cross-validation, i.e., by first splitting data in a training and a test sample, and then assigning each patient of the test sample to one of the clusters derived from the training sample. Results from 50 independent random drawings showed that our clustering solution was robust with 86.8% to 95.5% of the patients (depending of the cluster) in the test sample being correctly classified (Supplementary Table 4).

As an alternative and complementary approach, we sought to validate our clustering solution on external biological measures, i.e., to investigate whether reducing clinical heterogeneity also reduces biological heterogeneity45,46,47. To this aim, we searched for serum biomarkers that were present at different levels between clusters. Univariate analysis did not identify serum proteins that distinguished C1B or C2A patients from the others. In contrast, C1A patients exhibited statistically higher levels of IL-7, IL-15, IL-17, IFN-γ, TNF-α, sICAM-1 and sVCAM-1 after correction for multiple test (Table 3). The probability that seven biomarkers or more would have been expressed at statistically higher levels in 97 randomly selected patients (to match the number of C1A patients) compared to the others was 5.47 × 10−6 as estimated by 10,000 successive random drawings (Supplementary Table 5). We also found that C2B patients exhibited lower levels of CXCL12 and higher levels of IL-8. Effect sizes were small to medium (0.5 > Cohen’s d coefficient > 0.2) for IFN-γ, IL-7, IL-17, TNF-α, sICAM-1, CXCL12 and sVCAM-1, and medium to high (1.0 > Cohen’s d coefficient > 0.5) for both IL-15 and IL-8. Because the K-sparse* clustering approach that we have used to define C1A, C1B, C2A and C2B subtypes was based on clinical features only, the fact that several peripheral biomarkers distinguished at least two patient subtypes from the others validated our clustering approach on external biological measures48.

Table 3 Differences in serum biomarker levels between patient subtypes

Predicting remission in individual patient subtypes

As an attempt to identify serum biomarkers associated with remission in individual patient subtypes, we applied regularized logistic regression to clinical and biological data from C1A, C1B, C2A and C2B patients. None of the analyzed variables was associated with remission in C1B, C2A and C2B patients (not shown). In striking contrast, lower serum levels of IL-15, higher serum levels of CXCL12, seropositivity to CMV, use of recreational drugs and being younger were all associated with increased odds of being non-remitters in C1A patients (Table 4, model 1). Among these five variables, IL-15 was selected in 99.6% of the training/test runs and had a p value < 0.001. To estimate the predictive value of these five combined variables, we applied a regularized logistic regression to these five variables only (Table 4, model 2). All variables were selected more than 95% of the time and exhibited p values < 0.1. The predictive value of this model, assessed by the ROC curve was 73 ± 0.10%, and its specificity and selectivity were 45 ± 0.09% and 83 ± 0.03%, respectively.

Table 4 Clinical and biological variables associated with non-remission in C1A patients

Discussion

Heterogeneity of patients with mental disorders may impede identification of adequate predictors of remission43. In keeping with this hypothesis, we have failed to identify serum biomarkers associated with remission in non-stratified FEP patients. To overcome this problem, we used a hierarchical clustering approach to identify subtypes of patients based on their clinical symptoms. Several unsupervised clustering methods have been used to stratify patients with mental disorders based on clinical symptoms and case history variables49,50,51,52,53 or social cognitive measures54. Given a set of data points, clustering methods aim to partition data into a specified number (k) of clusters, such that the samples in each cluster are more similar to one another than to those in the other clusters. This entails defining a measure of similarity or distance between data points. As recently pointed out43, the outcome of clustering is highly dependent on the input data with relatively little convergence towards a coherent and consistent set of subtypes. Unfortunately, the biological relevance of the few subtypes identified so far was generally limited and did not clearly reflect underlying biological mechanisms. Here, we have used a two-step hierarchical unsupervised clustering method to stratify FEP patients into four subtypes, termed C1A, C1B, C2A and C2B, based on their clinical symptoms. C1A patients were characterized by the most severe symptoms in the positive, negative and general physiopathology dimensions. In contrast, C2B patients were the least severely affected. Most importantly, C1A and C2B patients did not only differ from other patients in terms of symptoms severity but also exhibited specific peripheral immune signatures suggesting that these subtypes reflected distinct pathophysiological entities45,55. Our study therefore provides the proof of concept that clustering methods aimed at reducing clinical heterogeneity may also reduce biological heterogeneity.

Several authors in the field have tried to stratify psychosis spectrum patients on the basis of symptoms. In a pioneer study, Dollfus et al.51 have used the Ward’s method of hierarchical clustering to identify four subtypes of schizophrenia patients that they called “positive”, “negative”, “mixed” and “disorganized”. These four subtypes are very similar to the four subsets that we describe here, with our C1B, C2A, C1A and C2B subsets being very similar to Dollfus’ “positive”, “negative”, “mixed” and “disorganized” subtypes, respectively.

In contrast to the current ”one size fits all” or ”trial and error” approach in healthcare, stratified medicine aims at sorting a population into biologically relevant subtypes. We found that the vast majority (90.4%) of C2B patients were remitters after treatment. This agrees with previous studies which have shown that patients with less severe negative symptoms are more likely to be remitters than others9. In contrast, the proportion of remitters among C1A, C1B and C2B patients ranged between 54.8% and 61.3%. Thus, the clustering solution that we describe here constitutes a first step towards stratified medicine for psychotic patients.

In support of a critical role of inflammation in psychiatric diseases, add-on treatments with anti-inflammatory drugs have been tested in severe and treatment-resistant psychiatric patients56,57,58,59,60,61,62,63,64. For example, acetyl salicylic acid (Aspirin) which interrupts the immuno-inflammatory cascade by inhibiting cyclooxygenase (COX)-1 and COX-2 showed promising results as an add-on treatment of schizophrenia in comparison to treatment as usual60,63. In most cases however, add-on anti-inflammatory treatments in psychotic patients only provided modest improvements in clinical outcome. This could be explained if only a subtype of the treated patients exhibited a pro-inflammatory profile at baseline. In agreement with this latter hypothesis, an add-on trial in patients with psychotic disorders showed that those with increased CRP levels had the largest response to add-on Aspirin as compared to those with lower levels65. Compared to others, C1A patients exhibited higher levels of IL-7, IL-15, IL-17, IFN-γ, TNF-α, sICAM-1 and sVCAM-1. Therefore, a reasonable and testable hypothesis is that C1A patients would be those that could benefit the most from add-on anti-inflammatory treatment. On another topic, several authors have proposed that inflammation was associated with poor clinical outcome in psychosis66,67,68,69. In agreement with these studies, C1A patients were both characterized by higher levels of several inflammatory biomarkers and a lower proportion of non-remitters (57.7% compared to 68.6% in the study sample as a whole, and 90.4 % in the C2B subtype).

Compared to stratified medicine, personalized medicine builds on a finer sub-classification of patients to enable individual tailoring of treatment to maximize response. Bearing this in mind, we have used regularized logistic regression to select variables that could predict remission in individual patient subtypes. In C1A patients but not in others, lower levels of IL-15, higher levels of CXCL12, recreational drug use, being seropositive to CMV and being younger were all associated with increased odds of being non-remitters after adjustment for covariates suspected to impact cytokine levels or response to treatment. While IL-15 is mainly known for its role in regulating natural killer and T cells70,71, it is also produced by astrocytes and neural progenitors72,73, regulates neurogenesis and exerts anti-depressive effects in mice74,75. Likewise, while CXCL12 was first described as a chemotactic factor for lymphocytes and macrophages, it is also secreted by glial cells and neurons and plays a role in brain plasticity and function76. One of the two CXCL12 receptors, CXCR4, acts at both the synaptic and post-synaptic levels by promoting the release of glutamate and γ-aminobutyric acid (GABA) and by activating the voltage-gated K channel Kv2.1, respectively. Whether and how IL-15 and CXCL12 impact response to antipsychotics remains to be elucidated.

In addition to lower levels of IL-15 and higher levels of CXCL12, being seropositive to CMV and the use of recreational drugs were both associated with an increased risk of being non-remitters in C1A patients. Previous studies have identified CMV infection77 and use of recreational drugs78 as risk factors for schizophrenia. Why these two variables are also associated with an increased risk of being non-remitters in C1A patients is unclear.

Our results, if replicated, could pave the way for the development of a blood-based assisted clinical decision support system for selecting the most appropriate treatment in psychotic patients.