Introduction

Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is one curative treatment for adult patients with high-risk acute leukemia or severe hematopoietic failure syndromes. Overall survival is about 40% (range 25–62%) for leukemia patients depending on primary disease, stage, conditioning regimens1, 2 and risk groups (range: 25% (high-risk leukemia) to 62% (good-risk leukemia)),3 and about 90% for hematopoietic failure syndrome patients.4, 5, 6 However, allo-HSCT is associated with major complications, such as severe acute graft-versus-host disease (aGvHD) and infections.7, 8, 9 Differential diagnosis of aGvHD from treatment-related toxicities can be difficult and is mainly made according to clinical symptoms and biopsies. Thus, a method is urgently needed to diagnose early onset of aGvHD and to identify patients at risk of developing severe GvHD in an observer-independent, unbiased fashion. Depending on the type of transplantation, patient age, the immunosuppressive prophylaxis and the underlying disorders, 35–85% of transplanted patients develop aGvHD.7, 10, 11 First-line therapy of aGvHD consists of steroids resulting in a response rate of about 70% for patients with aGvHD grade I or II without significant increase of mortality.10 In contrast, patients developing aGvHD grades III or IV have a mortality risk of about 80–90% due to aGvHD-specific organ dysfunction or concomitant infections.12 Recently, proteome analysis of body fluids using capillary electrophoresis (CE) coupled on-line to mass spectrometry (MS) to define differentially excreted peptides has been shown to be a powerful new diagnostic tool in a variety of diseases and is broadly applicable.13, 14, 15, 16, 17 CE-MS has been applied to identify biomarkers for early detection of aGvHD in patients undergoing allo-HSCT since 2003.18, 19, 20 We employed these biomarkers to generate an aGvHD-specific classifier, aGvHD_MS17, that allowed distinction of patients with severe aGvHD (grades III and IV) from those who never developed aGvHD, patients with low or moderate aGvHD (grades I and II) and patients with chronic GvHD (cGvHD) after allo-HSCT. In the present study, we prospectively evaluated the predictive value of aGvHD_MS17 in 423 patients who were enrolled in one of five participating transplant centers and who were transplanted between 2005 and 2010. Results obtained from aGvHD_MS17 analysis were superior to results for other biomarkers previously described for prediction or diagnosis of aGvHD, such as loss of serum albumin,21 C-reactive protein22 and plasma biomarkers.23 This report represents the largest study using proteomics in patient assessment. Our results demonstrate the predictive value, clinical usefulness and applicability of this novel diagnostic tool in post-HSCT surveillance.

Patients and methods

Patients

Prospectively collected midstream urine samples from 429 patients undergoing allo-HSCT between 2005 and 2010 were obtained after informed consent (ethic protocol number 3790). Six patients died before engraftment and were excluded from further analysis. A summary of all clinical data is shown in Tables 1a–c. Of 423 recipients, 242 were male, 80 of those were transplanted from female donors and for 16 no information on donor gender was available. Immunosuppressive antibodies were administered to 308 (72%) patients. For 17 patients, no information regarding antibody treatment was available. Diagnosis of aGvHD was based on clinical criteria24 and on histopathology of biopsies, if available (Table 1b and c). Diagnosis of cGvHD followed criteria established in the cGvHD diagnosis and treatment consensus conferences 2007 and 2009 (ref. 25) and adapted to European needs.26 Incidence and severity of acute GvHD and information on biopsies are summarized in Tables 1b and c. Twenty-five patients died before day +100, six had aGvHD as cause of death. All patients were examined daily during hospitalization and weekly thereafter for the first 130 days post allo-HSCT. Clinical aGvHD was assessed according to the aGvHD score from grade 0 (no sign of GvHD) to IV.24

Table 1a Clinical characteristics of all patientspa
Table 1b Incidence and severity of acute GvHD after allogeneic HSCT and biopsy and proteomic pattern information
Table 1c Acute GvHD manifestation, proteomic profiling and biopsy information

Urine sample collection and preparation

A volume of 10 ml of second morning midstream urine was obtained from the participants and immediately frozen at −20 °C. Samples were collected before HSCT, and on days 0 to 35 (+/−3 days) on a weekly basis and bimonthly thereafter. Sample preparation was done as previously described.19 A median of three samples (range 1–10) were analyzed per patient.

CE-MS analysis and data processing

CE-MS analysis was performed as previously described15, 16, 19, 20 using a P/ACE MDQ (Beckman Coulter, Fullerton, CA, USA) coupled on-line to a Micro-TOF MS (Bruker Daltonic, Bremen, Germany). Mass spectral ion peaks representing identical peptides at different charge states were deconvoluted into molecular mass using MosaVisu software.14 Migration times and ion signal intensities (amplitude) were normalized using internal polypeptide standards.27 The resulting peak list characterizes each polypeptide by its molecular mass (kDa), normalized migration time (min) and normalized signal intensity. Polypeptides within different samples were considered identical if the mass deviation was <50 p.p.m., and the CE migration time deviation was <2 min.19

Adaptation of the aGvHD-specific proteomic pattern and support vector machine-based cluster analysis

The training set for the aGvHD-specific pattern was published previously19 and expanded here. Thirty-three samples were collected from patients with biopsy-proven aGvHD grade II or higher at the time of diagnosis (range: day +4 to +79). Controls consisted of 76 time-matched samples of patients without aGvHD and without infections or relapse at the time of sampling (Supplementary Table S1). All identified discriminatory polypeptides were combined to a support vector machine (SVM) classification model using the MosaCluster software.17 The SVM classifier generates a dimensionless membership probability value on the basis of a patient’s peptide marker profile, termed the classification factor (CF).19, 20

Statistical methods

Estimates of sensitivity and specificity were calculated based on tabulating the number of correctly classified samples in receiver operating characteristic curves and are presented as Box-and-Whisker plots of group-specific CF distributions. Only samples collected until clinical diagnosis of aGvHD were included in this evaluation. Confidence intervals (95%) were based on exact binomial calculations using MedCalc (MedCalc version 8.1.1.0 software, Mariakerke, Belgium).

Binomial logistic regression analysis was performed to determine the relationship between proteomic classification with the aGvHD_MS17 model, demographic and clinical data (Table 2).

Table 2 Multiparameter logistic regression analysis of demographic and clinical variables for the prediction of aGvHD grade III or IV development

Peptide sequencing

Urine samples were analyzed on a Dionex Ultimate 3000 RSLS nano flow system (Dionex, Camberly, UK) as described previously.19 All polypeptides forming aGvHD_MS17 are shown with their CE-MS characteristics (Table 3) and sequences. More detailed information and additional data can be found in the Supplementary Material provided at the journal’s website.

Table 3 Characteristics of urine peptides forming the aGvHD_MS17 pattern

Results

Patient characteristics

In this prospective validation study, 423 patients from five transplant centers were evaluated with the aGvHD-specific aGvHD_MS17 peptide marker pattern. A summary of relevant clinical data is shown in Table 1a and described in Methods. Table 1b lists the incidence and severity of aGvHD and gives information on biopsies obtained within our cohort. Acute GvHD developed in 215 patients (50%). Grade I was diagnosed in 21.5% (n=89), whereas 17.5% (n=74) had aGvHD grade II. Twelve percent (n=52) of the patients developed aGvHD III (n=29) or IV (n=23) despite GvHD prophylaxis and additional immunosuppressive antibodies (antithymocyte globulin) (Table 1b). Biopsy results and proteome analysis at the same time point were available from 80 patients. aGvHD was histologically confirmed in 70 patients. Of those, 32 had aGvHD grade I or II and 38 had GvHD grade III or IV. Only the latter were included to the in-depth analysis. Diagnosis based on biopsy and proteomic profiling is compared in Table 1b. Table 1c summarizes the data of biopsies and aGvHD-MS17 diagnostics.

Proteomic patterns (aGvHD_MS17) for aGvHD assessment

The aGvHD_MS17 proteomic classifier was designed to predict patients at risk for development of severe aGvHD. Quantitative differences in the excretion of the pattern-forming peptides were observed upon comparison of patients without aGvHD, patients with aGvHD grade I and those with biopsy-proven aGvHD grade II or more sampled at clinical diagnosis of aGvHD (Table 2). The differences in the excretion of the peptides included in the proteomic classification model aGvHD_MS17 were converted to a numerical CF, using an SVM-based clustering software as described.19 Box-and-Whisker plot analysis of CF values in the case and control patient groups of the training set (Supplementary Table S1) demonstrated a significant difference of the aGvHD_MS17 classifier in samples from patients without aGvHD or aGvHD grade I (P<0.0001) when compared with patients with aGvHD grade II or more (Figure 1a). Analyses of 1106 samples collected from our prospective cohort provided further evidence that the proteome classifier aGvHD_MS17 can significantly distinguish patients with no aGvHD from those with aGvHD grade I (P=0.0004), grade II (P<0.0001) or grades III/IV (P<0.0001), respectively (Figure 1b). To evaluate the specificity of aGvHD_MS17, additional control samples including chronic renal failure syndromes and autoimmune diseases were analyzed with the same classifier as patients after allo-HSCT (Figure 1c). Only samples from patients after allo-HSCT with severe aGvHD were positive in aGvHD_MS17 classification. Organ manifestation of aGvHD was analyzed in the prospective set for prediction of organ involvement. aGvHD_MS17 scoring was investigated for skin, intestine or liver manifestation of aGvHD to examine possible organ-specific effects on the classification. Although no significant difference between the different manifestations could be detected (data not shown), indicating absence of organ specificity of aGvHD_MS17, involvement of more than 1 organ, which usually correlated with a higher grade of aGvHD, resulted in higher CF values (Figure 1d), as expected.

Figure 1
figure 1

Patients and samples in the model establishment and prospective evaluation phase. (a) Distribution of the CF in the training set. Box-and-Whisker plot presentation showing the difference in aGvHD_MS17 classification between patients with aGvHD grade II or more compared with the controls for the training set. The training set consists of 33 samples with aGvHD grade II or more, and 76 samples from control patients. The pattern was transformed into a CF shown on the y axis using MosaCluster, an SVM-based program. MosaCluster constructs a separation hyperplane between the case and control samples of the training set in the n-dimensional aGvHD biomarker space. The result of SVM classification is a dimensionless positive or negative number termed as CF representing the Euclidian distance of a sample data point to the constructed separation hyperplane. The CF with the best sensitivity–specificity ratio in receiver operating characteristic evaluation of SVM values of the training set was defined as the cut-off point, in this case CF 0.1, and used subsequently as decision criterion for aGvHD prediction in all prospectively collected samples. (b) Distribution of the CF in the prospective samples (n=1106). Comparison of aGvHD_MS17 CF values in the prospective HSCT patient cohort for the differentiation of aGvHD grade I from grade II and >grade II. All samples of the prospective cohort were analyzed and correlated with the clinical data. Box-and-Whisker representation of group-specific CF distribution is shown for the groups ‘no GvHD’, ‘aGvHD grade I’, ‘aGvHD grade II’ and ‘aGvHD grade III/IV’ of the prospective validation cohort (423 patients, 1106 samples) until clinical diagnosis of aGvHD. For the calculation of P-values, a post-hoc rank test was performed for average rank differences between the aGvHD grade I reference group and the aGvHD grade II and >grade II case groups after a significant result in the global Kruskal–Wallis test (P<0.0001). (c) Specificity of aGvHD_MS17. Comparative analysis of aGvHD_MS17 model classification of samples collected from: NC, normal controls (n=76); NS, patients with nephrotic syndromes (n=253) including minimal change disease (n=12), focal segmental glomerulosclerosis (n=106), membranous glomerulonephritis (n=55), membranoproliferative glomerulonephritis (n=4) and IgA nephropathy (n=76); CVD, patients with cardiovascular diseases (n=234) including myocardial infarction (n=87), atherosclerosis (n=7), hypertension (n=45) and coronary disease (n=95); TU, patients with tumors (n=160) including Kaposi’s sarcoma (n=68), pancreatic carcinoma (n=11), cholangiocarcinoma (n=68), hepatocellular carcinoma (n=9) and tumors of other origin (n=4); IEM, patients with inborn error of metabolism (n=239) including type 2 diabetes mellitus (n=78) and Fabry disease (n=161); AI/ID, patients with autoimmune or inflammatory disorders (n=661) including type 1 diabetes mellitus (n=503), systemic lupus erythematosus (n=18), cholestasis (n=115) and vasculitis (n=25); GD, patients with genetic diseases (n=118) including autosomal-dominant polycystic kidney disease (n=71) and polycystic ovary syndrome (n=47). These non-disease-related control groups were compared with samples collected from patients after allo-HSCT without aGvHD or aGvHD grade I, aGvHD grade II or aGvHD III and IV. (d) Organ involvement in severe aGvHD. Figure 1d shows the Box-and-Whisker analyses of aGvHD_MS17 scoring for organ involvement in severe aGvHD. Applying proteomic profiling does not describe involvement of particular organs; however, severity of aGvHD is usually also accompanied by more than one organ manifestation. Manifestation of aGvHD in specific organs is indicated. GI, gastrointestinal manifestation.

Peptides and proteins forming the aGvHD_MS17 proteomic pattern

To date, we have successfully sequenced 10 of 17 pattern-forming, naive peptides. In patients with aGvHD, we found increased excretion of fragments of albumin (N-terminal), β2-microglobulin, collagen-α1 and -α2, and decreased excretion of fragments of CD99, fibronectin and collagen-α1 (Table 3).

Multivariable logistic regression and receiver operating characteristic analysis

Consecutive logistic regression analysis using aGvHD grade III or IV onset 14 days before any clinical signs for aGvHD as a dependent binary variable (Methods and Table 2) demonstrated that positivity in the aGvHD_MS17 model was the strongest predicting variable (P<0.0001) for the development of severe aGvHD. Recipient gender (P=0.0001) was also a highly significant predictor in our cohort (Table 2), with a predisposition of aGvHD development in males. Donor gender (P=0.037) was also a significant variable; male recipients transplanted from female donors had the highest risk for aGvHD development. Other significant variables were age, conditioning (P=0.05), immunosuppressive antibodies (P=0.02), primary disease (acute myeloid leukemia; P=0.046) and days post HSCT (P=0.001). C-reactive protein and serum albumin did not correlate with aGvHD development (P-values of 0.72 and 0.07, respectively) and therefore did not improve classification performance of the logistic regression model.

A logistic regression model combining the aGvHD_MS17 CF values with the statistically significant demographic and clinical variables presented in Table 2 enabled diagnosis of severe aGvHD with a sensitivity of 82.4% and a specificity of 77.3% about 14 days before clinical diagnosis and at a time when the patients had no clinical signs of aGvHD (Figure 2a). CF of 0.1 was determined as the most discriminatory cut off. Separate analyses of recipients of bone marrow (BM) grafts (n=39) revealed high sensitivity (83%) and specificity (93%) for prediction of severe aGvHD development (Figure 2b). In addition, we compared the proteomics data with data obtained from biopsies where available. Figure 2c shows the receiver operating characteristic for both diagnostic tools in comparison. The prediction of severe aGvHD by aGvHD_MS17 proteomic profiling is comparable to the diagnosis based on biopsies (Table 1c, Figure 2c). Patients with biopsy-proven aGvHD grade III/IV were predicted correctly with aGvHD_MS17 with 91% sensitivity and 80% specificity. In addition, positivity of aGvHD_MS17 was usually detected earlier than positivity in biopsies (Table 1c, Figure 2c).

Figure 2
figure 2

(a) Prediction of severe aGvHD 14 days before clinical signs in the prospective patient cohort. Receiver operating characteristic (ROC) curve (bold line, area under the curve (AUC)=0.85) of aGvHD grade III/IV prediction 14 days before any signs of aGvHD by the logistic regression model that was generated by combining proteomic pattern diagnosis with statistically significant demographic and medical variables such as age, immunosuppressive antibodies (antithymocyte globulin/thymoglobulin) recipient and donor gender, conditioning regimen, primary disease, human leukocyte antigen-match of donor and recipient and days post HSCT. Samples taken under steroid therapy were excluded to prevent confounding effects of steroids of the blinded set (Tables 1a–c, Supplementary Table 1). 95% Confidence intervals (95% CIs) are indicated by thin, broken lines. (b) Prediction of aGvHD grade II or more: BM-HSCT versus PB-HSCT. Separate analyses of samples collected from 39 patients after allogeneic BM and 379 patients after PB stem cell HSCT are shown. Only samples of patients with information on all clinical and demographic variables were analyzed. Cord blood SCT recipients (n=5) were excluded from this analysis. Pending severe aGvHD was analyzed by application of aGvHD_MS17 positivity in combination with statistically significant demographic and medical variables. The resulting ROC curve is compared with that of patients after PB-HSCT. The AUCs (0.95 and 0.84, respectively) are shown by the bold line, and 95% CIs are indicted by dotted lines. (c) Biopsy-proven aGvHD: correlation to prediction of aGvHD by proteomic profiling. Biopsies of the suspected organ were available in 80 patients. In 10 cases, aGvHD was not confirmed by biopsy (control). Only patients with biopsy-confirmed aGvHD grades III/IV were included in the analysis. The correlation of aGvHD_MS17 prediction of pending aGvHD with the later biopsy-confirmed aGvHD is shown here. AUC (0.89) and 95% CI are shown.

To test the ability of the aGvHD_MS17 pattern to discriminate between aGvHD and cGvHD, we evaluated samples from patients with manifested cGvHD and samples collected after day +130 post HSCT upon complete withdrawal of immunosuppression. The aGvHD_MS17 pattern did not cross-react with patients with manifested cGvHD (Supplementary Figure S1). Late-onset aGvHD upon withdrawal of immunosuppression was diagnosed using aGvHD_MS17 and presented as ‘aGvHD’ in our biomarker panel. The data demonstrate that the combination of aGvHD_MS17 with relevant demographic and medical variables provides for the first time the opportunity for preemptive treatment of patients at risk for severe aGvHD.

Discussion

Evaluation of the aGvHD-specific proteomic pattern aGvHD_MS17 over a period of 5 years in five different transplant centers demonstrated its power to predict aGvHD and potential usefulness to select patients for preemptive therapy. Blinded samples were classified correctly, with a sensitivity of 82.4% (95% confidence interval: 71–92.4) and specificity of 77.3% (95% confidence interval: 73.7–79.2) in combination with demographic and medical variables using a logistic regression model (Figure 2). Separate analyses of samples from patients after BM or peripheral blood (PB) stem cell transplantation showed that the performance of aGvHD_MS17 was statistically significantly better (P=0.01) in patients after BM-HSCT (area under the curve: 0.95). The sensitivity and specificity were 83% and 93% compared with 83% and 76%, respectively, in the PB-HSCT (area under the curve: 0.84) recipients. However, only 39 patients received BM-HSCT grafts, whereas 379 received PB-HSCT grafts.

Importantly, the aGvHD_MS17 is specific for prediction of aGvHD, especially grades III and IV, and does not cross-react with patients with other diseases or complications tested (Figure 1) or samples from patients with cGvHD (Supplementary Figure S1). In addition, aGvHD_MS17 positivity was the most significant independent variable in the multivariable logistic regression model, predicting development of aGvHD grades III and IV, followed by gender, whereas conditioning regimen and even matched donor transplantation were less significant (Table 2).

The loss of serum albumin in patients developing aGvHD grades III and IV of the intestine has been described recently, leading the authors to speculate that albumin might be lost via the intestine as aGvHD-initiated organ damage progresses.21 The majority of patients had decreased albumin levels early after HSCT; however, inclusion of serum albumin levels in our multivariate regression model showed that serum albumin loss was not statistically significant in our cohort for prediction of severe aGvHD. The decreased serum albumin levels observed in our study may have resulted from the administration of immunosuppressive antibodies to 72% of our patients during conditioning (Tables 1a–c). Capillary leakage syndromes are common under this conditioning therapy and may be the underlying cause of serum albumin loss in our patients independent of aGvHD. However, we detected increased urinary excretion of a specific N-terminal fragment of albumin as aGvHD progressed (Table 3). Albumin uptake in T cells was described to be associated with aGvHD development.28 Thus, our results confirm those of Rezvani et al.,21 but suggest changes in serum albumin metabolism/catabolism or possible GvHD-induced vascular damage in the kidney rather than mere intestinal loss of serum albumin as a pathological component of aGvHD.

Others have applied new technologies for aGvHD diagnosis, underlining the need for advances in the ability to diagnose GvHD in patients undergoing allogeneic HSCT.23, 29, 30 A biomarker panel consisting of six proteins potentially involved in the pathogenesis of aGvHD (IL-2 receptor-α, tumor necrosis factor receptor-1, hepatocyte growth factor, IL-8, elafin, a skin-specific marker,23 and regenerating islet-derived 3-α)31 was established for serum using enzyme-linked immunosorbent assay. These biomarkers, present at the time of diagnosis of manifested aGvHD, were investigated in a multicenter trial to predict treatment response and survival of patients with aGvHD.30 Sampling was done at diagnosis of manifested aGvHD and 14 and 28 days after initiation of treatment, and the pattern could predict response to therapy and survival. However, these markers are not suitable for preemptive diagnosis of aGvHD.30 The special value of our aGvHD-specific classifier (aGvHD_MS17) is its capacity to identify patients before any clinical signs of developing aGvHD, independent of organ manifestation and at least 14 days before clinical manifestation of aGvHD. The aGvHD_MS17 classifier is in very good agreement with the gold standard for aGvHD diagnosis, namely tissue biopsies (Tables 1a–c, Figure 2d). Tissue biopsy cannot be used for routine monitoring requiring repeated sampling, and its predictive value is therefore not easily assessable. Prediction of pending severe aGvHD can currently only be accomplished by the proteomic pattern. No association of specific organ manifestations of aGvHD was detectable. However, the severity of pending aGvHD, as well as manifestation of aGvHD in more than one organ, was both associated with aGvHD_MS17 scoring. In our cohort, patients with severe aGvHD had generally more than one organ involved in aGvHD, as well as a higher score in the aGvHD_MS17 classifier (Figure 1d).

Sequencing the naive peptides forming the classifier (aGvHD_MS17) provided insight into aGvHD pathophysiology and, ultimately, may help to identify novel potential therapeutic targets for aGvHD therapy. We observed increased or decreased excretion of the pattern-forming peptides. For example, increased β2-microglobulin excretion may indicate cell death as aGvHD progresses in severity. In addition, we observed increased or decreased excretion of particular collagen fragments, indicating very early changes in collagen metabolism, possibly indicating inflammation and/or early vascular damage that may consequently lead to organ damage. It is well accepted that conditioning, especially with total body irradiation, leads to an inflammatory environment, which causes activation of recipient antigen-presenting cells and donor T cells. CD99, for example, is an activation marker of T cells, and excretion was decreased as aGvHD severity increased. One can speculate that in the activation state (aGvHD) turnover of CD99 may be reduced. Interestingly, the decreased excretion of the fibrinogen fragment points toward unsuccessful repair of the microdamages to the vasculature in patients prone to develop aGvHD III/IV (Table 3).

In summary, application of the proteomic classifier (aGvHD-MS17) to evaluate allo-HSCT recipients allowed reliable prediction of specific changes and damages relevant for our understanding of aGvHD development. Urinary proteomic monitoring introduces the first unbiased, investigator-independent diagnosis of pending severe aGvHD and are currently investigated to guide preemptive treatment of aGvHD_MS17 pattern-positive patients in clinical trials.