Exploratory classification of clinical phenotypes in Japanese patients with antineutrophil cytoplasmic antibody-associated vasculitis using cluster analysis

A novel patient cluster in antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis (AAV) may be identified in Japan. We performed multiple correspondence and cluster analysis regarding 427 clinically diagnosed AAV patients excluding eosinophilic granulomatosis with polyangiitis. Model 1 included the ANCA phenotype, items of the Birmingham Vasculitis Activity Score, and interstitial lung disease; model 2 included serum creatinine (s-Cr) and C-reactive protein (CRP) levels with model 1 components. In seven clusters determined in model 1, the ANCA-negative (n = 8) and proteinase 3-ANCA-positive (n = 41) groups emerged as two distinct clusters. The other five myeloperoxidase-ANCA-positive clusters were characterized by ear, nose, and throat (ENT) (n = 47); cutaneous (n = 36); renal (n = 256), non-renal (n = 33); and both ENT and cutaneous symptoms (n = 6). Four clusters in model 2 were characterized by myeloperoxidase-ANCA negativity (n = 42), without s-Cr elevation (< 1.3 mg/dL) (n = 157), s-Cr elevation (≥ 1.3 mg/dL) with high CRP (> 10 mg/dL) (n = 71), or s-Cr elevation (≥ 1.3 mg/dL) without high CRP (≤ 10 mg/dL) (n = 157). Overall, renal, and relapse-free survival rates were significantly different across the four clusters in model 2. ENT, cutaneous, and renal symptoms may be useful in characterization of Japanese AAV patients with myeloperoxidase-ANCA. The combination of s-Cr and CRP levels may be predictive of prognosis.

Proteinase-3 RemIT-JAV Remission induction therapy in Japanese patients with AAV RemIT-JAV-RPGN Remission induction therapy in Japanese patients with AAV and rapidly progressive glomerulonephritis s-Cr Serum creatinine Antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis (AAV) is a multisystem autoimmune disease characterized by ANCA production and small-and medium-sized blood vessel inflammation 1 . Eosinophilic granulomatosis with polyangiitis (EGPA), Granulomatosis with polyangiitis (GPA), and microscopic polyangiitis (MPA), are the major categories of AAV, and proteinase 3 (PR3) and myeloperoxidase (MPO) are two major antigens of ANCA 2 . PR3-ANCA is generally regarded as a marker for granulomatosis with polyangiitis (GPA) 2 , while MPO-ANCA is for microscopic polyangiitis (MPA)/renal-limited vasculitis 3 . Though various classification and diagnostic criteria combining clinical symptoms and ANCA phenotypes have been used in the clinical study and practice, unclassifiable patients are still remain 3 . Moreover, a recent genome-wide association study has revealed that the ANCA phenotype better classified patients with GPA and MPA than clinical classification, suggesting that better classification should be required 4 . Cluster analysis is a statistical method of exploratory data mining for grouping objects into homogenous groups by their similarity. A previous report using cluster analysis for AAV suggested that ANCA phenotype and renal involvement could better predict prognosis than clinical classification such as granulomatosis with polyangiitis (GPA) and MPA 5 . We previously have reported predominance of MPA and MPO-ANCA positivity as common characteristics in Japan and other East Asian countries 6 , which is in marked contrast to the results of studies previously reported from Western countries 7,8 . Based on the above mentioned genetic study, there should be marked differences between the genetic backgrounds of AAV patients in Japan and Western countries 4 . Thus, another relevant cluster may be determined in Japan where MPO-ANCA and MPA are dominant among AAV patients.
Though prognostic factors in AAV have not yet fully explored, the Remission Induction Therapy in Japanese Patients with AAV (RemIT-JAV) study revealed that the European Vasculitis Study Group (EUVAS) criteria for disease severity was useful for predicting the prognosis of Japanese patients with AAV 6 , while the Remission Induction Therapy in Japanese Patients with AAV and Rapidly Progressive Glomerulonephritis (RemIT-JAV-RPGN) study added the superior suitability of the Japanese RPGN clinical grading system compared to the EUVAS criteria for disease severity 8 . The Japanese RPGN clinical grading system consists of four components: age, serum creatinine (s-Cr) levels, lung complication, and C-reactive protein (CRP) levels. CRP levels have been reported to be useful for predicting mortality and for distinguishing between active AAV and remission 9,10 . However, no study has elucidated the characteristics of AAV that are represented by CRP levels.
The objectives of the present study were (1) to explore novel clinical groups of MPA, GPA, and unclassifiable patients using cluster analysis in terms of clinical phenotype or severity assessment; (2) to evaluate the associations between the determined clusters and clinical outcomes; and (3) to elucidate the characteristics of AAV associated with CRP levels.

Methods
Database. This study used data from the RemIT-JAV study and RemIT-JAV-RPGN study. Twenty-two tertiary care institutions (university hospitals and referring hospitals) participated in RemIT-JAV, and 53 partici- Clinical variables. Data from the two studies were merged in a single dataset. Forty-two patients with EGPA were excluded because EGPA phenotypes are markedly different from phenotypes presented by other types of AAV patients. We also excluded eight patients whose PR3-ANCA results were unavailable at diagnosis. www.nature.com/scientificreports/ In the present study, nine items of the Birmingham Vasculitis Activity Score (BVAS) 2003, ILD, s-Cr levels, serum CRP levels, and the ANCA phenotype (proteinase-3 (PR3)-ANCA or MPO-ANCA) were used as clinical variables. Interstitial lung disease (ILD) was confirmed radiologically. The BVAS 2003 includes following symptoms: general; cutaneous; mucous membrane/eye; ear, nose and throat (ENT); chest; cardiovascular; abdominal; renal; and nervous system symptoms 12 . ILD was selected as a candidate clinical variable because of the high prevalence in our cohorts 3,8 . The patients enrolled in RemIT-JAV and RemIT-JAV-RPGN were evaluated at 3, 6, 12, 18, and 24 months after diagnosis and at the time of relapse. We collected the following outcome measures: remission rate, overall survival rate, end-stage renal disease (ESRD)-free survival rate, and relapse rate. Remission was defined as BVAS = 0 (new or worse) on two consecutive occasions that occurred at least one month apart 13 . ESRD was defined as dependence on dialysis or an irreversible increase in s-Cr level of > 5.6 mg/ dL (500 μmol/L) 14 . Relapse was defined as the recurrence or new onset of clinical signs and symptoms attributable to active vasculitis 15 . Statistical analysis. Cluster analysis was performed based on two models. Model 1 included the nine clinical symptoms considered in the BVAS, PR3-ANCA and MPO-ANCA, and ILD for assessment of the clinical phenotype. Model 2 included laboratory data for two additional characteristics (s-Cr and CRP levels) for the assessment of disease severity [8][9][10] . The s-Cr levels were categorized on the basis of the thresholds of the EUVAS criteria for disease severity (1.3 mg/dL (120 μmol/L) and 5.6 mg/dL (500 μmol/L)) 15 , and the CRP levels were categorized on the basis of the Japanese RPGN clinical grading system (2.6 mg/dL and 10.0 mg/dL), which could stratify the prognosis in patients with AAV and/or RPGN 8,16 . There are no missing data regarding these variables among enrolled patients.
At first, multiple correspondence analysis was performed to select candidate variables. Using principal component analysis (PCA), the contribution rate of each variable was calculated according to the distance from principal component 1 and principal component 2. Variables that explained at least 90% of the total contribution rates were included for the cluster analysis. Subsequently, hierarchical clustering based on the Ward method, followed by consolidation (K-means algorithm), was performed using the determined variables. To decide the optimal number of clusters, a dendrogram was plotted in each model. We determined the clusters by the branches and vertical distance based on each dendrogram. A dominant clinical feature (> 75% or 0% of patients in each cluster) was used to name each cluster.
For evaluation of the discrimination ability of the determined characteristics in each cluster, classification tree analysis was conducted subsequently. The predictive accuracies of the algorithms were calculated using the observed numbers of individuals allocated to the predicted classes. The overall survival, ESRD-free survival, cumulative remission, and relapse rates were analyzed using the Kaplan-Meier method and the log-rank test across the determined clusters.
To explore the clinical symptoms associated with the serum CRP levels, multiple linear regression analysis was performed using stepwise backward selection to minimize the Bayesian information criterion. Among all 63 items of the BVAS and ILD, the items observed in > 5% of enrolled patients were used as candidate variables. The prevalence of each BVAS item or ILD is shown in Supplementary Table S1.
All statistical analyses were performed by a biostatistician using the JMP version 10.0.2 statistical package for Windows (SAS Institute Inc., Cary, NC, USA) or R 3.2.3 (R Foundation for Statistical Computing, Vienna, Austria). A two-tailed P < 0.05 was considered statistically significant. When comparing seven or four clusters, the statistical significance was determined by P < 0.05/7 or P < 0.05/4 by the Bonferroni correction to adjust for multiple testing.
Ethics approval and consent to participate. This study was approved by the Ethics Committee of the Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences (authorization number: No. 1909-016), and conducted according to the Declaration of Helsinki and the Ethical Guidelines for Epidemiological Research in Japan. Written informed consent was obtained from each participant, and the study protocol was approved by the ethics committee of each participating hospital. The RemIT-JAV study and RemIT-JAV-RPGN study were registered with the University Hospital Medical Information Network Clinical Trials Registry (UMIN000001648 and 000005136).

Patient characteristics and clinical outcomes.
Of 477 patients with AAV enrolled in the two cohort studies, 427 patients were enrolled in the present study (142 patients from the RemIT-JAV and 285 patients from the RemIT-JAV-RPGN study, Supplementary Fig. S1). The enrolled patient characteristics are shown in Supplementary Table S2. Among enrolled 427 patients, fifteen patients showed positive results for both types of ANCA. Remission was achieved in 88% (n = 376) of the enrolled patients, and relapses occurred among 15% (n = 57) of the remitted patients. During the median (IQR) observational periods of 730 (654-730) days, 47 deaths and 46 ESRDs were reported. Cluster analysis in model 1. On the basis of the contribution rates of the candidate variables in model 1 by PCA, eight variables were selected with 91% of total contribution rates: MPO-ANCA, PR3-ANCA, ENT symptoms, nervous system symptoms, general symptoms, renal symptoms, cutaneous symptoms, and ILD (Supplementary Table S3). By the dendrogram of model 1, seven clusters were suggested ( Supplementary Fig. S2). Patient characteristics were compared across seven clusters, as presented in Table 1.

Cluster analysis in model 2.
Next, PCA of model 2 (including the CRP and s-Cr levels) was performed, and nine variables were selected with a total of 93% contribution rates: MPO-ANCA, PR3-ANCA, general symptoms, ENT symptoms, CRP, nervous system symptoms, creatinine, mucous membrane/eye symptoms; and renal symptoms (Supplementary Table S4). By the dendrogram of model 2, four clusters were suggested (Fig. 2). Patient characteristics were compared across four clusters, as presented in Table 2. Cluster 1 was characterized by MPO-ANCA negativity (39 of 42, 93%), Cluster 2 by s-Cr elevation (≥ 1.3 mg/dL (115 μmol/L)) with high CRP (> 10 mg/dL) (54 of 71, 76%; renal with high CRP), Cluster 3 by without s-Cr elevation (< 1.3 mg/dL) (10 of 157, 89%; non-renal), and Cluster 4 by s-Cr elevation (≥ 1.3 mg/dL) without high CRP (≤ 10 mg/dL) (117 of 157, 75%; renal without high CRP). Classification tree analysis based on specified characteristics in each cluster is presented in Fig. 3. First, 47 patients were classified as MPO-ANCA-negative. Among the MPO-ANCA-positive patients, 178 patients were classified as a non-renal group, 65 patients as a renal group with high CRP level, and 137 patients as a renal group without high CRP level. Classification tree analysis was consistent with cluster analysis in model 2; 343 (80%) patients were classified in the same cluster.

Discussion
This is the first study that classified Japanese AAV patients by cluster analysis and evaluated the prognosis among the determined clusters. In the first model, seven distinct clinical clusters were identified: ENT, cutaneous, non-renal, and renal in MPO-ANCA-positive patients; PR3-ANCA-positive patients; and ANCA-negative patients. Moreover, clusters separated on the basis of CRP and s-Cr levels predicted prognosis with respect to overall, ESRD-free, and relapse-free survival rates. Further, this is the first study to evaluate the symptoms which contribute to the CRP level, and the analysis suggested that the CRP level represented general, pulmonary, and neural symptoms but not renal symptoms. Although the ANCA phenotype and the ENT, cutaneous, and renal symptoms are universal and specific features of AAV, it is possible that the priority for each feature in the classification of AAV is different among different ethnicities or regions. In this study, PR3-ANCA-positive and ANCA-negative patients were first separated. A previous report from a Western country demonstrated that PR3-ANCA positivity was used to classify patients with GPA and MPA into two clusters in the final branch of classification 5 . In that study, 56% of patients was PR3-ANCA-positive and 32% was MPO-ANCA-positive, while, in our study, 89% was MPO-ANCA-positive but only 11% was PR3-ANCA-positive. Although the importance of the ANCA phenotype in our study was consistent with that in the previous study, this apparent difference in ANCA positivity might determine the distinction between two clusters according to the ANCA phenotype ahead of clinical symptoms 5 . Subsequently, patients with ENT symptoms were classified among MPO-ANCA-positive patients. Although ENT symptoms are generally regarded as surrogate markers for GPA 11 , which is characterized by PR3-ANCA, there are several reports regarding MPO-ANCA-positive GPA, in which the majority of patients show ENT symptoms, consistent with our results 17,18 . Recently, the notion of otitis media with AAV (OMAAV) irrespective of the ANCA phenotype has emerged 19 , which frequently accompanies facial palsy and hypertrophic pachymeningitis. Thus, ENT symptoms may be a surrogate clinical marker for AAV even in MPO-ANCA-positive AAV. After classification based on ENT symptoms, patients with cutaneous symptoms were identified. This cluster exhibited various organ involvements except for ENT symptoms (Table 1). According to the previous report 20 , cutaneous vasculitis rarely preceded the renal and pulmonary symptoms in patients with MPA. Cutaneous vasculitis was also reported to be associated with general, renal, pulmonary, and nervous system symptoms 20,21 . Indeed, the cutaneous cluster showed a tendency toward worse overall survival in our study. Therefore, patients with cutaneous symptoms should be carefully evaluated for systemic organ involvements. Finally, clusters with and without Analysis was performed using a log-rank test. Cluster 2 showed a worse survival rate compared to Cluster 1 (P = 0.0024) and Cluster 3 (P = 0.0003). Cluster 2 exhibited worse ESRD-free survival rate compared to Cluster 1 (P < 0.0001) and Cluster 3 (P < 0.0001). One patient in Cluster 3 and one patient in Cluster 4 were excluded from these analyses because of the missing follow-up data. ESRD end-stage renal disease, CRP C-reactive protein, MPO-ANCA myeloperoxidase-antineutrophil cytoplasmic antibody, s-Cr serum creatinine. www.nature.com/scientificreports/ renal symptoms were separated. In our previous report, using a different cohort, renal vasculitis was observed in 75% of MPO-ANCA-positive AAV 22 . In the present study, we have been able to confirm that renal symptoms are specific for MPO-ANCA-positive MPA and GPA. s-Cr and CRP levels could be good biomarkers for the prediction of prognosis. In the present study, patients with ≥ 1.3 mg/dL (120 μmol/L) of s-Cr levels showed poor overall and ESRD-free survival. Patients with < 1.3 mg/dL of s-Cr levels were categorized as having the localized type or early systemic type of AAV 15 . The BVAS has categorized s-Cr levels into 3 groups: 1.4 mg/dL (125 μmol/L)-2.8 mg/dL (249 μmol/L), 2.8 mg/ dL (250 μmol/L)-5.6 mg/dL (499 μmol/L), and ≥ 5.6 mg/dL (500 μmol/L) 12 . The 1996 Five-Factor Score also included s-Cr levels of > 1.6 mg/dL as the indicator for renal insufficiency 23 . Although previous reports have shown that better renal function at diagnosis is associated with improved survival and renal outcome [24][25][26] , the presence of renal insufficiency might be more important than the severity of renal insufficiency for the prediction of prognosis in patients with AAV. In addition to the s-Cr levels, the CRP levels could differentiate the prognosis of AAV. Previous reports have showed that an increase in CRP levels before treatment of AAV was associated with relapse and mortality 9,10 . Though the appropriate cut-off of CRP for the prediction of AAV outcomes had not been determined, we have reported the utility of the Japanese RPGN clinical grading system for predicting prognosis, which consists of age, s-Cr levels, lung complication, and CRP levels where the CRP levels were separated by 2.6 mg/dL and 10.0 mg/dL 8 . In the present study, we were able to validate these cut-offs of CRP levels. Further studies are needed to confirm the roles of s-Cr and CRP levels as biomarkers for prognosis in patients with AAV.
CRP is a surrogate marker for important organ involvements, except for renal symptoms. The CRP levels have not been observed to increase in patients with glomerulonephritis including IgA nephropathy, membranous nephropathy, and minimal change disease compared to that in controls 27 . Consistent with these previous observations, the CRP levels associated with fever, myalgia, and pulmonary and neurological vasculitis but not with renal symptoms in the present study. Although several previous reports have described that CRP could be a biomarker in the patients with systemic lupus erythematosus or AAV with renal involvements 28,29 , CRP levels may be related to organ involvements other than the kidney in the patients with AAV. Therefore, it could be rational and relevant to perform a combined evaluation of s-Cr and CRP as indicators of the severity of renal and non-renal symptoms, respectively, for predicting prognosis in patients with AAV. Importantly, these biomarkers are modifiable by treatment. Hence, future research should elucidate whether improvements in CRP and/or s-Cr levels by treatments can predict prognosis.
Several limitations of this study should be acknowledged. First, we could not validate our classification models. Different populations are required for the validation of our results, which will be conducted in the near future. Second, MPO-ANCA-positive MPA is dominant in the population of Japan in contrast to the dominance of PR3-ANCA in the Western population. However, this point might be strength of our study because new insights for clinical characteristics were provided by different cluster analyses in Western populations. Third, AAV was diagnosed clinically by the site investigators, though all patients were fulfilled the criteria for primary systemic vasculitis and classified by the EMEA algorithm. Forth, the treatment strategy was decided at the discretion of each attending physician; therefore, it is possible that patients with higher s-Cr levels were treated intensively, leading to underestimation of outcomes. Nevertheless, patients with higher s-Cr levels showed worse overall and renal survival, supporting the relevance of renal vasculitis in AAV. Fifth, BVAS items with low prevalence were excluded from the analysis of the association with CRP levels. Among the excluded items, congestive cardiac failure, peritonitis, bloody diarrhea, and ischemic abdominal pain have been reportedly related to survival in patients with AAV 30 . Therefore, careful workup for these cardiovascular and abdominal involvements may be necessary, regardless of CRP levels.

Conclusions
In summary, we have identified novel clusters of AAV among Japanese patients. We have also identified s-Cr level, CRP level, and MPO-ANCA negativity as prognostic biomarkers for overall, ESRD-free, and relapse-free survival. Further, we found that CRP was associated with non-renal symptoms such as general, pulmonary and nervous system symptoms but not with renal symptoms. www.nature.com/scientificreports/