Feature selection and classification of urinary mRNA microarray data by iterative random forest to diagnose renal fibrosis: a two-stage study

Renal fibrosis is a common pathological pathway of progressive chronic kidney disease (CKD). However, kidney function parameters are suboptimal for detecting early fibrosis, and therefore, novel biomarkers are urgently needed. We designed a 2-stage study and constructed a targeted microarray to detect urinary mRNAs of CKD patients with renal biopsy and healthy participants. We analysed the microarray data by an iterative random forest method to select candidate biomarkers and produce a more accurate classifier of renal fibrosis. Seventy-six and 49 participants were enrolled into stage I and stage II studies, respectively. By the iterative random forest method, we identified a four-mRNA signature in urinary sediment, including TGFβ1, MMP9, TIMP2, and vimentin, as important features of tubulointerstitial fibrosis (TIF). All four mRNAs significantly correlated with TIF scores and discriminated TIF with high sensitivity, which was further validated in the stage-II study. The combined classifiers showed excellent sensitivity and outperformed serum creatinine and estimated glomerular filtration rate measurements in diagnosing TIF. Another four mRNAs significantly correlated with glomerulosclerosis. These findings showed that urinary mRNAs can serve as sensitive biomarkers of renal fibrosis, and the random forest classifier containing urinary mRNAs showed favourable performance in diagnosing early renal fibrosis.

Scientific RepoRts | 7:39832 | DOI: 10.1038/srep39832 Using targeted microarrays, our previous study showed that urinary vimentin mRNA was significantly upregulated in moderate-to-severe fibrosis 9 . However, whether urinary mRNAs can efficiently identify patients with renal fibrosis in the context of CKD has not been investigated yet.
Another problem is that datasets generated by microarrays are often noisy, multicollinear, and high dimensional, which make it difficult to process. Machine learning is a subfield of computer science that evolved from artificial intelligence. Machine learning can be used to process much more complex data than traditional statistical methods and make predictions with higher accuracy 10 . Random forest (RF) methods, constructed from decision tree predictors, represent one of the most prevalent supervised machine learning methods, which was first introduced by Breiman in 2001 11 . RF methods return measures of variable importance and have superior performance with respect to the problems that microarray data bring, making it well suited for microarray analysis 12 . An empirical study by Archer et al. showed that RF is a robust method for making an accurate classifier and evaluating the discriminative ability of individual predictors in classification problems 13 .
Patients who undergo renal biopsy are usually at the early stages of CKD, which is the ideal target population. The primary objective of this two-stage study was to test and validate the hypothesis that mRNAs from urinary sediment could provide useful information of early renal fibrosis. To our knowledge, this is the first machine-learning analysis of the diagnostic performance of urinary mRNAs in renal fibrosis. By iterative random forest analysis of a targeted microarray, we aimed to discover a panel of mRNAs and develop a more powerful classifier for improved diagnosis of renal fibrosis.
Forty-one biopsy-proven CKD patients and 8 healthy participants, who were screened according to the same criteria used for stage I, were enrolled in the validation set (Table 1). Primary glomerulonephritis was still the leading cause of CKD, and 16 IgAN, 7 non-IgA MsPGN, 3 MCD, 3 MN, 1 FSGS, 1 IgM, and 1 C1q nephropathy cases were found. Other causes of CKD included lupus nephritis (n = 3), DN (n = 2), Alport's syndrome (n = 2), and ANCA-associated vasculitis (n = 3). The basic clinical characteristics of all participants, with or without TIF, are shown in Table 1.
Four mRNAs were identified as important features of TIF by RF. As shown in Fig. 1, the initial out-of-bag (OOB) estimates of the error rates were 0.329 and 0.306 in the test set and validation set, respectively. After the first four iterations, the OOB error decreased as the noisy mRNAs were eliminated. The error rates of both sets rebounded at the fifth iteration; thus, the "early stop" strategy was applied and the final OOB error was 0.210 in the test set and 0.183 in the validation set. Consequently, four mRNAs including TGFβ 1 (TGB1), MMP9, TIMP2, and vimentin (VIM) were identified as important features of TIF by RF (Fig. 1).
Use of four selected mRNAs to diagnose TIF with a high sensitivity. We first examined the basic statistical associations between the selected mRNAs and TIF in the test set. The relative expression levels of the four mRNAs in the TIF group were significantly higher than those in the group without TIF (Fig. 2). In addition, the TIF score (the healthy participants were not included in this analysis) and relative expression levels of TGFβ 1 (r = 0.281, p = 0.028), MMP9 (r = 0.338, p = 0.007), TIMP2 (r = 0.326, p = 0.009), and vimentin (r = 0.397, p = 0.001) were significantly correlated (Fig. 3).
Then, we assessed the individual diagnostic power of the four mRNA biomarkers in the test set by receiver-operating characteristic (ROC) curve analysis. As shown in Fig. 4, all biomarkers showed moderate performance in discriminating TIF, with areas under the ROC curve (AUCs) ranging from 0.727 to 0.757. The best cut-off of the four mRNAs yielded good sensitivity (0.762 to 0.976) but poor specificity (0.471 to 0.647), indicating the screening value of these biomarkers. Their performance was further validated in the stage-II study (AUCs between 0.668 and 0.748) (Fig. 4). In the test set, eGFR and SCr yielded better overall performance than did the individual mRNAs with AUCs of 0.781 (95% CI, 0.672-0.890, p < 0.001) and 0.775 (95% CI, 0.669-0.881, p < 0.001), respectively. The best cut-off values for both eGFR and SCr yielded excellent specificity (0.912 for eGFR and 0.903 for SCr), but poor sensitivity (0.690 for eGFR and 0.622 for SCr). In contrast, 24 urine proteins failed to discriminate TIF, both in the test set (AUC = 0.613, p = 0.092) and validation set (AUC = 0.598, p = 0.245). mRNA classifier trained by RF outperformed SCr and eGFR in diagnosing TIF. As shown in Table 2, the best cut-offs of SCr and eGFR obtained from the test set (87.1 μ mol/LforSCrand86.2ml• min•

Discussion
A major challenge for early detection of renal fibrosis is the lack of early and noninvasive biomarkers. However, previous evidence revealed that kidney function parameters were suboptimal for detecting early fibrosis owing to the compensatory effect 14 . Moreover, these markers reveal limited information regarding the underlying molecular mechanism. Recent data have shown that measuring transcriptional differences in urine sediment from patients with CKD may provide a novel means for early and sensitive diagnosis 7-9 . As microarray data is high-dimensional and reflects gene-gene interactions, univariate selection methods may generate less accurate classifiers and fail to capture these interactions 15 . In this study, we applied RF, a novel and powerful gene selection strategy, to discover renal fibrosis biomarkers.
Here, we identified a four-mRNA signature in urinary sediment including TGFβ 1, MMP9, TIMP2, and vimentin, which were important features of TIF. We showed that these four mRNAs could individually serve as sensitive diagnostic biomarkers for TIF. We also found that a classifier containing all four mRNAs had favourable performance in diagnosing TIF. Impressively, the combined classifiers outperformed SCr and eGFR in the validation set, with excellent sensitivity. We also investigated correlations between urinary mRNAs and GS severity by RF regression. As a secondary result, four mRNAs showed significant correlations with the GS score. These findings extended our previous findings on the diagnostic value of urinary mRNAs in renal fibrosis.
Far from being a simple disposition of collagen, renal fibrosis is modulated by a complex signalling network 16 . The role of TGFβ 1 in renal fibrosis has been well established 17 . Accumulating evidence has shown that both circulating and urinary TGFβ 1 levels can serve as biomarkers for CKD 18,19 . Recently, TGFβ 1 mRNA expression in urinary sediment was found to correlate with the degree of tubulointerstitial scarring in a small scale observational study 20 . Our study further tested the diagnostic value of urinary TGFβ 1 mRNA levels on a larger sample. A vast array of additional molecules has also been demonstrated to serve modulatory roles. Metal matrix proteinases (MMPs) and tissue inhibitor of metalloprotease (TIMP) proteins are recognized as the major cellular factors mediating matrix turnover 16 . In addition to ECM proteins, growth factor receptors and cell-adhesion molecules are also MMP substrates. Friese et al. reported that MMP-9 was up-regulated in hypertension and hypertensive end-stage kidney disease (ESKD) 21 . Recent findings have implied that TEC, MMP-9, and TIMP-2 expression can be upregulated by TGF-β in disease models 22,23 . We previously reported that the expression of MMP9 mRNA was strongly correlated with kidney function parameters 9 . Here, we further showed that these two mRNAs could serve as TIF biomarkers. Vimentin, a marker for EMT (epithelial-mesenchymal transition), has been demonstrated to be overexpressed in tubular cells during renal fibrosis 24 . Lee et al. found that a four-gene signature of mRNAs, including vimentin, was a predictor for renal fibrosis in human kidney allografts 7 . Data from our previous study revealed that vimentin mRNA detection enables good discriminative power of moderate to severe TIF. In this study, we also found that vimentin mRNA was a sensitive diagnostic marker of TIF. Although the overall performance of every single urinary mRNA markers was not superior to kidney function parameters in this analysis, ROC curve analysis showed that these markers all had better sensitivity. In fact, kidney function parameters changed little at the onset of fibrosis 4,5 . So urinary mRNAs can make up for the lack of sensitivity of current parameters and serve as screening markers for renal fibrosis. Moreover, kidney function parameters are also affected by non-renal factors such as prerenal ischemia and postrenal obstruction 5 . The value of urinary mRNAs in differential diagnosis needs further investigation.
No single biomarker alone is considered to be sensitive and specific enough for fibrosis screening. One possible reason is that a common pathway may not be involved in all CKD patients 25 . A previous report by Zeisberg et al. showed that renal fibrosis also occurs in a TGF-β signaling-independent manner 26 . Moreover, single-molecule measurements might have larger variations, which would make the associated predictions unstable. Our results showed that the combined mRNAs outperformed single mRNAs and kidney function parameters in detecting TIF, supporting the possibility that combined biomarkers could yield more accurate classification. In particular, when combined with kidney function parameters, the mRNA classifiers trained by RF yielded excellent accuracy of classification.
The main strengths of our study include: (1) the targeted microarray was constructed for high throughput analysis of urinary mRNAs; (2) strong methodology (iterative random forest) was applied to select features and generate a classifier that had a favorable performance in diagnosing TIF. (3) a two-stage study design was adopted to further validate the performance of biomarkers in diagnosing TIF. Our study has some limitations. First, the sample size of the validation set was relatively small, and a larger study is needed to further confirm our conclusions. Second, the constituents of urinary sediment in enrolled CKD patients may also interfere with the diagnostic performance of mRNAs. Urinary cell-specific mRNAs are expected to better reflect kidney injury. Third, to test the screening value of urinary mRNA biomarkers in renal fibrosis, we also included several healthy participants who did not undergo renal fibrosis and classified them into the no-fibrosis group.
In summary, we showed that among the target genes examined, four fibrosis-associated mRNAs in urinary sediments can serve as sensitive predictors of TIF. Combined classifier showed excellent sensitivity and better overall diagnostic performance than eGFR and SCr. Large-scale studies are needed to confirm the utility of urinary mRNA biomarkers for diagnosing renal fibrosis and predicting CKD progression.   younger than 18 years old; b) patients with acute kidney injury, chronic liver disease, cardiovascular disorders, urinary tract infections, or cancer; and c) administration of immunosuppressive medications. A group of age-and gender-matched healthy volunteers from the Zhong Da Hospital Health Care Center was also included in this study, all of whom met the following criteria: (1) no record of abnormal renal function (eGFR < 90mL• min• 1.73m 2 ); (2) normal routine urinalysis, ACR (albumin-creatinine ratio), and 24-h urinary protein test results; (3) no record of hypertension, diabetes, hyperlipidaemia, or hyperuricaemia; (4) no family history of kidney diseases.
Construction of the urinary mRNA microarray. We designed an mRNA microarray containing 61 genes participating in the well-accepted molecular mechanism of renal fibrosis. PRIMER 5 software was used to design the qPCR primer sets. Six reference genes (GAPDH, B2M, OAZ1, RPL27, HRPT1, and ACTB) were included in the microarray to normalize the transcription levels. Furthermore, we also included a genomic DNA control (GDC) to control for DNA contamination. The list of included genes is shown in Table 3. The primer sequences of all included genes are shown in the supplemental material.
Urine samples and mRNA measurements. First morning urine samples obtained at the time of kidney biopsy were centrifuged at 3,000 × g for 20 min at 4 °C within 2 h after sample collection. Then, the obtained urinary sediments were resuspended in 1.5 ml DEPC-treated PBS and centrifuged at 12,000 × g for 5 min at 4 °C. One millilitre of RNAiso Plus (Takara, Life Technologies) was added to preserve total RNA, and the samples were stored at − 80 °C until use. Total RNA was extracted according to the manufacturer's protocol (Ambion, Life Technologies). We measured the RNA concentrations using a NanoDrop 2000 (Thermo) based on the relative absorbance ratio at 260/280, and the RNA purity was also assessed by agarose gel electrophoresis. The qualified RNA samples were stored at − 80 °C until use.
We used the kit from Invitrogen to reverse transcribe the total RNA (Invitrogen, Life Technologies), which was stored at − 20 °C until use. mRNA-expression levels were quantified by real-time quantitative polymerase-chain-reaction (qPCR) assays in an ABI PRISM7700 system (Applied Biosystems). The thermocycling conditions were set as follows: 95 °C for 10 min, followed by 40 cycles of 15 s at 95 °C and 60 °C for 1 min. Dissociation curves and melting temperatures were recorded and relative mRNA expression levels were calculated by the Δ Δ Ct method 27 . Evaluation of renal fibrosis. Periodic acid Schiff (PAS) and Masson trichrome staining were used to determine the severity of GS and TIF, respectively. Two experienced pathologists (P.S. Chen and H.F. Ni) were blinded to the microarray data and scored the severity of renal fibrosis. For GS, a semiquantitative scoring system was used, as described previously by Raji et al. 28 . In specimens containing at least 20 glomeruli, each glomerulus was graded from 0 to 4 according to the percentage of fibrotic area (0 for 0%; 1 for 1-25% affected glomerular area; 2 for 26-50% affected glomerular area; 3 for 51-75% affected glomerular area; 4 for 76-100% affected glomerular area). The final score was a weighted average of all grades obtained. The percentage of fibrotic area in tubulointerstitium was recorded as the TIF score, and its grade was based on the following well-accepted rules: grade 0, no more than 5% fibrotic area; 1, 6-25% fibrotic area; 2, 26-50% fibrotic area; and 3, > 50% fibrotic area. A grade of 0 was considered to reflect the absence of TIF. Participants showing grades of 1-3 were combined into the TIF group. All healthy participants were classified in the no-TIF group.
Statistical analysis. Our analyses of the microarray database involved the following steps: (1) select important TIF features by iterative RF; (2) assess the individual diagnostic power of selected mRNAs for TIF, as well as their correlation with scores of TIF, in the test set; (3) validate the individual diagnostic power of selected mRNAs and routine parameters for TIF in the validation set; (4) assess the diagnostic power of the classifiers trained by RF for TIF and compare them with those of eGFR and SCr in the validation set; (5) select important features of GS by iterative RF regression and assess their correlation with GS scores.

Iterative RF method.
In the test set, iterative RF was performed to classify cases with and without TIF, using the randomForest package (from A. Liaw and M. Wiener) in R software (version i386 3.2.4) 29 . After each iteration, mRNAs showing the smallest mean decrease in accuracy were discarded, and a new RF with a lower OOB error rate was constructed. Data from a previous study have shown that the OOB estimate is as accurate as using a test set of the same sample size as the training set 11 . Although RF was more resistant to over-fitting than a support vector machine or artificial neural network, the validation set was involved in the same process simultaneously and an "early-stop" strategy was applied to prevent "over-fitting" 12 . The final set of mRNAs with smallest estimated OOB was identified as important features of TIF. For regression, the % IncMSE was used as a parameter for determining the importance of each mRNA, and the same iterative process was undertaken.
Other statistical methods. SPSS 18.0 software was used for all additional statistical analyses. The Kolmogorov-Smirnov test was used to determine the normality of the data. Numeric results subjected to normal distribution were presented as the mean ± SD. Non-normal numeric results were presented as quartiles. Student's t-test was applied to compare the means of normal data. Otherwise, the Mann-Whitney test was applied. The correlation between gene-expression levels and pathological parameters was analysed by Spearman's rank-order test. The diagnostic performance of single biomarker was evaluated by generating ROC curves. The AUC was used to assess the overall discriminative power. An AUC of 0.6-0.7 was considered as poor, 0.7-0.8 was moderate, 0.8-0.9 was good, and > 0.9 was excellent. The best cut-offs were determined by selecting the data points that maximized the sum of specificity and sensitivity on the ROC curve. Two-tailed P values of < 0.05 were considered statistically significant.  HAVCR1  HGF  ICAM1  IGF1  IL10  IL18  IL1B  IL6  IL8  ILK  LCN2  MMP2 MMP7 MMP9 NFKB2  NLRP3   NPHS1  NPHS2 PDGFA  PDGFB  PLAUR  PODXL REN S100A4 SERPINE1 SMAD2 SMAD3 SMAD4 SMAD7 SNAI1 SNAI2 ST6GALNAC2   SYNPO  TF  TFRC  TGFB1  TGFB2  TIMP1 TIMP2  TNF  TNFSF13  TP53 TWIST1 VEGFA  VIM   Table 3. A display of 61 genes in the targeted microarray.