MRI-based radiomics signature is a quantitative prognostic biomarker for nasopharyngeal carcinoma

This study aimed to develop prognosis signatures through a radiomics analysis for patients with nasopharyngeal carcinoma (NPC) by their pretreatment diagnosis magnetic resonance imaging (MRI). A total of 208 radiomics features were extracted for each patient from a database of 303 patients. The patients were split into the training and validation cohorts according to their pretreatment diagnosis date. The radiomics feature analysis consisted of cluster analysis and prognosis model analysis for disease free-survival (DFS), overall survival (OS), distant metastasis-free survival (DMFS) and locoregional recurrence-free survival (LRFS). Additionally, two prognosis models using clinical features only and combined radiomics and clinical features were generated to estimate the incremental prognostic value of radiomics features. Patients were clustered by non-negative matrix factorization (NMF) into two groups. It showed high correspondence with patients’ T stage (p < 0.00001) and overall stage information (p < 0.00001) by chi-squared tests. There were significant differences in DFS (p = 0.0052), OS (p = 0.033), and LRFS (p = 0.037) between the two clustered groups but not in DMFS (p = 0.11) by log-rank tests. Radiomics nomograms that incorporated radiomics and clinical features could estimate DFS with the C-index of 0.751 [0.639, 0.863] and OS with the C-index of 0.845 [0.752, 0.939] in the validation cohort. The nomograms improved the prediction accuracy with the C-index value of 0.029 for DFS and 0.107 for OS compared with clinical features only. The DFS and OS radiomics nomograms developed in our study demonstrated the excellent prognostic estimation for NPC patients with a noninvasive way of MRI. The combination of clinical and radiomics features can provide more information for precise treatment decision.

Our study aimed to investigate the prognostic value of the radiomics features of primary and neck lymph node metastasis lesions on MRI and to develop and validate a radiomics signature for estimating NPC prognosis after evaluating feature reliability.

Results
Patient characteristics and follow-up. The patient characteristics are shown in Table 1. Fifty-eight patients died or experienced at least one disease relapse during follow-up. The disease relapse and death positivity were 20% and 17.5% in the training and validation cohorts. No differences in patient clinical features were observed between the training and validation cohorts, as shown in Data Supplement Table 1.

Radiomics analysis on primary lesions.
In cluster analysis, patients were clustered into two groups of 164 and 139 patients by NMF and the details are provided in Data Supplement Table 2. Clustering results are shown as a heatmap in Fig. 1, with 303 patients on the x-axis and the expression of 208 radiomics features on the y-axis. The feature heatmap showed that patients within the same cluster displayed similar radiomics patterns. The cluster showed high correspondence with patients' T stage (p < 0.00001) and overall stage (p < 0.00001) but not N stage (p = 0.372) according to chi-squared tests.
Kaplan-Meier survival curves of the two NMF clustered groups for DFS, OS, DMFS and LRFS are also shown in Fig. 1. The log-rank tests showed significant difference in DFS (p = 0.0052), OS (p = 0.033) and LRFS (p = 0.037), but not in DMFS (p = 0.11) between the two groups.
After primary feature selection based on contour reproducibility and nonredundancy, 13 radiomics features were selected according to our criteria for the subsequent construction of prognostic models; the selection results are shown in Data Supplement Table 3.
Details of the prognostic model-building and the respective 10-fold cross-validation results are shown as Data Supplement Tables 4 and 5. The C-index values of all the prognostic models for both training and validation cohorts are presented in Table 2.  for OS in the validation (training) cohort. The data also demonstrated the capability of improving the predictive power when combining radiomics and clinical features. That indicated that radiomics features provide additional prognostic information for DFS and OS prediction. All the C-index values of the DMFS models were less than 0.65 in the validation cohort. LASSO-Cox regression failed to identify any radiomics-related features to establish a prognostic model for LRFS.
As the validation of the radiomics signature, the nomogram of the DFS signature and the OS signature are shown in Data Supplement Figs 1 and 2. Five radiomics related features were selected in the DFS signature and two were selected in the OS signature (shown in Data Supplement). The radiomics features of "LL_HIST.kurtosis" and "HL_GLCM.information_Measure_II" contributed to both DFS and OS signatures. "LL_HIST.kurtosis" describes the peak of the tumor image intensity histogram. The gray level co-occurrence matrix (GLCM) represents the joint probability of two particular gray levels within the matrix, and "HL_GLCM.information_Measure_ II" describes the inner GLCM linearity dependence after wavelet transformation of the tumor image 13 .
The median DFS risk score (the calculation formula is shown in the Data Supplement) was −16.66 in the training cohort. A higher DFS risk score indicated higher DFS risk. The patients in the validation cohort was divided into a high-risk group of 44 patients and a low-risk group of 59 patients according this threshold. The disease progression positivity was 4.5% in the low-risk group versus 27.1% in the high risk group. There were significant differences between the DFS of the high-risk and low-risk patients according to the log-rank tests in the stratified survival analyses, which are shown in Fig. 2.
Radiomics analysis on neck lymph-node metastatic lesions. Patients were clustered into two groups of 227 and 67 patients by NMF according to their radiomics patterns in lymph node metastatic lesions. The chi-squared tests revealed that the clustering results were significantly related to patients' N stage (p < 0.0001) and overall stage (p < 0.0001), but not T stage (p = 0.315), as shown in Data Supplement Table 2. Kaplan-Meier survival curves constructed by the NMF clustering results and prognostic model analysis results are shown in Data Supplement Fig. 4 and Data Supplement Table 6. No valuable prognostic information was available through this radiomics analysis focused on neck lymph node metastatic lesions.

Discussion
MRI has become the main imaging protocol for NPC diagnosis because soft tissue contrast on MRI is superior to that on CT. The MRI-based radiomics features capture tumor characteristics in a quantitative form, and several previous works have proven the utility of such features for diagnosis 14,15 , prediction of treatment responses 16,17 and prognosis estimation 18 . In this study, the prognostic values of MRI-based radiomics features were explored through different survival analysis methods, and the results demonstrated the prognostic value of these quantitative features.
Radiomics features were primarily selected before prognostic analysis, which focused on contour reproducibility and nonredundancy. Radiomics-related studies usually involve hundreds of features. This high-dimensional feature matrix requires a reasonable test of feature reliability for quality assurance and dimension reduction to avoid redundancy before being integrated as a prognostic signature 19 . Balagurunathan et al. 20 proposed three indicators for radiomics feature selection, which were reproducibility, informativeness and nonredundancy. The features with high reproducibility provided high robustness and were considered to be reliable information for model building 21 .
In the radiomics analysis of primary lesions, the clustering results showed significant correlations with patients' T stage and overall stage but not N stage. The radiomics features in our study and patients' T stage are both extensions of primary tumor imaging information. However, MRI-based radiomics features capture the spatial heterogeneity information inside the tumor 22 , while T staging is based on the distance and position of tumor invasion. The relationship between a patient's radiomics pattern and clinical stage illustrates that the intratumoral heterogeneity generated by cellular and genetic variety can reflect the tumor malignancy level 23 . DMFS is considered more relevant to patient N stage 24 , which is determined based on metastatic lymph nodes. This observation is consistent with the result that the DMFS of patients was not significantly different between the two clustered  www.nature.com/scientificreports www.nature.com/scientificreports/ groups in this study. Patients in the two clustered groups displayed different radiomics patterns and had significantly different DFS, OS, and LRFS in the clustering analysis. These differences illustrate the potential of certain radiomics pattern to supplement the tumor staging system 25 as precise staging is crucial for NPC treatment design and patient prognosis.
The radiomics signatures for DFS and OS developed in our study demonstrated effective prognostic estimation. Compared with clinical feature-based models, these signatures also showed an improvement for both DFS for OS in their discrimination performances. This improvement demonstrates that spatial heterogeneity inside the tumor provides complementary prognostic information. Stratified survival analyses in the validation cohort, as an application of the DFS radiomics signature, enhanced the accuracy of prognostic prediction after evaluating the patient risk from the signature. Therefore, the DFS signature can help physicians to reasonably adjust the treatment strategy and improve the quality of health care 8 . Two radiomics features, "LL_HIST.kurtosis" and "HL_GLCM.information_Measure_II" were shared in both the DFS and OS radiomics signatures. They captured essential tumor characteristics and might be general prognostic indicators. Parmar et al. 26 studied CT-based radiomics features and found that their radiomics clusters could express the cancer phenotypic characteristics across different cancer types. Patients with higher "LL_HIST.kurtosis" and lower "HL_GLCM.information_Measure_ II" presented higher risk of DFS and OS. Both higher "LL_HIST.kurtosis" and lower "HL_GLCM.information_ Measure_II" suggest the dense and fine textures of high signals in T1-weighted contrast-enhanced MRI, which are usually indicative of vessels. This agrees with the theory that highly vascular tumors are more likely to progress and lead to poorer prognosis because of the abundant nutrition supplied by the vessels 27,28 . Therefore, integrating MRI-based radiomics with genomics studies 21 could reveal the underlying mechanisms of tumor growth, such as regulation of angiogenesis.
No additional prognostic information from radiomics features of neck lymph node metastatic lesions was obtained through our approach. Lymph node metastasis lesions generally consist of several smaller positive nodes. These small fragments may cause difficulty in accurate feature calculation because of the limited imaging contrast, which is recognized as partial volume effect 29 . The poor prognostic performance of the features extracted from positive cervical lymph nodes also corresponds to the clinical consensus that particular nodal stations and side distributions are considered to contain more prognostic information 12 . www.nature.com/scientificreports www.nature.com/scientificreports/ One limitation of our study was that no radiomics feature was available in LASSO-Cox shrinkage for LRFS estimation, although the clustering results showed a significant difference (p = 0.0037) in patients' LRFS. This failure was likely caused by an insufficient number of events during follow-up. Therefore, other appropriate feature selection methods are expected for LRFS model building based on larger sample sizes. Moreover, the radiomics signature generated in this study was based on patients from the same clinical center, and external validation is warranted to assess the predictive power 30 . A larger-scale patient dataset for validation can also reduce the false discovery rate of the radiomics based models 31 .
In conclusion, this study demonstrated the utility of radiomics signatures from routine MRI in the evaluation of DFS and OS for NPC patients. Our findings may represent a step toward guiding individualized treatment options and improving health care quality.

Methods
Study design. Fudan University Shanghai Cancer Center Institutional Review Board approved this retrospective study and waived the requirement to obtain informed consent. All methods were performed in accordance with the guidelines and regulations of this ethics board.
The workflow of this study is shown in Fig. 3. A total of 303 patients were enrolled. After tumor segmentation, 208 radiomics features were separately extracted from the patient's primary or neck lymph-node metastasis lesions. Then, an unsupervised cluster analysis was implemented to study the radiomics patterns of the 303 patients. After primary feature selection for feature reduction, the prognostic models were built. Disease-free survival (DFS), overall survival (OS), distant metastasis-free survival (DMFS) and locoregional recurrence-free survival (LRFS) were studied as survival outcomes.
Patients. All patients who underwent MRI scans for NPC pretreatment diagnosis and subsequent treatment at our center from January 2010 to February 2012 were enrolled in this retrospective study (226 men and 77 women; mean age, 48.8 ± 12.7 years; range, 11 to 80 years). Clinical staging of the tumor was performed based on the 7th edition of the American Joint Committee on Cancer (AJCC) TNM staging system 32 . All patients were free of distant metastases (M0) before treatment. The patients' T stage, N stage, sex, age and digital MRI data were collected from the medical records. The patient inclusion and exclusion criteria are shown in the Data Supplement.
MRI acquisition, ROI segmentation and radiomics feature extraction methodology. All patients underwent a 1.5 T MRI scan as diagnostic imaging. Transversal contrast-enhanced T1-weighted Digital Imaging and Communications in Medicine (DICOM) images were studied in this radiomics analysis. The detailed MRI acquisition parameters are described in the Data Supplement.
DICOM images of the patients' MRI were gathered and imported into MIM (version 6.6; MIM Software Inc. Cleveland, OH) for manual segmentation. The primary and lymph-node metastasis lesions of all 303 patients were contoured by an oncologist with 5 years of experience. Nineteen patients were randomly selected and their lesions were recontoured by another oncologist for the contour reproducibility study. All segmentation was checked by a senior oncologist experienced in NPC.
After tumor segmentation, 208 radiomics features were extracted from the contrast-enhanced T1-weighted MRI data by an in-house algorithm called "QIAT" in MATLAB R2015a (The MathWorks Inc, Natick, MA) for each patient. The radiomics features included image intensity histogram analysis (10), texture analysis (31), wavelet analysis 21 (164) and fractal analysis 33 (3). The feature extraction methodology has been described in detail in the Data Supplement. The primary and lymph-node metastatic lesions served as two different ROIs and the radiomics features were separately extracted from the two. www.nature.com/scientificreports www.nature.com/scientificreports/ Cluster analysis. Non-negative matrix factorization (NMF) is widely used for the clustering of high-dimensional datasets in computational biology 34 . After feature extraction, NMF was used to cluster 303 patients into two groups according to their radiomics feature patterns. Details are provided in the Data Supplement. Then, the chi-squared test was used to study the relationship between tumor stage and the clustering results. Kaplan-Meier survival curves for DFS, OS, DMFS and LRFS were also constructed for the two clustered groups. The log-rank test was used to study the significance of the differences among the clustered groups for each endpoint.
Prognostic model analysis. Patients were first segregated according to their pretreatment diagnosis date into a training cohort and a validation cohort. The training cohort of 200 patients was used to build and optimize the prognostic model, and a validation cohort of 103 people was used to validate the estimation power of the model. The Mann-Whitney U test was used to study the differences in clinical features between the training and validation cohorts.
Primary feature selection on contour reproducibility and nonredundancy was implemented before building the prognostic models. To study contour reproducibility, 19 randomly selected patients in the training cohort were contoured independently by two oncologists. The intra-class correlation coefficient (ICC2) was calculated by the radiomics features extracted from both segmentations to evaluate the segmentation reproducibility 35 . Features with ICC2 > 0.8 36 were selected. Redundant features with pairwise correlation were removed by the "findCorrelation" function with a cutoff of 0.5 using the "caret" packages in R for nonredundancy selection.
After primary feature selection, a prognostic model for each endpoint was built by the least absolute shrinkage and selection operator (LASSO) Cox regression 37 using the training cohort. When building a model, LASSO shrinks the algebraic sum of the feature coefficients into a penalty parameter "lambda". Therefore, some feature coefficients were reduced into zero to achieve the minimum lambda. The leave-one-out cross validation (LOOCV) method was used to optimize the penalty parameter "lambda" for LASSO because LOOCV achieves a thorough data mining and provides an almost unbiased estimator 38 . Additionally, the widely recommended "lambda.1se" 39 was used as the penalty parameter of LASSO to establish a concise prediction model. Further 10-fold cross-validation was also performed to achieve a steady result.
To study the incremental prognostic value of radiomics features, two other prognostic models were built using the same methods: one based on clinical features (T stage, N stage, age and sex) only and the other based on a combination of both radiomics and clinical features. The Harrell concordance index (C-index) 40 and 95% confidence intervals (CIs) for the respective models were calculated to assess the estimation performance of the models in both the training and validation cohorts.
Construction and validation of the radiomics signature. The radiomics signature for each endpoint combined both clinical features and the radiomics features extracted from the primary lesions. The signatures were generated by LASSO for the training cohort after summing the radiomics score. The radiomics score was calculated by summing up the non-zero radiomics features according to their coefficient weights. A nomogram was constructed for respective signatures based on the validation cohort.
The DFS risk score of the patients in the training cohort were evaluated according to the DFS radiomics signature. The median value of their DFS risk score was applied as a threshold to divide the patients on the training cohort into a high-risk group or low-risk group. Stratified analyses were implemented to compare the high-risk and low-risk patients' DFS in various subgroups.
All statistical analyses were conducted with R software (version 3.3.1; http://www.Rproject.org), and all related analysis packages are listed in the Data Supplement Table 7. Statistical significance levels are indicated by two-sided p values with α set at 0.05.