Introduction

Nasopharyngeal carcinoma (NPC) is an endemic head and neck malignancy in southern China1. The main treatment for NPC is radiotherapy2. Radiation fractions and prescription doses are related to tumor stages determined by medical imaging at diagnosis3. Tumor stages are also important prognostic factors. However, even patients with the same stage may have significantly different treatment responses and prognoses. Increasingly -omics studies of gene and protein patterns4,5 are being conducted to provide a deeper understanding of NPC tumor characteristics and to provide decision-making guidance for individualized treatment.

As a routine noninvasive practice, medical imaging can be repeatedly performed in patients for tumor diagnosis and treatment estimation. However, the vast amount of information contained within imaging data need to be further explored. Radiomics is the data mining process that extracts high-throughput features from digital medical images in the region of interest (ROI) with automated algorithms6. These quantitative feature have shown validity for predicting tumor treatment responses7 and patient prognosis8,9.

Magnetic resonance imaging (MRI) has superior soft tissue contrast to computed tomography (CT). The proper selection and combination of a panel of features as a signature has yielded great value for NPC prognosis estimation10,11. However, there is a high prevalence of lymph node metastasis among NPC cases12. Lymph node status is also important for NPC staging, treatment design and prognosis. To the best of our knowledge, no study has reported an MRI-based radiomics analysis based on both primary and neck lymph node metastatic lesions in NPC patients.

Our study aimed to investigate the prognostic value of the radiomics features of primary and neck lymph node metastasis lesions on MRI and to develop and validate a radiomics signature for estimating NPC prognosis after evaluating feature reliability.

Results

Patient characteristics and follow-up

The patient characteristics are shown in Table 1. Fifty-eight patients died or experienced at least one disease relapse during follow-up. The disease relapse and death positivity were 20% and 17.5% in the training and validation cohorts. No differences in patient clinical features were observed between the training and validation cohorts, as shown in Data Supplement Table 1.

Table 1 Patient characteristics of the training and validation cohorts.

Radiomics analysis on primary lesions

In cluster analysis, patients were clustered into two groups of 164 and 139 patients by NMF and the details are provided in Data Supplement Table 2. Clustering results are shown as a heatmap in Fig. 1, with 303 patients on the x-axis and the expression of 208 radiomics features on the y-axis. The feature heatmap showed that patients within the same cluster displayed similar radiomics patterns. The cluster showed high correspondence with patients’ T stage (p < 0.00001) and overall stage (p < 0.00001) but not N stage (p = 0.372) according to chi-squared tests.

Figure 1
figure 1

Three hundred and three patients were clustered into two groups according to their radiomics patterns by NMF. (A) The clustering results are shown as the heatmap, with 303 patients on the x-axis and the expression of 208 radiomics features on the y-axis. The feature heatmap showed that patients within the same cluster expressed similar radiomics patterns. The clusters showed high correspondence with patients’ T stage (p < 0.00001) and overall stage (p < 0.00001) but not N stage (p = 0.372) according to chi-squared tests. Kaplan-Meier survival curves were constructed for the two NMF clustered groups at each endpoint: (B) disease free-survival (DFS); (C) overall survival (OS); (D) distant metastasis-free survival (DMFS) and (E) locoregional recurrence-free survival (LRFS). P values are based on log-rank tests.

Kaplan-Meier survival curves of the two NMF clustered groups for DFS, OS, DMFS and LRFS are also shown in Fig. 1. The log-rank tests showed significant difference in DFS (p = 0.0052), OS (p = 0.033) and LRFS (p = 0.037), but not in DMFS (p = 0.11) between the two groups.

After primary feature selection based on contour reproducibility and nonredundancy, 13 radiomics features were selected according to our criteria for the subsequent construction of prognostic models; the selection results are shown in Data Supplement Table 3.

Details of the prognostic model-building and the respective 10-fold cross-validation results are shown as Data Supplement Tables 4 and 5. The C-index values of all the prognostic models for both training and validation cohorts are presented in Table 2.

Table 2 Prognostic model analysis result

The radiomics-based prognostic models were superior in OS prediction to the clinical-based prognostic models in both the training and validation groups. For the models that combined radiomics and clinical features, the C-index was 0.751 [0.639, 0.863] (0.736 [0.665, 0.807]) for DFS and 0.845 [0.752, 0.939] (0.717 [0.624, 0.811]) for OS in the validation (training) cohort. The data also demonstrated the capability of improving the predictive power when combining radiomics and clinical features. That indicated that radiomics features provide additional prognostic information for DFS and OS prediction. All the C-index values of the DMFS models were less than 0.65 in the validation cohort. LASSO-Cox regression failed to identify any radiomics-related features to establish a prognostic model for LRFS.

As the validation of the radiomics signature, the nomogram of the DFS signature and the OS signature are shown in Data Supplement Figs 1 and 2. Five radiomics related features were selected in the DFS signature and two were selected in the OS signature (shown in Data Supplement). The radiomics features of “LL_HIST.kurtosis” and “HL_GLCM.information_Measure_II” contributed to both DFS and OS signatures. “LL_HIST.kurtosis” describes the peak of the tumor image intensity histogram. The gray level co-occurrence matrix (GLCM) represents the joint probability of two particular gray levels within the matrix, and “HL_GLCM.information_Measure_II” describes the inner GLCM linearity dependence after wavelet transformation of the tumor image13.

The median DFS risk score (the calculation formula is shown in the Data Supplement) was −16.66 in the training cohort. A higher DFS risk score indicated higher DFS risk. The patients in the validation cohort was divided into a high-risk group of 44 patients and a low-risk group of 59 patients according this threshold. The disease progression positivity was 4.5% in the low-risk group versus 27.1% in the high risk group. There were significant differences between the DFS of the high-risk and low-risk patients according to the log-rank tests in the stratified survival analyses, which are shown in Fig. 2.

Figure 2
figure 2

Patients in the validation cohort were divided into a high-risk group and a low-risk group according to their DFS risk score based on a threshold value of −16.66. Kaplan-Meier survival curves of the high- and low-risk patients were constructed on the (A) entire validation cohort and in the subgroups stratified by (B) overall stage, (C) gender and (D) age.

Radiomics analysis on neck lymph-node metastatic lesions

Patients were clustered into two groups of 227 and 67 patients by NMF according to their radiomics patterns in lymph node metastatic lesions. The chi-squared tests revealed that the clustering results were significantly related to patients’ N stage (p < 0.0001) and overall stage (p < 0.0001), but not T stage (p = 0.315), as shown in Data Supplement Table 2. Kaplan-Meier survival curves constructed by the NMF clustering results and prognostic model analysis results are shown in Data Supplement Fig. 4 and Data Supplement Table 6. No valuable prognostic information was available through this radiomics analysis focused on neck lymph node metastatic lesions.

Discussion

MRI has become the main imaging protocol for NPC diagnosis because soft tissue contrast on MRI is superior to that on CT. The MRI-based radiomics features capture tumor characteristics in a quantitative form, and several previous works have proven the utility of such features for diagnosis14,15, prediction of treatment responses16,17 and prognosis estimation18. In this study, the prognostic values of MRI-based radiomics features were explored through different survival analysis methods, and the results demonstrated the prognostic value of these quantitative features.

Radiomics features were primarily selected before prognostic analysis, which focused on contour reproducibility and nonredundancy. Radiomics-related studies usually involve hundreds of features. This high-dimensional feature matrix requires a reasonable test of feature reliability for quality assurance and dimension reduction to avoid redundancy before being integrated as a prognostic signature19. Balagurunathan et al.20 proposed three indicators for radiomics feature selection, which were reproducibility, informativeness and nonredundancy. The features with high reproducibility provided high robustness and were considered to be reliable information for model building21.

In the radiomics analysis of primary lesions, the clustering results showed significant correlations with patients’ T stage and overall stage but not N stage. The radiomics features in our study and patients’ T stage are both extensions of primary tumor imaging information. However, MRI-based radiomics features capture the spatial heterogeneity information inside the tumor22, while T staging is based on the distance and position of tumor invasion. The relationship between a patient’s radiomics pattern and clinical stage illustrates that the intratumoral heterogeneity generated by cellular and genetic variety can reflect the tumor malignancy level23. DMFS is considered more relevant to patient N stage24, which is determined based on metastatic lymph nodes. This observation is consistent with the result that the DMFS of patients was not significantly different between the two clustered groups in this study. Patients in the two clustered groups displayed different radiomics patterns and had significantly different DFS, OS, and LRFS in the clustering analysis. These differences illustrate the potential of certain radiomics pattern to supplement the tumor staging system25 as precise staging is crucial for NPC treatment design and patient prognosis.

The radiomics signatures for DFS and OS developed in our study demonstrated effective prognostic estimation. Compared with clinical feature-based models, these signatures also showed an improvement for both DFS for OS in their discrimination performances. This improvement demonstrates that spatial heterogeneity inside the tumor provides complementary prognostic information. Stratified survival analyses in the validation cohort, as an application of the DFS radiomics signature, enhanced the accuracy of prognostic prediction after evaluating the patient risk from the signature. Therefore, the DFS signature can help physicians to reasonably adjust the treatment strategy and improve the quality of health care8. Two radiomics features, “LL_HIST.kurtosis” and “HL_GLCM.information_Measure_II” were shared in both the DFS and OS radiomics signatures. They captured essential tumor characteristics and might be general prognostic indicators. Parmar et al.26 studied CT-based radiomics features and found that their radiomics clusters could express the cancer phenotypic characteristics across different cancer types. Patients with higher “LL_HIST.kurtosis” and lower “HL_GLCM.information_Measure_II” presented higher risk of DFS and OS. Both higher “LL_HIST.kurtosis” and lower “HL_GLCM.information_Measure_II” suggest the dense and fine textures of high signals in T1-weighted contrast-enhanced MRI, which are usually indicative of vessels. This agrees with the theory that highly vascular tumors are more likely to progress and lead to poorer prognosis because of the abundant nutrition supplied by the vessels27,28. Therefore, integrating MRI-based radiomics with genomics studies21 could reveal the underlying mechanisms of tumor growth, such as regulation of angiogenesis.

No additional prognostic information from radiomics features of neck lymph node metastatic lesions was obtained through our approach. Lymph node metastasis lesions generally consist of several smaller positive nodes. These small fragments may cause difficulty in accurate feature calculation because of the limited imaging contrast, which is recognized as partial volume effect29. The poor prognostic performance of the features extracted from positive cervical lymph nodes also corresponds to the clinical consensus that particular nodal stations and side distributions are considered to contain more prognostic information12.

One limitation of our study was that no radiomics feature was available in LASSO-Cox shrinkage for LRFS estimation, although the clustering results showed a significant difference (p = 0.0037) in patients’ LRFS. This failure was likely caused by an insufficient number of events during follow-up. Therefore, other appropriate feature selection methods are expected for LRFS model building based on larger sample sizes. Moreover, the radiomics signature generated in this study was based on patients from the same clinical center, and external validation is warranted to assess the predictive power30. A larger-scale patient dataset for validation can also reduce the false discovery rate of the radiomics based models31.

In conclusion, this study demonstrated the utility of radiomics signatures from routine MRI in the evaluation of DFS and OS for NPC patients. Our findings may represent a step toward guiding individualized treatment options and improving health care quality.

Methods

Study design

Fudan University Shanghai Cancer Center Institutional Review Board approved this retrospective study and waived the requirement to obtain informed consent. All methods were performed in accordance with the guidelines and regulations of this ethics board.

The workflow of this study is shown in Fig. 3. A total of 303 patients were enrolled. After tumor segmentation, 208 radiomics features were separately extracted from the patient’s primary or neck lymph-node metastasis lesions. Then, an unsupervised cluster analysis was implemented to study the radiomics patterns of the 303 patients. After primary feature selection for feature reduction, the prognostic models were built. Disease-free survival (DFS), overall survival (OS), distant metastasis-free survival (DMFS) and locoregional recurrence-free survival (LRFS) were studied as survival outcomes.

Figure 3
figure 3

Study work flow. A total of 303 patients were enrolled in this study. The unsupervised cluster analysis was implemented to study the radiomics patterns of the total 303 patients. Primary feature selection, prognostic model building and construction of the radiomics signature were implemented based on the training cohort (n = 200). A validation cohort of 103 patients was used for model validation.

Patients

All patients who underwent MRI scans for NPC pretreatment diagnosis and subsequent treatment at our center from January 2010 to February 2012 were enrolled in this retrospective study (226 men and 77 women; mean age, 48.8 ± 12.7 years; range, 11 to 80 years). Clinical staging of the tumor was performed based on the 7th edition of the American Joint Committee on Cancer (AJCC) TNM staging system32. All patients were free of distant metastases (M0) before treatment. The patients’ T stage, N stage, sex, age and digital MRI data were collected from the medical records. The patient inclusion and exclusion criteria are shown in the Data Supplement.

MRI acquisition, ROI segmentation and radiomics feature extraction methodology

All patients underwent a 1.5 T MRI scan as diagnostic imaging. Transversal contrast-enhanced T1-weighted Digital Imaging and Communications in Medicine (DICOM) images were studied in this radiomics analysis. The detailed MRI acquisition parameters are described in the Data Supplement.

DICOM images of the patients’ MRI were gathered and imported into MIM (version 6.6; MIM Software Inc. Cleveland, OH) for manual segmentation. The primary and lymph-node metastasis lesions of all 303 patients were contoured by an oncologist with 5 years of experience. Nineteen patients were randomly selected and their lesions were recontoured by another oncologist for the contour reproducibility study. All segmentation was checked by a senior oncologist experienced in NPC.

After tumor segmentation, 208 radiomics features were extracted from the contrast-enhanced T1-weighted MRI data by an in-house algorithm called “QIAT” in MATLAB R2015a (The MathWorks Inc, Natick, MA) for each patient. The radiomics features included image intensity histogram analysis (10), texture analysis (31), wavelet analysis21 (164) and fractal analysis33 (3). The feature extraction methodology has been described in detail in the Data Supplement. The primary and lymph-node metastatic lesions served as two different ROIs and the radiomics features were separately extracted from the two.

Cluster analysis

Non-negative matrix factorization (NMF) is widely used for the clustering of high-dimensional datasets in computational biology34. After feature extraction, NMF was used to cluster 303 patients into two groups according to their radiomics feature patterns. Details are provided in the Data Supplement. Then, the chi-squared test was used to study the relationship between tumor stage and the clustering results. Kaplan-Meier survival curves for DFS, OS, DMFS and LRFS were also constructed for the two clustered groups. The log-rank test was used to study the significance of the differences among the clustered groups for each endpoint.

Prognostic model analysis

Patients were first segregated according to their pretreatment diagnosis date into a training cohort and a validation cohort. The training cohort of 200 patients was used to build and optimize the prognostic model, and a validation cohort of 103 people was used to validate the estimation power of the model. The Mann-Whitney U test was used to study the differences in clinical features between the training and validation cohorts.

Primary feature selection on contour reproducibility and nonredundancy was implemented before building the prognostic models. To study contour reproducibility, 19 randomly selected patients in the training cohort were contoured independently by two oncologists. The intra-class correlation coefficient (ICC2) was calculated by the radiomics features extracted from both segmentations to evaluate the segmentation reproducibility35. Features with ICC2 > 0.836 were selected. Redundant features with pairwise correlation were removed by the “findCorrelation” function with a cutoff of 0.5 using the “caret” packages in R for nonredundancy selection.

After primary feature selection, a prognostic model for each endpoint was built by the least absolute shrinkage and selection operator (LASSO) Cox regression37 using the training cohort. When building a model, LASSO shrinks the algebraic sum of the feature coefficients into a penalty parameter “lambda”. Therefore, some feature coefficients were reduced into zero to achieve the minimum lambda. The leave-one-out cross validation (LOOCV) method was used to optimize the penalty parameter “lambda” for LASSO because LOOCV achieves a thorough data mining and provides an almost unbiased estimator38. Additionally, the widely recommended “lambda.1se”39 was used as the penalty parameter of LASSO to establish a concise prediction model. Further 10-fold cross-validation was also performed to achieve a steady result.

To study the incremental prognostic value of radiomics features, two other prognostic models were built using the same methods: one based on clinical features (T stage, N stage, age and sex) only and the other based on a combination of both radiomics and clinical features. The Harrell concordance index (C-index)40 and 95% confidence intervals (CIs) for the respective models were calculated to assess the estimation performance of the models in both the training and validation cohorts.

Construction and validation of the radiomics signature

The radiomics signature for each endpoint combined both clinical features and the radiomics features extracted from the primary lesions. The signatures were generated by LASSO for the training cohort after summing the radiomics score. The radiomics score was calculated by summing up the non-zero radiomics features according to their coefficient weights. A nomogram was constructed for respective signatures based on the validation cohort.

The DFS risk score of the patients in the training cohort were evaluated according to the DFS radiomics signature. The median value of their DFS risk score was applied as a threshold to divide the patients on the training cohort into a high-risk group or low-risk group. Stratified analyses were implemented to compare the high-risk and low-risk patients’ DFS in various subgroups.

All statistical analyses were conducted with R software (version 3.3.1; http://www.Rproject.org), and all related analysis packages are listed in the Data Supplement Table 7. Statistical significance levels are indicated by two-sided p values with α set at 0.05.