Radiomics-based model for predicting early recurrence of intrahepatic mass-forming cholangiocarcinoma after curative tumor resection

To investigate the ability of CT-based radiomics signature for pre-and postoperatively predicting the early recurrence of intrahepatic mass-forming cholangiocarcinoma (IMCC) and develop radiomics-based prediction models. Institutional review board approved this study. Clinicopathological characteristics, contrast-enhanced CT images, and radiomics features of 125 IMCC patients (35 with early recurrence and 90 with non-early recurrence) were retrospectively reviewed. In the training set of 92 patients, preoperative model, pathological model, and combined model were developed by multivariate logistic regression analysis to predict the early recurrence (≤ 6 months) of IMCC, and the prediction performance of different models were compared using the Delong test. The developed models were validated by assessing their prediction performance in test set of 33 patients. Multivariate logistic regression analysis identified solitary, differentiation, energy- arterial phase (AP), inertia-AP, and percentile50th-portal venous phase (PV) to construct combined model for predicting early recurrence of IMCC [the area under the curve (AUC) = 0.917; 95% CI 0.840–0.965]. While the AUC of pathological model and preoperative model were 0.741 (95% CI 0.637–0.828) and 0.844 (95% CI 0.751–0.912), respectively. The AUC of the combined model was significantly higher than that of the preoperative model (p = 0.049) or pathological model (p = 0.002) in training set. In test set, the combined model also showed higher prediction performance. CT-based radiomics signature is a powerful predictor for early recurrence of IMCC. Preoperative model (constructed with homogeneity-AP and standard deviation-AP) and combined model (constructed with solitary, differentiation, energy-AP, inertia-AP, and percentile50th-PV) can improve the accuracy for pre-and postoperatively predicting the early recurrence of IMCC.

www.nature.com/scientificreports/ Thus, there is a need to identify high-risk early recurrence patients with IMCC so that a more aggressive surgery as well as a strict follow-up protocol can be established. Previous studies have suggested that clinicopathological variables, including preoperative carbohydrate antigen 19-9 (CA19-9) elevation, obstructive jaundice, tumor size, number of lesions, satellite lesions, lymph node metastases, perineural invasion, and macrovascular invasion were significantly associated with recurrence of IMCC patients after curative resection, but failed to elucidate predictive performance in clinical applications and some of these predictors can only be evaluated with postoperative pathological examination 4,[7][8][9][10] .
Radiomics is gaining importance in personalized cancer treatment. It involves the extraction of quantitative features from digital medical images that enables mineable high-dimensional data to be applied in clinical decision support to provide improved diagnostic, prognostic, and predictive accuracy [11][12][13][14][15][16] . Science prior studies have reported the high prediction performance (AUC ≥ 0.80) of CT-based radiomics models to predict early recurrence of HCC 17,18 , we hypothesized that CT-based radiomics could improve the accuracy for predicting the early recurrence of IMCC.
Thus, the aim of this study was to investigate the ability of CT-based radiomics signature for pre-and postoperatively predicting the early recurrence of IMCC and develop radiomics-based prediction models.

Materials and methods
The study protocol was in complies with the Declaration of Helsinki and acts in accordance to ICH GCP guidelines. Institutional review board of Nanjing Drum Tower Hospital, the affiliated hospital of Nanjing University Medical School approved this study and explicitly waived the informed consent due to its retrospective nature. It also clarified that authors had access to identifying patient information when analyzing the data.
Patients. From June 2012 to June 2020, a total of 249 patients with a clinical diagnosis of intrahepatic cholangiocarcinoma (ICC) were reviewed. The inclusion criteria were: (a) initial curative hepatectomy with a pathologically confirmed diagnosis of IMCC according to the 2010 World Health Organization classification; (b) preoperative contrast-enhanced CT performed within one month before surgery; (c) without any local or systematic treatment such as transcatheter arterial chemoembolization, percutaneous ethanol injection, radiofrequency ablation, radiotherapy or chemotherapy before CT examination. The exclusion criteria were: (a) with a history of malignancy or combined with other malignancy; (b) with tumor-positive resection margin; (c) those who were followed up for less than six months. The remaining 125 patients, including 67 men and 58 women, with a median age of 56 years (range 34-79) served as our study population (Fig. 1).
CT examination. Each patient underwent plain and dynamic contrast-enhanced CT scans on a multidetector CT scanner (Lightspeed, VCT, or Discovery HD750, GE Healthcare, US). The scanning parameters are the same as the details in the previous research of our team 19 . The medium interval between CT examination and surgery was 9 days (range 4-23).

Region-of-interest segmentation and radiomics features extraction. The region of interest (ROI)
segmentation were the same as the steps described in previous study 20 . Two radiologists (Z.Y. and M.Y.F., with 7 and 5 years' experience in abdominal radiology, respectively) manually drew along the margin of the tumor independently. The software automatically read the CT value of each pixel within the volume of interest (VOI) and generated 87 parameters as follows: (1) shape-based features describing three-dimensional size and tumor shape (n = 8); (2) the first-order statistics features describing the distribution of pixel intensity within the VOI (n = 14); (3) the texture features assessing tumor heterogeneity by analyzing the distribution and relationship of pixel or voxel gray levels in the VOI (n = 65). Radiomics features were selected by using a parametric method, www.nature.com/scientificreports/ the least absolute shrinkage and selection operator (LASSO) logistic regression [21][22][23] , after manually eliminating the features that had an absolute value less than 0.6 for the coefficients of early recurrence from the radiomics features. The other observer repeated image analysis independently as discussed 1 months later in order to assess the intra-observer reliability for the selected features.
Basic CT imaging analysis. Basic CT imaging features of IMCCs included liver surface contour (smooth, bulging or retraction), intratumoral arteries (defined as an artery entering the tumor and remaining inside the tumor 24 ) and bile duct dilatation (with or without). Quantitative analysis was also performed. The axial CT image showing the largest slice of the tumor was selected. The first ROI was drawn manually as large as possible within the lesion area to obtain the mean CT attenuation (in Hounsfield Unit, HU), which was recorded as CT value-mean of tumor. Then, a second round/ oval ROI was placed in non-tumorous liver parenchyma in the same slice as large as possible avoiding visible vessels to obtain the mean CT attenuation of liver parenchyma (in HU). The ROIs in the tumor and non-tumorous liver parenchyma were kept identical in unenhanced, arterial, portal venous and equilibrium phases. The mean values of the two radiologists were calculated as the final results. Tumor-liver ratio was calculated by CT valuemean/CT value of non-tumorous liver parenchyma in each phase.
Surgical treatment, clinical-pathological data collection and follow-up. Sixty-one patients underwent major hepatectomy with bile duct resection and other 64 patients received partial hepatectomy without bile duct resection by two surgeons (QYD and ML, with 26 and 15 years' experience in hepatobiliary surgery). Demographic and clinicopathological variables were collected for each patient. Solitary, size, vascular invasion, neurological and membrane invasion, lymph node metastasis and histological grade were also documented based on final pathology reports. Data on tumor stage were collected according to the AJCC seventh edition staging system 25 . After surgery, patients were followed in the outpatient clinic every 3 months for the first 2 years and every 6 months thereafter. Examinations consisted of ultrasonography and serum tumor markers, and contrast-enhanced CT, MR imaging or PET would be ordered if a recurrence or progression of the disease was suspected. Further therapeutic regimes such as radiotherapy or chemotherapy would be undertaken if disease relapse or progression was confirmed. During a median follow-up time of 22 months (mean 22.6, range 17-85), 70 patients were still alive, 46 patients died, and 9 patients were lost to follow-up. Due to the high rate of recurrence after liver resection and short median time (8-9 months) to recurrence 7,26,27 , early recurrence was defined as suspicious imaging findings or biopsy-proven tumor within 6 months.
Statistical analysis. Categorical variables were compared by using the ×2 test or Fisher exact test. Continuous variables were compared by using the Student t test or Mann-Whitney U test, when appropriate. In training set, the selected features extracted from the VOIs and other variables that reached statistical significance (p value < 0.05) at the univariable analysis were considered for the multivariable logistic regression model. Receiver operating characteristics (ROC) curves were generated and the area under the curve (AUC) was used to evaluate the accuracy of preoperative model, pathological model, and combined model in predicting the early recurrence of IMCC. Then, the developed models were validated by assessing their prediction performance in test set. Comparisons between three models were performed using the Delong test in the training and test sets. Higher prediction accuracy presented with a larger AUC and a p value < 0.05 (two-tailed) indicated statistical significance. Decision curve analysis (DCA) of training and test sets were conducted to determine the clinical usefulness by quantifying the net benefits at different threshold probabilities in the models. Interobserver agreement of radiomics features between two radiologists were assessed with intraclass correlation coefficients (0.000-0.200, poor; 0.201-0.400, fair; 0.301-0.600, moderate; 0.601-0.800, good; 0.801-1.000, excellent). Survival curves were drawn by using the Kaplan-Meier method, and differences of survival rates were compared with the log rank, breslow, and tarone ware test. Other statistical analyses were performed with SPSS, version 22.0 (IBM, Armonk, NY, USA) and R software (R Foundation for Statistical Computing, version 3.4.1; https:// www.r-proje ct. org/). Ethics approval and consent to participate. Institutional review board of Nanjing Drum Tower Hospital, the affiliated hospital of Nanjing University Medical School, approved this study and the informed consent from patients was waived due to its retrospective nature.

Results
Patient characteristics. Early recurrence was detected in 28.0% (35/125) of IMCCs in our study. In training set, the baseline characteristics of patients are summarized in Table 1. Early recurrence group IMCCs were depicted as multiple lesions significantly more often than that in non-early recurrence group IMCCs (41.7% vs. 16.2%; p = 0.011). Early recurrence group IMCCs appear as larger size than non-early recurrence group IMCCs (p = 0.021). IMCCs in early recurrence group showed poorly to undifferentiated differentiation and membrane invasion more often than those in non-early recurrence group (p = 0.025 and 0.024, respectively). There were no significant differences in clinical characteristics between the early recurrence and non-early recurrence group IMCCs. www.nature.com/scientificreports/ Overall survival in IMCC patients with early recurrence was significantly poorer than those without early recurrence both in training set (11.9 ± 2.0 vs. 64.0 ± 4.7 months, p < 0.001) and test set (16.8 ± 3.9 vs. 74.4 ± 6.0 months, p < 0.001) (Fig. 2a,b). Development and performance of prediction models. Based on basic CT imaging features and clinical indicators in training set (Table A.1), PV CT value-mean and EP CT value-mean were significantly associated www.nature.com/scientificreports/ with early recurrence. However, neither of them were identified as independent risk factors for early recurrence by multivariable analysis (p > 0.05).
Of 348 extracted radiomics features in plain, arterial, portal venous, and equilibrium phase CT images, 20 most stable features (15 first-order features and 5 texture features) were considered for subsequent analysis. All texture features showed excellent interobserver agreement with intraclass correlation coefficients ≥ 0.80 (Table  A.2). Radiomics features in AP (homogeneity-AP and standard deviation-AP) constructed the preoperative model (AUC = 0.844; 95% CI 0.751-0.912) ( Table 2), all basic CT imaging features and clinical indicators were excluded in multivariate logistic regression analysis.
Multivariate logistic regression analysis identified 2 independent risk factors in pathological model for predicting early recurrence of IMCC in training set, including solitary and membrane invasion ( Table 3). The area under the ROC curve was considered normal (0.741; 95% CI 0.637-0.828). Multivariate logistic regression analysis based on basic CT imaging features, radiomics features, clinical indicators, and pathological characteristics identified solitary, differentiation, energy-AP, inertia-AP, and percentile50th-PV to construct combined model for predicting early recurrence of IMCC (AUC = 0.917; 95% CI 0.840-0.965) ( Table 4).
AUC estimates were compared between prediction models by using the Delong nonparametric approach in training and test sets (Fig. 3a,b). In training set, the AUC of the combined model was significantly higher than that of the preoperative model (p = 0.049) and pathological model (p = 0.002). In test set, the combined model  Clinical application. DCA in training set and test set for the preoperative model, pathological model and combined model was performed (Fig. 4a,b). Compared with scenarios in which no prediction model would be used (i.e., treat-all or treat-none scheme), the highest curve (representing the combined model) at most of given  Figure 3. Delong non-parametric approach, AUC estimates for predicting early recurrence of IMCC were compared between different prediction models in the training set (a) and test set (b). Table 5. Predictive performance of the radiomics signature and the two prediction models for the discrimination of early recurrence in training set and test set. AUC area under the curve, CI confidence interval. *p < 0.05. www.nature.com/scientificreports/ threshold probability is the optimal decision-making strategy to maximize the net benefit compared with other two models.

Discussion
Based on basic CT imaging features, radiomics features, clinical indicators and pathological characteristics, we developed preoperative model, pathological model, and combined model to predict the early recurrence of IMCC. Finally, the prediction performance of different models were compared. Early recurrence (≤ 6 months) was detected in 28.0% of IMCCs in our study, which was basically consistent with previous studies 4,26,28 , and nearly half of IMCC patients had recurrence within 6 months of surgery. Due to the high rate and short median time of recurrence 7,26,27 , we suggested that predicting early recurrence of IMCC with the cutoff time of 6 months would help improve clinical prognosis. We confirmed that the prognosis of IMCC patients with early recurrence was significantly poorer than those without. Park et al. 29 found that HCC patients with early recurrences (≤ 6 months) showed a poorer prognosis than those who did not. Zhang et al. 4 defined 24 months as the optimal cutoff value for early and late recurrence of ICC identified with linear regression. Ohira et al. 27 divided ICC after curative resection into early recurrence and late recurrence based on time of 12 months and found no prognostic difference between them. Determination of the optimal cutoff value for early and late recurrence of IMCC require further investigation. In addition, we found that larger tumor size, multiple lesions, poorly to undifferentiated differentiation and membrane invasion were associated with increased risk of early recurrence. Zhang et al. 4 concluded that early recurrence occurred more commonly in ICC patients with tumors that had more aggressive biological features.
In our study, multivariate logistic regression analysis identified homogeneity-AP and standard deviation-AP to construct the preoperative mode for predicting early recurrence of IMCC (AUC = 0.844; 95% CI 0.751-0.912). However, when based on basic CT imaging features and clinical indicators, none of them were identified as independent risk factors. This result indicated that radiomics approach could generate more potentially informative and relevant metrics than basic CT imaging features and clinical indicators 11,17,18 . Early recurrence of IMCC might be related to the aggressiveness of the tumor, which could better reflect the heterogeneity inside the tumor, and radiomics can better reflect this internal heterogeneity. Indeed, two dominant features (homogeneity and standard deviation) in the preoperative mode quantified intra-tumor heterogeneity and distribution of pixel values, which were reported to be associated with histopathologic and prognosis in other solid tumors 30,31 . Accurately identifying high-risk early recurrence patients with IMCC by preoperative model could provide a basis for developing more aggressive surgery and neoadjuvant chemotherapy.
In our study, multivariate logistic regression analysis identified solitary and membrane invasion as independent risk factors in pathological model for predicting early recurrence of IMCC (AUC = 0.741), which was lower than AUC of the preoperative model (p = 0.144). We extracted shape-based, first-order statistics and texture features from whole tumor volume based on multiphase contrast enhanced CT imaging 32 , which completely reflect tumor microstructures and heterogeneity and might provide more valuable information reflecting the biological characteristics than several routine pathological features recorded. Prior studies have also reported the advantages of CT-based radiomics models to predict early recurrence of liver tumors. Zhou et al. 17 developed a CT-based radiomics model that demonstrated an AUC of 0.82 in predicting the early recurrence (≤ 1 year) of HCC. Shan et al. 18 supposed that CT-based peritumoral radiomics model can effectively predict early recurrence in HCC after curative tumor resection or ablation. www.nature.com/scientificreports/ In our study, multivariate logistic regression analysis identified solitary, differentiation, energy-AP, inertia-AP and percentile50th-PV to construct combined model for predicting early recurrence of IMCC (AUC = 0.917). The high AUC in training set and test set suggested that the combined model performed well in discriminating for early recurrence and which improved the accuracy compared with either preoperative model or pathological model individually. Notably, DCA showed that the combined model adds more benefit to predicting early recurrence than the preoperative model or pathological model at most given threshold probability. For postoperative IMCC patients with high-risk early recurrence identified by combined model, it was significant to adopt more stringent follow-up protocol to detect and manage lesions early.
Our study had some limitations. First, the proposed combined model was established on the basis of data obtained from a single center. Prospective multicenter studies with considerably large datasets are needed to further validate the robustness and reproducibility of our prediction model. Second, CT images were obtained from multiple CT scanners, which might bring some potential bias. Nevertheless, a good inter-scanner agreement of CT-based radiomics method has been confirmed 33 . Furthermore, logistic regression analysis did not find significant correlation between basic CT imaging features and early recurrence of IMCCs. Combining different CT and MRI imaging characteristics to predict early recurrence requires further investigations.

Conclusion
Our study indicates that CT-based radiomics signature is a powerful predictor for early recurrence of IMCC. Preoperative model (constructed with homogeneity-AP and standard deviation-AP) and combined model (constructed with solitary, differentiation, energy-AP, inertia-AP and percentile50th-PV) can improve the accuracy for pre-and postoperatively predicting the early recurrence of IMCC, which may help better stratify patients for surgery and enable clinicians to select optimal treatment strategies and an individualized follow-up protocol after initial surgery to improve clinical outcomes.

Data availability
The datasets during and/or analysed during the current study available from the corresponding author on reasonable request.