Improving prognostic performance in resectable pancreatic ductal adenocarcinoma using radiomics and deep learning features fusion in CT images

Zhang, Yucheng; Lobo-Mueller, Edrise M.; Karanicolas, Paul; Gallinger, Steven; Haider, Masoom A.; Khalvati, Farzad

doi:10.1038/s41598-021-80998-y

Download PDF

Article
Open access
Published: 14 January 2021

Improving prognostic performance in resectable pancreatic ductal adenocarcinoma using radiomics and deep learning features fusion in CT images

Yucheng Zhang¹,
Edrise M. Lobo-Mueller²,
Paul Karanicolas³,
Steven Gallinger⁴,
Masoom A. Haider^4,5 &
…
Farzad Khalvati^1,6,7

Scientific Reports volume 11, Article number: 1378 (2021) Cite this article

3275 Accesses
19 Citations
1 Altmetric
Metrics details

Subjects

Abstract

As an analytic pipeline for quantitative imaging feature extraction and analysis, radiomics has grown rapidly in the past decade. On the other hand, recent advances in deep learning and transfer learning have shown significant potential in the quantitative medical imaging field, raising the research question of whether deep transfer learning features have predictive information in addition to radiomics features. In this study, using CT images from Pancreatic Ductal Adenocarcinoma (PDAC) patients recruited in two independent hospitals, we discovered most transfer learning features have weak linear relationships with radiomics features, suggesting a potential complementary relationship between these two feature sets. We also tested the prognostic performance for overall survival using four feature fusion and reduction methods for combining radiomics and transfer learning features and compared the results with our proposed risk score-based feature fusion method. It was shown that the risk score-based feature fusion method significantly improves the prognosis performance for predicting overall survival in PDAC patients compared to other traditional feature reduction methods used in previous radiomics studies (40% increase in area under ROC curve (AUC) yielding AUC of 0.84).

Radiomics-guided deep neural networks stratify lung adenocarcinoma prognosis from CT scans

Article Open access 12 November 2021

Fused feature signatures to probe tumour radiogenomics relationships

Article Open access 09 February 2022

Comparative analysis of radiomics and deep-learning algorithms for survival prediction in hepatocellular carcinoma

Article Open access 05 January 2024

Introduction

In the past decade, as an emerging field, radiomics has been developed to extract more information from medical images for improved diagnosis and prognosis of cancer. As a quantitative approach, radiomics comprises of the extraction and analysis of quantitative medical imaging features and establishing correlations between these features and clinical outcomes such as patient survival^1,2,3,4,5. Several radiomic features have been found to be significantly associated with various clinical outcomes in multiple cancer sites such as lung, pancreas, and kidney^{2,6,7,8,9,10,11,12}.

In the past few years, the pipeline for traditional radiomics analysis has been established^1,2,9,13. This traditional pipeline consists of four steps: image acquisition, region of interest (ROI) segmentation or annotation, feature extraction, and building a predictive model. As the core of this pipeline, radiomics features are extracted from medical images using predefined mathematical equations¹⁴. These engineered equations have been designed to capture different characteristics of images¹⁵. For example, first-order features measure the distribution of pixel intensities while second-order features are based on matrices including grey-level co-occurrence matrix (GLCM) and grey-level run length matrix (GLRLM) and extract texture information¹⁴. Efforts have been made to standardize the feature banks by implementing open source libraries such as PyRadiomics¹⁵. In these feature banks, thousands of engineered features from different classes can be extracted from 2D or 3D medical images¹⁵. These features can be further tested for their associations with clinical outcomes such as overall survival, recurrence, or genetic mutations^4,8,16,17. Several cross-cohort and multi-centre studies have also shown that several PyRadiomics features are robust to different scanners and clinician annotations^8,15,18,19.

Despite recent progress, the traditional radiomics analytics pipeline has a few drawbacks. First, the equations of features are predefined, and many formulas are similar. Thus, some radiomics features are highly correlated with each other. As a result, if a feature was found to be significantly associated with a certain clinical outcome, other highly correlated features may be significant as well. Consequently, while the high dimension of significant features increases the complexity of the prognostic model, there is no corresponding increase in performance. Second, testing radiomics features one by one increases the family-wise error rate (FWER), which is the probability of making one or more false discoveries. Previous publications have pointed out that several radiomics studies lacked multiple testing control and hence, some discovered significant features may be the result of type I errors^20,21. These shortcomings in the traditional radiomics analytics pipeline have inspired new research which takes advantage of the recent progress in deep learning and convolutional neural networks (CNNs) to improve the performance of the predictive models.

CNNs are one of the most frequently used deep learning architectures in computer vision²². CNNs apply a series of convolution operations on input images, preserving the spatial relationship between pixels and mapping these relationships onto outputs. During the training phase, parameters of the convolution operations are tuned based on the outcome. Consequently, convolution layers can capture information specifically related to the classification task (e.g., outcome prediction) at hand. In medical imaging, this allows generating customized feature maps for specific modalities or diseases, which further improves performance^23,24. However, training CNN parameters requires a large sample size, which is usually not available in typical medical imaging research settings. To overcome this limitation, transfer learning-based feature extraction has been proposed^25,26,27.

Transfer learning was developed based on an assumption that the structures of CNNs are similar to the mechanism of the human visual cortex^22,28. The top layers of CNNs can extract general features from images, while the deeper layers are more specific to the target²². Pretraining CNNs using large image datasets such as ImageNet helps the model to learn how to extract general features^29,30. Since many image recognition tasks are similar, the top layers of the network can be transferred to another target domain²⁶. On the other hand, deeper layers of CNNs can extract “higher-order” information which is associated with the target outcome. Thus, if the target domain is similar to the pretrained domain, deeper layers can also be transferred to extract features^25,31.

Deep learning and transfer learning-based feature extraction have shown promising results in cancer assessment^31,32,33. Furthermore, it has also been shown that combining predefined features with deep learning-based features can improve the performance in the prognosis of Glioblastoma Multiforme³¹. To gain a deeper understanding of the relationship between traditional radiomics and transfer learning features, it is crucial to map the correlation between these two sets of features. In addition, it is imperative to develop an optimal feature fusion pipeline that can exploit the prognostic information from both feature sets to improve the overall performance of the model.

The aim of this study was to assess the complementary prognostic information of predefined radiomic features and transfer learning features for overall survival in CT scans of Pancreatic Ductal Adenocarcinoma (PDAC) patients. Using CT images from PDAC patients, we mapped the association between PyRadiomics and a set of transfer learning features and showed the correlation among the two classes of features. Next, we applied four existing feature fusion and reduction methods, which include principal component analysis (PCA), Boruta³⁴, feature-wise selection using the Cox Proportional Hazards Model (CPH)³⁵, and LASSO³⁶, to combine the predefined radiomic features with transfer learning features for the prognosis of overall survival in PDAC patients. We then proposed a novel pipeline for combining predefined radiomics features and transfer learning features using a risk-score based model and compared its performance to aforementioned four existing feature fusion and reduction methods in an independent test cohort.

Methods

Dataset

Two cohorts from two independent hospitals consisting of 68 (training cohort) and 30 patients (test cohort) who had pre-operative contrast-enhanced CT available for analysis were enrolled in this retrospective study. All patients underwent curative-intent surgical resection for PDAC from 2008–2013 to 2007–2012 for both cohorts, respectively, and they did not receive other neo-adjuvant treatment. CT scans were performed on Toshiba, Aquilion (training cohort) and GE Medical Systems, LightSpeed VCT (test cohort) scanners using 2–3 mm slice thickness in the portal venous phase without advanced dose reduction algorithms.

Survival data were collected retrospectively (training cohort: 52 death vs. 16 survival, test cohort: 15 death vs. 15 survival at the end of follow-up). The median follow-up date was 21 months (range: 101 days to 1890 days) and 19 months (range: 109 days to 2569 days) for the training and test cohorts, respectively. We selected the two-year survival as the primary outcome, which was determined by the last follow-up date or date of death 2 years after surgery (Training cohort: 38 death vs. 30 survival, test cohort: 11 death vs. 19 survival). Further demographic information about these two cohorts can be found in Table 1⁸. To exclude the effect of postoperative complications on the prognosis, the patients who died within 90 days after surgery were excluded. An in-house developed region of interest (ROI) contouring tool (ProCanVAS)³⁷ was used by an experienced radiologist to annotate ROIs. The reader contoured the ROIs blind to the outcome.

Table 1 Demographic information of training and test cohorts⁸.

Full size table

Ethics approval and consent to participate

For the training cohort, University Health Network Research Ethics Boards approved the retrospective study and informed consent was obtained. For the test cohort, the Sunnybrook Health Sciences Centre Research Ethics Boards approved the retrospective study and waived the requirement for informed consent. All methods were performed in accordance with the relevant guidelines and regulations of both institutions.

Radiomics feature extraction

Pre-defined radiomic features were extracted using the PyRadiomics library (version 2.0.0) in Python¹⁵. To ensure that features were extracted from tumour regions exclusively, voxels with Hounsfield unit (HU) < -10 and > 500 were excluded to eliminate fat and stents from the feature values. A threshold of 500 would only exclude large parts of blood vessels in the portal venous phase which are not part of the tumor contour. These are normal structures that if included would confound analysis. This threshold, however, would not exclude tumor neovasculature or hyperenhancing subcomponents in the tumor which do not reach such a high attenuation level. In total, 1,428 radiomic features were extracted for both cohorts from the contoured ROIs. Details of the extracted features are listed in Table 2.

Table 2 Number of radiomics features extracted for different feature classes and image filters.

Full size table

Transfer learning feature extraction

Transfer learning features were extracted using a CNN model (LungTrans) pretrained by Non-Small Cell Lung Cancer (NSCLC) CT images³⁸. The NSCLC dataset was published as Lung Nodule Analysis (LUNA16) challenge with CT images from 888 patients³⁹. Images were extracted from the largest contoured ROI from each patient without preprocessing. All input ROIs were resized to 32 \(\times \hspace{0.17em}\)32 greyscale. Given that the shape of the ROI is not rectangular, the region outside of the ROI was set as black. Using this dataset, an 8-layer CNN (LungTrans) was trained de novo with batch size 16 and learning rate 0.001, with the architecture shown in Fig. 1⁴⁰ Every convolutional layer has Kernel size of 3 \(\times \hspace{0.17em}\)3 with stride of 1 with zero padding except for Conv_5 layer which has 2 \(\times \hspace{0.17em}\)2 kernel size and stride of 1 without padding. All the Max Pooling layers have 2 \(\times \hspace{0.17em}\)2 kernel size.

The process of transfer learning varies depending on the similarity of the pretrained domain and target domain. If the pretrained and target domains are different (e.g., natural images vs. CT pancreatic images), features will generally be extracted from upper layers for better generalization. However, if the pretrained and target domains are similar (e.g., they share the same imaging modality, similar resolution, and similar outcome), features can be extracted from deeper layers. In this study, since the pretrained and target domains are similar (lung and pancreatic CT), features were extracted from the Conv_5 layer which is a deep layer just before classification layers. Feeding the LungTrans CNN with contoured PDAC CT images with the same settings as the pretrained domain (32 \(\times \hspace{0.17em}\)32 greyscale ROI images with black background), 64 LungTrans features were extracted. After eliminating 29 LungTrans features with zero variance, 35 LungTrans remaining features were used in this study.

Correlation

To investigate the correlation between the features extracted using traditional radiomics pipeline (PyRadiomics) and transfer learning (LungTrans), Pearson correlation coefficients were calculated for each pair of feature sets in the training cohort (n = 68). The mean absolute correlation coefficient was calculated for each feature set (PyRadiomics and LungTrans). The distributions of the correlation coefficients were also calculated.

Proposed prognosis model

To investigate the optimal feature reduction and fusion methods, we first trained four prognosis models using CT images from the training cohort (n = 68) and validated them in the test cohort (n = 30) targeting a two-year survival. In each model, features from Pyradiomics and LungTrans were fused or selected in the training cohort using PCA, Boruta³⁴, feature-wise reduction through CPH³⁵, or LASSO³⁶ method. These selected/fused features were then used to train Random Forest-based prognosis models (number of trees to grow (ntree) = 500, number of randomly sampled variables as candidates at each split (mtry) varies depending on the setting that had the best performance in the training cohort). These prognosis models were further validated in the test cohort. The pipelines of four traditional feature fusion/reduction algorithms including PCA, Boruta³⁴, CPH-based feature reduction³⁵, and LASSO³⁶ are shown in Figs. 2A–D, respectively. In the following, each method is described in detail.

A.
Unsupervised feature fusion using PCA: Features from two feature banks were fused using PCA, generating 30 components. Next, these components were used to build a model (Random Forest, mtry = 2) in the training cohort, which was then evaluated in the test cohort.
B.
Supervised feature reduction using Boruta. Boruta identified prognostic features which were then used to build a prognosis model (Random Forest, mtry = 2) in the training cohort. The model’s performance was validated in the test cohort.
C.
Supervised feature reduction using Cox-Regression. Each feature was tested using univariate Cox-regression in the training cohort. Significant features were then used to build a prognosis model (Random Forest, mtry = 310), which was validated in the test cohort.
D.
Supervised feature selection using Correlation cut-off and LASSO Regression. In the training cohort, features with correlation coefficients higher than 0.7 were removed. The remaining features were reduced using LASSO logistic regression with optimized lambda. The features with nonzero coefficients in LASSO regression in the training cohort were selected to build the Random Forest model (mtry = 2), which was then evaluated in the test cohort.

Our proposed risk score-based method is illustrated in Fig. 2E. First, using the training cohort, two different Random Forest classification models were trained separately using each of the two feature banks (PyRadiomics and LungTrans) through tenfold cross validation⁴¹. Each of these models was then used to produce the probability of death for every patient in the training cohort through tenfold cross-validation. At this point, each patient in the training cohort would have two probabilities (training risk scores) of death based on the two feature banks (PyRadiomics and LungTrans). Similarly, feeding these two random forest models (trained using the entire training cohort) with PyRadiomics features and LungTrans features in the test cohort, two risk scores were generated for each patient in the test cohort (test risk scores). We then used these two training risk scores to train another Random Forest-based prognosis model in the training cohort and validated the model in the test cohort using the test risk scores.

To address the imbalanced outcome in the training cohort, SMOTE algorithm⁴² was applied in the training process of all five models as it has been shown that SMOTE’s performance is comparable to that of more recent balancing methods such as ADASYN⁴³. The following settings were used for SMOTE algorithm:

k (number of nearest neighbours used to generate the new examples of the minority class) = 5.
perc.over = 200, perc.under = 200 (a common default setting to balance the amount of over-sampling of the minority class and under-sampling of the majority class).

The area under the ROC curve (AUC) was used to measure the performance of these five approaches⁴⁴. Youden’s J statistics were used to identify the optimal threshold for sensitivity and specificity⁴⁵. DeLong tests were applied to test the difference between the AUCs of different models. The classification modeling, calculation of AUC, and DeLong tests were performed using the “caret”, “survival”, and “pROC” package in R (Version 3.5.1)^46,47,48.

Results

Correlation analysis between predefined and deep radiomic features

Within each feature bank, the average absolute values of Pearson correlation coefficients of 1,428 PyRadiomics and 35 LungTrans features were 0.27 (standard deviation: 0.23) and 0.32 (standard deviation: 0.32), respectively. The average absolute correlation coefficient between PyRadiomics and LungTrans features was 0.17 (standard deviation: 0.18). The weak linear relationship between PyRadiomics and LungTrans features suggest that the LungTrans features may harbor new information that PyRadiomics doesn’t capture.

The heatmap in Fig. 3 shows the correlation details between the two feature sets. Each dot in Fig. 3 represents a correlation coefficient. White colour indicates that the coefficient is 0, while red and blue dots represent positive or negative correlations. There are several colour blocks in PyRadiomics vs. the PyRadiomics region, indicating high correlations among the PyRadiomics features. Several colour bands in the PyRadiomics vs. LungTrans region also suggest that some LungTrans features may have strong linear relationships with PyRadiomics features.

The distribution of the correlation coefficients (in absolute value) is displayed in histogram form in Fig. 4. As illustrated by a skewed distribution, most of the predefined and deep radiomic features have weak correlations with one another. However, strong linear associations exist between certain features given the high correlation coefficients (> 0.70)⁴⁹. More details for the correlation between PyRadiomics and Transfer Learning features can be found in Table 3, where the average absolute values of correlation coefficients were calculated for each type of filter and feature.

Table 3 Mean absolute correlation coefficients between PyRadiomics and LungTrans features across different types of filters and features.

Full size table

Performance of the proposed prognosis model

The performances of four existing feature reduction methods (PCA, Boruta, feature-wise selection through CPH, and LASSO) were compared to that of the proposed risk score-based prognosis model. PCA method generated 30 components in the training cohort that represent the 95% variance in the original 1463 features from the PyRadiomics (1428 features) and LungTrans feature banks (35 features). In 100 iterations, Boruta feature reduction method selected only 1 feature in the training cohort, which was from PyRadiomics feature bank (Wavelet GLDM Small Dependence Low Gray Level Emphasis), with a cut-off at 0.05 (p-value cut-off for the Boruta method). CPH method identified 310 features associated with overall survival in the training cohort. Particularly, as shown in Table 4, 308 of them belong to the PyRadiomics feature bank, while LungTrans contributed with only 2 features. While some of the PyRadiomics features have been previously identified for PDAC prognosis (e.g., SumEntropy⁸), other well-known features such as ROI size was not significant. In the LASSO model, 14 features were identified as the potential prognostic biomarkers (3 features from LungTrans, and 11 features from PyRadiomics). Our proposed risk score-based model utilized the probabilities of the two individually trained Random Forest models. The performance of these five models was measured using the area under the ROC curve (AUC) for overall survival in the test cohort.

Table 4 Significant PyRadiomics features in univariate CPH across different types of filters and features.

Full size table

In the validation (test cohort), the AUCs for PCA, Boruta, CPH, and LASSO methods were 0.60 (95% Confidence Interval (CI): 0.37–0.82), 0.60 (95% CI: 0.38–0.81), 0.55 (95% CI: 0.32–0.77), and 0.50 (95% CI: 0.28–0.72), respectively. The proposed risk score-based method produced the highest AUC (AUC of 0.84, 95% CI: 0.70–0.98).

Comparing the feature reduction methods using DeLong test, the performance of the proposed risk score-based method was significantly higher than PCA (0.84 vs. 0.60, p-value = 0.044, FDR adjusted p-value = 0.044), Boruta (0.84 vs. 0.60, p-value = 0.040, FDR adjusted p-value = 0.044), Cox-regression methods (0.84 vs. 0.55, p-value = 0.0086, FDR adjusted p-value = 0.017), and LASSO (0.84 vs. 0.50, p-value = 0.0062, FDR adjusted p-value = 0.017). The results suggest that a risk score model, which is based on probabilities calculated by multiple individual small models, gave the best performance compared to other models. The ROC curves for four traditional feature reduction methods (PCA, Boruta, CPH, and LASSO) and the proposed risk score-based model are shown in Fig. 5.

Discussion

As deep transfer learning is becoming increasingly popular in medical imaging studies, there is an urgent need for identifying an optimal feature reduction and fusion method which can combine the information from traditional radiomics and transfer learning features. In this study, we proposed a risk score-based feature reduction and fusion method for a medical imaging-based model for PDAC prognosis. We discovered that the proposed risk score-based method had a significantly better prognosis performance than those of traditional supervised and unsupervised methods, increasing AUC by at least 40% (From 0.60 using PCA to 0.84). This result is consistent with previous studies, which have shown that ensemble methods can outperform traditional feature-wise selection models^50,51,52.

As deep transfer learning increasingly plays a vital role in medical image analysis, the curse of dimensionality is becoming more acute in radiomics-based prognosis models¹. Supervised feature reduction methods such as univariate CPH and Boruta have difficulties in balancing false positive rate and statistical power. By testing 1,463 features (1,428 PyRadiomics features and 35 LungTrans features) using univariate CPH, the probability of having at least one false positive (FWER) is higher than 99%. Hence, supervised feature reduction methods may lose their significance as feature banks continue to grow in size. In addition, PCA, an unsupervised method, wasn’t able to boost the prognosis performance due to the inherent noise in image features. Feature reduction using correlation cut-off with LASSO was previously used in a similar study for Glioblastoma prognosis³¹, but this method also failed in our independent test cohort in terms of performance. On the other hand, ensemble methods, which use multiple models to generate risk scores, may overcome these limitations of the traditional feature reduction methods^53,54. Additionally, since risk scores were generated using a nonlinear classifier (Random Forest), they were in fact nonlinear mappings from the original feature space, providing better fits for patients’ survival patterns leading to higher AUC.

It is worth to note that although there were high Pearson correlation coefficients between certain transfer learning and PyRadiomics features, most deep radiomics features have weak linear relationships with PyRadiomics features. The nature of PyRadiomics features and LungTrans features is different. A PyRadiomics feature is extracted using a predefined formula from medical images while LungTrans features were extracted using parameters fine-tuned by lung CT images. This result suggests that the relationship between transfer learning and PyRadiomics features was more complementary than replacement. Thus, we hypothesized that fusing these two feature banks might provide more information to the prognosis model. Future studies can further test the associations between conventional radiomics features and transfer learning features from different pretrained models. A thorough understanding of these associations will provide a steady base for developing more sophisticated and advanced feature fusion methods, which may further improve the prognosis performance for different cancer types.

Although the proposed risk score-based method outperformed traditional approaches, it had limitations. First, compared to supervised methods where certain biomarkers can be identified during the process, the risk score method is hard to interpret since the stacked model is based on the results (probabilities) from other models. Although using intuitive algorithms such as logistic regression instead of Random Forests, one may derive the final prognosis probability (risk score) from original features using mathematical formulations, it would be a complicated task. Second, although lung cancer and pancreatic cancer are both adenocarcinomas, they are different in that pancreatic cancer tends to exhibit much more stromal reaction thus the features relevant to prognosis might be expected to be different. The effect of this on the transfer learning model is uncertain and further validation with a variety of adenocarcinoma types may be of interest to see if there are transfer learning features invariant across tumour types. Third, for practical applications, a model must include other known prognostic factors. In this case of pancreatic cancer, this includes variables such as age, tumour size, grade, and stage. Although it has been shown that none of these clinical variables is prognostic of overall survival in PDAC patients⁸, nor adding them to radiomic features improves the prognostic model⁸, further work is necessary to incorporate these into a practical prognostic model for PDAC. Forth, the aim of this paper was primarily to explore approaches to fuse radiomics and transfer learning features. We recognize that validation with a larger cohort with careful attention to covariates will be required for practical application and examining the effectiveness of the proposed feature fusion method.

Conclusion

Deep radiomics features are complementary to conventional radiomics features. Through the proposed risk score-based prognosis model by fusing deep transfer learning and radiomics features, prognostication performance for resectable PDAC patients showed significant improvement compared to that of the traditional feature fusion and reduction methods.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request pending the approval of the institution(s) and trial/study investigators who contributed to the dataset.

References

Yip, S. S. F. & Aerts, H. J. W. L. Applications and limitations of radiomics. Phys. Med. Biol. 61, R150–66 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Parmar, C., Grossmann, P., Bussink, J., Lambin, P. & Aerts, H. J. W. L. Machine learning methods for quantitative radiomic biomarkers. Sci. Rep. 5, 13087 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Kumar, V. et al. Radiomics: the process and the challenges. Magn. Reson. Imaging 30, 1234–1248 (2012).
Article PubMed PubMed Central Google Scholar
Aerts, H. J. W. L. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5, 4006 (2014).
Article ADS CAS PubMed Google Scholar
Aerts, H. J. W. L. The potential of radiomic-based phenotyping in precision medicine. JAMA Oncol. 2, 1636 (2016).
Article PubMed Google Scholar
Hawkins, S. et al. Predicting malignant nodules from screening CT scans. J. Thorac. Oncol. 11, 2120–2128 (2016).
Article PubMed PubMed Central Google Scholar
Eilaghi A. et al. CT texture features are associated with overall survival in pancreatic ductal adenocarcinoma - a quantitative analysis. BMC Med. Imaging 17, 38 (2017).
Article PubMed PubMed Central Google Scholar
Khalvati, F. et al. Prognostic value of CT radiomic features in resectable pancreatic ductal adenocarcinoma. Sci. Rep. 9, 5449 (2019).
Article ADS PubMed PubMed Central Google Scholar
Zhang, Y., Oikonomou, A., Wong, A., Haider, M. A. & Khalvati, F. Radiomics-based prognosis analysis for non-small cell lung cancer. Nat. Sci. Rep. 7, 1 (2017).
Google Scholar
Lambin, P. et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012).
Article PubMed PubMed Central Google Scholar
Oikonomou, A. et al. Radiomics analysis at PET/CT contributes to prognosis of recurrence and survival in lung cancer treated with stereotactic body radiotherapy. Sci. Rep. 8, 1 (2018).
Article CAS Google Scholar
Haider, M. A. et al. CT texture analysis: a potential tool for prediction of survival in patients with metastatic clear cell carcinoma treated with sunitinib. Cancer Imaging 17, 1 (2017).
Article Google Scholar
Khalvati, F., Zhang, Y., Wong, A. & Haider, M. A. Radiomics. Encyclop. Biomed. Eng. 2, 597–603 (2019).
Article Google Scholar
Lambin, P. et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 14, 749–762 (2017).
Article PubMed Google Scholar
van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77, e104–e107 (2017).
Article PubMed PubMed Central Google Scholar
Li, Y. et al. MRI features predict p53 status in lower-grade gliomas via a machine-learning approach. NeuroImage Clin. 17, 306–311 (2018).
Article PubMed Google Scholar
Li, H. et al. MR imaging radiomics signatures for predicting the risk of breast cancer recurrence as given by research versions of MammaPrint, Oncotype DX, and PAM50 gene assays. Radiology 281, 382–391 (2016).
Article PubMed Google Scholar
Parmar, C. et al. Robust radiomics feature quantification using semiautomatic volumetric segmentation. PLoS ONE 9, e102107 (2014).
Article ADS PubMed PubMed Central Google Scholar
Traverso, A., Wee, L., Dekker, A. & Gillies, R. Repeatability and reproducibility of radiomic features: a systematic review. Int. J. Radiat. Oncol. 102, 1143–1158 (2018).
Article Google Scholar
Sanduleanu, S. et al. Tracking tumor biology with radiomics: a systematic review utilizing a radiomics quality score. Radiother. Oncol. 127, 349–360 (2018).
Article PubMed Google Scholar
Chen, S.-Y., Feng, Z. & Yi, X. A general introduction to adjustment for multiple comparisons. J. Thorac. Dis. 9, 1725–1729 (2017).
Article PubMed PubMed Central Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (2012).
Yamashita, R., Nishio, M., Do, R. K. G. & Togashi, K. Convolutional neural networks: an overview and application in radiology. Insights Imaging 9, 611–629 (2018).
Article PubMed PubMed Central Google Scholar
Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) (2019).
Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).
Article Google Scholar
Tan, C. et al. A survey on deep transfer learning. In International Conference on Artificial Neural Networks (ed. Kůrková, V. et al.) 270–279 (Springer, Cham, 2018).
He, K., Girshick, R. & Dollár, P. Rethinking imagenet pre-training. In IEEE/CVF International Conference on Computer Vision (ICCV) (IEEE, Seoul, 2019).
Google Scholar
Hubel, D. H. & Wiesel, T. N. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 160, 106–54 (1962).
Article CAS PubMed PubMed Central Google Scholar
George, D., Shen, H. & Huerta, E. A. Deep transfer learning: a new deep learning glitch classification method for advanced LIGO (2017).
Torrey, L. & Shavlik, J. Transfer learning. In Handbook of Research on Machine Learning Applications (ed Soria, E. et al.) (IGI Global, 2009).
Lao, J. et al. A deep learning-based radiomics model for prediction of survival in glioblastoma multiforme. Sci. Rep. 7, 10353 (2017).
Article ADS PubMed PubMed Central Google Scholar
Zhang, Y. et al. CNN-based survival model for pancreatic ductal adenocarcinoma in medical imaging. BMC Med. Imaging. 20, 11 (2020).
Article PubMed PubMed Central Google Scholar
Zhang, Y. et al. Prognostic value of transfer learning based features in resectable pancreatic ductal adenocarcinoma. Front. Artif. Intell. 3, 550890 (2020).
Article PubMed PubMed Central Google Scholar
Kursa, M. B. & Rudnicki, W. R. Feature selection with the boruta package. J. Stat. Softw. 36, 1–13 (2010).
Article Google Scholar
Fox, J. & Weisberg, S. Cox proportional-hazards regression for survival data in R. Most https://doi.org/10.1016/j.carbon.2010.02.029 (2011).
Article Google Scholar
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996).
MathSciNet MATH Google Scholar
Zhang, J., Baig, S., Wong, A., Haider, M. A. & Khalvati, F. A Local ROI-specific Atlas-based Segmentation of Prostate Gland and Transitional Zone in Diffusion MRI. J. Comput. Vis. Imaging Syst. (2016).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016). https://doi.org/10.1109/CVPR.2016.90.
Armato, S. G. et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 38, 915–31 (2011).
Article PubMed PubMed Central Google Scholar
De Wit, J. Kaggle datascience bowl 2017. (2017). Available at: https://github.com/juliandewit/kaggle_ndsb2017 (Accessed: 3rd November 2019)
Breiman, L. Random forests. Mach. Learn 45, 5–32 (2001).
Article MATH Google Scholar
Blagus, R. et al. SMOTE for high-dimensional class-imbalanced data. BMC Bioinform. 14, 106 (2013).
Article Google Scholar
Xie, C. et al. Effect of machine learning re-sampling techniques for imbalanced datasets in 18F-FDG PET-based radiomics model on prognostication performance in cohorts of head and neck cancer patients. Eur. J. Nucl. Med. Mol. Imaging 47, 2826–2835 (2020).
Article PubMed Google Scholar
Fawcett, T. An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006).
Article MathSciNet Google Scholar
Youden, W. J. Index for rating diagnostic tests. Cancer 3, 32–35 (1950).
Article CAS PubMed Google Scholar
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 12, 77 (2011).
Article Google Scholar
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26 (2008).
Article Google Scholar
Terry, M. & Therneau, M. Package ‘survival’ (2018).
Mukaka, M. M. Statistics corner: a guide to appropriate use of correlation coefficient in medical research. Malawi Med. J. 24, 69–71 (2012).
CAS PubMed PubMed Central Google Scholar
Breiman, L. & Leo,. Stacked regressions. Mach. Learn. 24, 49–64 (1996).
Article MATH Google Scholar
Dietterich, T. G. Ensemble methods in machine learning. in 1–15 (Springer, Berlin, 2000). https://doi.org/10.1007/3-540-45014-9_1.
Rokach, L. Ensemble Methods for Classifiers. in Data Mining and Knowledge Discovery Handbook 957–980 (Springer, 2005). https://doi.org/10.1007/0-387-25465-X_45.
Suk, H.-I. & Shen, D. Deep ensemble sparse regression network for Alzheimer’s disease diagnosis. in 113–121 (2016). https://doi.org/10.1007/978-3-319-47157-0_14.
Yang, P., Yang, Y. H., Zhou, B. B. & Zomaya, A. Y. A review of ensemble methods in bioinformatics: * Including stability of feature selection and ensemble feature selection methods (updated on 28 Sep. 2016).

Download references

Acknowledgements

This study was conducted with support of the Ontario Institute for Cancer Research (PanCuRx Translational Research Initiative) through funding provided by the Government of Ontario, the Wallace McCain Centre for Pancreatic Cancer supported by the Princess Margaret Cancer Foundation, the Terry Fox Research Institute, the Canadian Cancer Society Research Institute, and the Pancreatic Cancer Canada Foundation. The study was also supported by charitable donations from the Canadian Friends of the Hebrew University (Alex U. Soyka).

Author information

Authors and Affiliations

Department of Medical Imaging, University of Toronto, 686 Bay Street, Toronto, ON, M5G 0A4, Canada
Yucheng Zhang & Farzad Khalvati
Department of Diagnostic Imaging and Department of Oncology, Faculty of Medicine and Dentistry, Cross Cancer Institute, University of Alberta, Edmonton, AB, Canada
Edrise M. Lobo-Mueller
Department of Surgery, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
Paul Karanicolas
Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
Steven Gallinger & Masoom A. Haider
Joint Department of Medical Imaging, University Health Network, University of Toronto, Toronto, ON, Canada
Masoom A. Haider
Department of Diagnostic Imaging and Research Institute, The Hospital for Sick Children, Toronto, ON, Canada
Farzad Khalvati
Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON, Canada
Farzad Khalvati

Authors

Yucheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Edrise M. Lobo-Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Paul Karanicolas
View author publications
You can also search for this author in PubMed Google Scholar
Steven Gallinger
View author publications
You can also search for this author in PubMed Google Scholar
Masoom A. Haider
View author publications
You can also search for this author in PubMed Google Scholar
Farzad Khalvati
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Z., M.A.H., and F.K. contributed to the design of the concept. E.M.L., P.K., S.G., M.A.H. contributed in collecting and reviewing the data. Y.Z. and F.K. contributed to the design and implementation of quantitative imaging feature extraction and machine learning modules. All authors contributed to the writing and reviewing of the paper. F.K. and M.A.H. are co-senior authors. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Farzad Khalvati.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Lobo-Mueller, E.M., Karanicolas, P. et al. Improving prognostic performance in resectable pancreatic ductal adenocarcinoma using radiomics and deep learning features fusion in CT images. Sci Rep 11, 1378 (2021). https://doi.org/10.1038/s41598-021-80998-y

Download citation

Received: 15 April 2020
Accepted: 01 January 2021
Published: 14 January 2021
DOI: https://doi.org/10.1038/s41598-021-80998-y

This article is cited by

Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease
- Zecheng Zhu
- Shunjin Zhao
- Xifeng Wu
Respiratory Research (2024)
A multidomain fusion model of radiomics and deep learning to discriminate between PDAC and AIP based on 18F-FDG PET/CT images
- Wenting Wei
- Guorong Jia
- Changjing Zuo
Japanese Journal of Radiology (2023)
A convolutional neural network with self-attention for fully automated metabolic tumor volume delineation of head and neck cancer in \([^{18}\)F]FDG PET/CT
- Pavel Nikulin
- Sebastian Zschaeck
- Jörg van den Hoff
European Journal of Nuclear Medicine and Molecular Imaging (2023)
Pre-operative radiomics model for prognostication in resectable pancreatic adenocarcinoma with external validation
- Gerard M. Healy
- Emmanuel Salinas-Miranda
- Masoom A. Haider
European Radiology (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Radiomics-guided deep neural networks stratify lung adenocarcinoma prognosis from CT scans

Fused feature signatures to probe tumour radiogenomics relationships

Comparative analysis of radiomics and deep-learning algorithms for survival prediction in hepatocellular carcinoma

Introduction

Methods

Dataset

Ethics approval and consent to participate

Radiomics feature extraction

Transfer learning feature extraction

Correlation

Proposed prognosis model

Results

Correlation analysis between predefined and deep radiomic features

Performance of the proposed prognosis model

Discussion

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease

A multidomain fusion model of radiomics and deep learning to discriminate between PDAC and AIP based on 18F-FDG PET/CT images

A convolutional neural network with self-attention for fully automated metabolic tumor volume delineation of head and neck cancer in \([^{18}\)F]FDG PET/CT

Pre-operative radiomics model for prognostication in resectable pancreatic adenocarcinoma with external validation

Comments

Search

Quick links