Prognostic Impact of the Findings on Thin-Section Computed Tomography in stage I lung adenocarcinoma with visceral pleural invasion

Visceral pleural invasion (VPI) in stageI lung adenocarcinoma is an independent negative prognostic factor. However, no studies proved any morphologic pattern could be referred to as a prognostic factor. Thus, we aim to investigate the potential prognostic impact of VPI by extracting high-dimensional radiomics features on thin-section computed tomography (CT). A total of 327 surgically resected pathological-N0M0 lung adenocarcinoma 3 cm or less in size were evaluated. Radiomics signature was generated by calculating the contribution weight of each feature and validated using repeated leaving-one-out ten-fold cross-validation approach. The accuracy of proposed radiomics signature for predicting VPI achieved 90.5% with ROC analysis (AUC, 0.938, sensitivity, 90.6%, specificity, 93.2%, PPV: 91.2, NPV: 92.8). The cut-off value allowed separation of patients in the validation data into high-risk and low-risk groups with an odds ratio 12.01. Radiomics signature showed a concordance index of 0.895 and AIC value of 88.9% with regression analysis. Among these radiomics features, percentile 10%, wavEnLL_S_2, S_0_1_SumAverage represented as independent factors for determining VPI. Results suggested that radiomics signature on CT exhibited as an independent prognostic factor in discriminating VPI in lung adenocarcinoma and could potentially help to discriminate the prognosis difference in stage I lung adenocarcinoma.

accuracy in previous studies, few studies investigated why in the similar distance to pleura, or even much more distant to pleura, some peripheral NSCLC have VPI, while others not. Is it corresponding to the potential malignant characteristics of primary tumors, such as the predominant subtypes, pathological grades or some potential malignant features on CT? It remains unclear whether certain prognostic impact could be used to discriminate VPI from the early stage NSCLC with 3 cm or less in size. Thus, our study was designed to integrate comprehensive information to evaluated potential malignant characteristics and prognostic factor for discriminating VPI in early-stage NSCLC.
Recent advances in radiomics enable the noninvasive evaluation of tumor internal heterogeneity by extracting and analyzing a large amounts of advanced quantitative imaging features from medical images. These high-dimensional extracted features, termed as radiomics features, could obtain a comprehensive characterization and detect potential malignant features of tumors with complex components 8 . The selection and integration of radiomics database could return a result with information about the phenotype of tumor, or clinical outcome, etc, and presents as a predicting biomarker, which was termed as radiomics signature [9][10][11] . Here, we hypothesize that a large number of extracted radiomics features coupled with appropriate statistical analysis may be able to detect the potential malignant characteristics of NSCLC with VPI.
As for NSCLC, squamous cell carcinoma and adenocarcinoma showed significantly different biological behaviors and prognosis 12,13 . Moreover, tumors greater than 3 cm with or without VPI showed significant different prognosis with tumor 3 cm or less in size. In order to avoid confusion and complexity, our study only evaluated the prognostic impact of lung adenocarcinoma which accounts for the most common subtypes of peripheral lung cancer and tumor size within 3 cm. Therefore, the purpose of the study was to identify the ability of multi-feature-based radiomics combined with pathological findings in differentiating the phenotypes of stage I lung adenocarcinoma with visceral pleural invasion.

Results
Clinical and Histopathological Findings. The  S_3_3_Contrast, with total weights to 89.0%. The diagnostic performances of top-five best features by ROC curve analysis were illustrated in Table 2; Fig. 2. Univariate logistical analysis revealed that top four-best performing features showed significant difference between VPI (+) and VPI (−) ( Table 3).
Validation of the Radiomics Signature. The workflow for radiomics signature generating was illustrated in Fig. 3. When using SVM for confirming the diagnostic performance in the validation cohort, the accuracy for predicting VPI (+) from VPI (−) achieved 90.5% with ROC curve analysis (AUC, 0.938, sensitivity, 90.6%, specificity, 93.2%, PPV: 91.2, NPV: 92.8, +LR: 13.4, −LR: 0.101). The optimum cut-off value generated by ROC analysis after modeling by SVMs was 1.003 with Pi value of 0.787. Accordingly, patients in the validation data were classified into high-risk and low-risk groups. The regression analysis modeled by SVM showed that the radiomics signature showed significant difference between VPI+ and VPI− groups with P = 0.000 and odds ratio 12.01. AIC value achieved 88.9% and concordance index was 0.895 in the validation cohort. When validating each of five-best performing features, percentile 10%, wavEnLL_S_2, S_0_1_SumAverage showed to be independent factors in stratifying patients into high-risk and low-risk groups in the validation data set with wilcoxon signed-rank test. However, when compared with multiple radiomics features based signature, each of the prognostic radiomics feature showed significantly lower Az value and concordance index (Tables 2  and 3; Fig. 2).

Discussion
Our study showed that visceral pleural invasion occurred significantly more often in MP and solid predominant subtypes and less frequently in lepidic predominant adenocarcinoma. For acinar subtype, it did not show predominance. Most of the patients included in the present study were grade II (259 of 327 [79.2%]) and showed no predominance between stage IA and stage IB. Multi-feature-based radiomics signature was identified to be an independent factor for estimating peripheral lung adenocarcinoma with VPI (+) with accuracy of 90.5%. In addition, with the cut-off value generated by ROC analysis, patients were successfully stratified into high-risk and low-risk groups, which enabled us to evaluate the risk stratifications in a quantitative and non-invasive way.
The patients with visceral pleural invasion were proved to be an independent adverse prognostic factor with potency to invade lymphangion, blood vessels and develop into metastatic disease (lymph node or distant metastasis). Evaluating the morphologic manifestation of lung adenocarcinoma with VPI on CT images is insufficient,  Table 3. Prognostic Models for Predicting NSCLC with Visceral Pleural Invasion. Note.-AIC = Akaike information criterion. † Potential significances were identified in four-best performing radiomics features with univariate analysis, ‡ Percentile 10%, WavEnLL_S_2, 45dgr_GLevNonU showed to be independent factors with multiple regression analysis. ↑ Multiple radiomics features-based signature showed significantly higher Concordance Index than single radiomics features.
because till now, no studies demonstrated any morphologic pattern could be referred to as a prognostic factor. Apart from the CT morphologic changes, few studies integrated comprehensive information to evaluated potential malignant characteristics and prognostic factor for the patients with VPI. To the best of our knowledge, we are the first researcher to report on the investigation of prognostic factor for predicting VPI in the early-stage lung adenocarcinoma. By analysis of the predominant subtypes of 327 stage I adenocarcinoma, we found that patients with VPI occurred significantly more often in MP and solid subtypes, which proved to be with poor prognosis 14,15 . However, the sample size for MP and solid subtypes are limited in stage I lung adenocarcinoma. With regard to the pathological grade, most of the patients were grade II with no predominance between stage IA and IB. The predictive value of the predominant subtypes and pathologic grade to discrimination between VPI (+) and VPI (−) appeared to be limited. The extraction of advanced radiomics signature allowed us to quantitatively assess the heterogeneous internal features of lung adenocarcinoma with different tumor phenotypes on a macroscopic tissue scale by converting imaging data into high dimensional quantitative descriptors 13 . Intratumor heterogeneity calculated by radiomics has been suggested to correlate with worse clinical outcome, greater risk for lymph node involvement and distant metastasis 8,13,16 , and should be able to demonstrate the potential malignant characteristics of primary lung adenocarcinoma with VPI. Consistent with the hypothesis, the present study showed that integrated radiomics cut-off value of the patients in stage IB lung adenocarcinoma with VPI (+) was significantly higher than the patients in stage IA VPI (−), and the radiomics signature was validated as a prognostic factor in the validation data. These findings supported that radiomics signature may infer the tumor phenotypic characteristics in stage I lung adenocarcinoma and discriminate the potential malignant characteristics of primary adenocarcinoma with VPI.
The radiomics signature included five-best performing radiomics features: percentile 10%, wavEnLL_S_2, S_0_1_SumAverage, 45dgr_GLevNonU, S_3_3_Contrast, which represented intratumor heterogeneity within different radiomics feature groups. All features showed significant difference between stage IA and IB with VPI, which are partly consistent with results of recent studies on prognostic stratification 16 . Among them, percentile 10%, wavEnLL_S_2, S_0_1_SumAverage were identified as independent factor in the validation data set with wilcoxon signed-rank test. Among them, percentile 10% represents the point at which, 10% of the voxel values that form the histogram are found from the left. WavEnLL_S_2, which is a label for wavelet feature, is the energy of wavelet coefficients in subband LL. S_0_1_SumAverage is one of the labels for the co-occurrence matrix features: values in parenthesis represent coordinates, containing information about distance and direction between pixels. All these features represented intratumor heterogeneity within different radiomics groups. However, when compared the diagnostic performance of single radiomics-based factor with multiple radiomics features-based signature, radiomics signature showed significantly higher in Az value and concordance index, indicating that high dimensional radiomics factors should be integrated to select valuable biomarkers for phenotypic characteristics, which is similar to genomics.
Our study has several limitations. Firstly, the sample size is relatively small and lack of an external validation. Secondly, although radiomics demonstrated the intratumor heterogeneity within invasive adenocarcinoma, whether these macroscopic imaging features have underlying biologic relevance is not clear and should be investigated further. Thirdly, disease free survival in the study is based on the previous studies. We did not evaluate the difference of 5-year disease-free survival rate between stage IA and stage IB with VPI (+) in lung adenocarcinoma, because the median follow-up time is short considering that treatment failures may occur up to several years. The survival rate is insufficient for statistics. Further validation should be done according to the survival.
In conclusion, a multi-feature-based radiomics signature by thin-section CT was designed to identify tumor-phenotypes of lung adenocarcinoma with visceral pleural invasion. The new radiomics biomarker exhibited as an independent prognostic factor in discriminating VPI and may provide a non-invasive opportunity for evaluating the prognosis in early-stage lung adenocarcinoma.

Materials and Methods
Patients. This study was approved by the institution of the First Affiliated Hospital of Nanjing Medical University (Nanjing, China) and all methods were carried out in accordance with the approved guidelines. All subjects provided written informed consent to participate in the study. We systematically reviewed 739 patients with peripheral lung adenocarcinoma who underwent chest CT scans with thin-section (1.0 mm) images from the period of January 2014 to December 2016. Inclusion criteria were as follows: (a) all the patients underwent surgical resection and diagnosed by pathologic examination; (b) pathological-N0M0 peripheral lung adenocarcinoma 3.0 cm or less in greatest dimension according to the eighth edition of TNM staging system 4 ; (c) thin-section CT scan was performed within 90 days before surgery; (d) available results for clinical data, including age, sex, smoking history, et al. A total of 327 Asiatic patients met all the inclusion criteria and 412 patients were excluded because of one or more of the following: (a) CT scan with intravenous administration of contrast materials (n = 180); (b) unsatisfactory imaging quality due to respiratory artifact during examination (n = 41); (c) pathological invasion to parietal pleura (PL3) (n = 50); (d) tumor size >3 cm (n = 100); (e) associated with separate tumor nodule as the primary tumor or directly invades any of the following structures: chest wall, phrenic nerve, parietal pericardium, diaphragm, mediastinum, heart, great vessels, trachea, recurrent laryngeal nerve, esophagus, vertebral body, and carina (n = 41). CT scanning. All the patients underwent unenhanced chest CT with 64-silce (Definition) or 128-slice (Definition AS+; Siemens, Malvern, Pa) row CT scanner with 1.0 mm slice-thickness and 0.8 mm reconstruction interval. The protocol was as follows: 100-120 kVp, mAs were set based on CARE Dose4D for exposure dose reduction. All images were reconstructed with a high-kernel (b60) with 512 × 512 matrix. Window settings: standard lung (window width, 1500 HU; window level, −600 HU) and mediastinum (window width, 350 HU; window level, 50 HU).
A pathologist with 10 years experiences who was blinded to the imaging findings evaluated the histopathologic patterns and T, N descriptors according to eighth edition TNM staging system 4 . Elastic stains were performed to clarify the status of VPI when initial hematoxylin and eosin stained slides showed that the tumors were adjacent to the pleura. VPI was classified according to the eighth edition of TNM classification: PL0 (T1) as lack of pleural invasion beyond the elastic layer, PL1 (T2) as invasion beyond the elastic layer, PL2 (T2) as invasion to the surface of the visceral pleura and PL3 (T3) as invasion of the parietal pleura. The differences in survival were statistically significant between PL0 and PL1, PL2 for either ≤3 cm or >3 cm in size. However, there were no statistically significant differences in survival between PL1 and PL2 3 . Thus, in the present study, PL1 and PL2 were combined into VPI (+) group and were upstaged to T2a in those lesions ≤3 cm in size, and PL0 were classified into VPI (−) group. We evaluated the difference of histopathological and CT radiomics features between these two groups.
Segmentation and morphological features extraction. Each nodule was automatic segmented by running on lungCAD software (Siemens SOMATOM Force CT). Firstly, two thoracic radiologists (author 2 and author 3 with 8-years and 6-years of experience in chest imaging, respectively) who were blinded to the pathologic result independently placed the longest diameter of the lesion. Secondly, precise edge of the entire-tumor volume of interest (VOI) was autosegmented by lungCAD. Visually identified mismatching, blood vessels and the chest wall adjacent to the margin of nodule were manually adjustment. Three-dimensional longest diameter and other 7 morphological features were computed separately by lungCAD (Fig. 4).

Radiomics features extraction and selection.
A total of 308 radiomics features were extracted and quantified by using AnalysisKit (GE Healthcare, China) for tumor phenotypes from whole VOI segmented previously. Two experienced radiologists who performed lesion segmentation independently extracted these features. Radiomics features are divided into four groups: I) shape, II) tumor intensity, III) texture, IV) wavelet features. Tumor intensity was estimated by using histogram analysis with 9 features. Then, 271 texture features, derived from the gray level co-occurrence (GLCM) and run length matrices (GLRLM), were extracted from CT scans. Finally, the coiflet wavelet transformation was used to compute 20 wavelet features, which are the transformed domain representations of the intensity and textural features.
Feature selections on the basis of reproducibility and redundancy were performed to prioritize these high-dimensional features. Firstly, concordance correlation coefficient (CCC) was used to test the reproducibility and stability of each imaging feature. Top 100 most stable features with CCC value ≥0.9 were kept. The equation is described as: where μ x and μ y are mean values of variance x and y; σ x , σ y are mean squares; andρrepresents the correlation coefficient of x and y. Then we removed redundant features with nearest neighbor distance <0.05. Eighty-three features were kept after adjusting redundancy.

Feature Selection and Classification.
In the present study, we implemented a robust recursive feature elimination (RFE) method based on SVM for feature selection. The RFE-SVM was performed to create an integrated radiomics data and returned a result with ranking features by recursively training on SVM. An iterative method was performed and the feature with smallest ranking score (contribution weight ω) was removed until cumulative ω of all desired features reached 80%.
SVM with radical basis function (RBF) kernel was applied for separating the labeled training data into two classes. The SVM classifier is considered as a supervised learning task, which projects the data into multidimensional space to separate two classes with a hyperplane. For SVM with RBF kernel, the equation is described as: i i 2 where x and x i are two input vectors, and Gamma (γ) controls the shape of the hyperplane. As patient numbers were relatively small, SVM classifiers were trained (cohort 1) and validated (cohort 2) using repeated (10 repeat iterations) and leaving one out ten-fold cross validation approach, in which, except one fold for validation(cohort 2), the other nine folds were applied for training (cohort 1). This procedure was repeated until each case in the database was used once in the validating set. As the direct output value of classifiers does not show probabilities of VPI, we converted the output values to the probabilities (Pi) by applying a sigmoid function as follows: x where x is the output value of classifiers. The value of Pi, which indicates the probabilities that the target lesion has VPI, was also termed as radiomic signature, as it integrated a multi-feature based radiomics information and indicated a cut-off point for the probability of VPI phenotype.
Performance Evaluation and Statistical Analysis. We validate the predictive performance in the validation cohort using Receiver Operating Characteristic (ROC) regression curve and quantified by using the areas under the ROC curves (AUC), referring to the method of DeLong et al. 17 . Diagnostic accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (+LR) and positive likelihood ratio (−LR) were calculated. Regression model of radiomics signature was generated and Akaike information criterion (AIC) was used as a measure of goodness of fit. Concordance index was used to assess the prognostic capability of radiomics signature (concordance index, 0-1). Univariate and multivariate logistic regression analysis was used to determine the prognostic factor of radiomics and significant radiomics features. Wilcoxon signed-rank test was used to validate the performance of single best-performing feature.