Peritumoral radiomics features on preoperative thin-slice CT images can predict the spread through air spaces of lung adenocarcinoma

The spread through air spaces (STAS) is recognized as a negative prognostic factor in patients with early-stage lung adenocarcinoma. The present study aimed to develop a machine learning model for the prediction of STAS using peritumoral radiomics features extracted from preoperative CT imaging. A total of 339 patients who underwent lobectomy or limited resection for lung adenocarcinoma were included. The patients were randomly divided (3:2) into training and test cohorts. Two prediction models were created using the training cohort: a conventional model based on the tumor consolidation/tumor (C/T) ratio and a machine learning model based on peritumoral radiomics features. The areas under the curve for the two models in the testing cohort were 0.70 and 0.76, respectively (P = 0.045). The cumulative incidence of recurrence (CIR) was significantly higher in the STAS high-risk group when using the radiomics model than that in the low-risk group (44% vs. 4% at 5 years; P = 0.002) in patients who underwent limited resection in the testing cohort. In contrast, the 5-year CIR was not significantly different among patients who underwent lobectomy (17% vs. 11%; P = 0.469). In conclusion, the machine learning model for STAS prediction based on peritumoral radiomics features performed better than the C/T ratio model.


Scientific Reports
| (2022) 12:10323 | https://doi.org/10.1038/s41598-022-14400-w www.nature.com/scientificreports/ Radiomics is a quantitative approach that makes use of statistical patterns in tumor CT images to predict tumor pathology, tumor treatment response, or cancer prognosis using a large number of features extracted from medical images. Numerous studies have shown that radiomics features could quantify tumor characteristics and can potentially be applied as clinical biomarkers [11][12][13][14][15] . Several authors applied the radiomics approach to the prediction of STAS. Most previous studies have investigated the association between STAS and radiomics features inside the tumor 16,17 . Since STAS is a pathological finding present at the tumor edge, radiomics features at the tumor margins on preoperative CT images may lead to a more accurate prediction of STAS. Zhuo et al. had evaluated peritumoral radiomics features from preoperative CT, which resulted in no significant improvement in the prediction performance of STAS compared with a clinical model including maximal diameter of the solid component and mediastinal node metastasis 18 . This may be because the radiomics features were obtained from regions of interest (ROIs) outside the tumor contour, which may not truly represent the tumor edge characteristics associated with STAS, as described earlier, because of the limits of CT resolution. The peritumoral ring-shaped ROI, which contains both inside and outside the tumor edge, may overcome this limitation.
In the present study, we investigated the performance of machine learning models based on peritumoral radiomics features, aiming to improve the prediction performance of STAS in comparison with the conventional method using the C/T ratio.

Methods
Ethics. All study procedures complied with the 1964 Declaration of Helsinki and its later amendments. The study was approved by the Ethics Committee at Kyoto University Graduate School and Faculty of Medicine (approval no. R2272). As this study was performed retrospectively, the requirement for informed consent was waived.
Patients. From January 2007 to December 2015, 802 patients with pathologically confirmed lung adenocarcinoma were identified from our surgery database. Of these, 463 patients were excluded because of induction chemotherapy (n = 23), multiple lung cancer nodules (n = 69), absence of thin-slice plain CT (n = 367), tumor diameter greater than 5 cm (n = 2), and presence of lymph node metastasis (n = 2). The remaining 339 patients were included in the analyses (Fig. 1).
Histological evaluation. Two experienced pathologists reviewed the hematoxylin and eosin tissue sections with a Nikon Eclipse 80i optical microscope (Nikon Corporation, Tokyo, Japan) according to the WHO definitions of STAS. The edge of the main tumor was defined as a smooth surface that could be easily recognized by low-power visual field examination. STAS was defined as tumor aggregates floating in the air cavity at least one alveolus away.
Image acquisition. CT scans were performed using a 64-detector-row CT scanner (Aquillion 64, Canon Medical Systems, Otawara, Japan) or a 320-detector-row scanner (Aquillion ONE, Canon Medical Systems). Images were reconstructed with a soft-tissue kernel (FC11, 13) and a slice thickness of 1 mm for radiomics analysis and with lung kernel (FC51) and a slice thickness of 0.5 mm for evaluation of the C/T ratio, using a filtered back-projection algorithm. Table E1 enumerates the detailed scan parameters.
Radiological evaluation of the C/T ratio and the type of nodule. The largest diameter of the whole tumor, the largest diameter of consolidation (solid part), and types of nodules (solid, part-solid, and ground- www.nature.com/scientificreports/ glass nodule) were determined by an experienced radiation oncologist (K.T. with ten years of experience in radiotherapy for lung cancer and in lung cancer related-image interpretation). A board-certified radiologist (R.S. with 14 years of experience in lung image interpretation) independently confirmed the results and consensus was reached by discussion in the event of disagreement. All cases were anonymized and both readers were blinded to the presence or absence of STAS and to clinical outcomes. The largest diameter of the tumor was measured on the axial, coronal, or sagittal planes of the CT in the lung window (window level, − 600 HU; window width, 1500 HU). The largest consolidation diameter was measured on the same plane where the largest tumor diameter was measured.
Tumor segmentation and feature extraction. Peritumoral ROI was defined as a ring-shaped ROI 5 mm inward and 5 mm outward from the tumor surface, excluding surrounding soft tissues, such as the chest wall or mediastinum. A radiation oncologist (K.T.) segmented the peritumoral ROIs using 3D Slicer (version 4.10.2), which is a free, open source and multi-platform software package for medical, biomedical, and related imaging research (https:// www. slicer. org/). Details on the segmentation procedures are given in Figure E1. Segmentation in randomly selected patients was also performed by a radiologist (R.S.) to assess the reproducibility of radiomics features. Dice coefficients were calculated to compare the lesion segmentation and assess the interobserver variability. The radiomics features were extracted from peritumoral ROIs using PyRadiomics (version 3.0), supported by the image biomarker standardization initiative (IBSI) 19 . All slices were resampled to 1 × 1 mm 2 in the horizontal and vertical directions before the feature extraction. The features included 14 shapes, 18 first-order, 22 gray level co-occurrence matrices (GLCM), 14 gray level dependence matrices (GLDM), 16 gray level size zone matrices (GLSZM), 16 gray-level run-length matrices (GLRLM), and 5 neighboring gray-tone difference matrices (NGTDM). In addition to the original image, images processed with Laplacian of Gaussian (LoG) and coiflet wavelet filters were applied for six feature classes (first-order, GLCM, GLDM, GLSZM, GLRLM, and NGTDM). Consequently, 1288 features were extracted from each ROI. A complete list of radiomics features is provided in Table E2.
Model development. The patient cohort was randomly divided into training and testing cohorts (3:2) using the two stratification factors (the presence of STAS and nodule types). We developed two models for the prediction of STAS (Fig. 2) in the present study. One was a machine learning model based on peritumoral radiomics features (peritumoral radiomics model). The other was a logistic regression model based on the tumor C/T ratio (C/T ratio model). www.nature.com/scientificreports/ Before developing the peritumoral radiomics model, we selected non-redundant, reproducible features. The intraclass correlation coefficients (ICCs) were calculated from ROIs independently by two researchers to quantify the interobserver reproducibility of the radiomics features. An ICC of > 0.75 was considered reproducible 20 . The absolute pairwise Spearman's correlation coefficient values were calculated to remove redundant features in the training set (|ρ| > 0.7).
Finally, the peritumoral radiomics model was developed using the least absolute shrinkage and selection operator (LASSO) classification algorithm (Python scikit-learn environment, version 0.22.1) as a peritumoral radiomics model. The regularization parameters for LASSO were tuned using fivefold cross-validation on the training dataset. The regularization parameters of the model with the highest AUC in the fivefold cross validation were used to create the final model.
Evaluation of predictive performance. The predictive performance of the models was compared using the area under the curve (AUC) of the receiver operating characteristic curve. The 95% confidence interval (CI) for the AUC was calculated by bootstrapping with 2,000 iterations. For comparison, the P value was calculated by the DeLong method using the pROC package in R (version 1.16.1). The cut-off value dividing the cohort into STAS high-risk and STAS low-risk groups was selected to maximize Youden's index for each model.
The follow-up and survival periods were calculated from the surgery day. The cumulative incidence of recurrence (CIR) was calculated using competing risk analysis, with death without recurrence regarded as a competing event. The differences in CIR between the groups were tested by the Gray's test using the cmprsk package in R (version 2.2-9). The level of significance was set at P < 0.05.

Results
Patients. The patient cohort was divided into the training (n = 203) and testing (n = 136) cohorts. STAS was positive in 57 patients (28%) in the training cohort and 38 patients (28%) in the testing cohort. The demographic and clinical characteristics of the patients are shown in Table 1. There were no statistically significant differences in the patient characteristics between the training and testing cohorts.   Table 2), respectively. The cut-off values between the STAS high-and low-risk groups were determined to be 0.53 and 0.54 for the peritumoral radiomics and the C/T ratio models, respectively.
Predictive performance. The AUCs in the testing cohort of the peritumoral radiomics model and the C/T ratio model were 0.76 (95% CI 0.67-0.84) and 0.70 (95% CI 0.61-0.78), respectively (Fig. 4). The predictive performance of the peritumoral radiomics model was significantly higher than that of the C/T ratio model (P = 0.045, Table 2). The distribution of feature values used in the peritumoral radiomics model in the training and testing cohorts is shown in Fig. 5. In the testing cohort, patients at STAS-high risk according to the radiomics model were significantly associated with greater age, lobectomy, larger tumor diameter with higher C/T ratio compared with those at low risk ( Table 3). The median follow-up duration of the test group was 58 months (range 1-130 months). In the test group, 61 patients (45%) underwent limited resection (partial resection [n = 13] and segmentectomy [n = 48]), and 75 (55%) underwent lobectomy. In the limited resection group, 6 (10%) patients developed recurrence (locoregional [n = 4] and distant [n = 5]), and two (3%) died without recurrence. In contrast, 10 (13%) patients developed recurrence (locoregional [n = 4] and distant [n = 8]), and 5 (7%) died without recurrence in the lobectomy group.
In the testing cohort, the patients were divided into STAS high-risk and STAS low-risk groups based on the predicted probability calculated from the peritumoral radiomics model. In the limited resection group, the risk of recurrence was significantly higher in the STAS high-risk group than that in the STAS low-risk group (5-year CIR, 44% vs. 4%; P = 0.002) in the limited resection group (Fig. 6a). In contrast, in the lobectomy group, the risk of recurrence was not significantly different between the STAS high-risk and STAS low-risk groups (5-year CIR, 17% vs. 11%; P = 0.469) (Fig. 6b).

Discussion
We demonstrated that the model using radiomics features extracted from the peritumoral ROI could significantly improve the prediction performance of STAS as compared to the conventional model using C/T ratio and could also predict the prognosis after limited surgery in lung adenocarcinoma. To the best of our knowledge, no studies  www.nature.com/scientificreports/ so far have used STAS prediction models to predict the prognosis after limited resection. Our prediction model had the potential to diagnose patients who benefit from limited resection. We focused on the radiomics features at the tumor margins (peritumoral radiomics) to predict the presence of STAS since STAS is a pathological finding found at the tumor margins. Toyokawa et al. reported that imaging findings of nodule margins, such as peritumoral notch and surrounding GGO, are independent predictive factors for the presence of STAS 10 . In the study by Zhuo et al., radiomics analysis was performed focusing on the outer region of the tumor, but no improvement in prediction was observed 18 . In their report, fine characteristics of the tumor margins might not have been adequately included. In our study, the marginal nature of the tumor was captured in detail by adding 5-mm margins for both inside and outside the tumor. According to Kadota et al., the median distance of STAS was 1.5 mm (range 0.2-8.5) from the tumor edge, and a 5-mm margin outside the      www.nature.com/scientificreports/ As with pathological STAS, the risk of postoperative recurrence was significantly higher in the high-risk STAS group than that in the low-risk STAS group only in the limited resection group. This implies that our model could be used for additional decision-making information when selecting the surgical technique. Indeed, in the STAS low-risk group, more patients underwent limited resection compared with the STAS high. That might reflect that surgeons chose a less invasive surgery for low-risk patients according to the tumor size and C/T ratio. Masai et al. reported that the presence of STAS and tumor margins < 10 mm are significant risk factors for local recurrence in early-stage lung cancer after limited resection, and preoperative prediction of this may allow optimization of the resection margin 24 .
Our prediction model would be applied to medically inoperable patients with early-stage lung cancer, where stereotactic body radiation therapy (SBRT) is delivered. In the SBRT cases, evaluation of the presence of STAS in pathological specimens is not possible. The clinical target volume (CTV), which is the extent of tumor microinvasion, is usually defined as the same as gross tumor volume in the planning of SBRT 25,26 . An appropriate CTV margin may help local control if an accurate prediction of STAS is possible.
There are several limitations to this study. First, this was a single-center retrospective study, and prospective external model validation is needed. Second, patients who underwent limited resection included those who could not undergo lobectomy because of complications or elderly age and were subject to death due to causes other than lung cancer. The usefulness of this model in predicting the probability of STAS as a basis for clinical decision-making needs to be verified in a randomized study. Third, the radiomics features used in this model were not compared with imaging findings interpreted by radiologists; therefore, further studies are needed to identify the significance of those features. Fourth, our prediction model was solely based on imaging features because our main hypothesis was that the imaging features could reflect the information of the existence of STAS. However, clinical factors such as age, gender, and serum carcinoembryonic antigen are also reported to be predictors of STAS, and the addition of these factors may lead to a more predictive model 27 . Lastly, there is room to further reduce data sampling bias by applying sophisticated methods to building a prediction model, such as nested cross validation; however, we needed to leave a test cohort aside in advance for the following survival outcome analysis.
In conclusion, we developed a machine learning model based on peritumoral radiomics features. Its prediction performance is superior compared to that of the conventional model using the C/T ratio. Our model could further predict prognosis after limited surgery, as well as pathological STAS.