Introduction

Lung cancer is one of the most prevalent cancers worldwide, with non-small cell lung cancer (NSCLC) accounting for 85% to 90% of all forms of lung cancer1. Radiotherapy is an option for patients who are unable to tolerate or who decline surgery. Stereotactic body radiotherapy2 and particle therapy3 have been widely applied to such patients. Carbon‐ion radiotherapy (CIRT) has excellent dose‐localizing properties4, and can thus deliver a high dose to a target while avoiding adjacent critical organs at risk. The high radiosensitivity of the lungs constitutes a critical dose-limiting factor for treating thoracic tumors with radiation5,6. Various pulmonary side effects can arise after CIRT, such as radiographic lung damage, pleural reactions, pneumonitis, or fibrosis7,8.

Generally, CT cannot distinguish necrotic tumors or fibrotic scar tissues from residual or recurrent tumors9, which leads to a delay before undergoing repeat CIRT. On the other hand, metabolic imaging using 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography/computed tomography (18F-FDG PET/CT) can discriminate between recurrence and post-treatment changes10,11,12, since 18F-FDG PET/CT can noninvasively indirectly measure glucose metabolism in vivo.

The maximum standardized uptake value (SUVmax) is the most prevalent parameter used to estimate tumor metabolic activity in 18F-FDG PET/CT images. However, the SUVmax is measured as the most numerous pixels in a region of interest; images show only areas of the highest intensity of 18F-FDG uptake in a tumor and cannot reflect metabolic activity in whole tumors13. Therefore, the extent of active lesions in malignant tumors and active residual lesions after treatment is difficult to ascertain using SUVmax accurately. To overcome these issues, metabolic tumor volume (MTV) and total lesion glycolysis (TLG) have been proposed as supplementary diagnostic indicators of tumor activity14.

Few studies have discriminated lung cancer from radiation pneumonitis (RP) in images based on accumulated 18F-FDG. This suggests that 18F-FDG uptake after RT might be due not only to recurrent tumors but also to RT-induced inflammation. The characterization of uptake heterogeneity is gaining popularity through radiomics-based analyses that extract high throughput features based on intensity, shape, and the texture of uptake within regions of interest. Using textural features has improved the ability of 18F-FDG PET/CT to discriminate abnormal from normal tissues and delineate lesions. Textural features derived from Neighboring Gray Tone Difference Matrices describe features such as coarseness, contrast, and busyness on PET images15. They can differentiate cancerous tumors of the head and neck from normal tissues16. However, whether textural features can discriminate malignant from benign lesions remains unknown. The value of quantitative heterogeneity in 18F-FDG PET/CT images for RP after CIRT has not also been investigated. The aim of this study was to assess the ability of 18F-FDG PET/CT metabolic parameters and its textural image features to differentiate NSCLC from RP after CIRT to develop a differential diagnosis of malignancy and benign lesion.

Methods

Patients

We retrospectively analyzed 18F-FDG PET/CT data obtained from 32 patients with NSCLC who underwent 18F-FDG PET/CT before CIRT (50.0 GyE/day), and from 31 patients with RP who were diagnosed by biopsy or by clinical follow up at > 1 year after CIRT (50.0 GyE/day) for NSCLC. The NSCLC patients had not received chemotherapy before CIRT and underwent 18F-FDG PET/CT at baseline.

This clinical study was approved by the Ethics Committee at the National Institute of Radiological Sciences (Approval No. 19–019), and written informed consent was obtained from all patients. This study was conducted as a research involving research participants in accordance with the principles outlined in the 1964 Declaration of Helsinki and its later amendments. The results of this retrospective study did not influence further therapeutic decision-making.

18F-FDG PET/CT

Patients fasted for at least 6 h before being injected with 4 MBq/kg 18F-FDG. Whole-body images were acquired at a mean of 60 min later after voiding from the top of the skull to the midthigh using an Aquiduo PET/CT scanner (Canon Corp., Japan). Aquiduo PET/CT scanner has the following technical characteristics: detector material, Lu2SiO5(Ce) (LSO); crystal size, 4.0 × 4.0 × 20 mm3; detector ring diameter, 830 mm; transaxial field of view (FOV), 585 mm; axial FOV, 162 mm, coincidence window, 4.5 ns; energy window, 425–650 keV; maximum ring difference, 27; random correction, delayed; scatter correction, single scatter simulation. Emission data were acquired for 2–3 min per bed position17. The PET images were reconstructed using an iterative algorithm (the combination of Fourier rebinning and the ordered subsets expectation–maximization [FORE + 2D-OSEM]; 4 iterations, 14 subsets) with an 8-mm Gaussian filter, a 128 × 128 matrix (3.9 mm/pixel) and 81 slices (2 mm/slice). Time-of-flight and point spread function correction were not applied. The spatial resolution of this scanner according to NEMA NU 2–2007 is 6.5 mm in full width at half maximum (FWHM) at 10 mm off center. Whole-body spiral CT scanning proceeded under the following parameters: 120 kV; auto exposure control (noise level: SD 10); 512 × 512 matrix; beam pitch, 0.94; 2 mm × 16-row mode. The CT data were used for the attenuation correction.

Quantitative analysis

We used the PET/CT medical imaging viewer, PETSTAT (AdIn Research Inc., Tokyo, Japan) for texture analysis. Volumes of interest (VOI) on tumors were delineated using a threshold of 40% (40P) of the SUVmax in each lesion. In the texture analysis of 18F-FDG PET/CT in NSCLC, 40P had excellent inter-operator reproducibility of texture features and has proven tolerable in previous studies18. It is also recommended for heterogeneity determination of lesions affected by respiratory migration19.

VOI was set not to include physiological accumulation. We calculated The SUV parameters of SUVmax, SUVpeak, SUVmean, MTV, TLG as well as fifty-six texture parameters derived from seven matrices for each VOI. SUVpeak is defined as the mean SUV in an 1 cm3 sphere around the pixel with the highest uptake and is assumed to be less affected by image noise than SUVmax. For images with noise properties typically associated with clinical PET images, SUVpeak can provide a slightly more robust alternative. To extract the texture parameters, we first equalized histograms by rescaling the intensity within each ROI between the 1st and 99th percentiles of the ROI over 64 bins. Using 64 equally divided bins has been a common approach for image quantitation in radiomics analysis, and it also allows exploration of entire ranges of tumor signal intensity20.

Statistical analysis

We compared all data between NSCLC and RP after CIRT using Wilcoxon rank-sum tests. Values lying nearest the upper left corner in ROC curves were considered to indicate optimal diagnostic accuracy. Sensitivity, specificity, and accuracy were calculated using appropriate cutoffs. Diagnostic accuracy was compared using areas under ROC curves (AUC), and ROC curves were also compared. All data were statistically analyzed using JMP v.14.2.0 software (SAS Institute Inc., Cary, NC, USA) and values with P < 0.05 were considered statistically significant.

Results

Characteristics of patients with NSCLC and RP

The 63 patients (age, 76.8 ± 8.2 years; male, n = 47 [74.6%]) included eight with squamous cell carcinoma (SCC), 33 with adenocarcinoma (ADC), 2 with neuroendocrine carcinoma, and 18 with unclassified non-small cell carcinoma. The sizes of the NSCLC and RP were 2.90 ± 1.59 (range, 71.9–19.0) and 5.08 ± 2.12 (range, 82.4–33.4) cm, respectively. Gender, and tumor size significantly differed between the NSCLC and RP groups (P < 0.05; Table 1).

Table 1 Characteristic of patients with NSCLC and RP.

Comparison of NSCLC and RP

Statistical differences among parameters in the texture analysis were explored using Wilcoxon rank-sum tests (Table 2). Most texture parameters significantly differed between NSCLC and RP (P < 0.05), whereas most SUV parameters did not.

Table 2 Significant difference of each parameters.

Diagnostic accuracy

Figures 1 and 2 show the outcomes of ROC analyses for each parameter. The AUC of SUVmax, SUVpeak, MTV and TLG were respectively, 0.64, 0.63, 0.86 and 0.75. On the other hand, the AUC of the gray-level run-length matrix run percentage (GLRLM) and neighborhood gray-tone difference matrix coarseness (NGTDM) were 0.83 and 0.82, respectively, and significantly differed from the AUC of the SUVmax (Table 3).

Figure 1
figure 1

Receiver operating characteristic curves of ability of SUV parameters to discriminate NSCLC from RP.

Figure 2
figure 2

Receiver operating characteristic curves of ability of texture parameters to discriminate NSCLC from RP. GLRLM, gray-level run-length matrix run percentage; GLSZM, gray-level size-zone matrix intensity variability; NGLCM, normalized gray-level cooccurrence matrix dissimilarity; NGTDM, neighborhood gray-tone difference matrix coarseness; SUV Histogram, SUV histogram variance.

Table 3 18F-FDG PET/CT metrics and cutoffs for differentiation between NSCLC and RP.

Comparison of NSCLC and RP images

Figure 3 shows representative images. The GLRLM run percentage and NGTDM coarseness were significantly higher in NSCLC, than in RP, whereas SUVmax values of both lesions were similar.

Figure 3
figure 3

Representative CT and fused PET/CT images of NSCLC and RP. (A) CT image shows mass in right middle lobe of 79-year-old male with NSCLC. (B) Axial fused PET/CT image shows high 18F-FDG uptake (SUVmax, 4.32) and heterogeneous 18F-FDG distribution (GLRLM run percentage, 0.81; NGTDM coarseness, 0.0068). (C) CT Image of 77-year-old male with RP shows mass-like attenuation in left upper lobe. (D) Axial fused PET/CT image shows high 18F-FDG uptake (SUVmax, 3.75) and homogeneous 18F-FDG distribution (GLRLM run percentage, 0.65; GLSZM coarseness, 0.0017).

Discussion

To develop the differential diagnosis of malignancy and benign lesions, we compared whether texture parameters could more accurately differentiate NSCLC from RP in patients after CIRT than SUV parameters such as SUVmax and MTV. We found a possible relationship of textural imaging parameters for GLRLM run percentage and NGTDM coarseness of tumor heterogeneity measured on 18F-FDG PET/CT images, which would lead the ability to differentiate RP from NSCLC.

The MTV and TLG were significantly increased, whereas the SUVmax was reduced in RP compared with NSCLC. Buyger et al. reported that a fixed threshold could substantially underestimate TLG and MTV in lesions with high 18F-FDG uptake21. Therefore, a decrease in SUVmax might erroneously increase MTV because a larger volume of a less active tumor will be included in the MTV22. The SUV is not helpful for differentiating benign from malignant lesions23, and MTV and TLG are calculated based on SUV. Thus, we considered that MTV and TLG are not suitable for differentiating NSCLC and RP.

Almost all texture parameters significantly differed and had significantly better diagnostic ability than SUVmax. In particular, the GLRLM run percentage and NGTDM coarseness would be appropriate parameters due to their high diagnostic accuracy. The GLRLM run percentage corresponded to the number of homogeneous runs of a specific intensity voxel within an image. We found that the GLRLM run percentage was significantly reduced in RP compared with NSCLC. This suggests that 18F-FDG accumulation is more heterogeneous in NSCLC than in RP. Hotta et al. reported that the GLRLM run percentage using a machine learning in11C-methionine PET images could distinguish recurrent brain tumors from radiation necrosis and that radiation necrosis was significantly reduced compared with recurrent brain tumors24. Adding the machine learning model might allow to create a model with higher sensitivity and specificity than this model proposed for benign and malignant lung differentiation. The NGTDM coarseness is based on differences between each voxel and neighboring voxels in adjacent image planes, and thus measures granularity within an image25. Chen et al. found significantly lower NGTDM coarseness in benign than malignant solitary pulmonary nodules26. The present findings revealed significantly lower NGTDM coarseness in RP than NSCLC, which is considered to reflect biological differences between benign and malignant lesions. Also, NGTDM coarseness is useful in discriminating benign and malignant cancers of the head and neck27.

The present findings of texture features are considered to reflect the heterogeneity of 18F-FDG accumulation in lesions. Several studies have found that the intratumoral heterogeneity of 18F-FDG uptake in tumors might be useful for evaluating therapeutic responses and predicting prognoses of NSCLC20,28,29, while there is a review paper by Han, S. et al., showing limited evidence to support the prognostic value of texture analysis in 18F-FDG PET in lung cancer30. Yet, little is understood about the application of 18F-FDG PET/CT texture analysis to the differential diagnosis of NSCLC and RP. The effects of CIRT on NSCLC can be highly beneficial. The application of 18F-FDG PET/CT texture analysis should improve the ability to distinguish NSCLC from RP. If so, then the amount of time required to determine a treatment regimen such as repeated irradiation could be decreased.

Most 18F-FDG PET/CT texture parameters (51 out of 56 parameters; Table 2) were useful for differentiating NSCLC from RP after CIRT. Gunma University has recently reported that SUVmax and MTV had better predictive or prognostic power for CIRT-treated NSCLC patients31,32. Metabolic information of 18F-FDG PET/CT imaging reflects a biological property of tumor as well as metabolically diverse components, including inflammatory tissues and surrounding normal tissues such as vessels, bronchi, and pleura after CIRT29,33. 18F-FDG PET/CT is considered to be playing key metabolic imaging in the management of CIRT for NSCLC.

This study had some limitations. Texture parameters could be affected by lesion size, histological tumor type, and inflammatory status31. The textural analysis also requires evaluation of so many variables, and the sample size is relatively small for this number of variables. Therefore, further studies in more patients with different types of tumors are needed. In addition, due to the limited spatial resolution of PET, image noise and partial volume effects may affect the results of this study33.

Conclusion

We determined that texture parameters can differentiate NSCLC from RP after CIRT more accurately than SUV parameters such as SUVmax and MTV. The intratumoral heterogeneity of 18F-FDG uptake evaluated by texture analysis yielded improved diagnostic ability for differentiating NSCLC from RP after CIRT that outperformed SUV. In particular, GLRLM run percentage and NGTDM coarseness are appropriate parameters with high diagnostic accuracy. Our findings provide important information for understanding 18F-FDG PET/CT imaging in the management of CIRT for NSCLC.