Differentiation between non-small cell lung cancer and radiation pneumonitis after carbon-ion radiotherapy by 18F-FDG PET/CT texture analysis

The differentiation of non-small cell lung cancer (NSCLC) and radiation pneumonitis (RP) is critically essential for selecting optimal clinical therapeutic strategies to manage post carbon-ion radiotherapy (CIRT) in patients with NSCLC. The aim of this study was to assess the ability of 18F-FDG PET/CT metabolic parameters and its textural image features to differentiate NSCLC from RP after CIRT to develop a differential diagnosis of malignancy and benign lesion. We retrospectively analyzed 18F-FDG PET/CT image data from 32 patients with histopathologically proven NSCLC who were scheduled to undergo CIRT and 31 patients diagnosed with RP after CIRT. The SUV parameters, metabolic tumor volume (MTV), total lesion glycolysis (TLG) as well as fifty-six texture parameters derived from seven matrices were determined using PETSTAT image-analysis software. Data were statistically compared between NSCLC and RP using Wilcoxon rank-sum tests. Diagnostic accuracy was assessed using receiver operating characteristics (ROC) curves. Several texture parameters significantly differed between NSCLC and RP (p < 0.05). The parameters that were high in areas under the ROC curves (AUC) were as follows: SUVmax, 0.64; GLRLM run percentage, 0.83 and NGTDM coarseness, 0.82. Diagnostic accuracy was improved using GLRLM run percentage or NGTDM coarseness compared with SUVmax (p < 0.01). The texture parameters of 18F-FDG uptake yielded excellent outcomes for differentiating NSCLC from radiation pneumonitis after CIRT, which outperformed SUV-based evaluation. In particular, GLRLM run percentage and NGTDM coarseness of 18F-FDG PET/CT images would be appropriate parameters that can offer high diagnostic accuracy.


Scientific Reports
| (2021) 11:11509 | https://doi.org/10.1038/s41598-021-90674-w www.nature.com/scientificreports/ discriminate between recurrence and post-treatment changes [10][11][12] , since 18 F-FDG PET/CT can noninvasively indirectly measure glucose metabolism in vivo. The maximum standardized uptake value (SUV max ) is the most prevalent parameter used to estimate tumor metabolic activity in 18 F-FDG PET/CT images. However, the SUV max is measured as the most numerous pixels in a region of interest; images show only areas of the highest intensity of 18 F-FDG uptake in a tumor and cannot reflect metabolic activity in whole tumors 13 . Therefore, the extent of active lesions in malignant tumors and active residual lesions after treatment is difficult to ascertain using SUV max accurately. To overcome these issues, metabolic tumor volume (MTV) and total lesion glycolysis (TLG) have been proposed as supplementary diagnostic indicators of tumor activity 14 .
Few studies have discriminated lung cancer from radiation pneumonitis (RP) in images based on accumulated 18 F-FDG. This suggests that 18 F-FDG uptake after RT might be due not only to recurrent tumors but also to RTinduced inflammation. The characterization of uptake heterogeneity is gaining popularity through radiomicsbased analyses that extract high throughput features based on intensity, shape, and the texture of uptake within regions of interest. Using textural features has improved the ability of 18 F-FDG PET/CT to discriminate abnormal from normal tissues and delineate lesions. Textural features derived from Neighboring Gray Tone Difference Matrices describe features such as coarseness, contrast, and busyness on PET images 15 . They can differentiate cancerous tumors of the head and neck from normal tissues 16 . However, whether textural features can discriminate malignant from benign lesions remains unknown. The value of quantitative heterogeneity in 18 F-FDG PET/ CT images for RP after CIRT has not also been investigated. The aim of this study was to assess the ability of 18 F-FDG PET/CT metabolic parameters and its textural image features to differentiate NSCLC from RP after CIRT to develop a differential diagnosis of malignancy and benign lesion.

Methods
Patients. We retrospectively analyzed 18 F-FDG PET/CT data obtained from 32 patients with NSCLC who underwent 18 F-FDG PET/CT before CIRT (50.0 GyE/day), and from 31 patients with RP who were diagnosed by biopsy or by clinical follow up at > 1 year after CIRT (50.0 GyE/day) for NSCLC. The NSCLC patients had not received chemotherapy before CIRT and underwent 18 F-FDG PET/CT at baseline.
This clinical study was approved by the Ethics Committee at the National Institute of Radiological Sciences (Approval No. 19-019), and written informed consent was obtained from all patients. This study was conducted as a research involving research participants in accordance with the principles outlined in the 1964 Declaration of Helsinki and its later amendments. The results of this retrospective study did not influence further therapeutic decision-making.
images were acquired at a mean of 60 min later after voiding from the top of the skull to the midthigh using an Aquiduo PET/CT scanner (Canon Corp., Japan). Aquiduo PET/CT scanner has the following technical characteristics: detector material, Lu 2 SiO 5 (Ce) (LSO); crystal size, 4.0 × 4.0 × 20 mm 3 ; detector ring diameter, 830 mm; transaxial field of view (FOV), 585 mm; axial FOV, 162 mm, coincidence window, 4.5 ns; energy window, 425-650 keV; maximum ring difference, 27; random correction, delayed; scatter correction, single scatter simulation. Emission data were acquired for 2-3 min per bed position 17 . The PET images were reconstructed using an iterative algorithm (the combination of Fourier rebinning and the ordered subsets expectation-maximization [FORE + 2D-OSEM]; 4 iterations, 14 subsets) with an 8-mm Gaussian filter, a 128 × 128 matrix (3.9 mm/pixel) and 81 slices (2 mm/slice). Time-of-flight and point spread function correction were not applied. The spatial resolution of this scanner according to NEMA NU 2-2007 is 6.5 mm in full width at half maximum (FWHM) at 10 mm off center. Whole-body spiral CT scanning proceeded under the following parameters: 120 kV; auto exposure control (noise level: SD 10); 512 × 512 matrix; beam pitch, 0.94; 2 mm × 16-row mode. The CT data were used for the attenuation correction.
Quantitative analysis. We used the PET/CT medical imaging viewer, PETSTAT (AdIn Research Inc., Tokyo, Japan) for texture analysis. Volumes of interest (VOI) on tumors were delineated using a threshold of 40% (40P) of the SUV max in each lesion. In the texture analysis of 18 F-FDG PET/CT in NSCLC, 40P had excellent inter-operator reproducibility of texture features and has proven tolerable in previous studies 18 . It is also recommended for heterogeneity determination of lesions affected by respiratory migration 19 .
VOI was set not to include physiological accumulation. We calculated The SUV parameters of SUV max , SUV peak , SUV mean , MTV, TLG as well as fifty-six texture parameters derived from seven matrices for each VOI. SUV peak is defined as the mean SUV in an 1 cm 3 sphere around the pixel with the highest uptake and is assumed to be less affected by image noise than SUV max . For images with noise properties typically associated with clinical PET images, SUV peak can provide a slightly more robust alternative. To extract the texture parameters, we first equalized histograms by rescaling the intensity within each ROI between the 1st and 99th percentiles of the ROI over 64 bins. Using 64 equally divided bins has been a common approach for image quantitation in radiomics analysis, and it also allows exploration of entire ranges of tumor signal intensity 20 .

Statistical analysis.
We compared all data between NSCLC and RP after CIRT using Wilcoxon rank-sum tests. Values lying nearest the upper left corner in ROC curves were considered to indicate optimal diagnostic accuracy. Sensitivity, specificity, and accuracy were calculated using appropriate cutoffs. Diagnostic accuracy was compared using areas under ROC curves (AUC), and ROC curves were also compared. All data were statistically analyzed using JMP v.14.2.0 software (SAS Institute Inc., Cary, NC, USA) and values with P < 0.05 were considered statistically significant.

Comparison of NSCLC and RP.
Statistical differences among parameters in the texture analysis were explored using Wilcoxon rank-sum tests ( Table 2). Most texture parameters significantly differed between NSCLC and RP (P < 0.05), whereas most SUV parameters did not.
Diagnostic accuracy. Figures 1 and 2 show the outcomes of ROC analyses for each parameter. The AUC of SUV max , SUV peak , MTV and TLG were respectively, 0.64, 0.63, 0.86 and 0.75. On the other hand, the AUC of the gray-level run-length matrix run percentage (GLRLM) and neighborhood gray-tone difference matrix coarseness (NGTDM) were 0.83 and 0.82, respectively, and significantly differed from the AUC of the SUV max ( Table 3). Figure 3 shows representative images. The GLRLM run percentage and NGTDM coarseness were significantly higher in NSCLC, than in RP, whereas SUV max values of both lesions were similar.

Discussion
To develop the differential diagnosis of malignancy and benign lesions, we compared whether texture parameters could more accurately differentiate NSCLC from RP in patients after CIRT than SUV parameters such as SUV max and MTV. We found a possible relationship of textural imaging parameters for GLRLM run percentage and NGTDM coarseness of tumor heterogeneity measured on 18 F-FDG PET/CT images, which would lead the ability to differentiate RP from NSCLC. The MTV and TLG were significantly increased, whereas the SUV max was reduced in RP compared with NSCLC. Buyger et al. reported that a fixed threshold could substantially underestimate TLG and MTV in lesions with high 18 F-FDG uptake 21 . Therefore, a decrease in SUV max might erroneously increase MTV because a larger volume of a less active tumor will be included in the MTV 22 . The SUV is not helpful for differentiating benign from malignant lesions 23 , and MTV and TLG are calculated based on SUV. Thus, we considered that MTV and TLG are not suitable for differentiating NSCLC and RP.
Almost all texture parameters significantly differed and had significantly better diagnostic ability than SUV max . In particular, the GLRLM run percentage and NGTDM coarseness would be appropriate parameters due to their high diagnostic accuracy. The GLRLM run percentage corresponded to the number of homogeneous runs of a specific intensity voxel within an image. We found that the GLRLM run percentage was significantly reduced in RP compared with NSCLC. This suggests that 18 F-FDG accumulation is more heterogeneous in NSCLC than in RP. Hotta et al. reported that the GLRLM run percentage using a machine learning in 11 C-methionine PET images could distinguish recurrent brain tumors from radiation necrosis and that radiation necrosis was significantly reduced compared with recurrent brain tumors 24 . Adding the machine learning model might allow to create a model with higher sensitivity and specificity than this model proposed for benign and malignant lung differentiation. The NGTDM coarseness is based on differences between each voxel and neighboring voxels in adjacent image planes, and thus measures granularity within an image 25 . Chen et al. found significantly lower NGTDM coarseness in benign than malignant solitary pulmonary nodules 26 . The present findings revealed significantly lower NGTDM coarseness in RP than NSCLC, which is considered to reflect biological differences between benign and malignant lesions. Also, NGTDM coarseness is useful in discriminating benign and malignant cancers of the head and neck 27 .
The present findings of texture features are considered to reflect the heterogeneity of 18 F-FDG accumulation in lesions. Several studies have found that the intratumoral heterogeneity of 18 F-FDG uptake in tumors might  Table 2) were useful for differentiating NSCLC from RP after CIRT. Gunma University has recently reported that SUV max and MTV had better predictive or prognostic power for CIRT-treated NSCLC patients 31,32 . Metabolic information of 18 F-FDG PET/CT imaging reflects a biological property of tumor as well as metabolically diverse components, including inflammatory tissues and surrounding normal tissues such as vessels, bronchi, and pleura after CIRT 29,33 . 18 F-FDG PET/CT is considered to be playing key metabolic imaging in the management of CIRT for NSCLC.
This study had some limitations. Texture parameters could be affected by lesion size, histological tumor type, and inflammatory status 31 . The textural analysis also requires evaluation of so many variables, and the sample size is relatively small for this number of variables. Therefore, further studies in more patients with different types of tumors are needed. In addition, due to the limited spatial resolution of PET, image noise and partial volume effects may affect the results of this study 33 .

Conclusion
We determined that texture parameters can differentiate NSCLC from RP after CIRT more accurately than SUV parameters such as SUV max and MTV. The intratumoral heterogeneity of 18 F-FDG uptake evaluated by texture analysis yielded improved diagnostic ability for differentiating NSCLC from RP after CIRT that outperformed Figure 2. Receiver operating characteristic curves of ability of texture parameters to discriminate NSCLC from RP. GLRLM, gray-level run-length matrix run percentage; GLSZM, gray-level size-zone matrix intensity variability; NGLCM, normalized gray-level cooccurrence matrix dissimilarity; NGTDM, neighborhood graytone difference matrix coarseness; SUV Histogram, SUV histogram variance.