Diagnostic value of baseline 18FDG PET/CT skeletal textural features in follicular lymphoma

At present, 18F-fluorodesoxyglucose (18FDG) positron emission tomography (PET)/computed tomography (CT) cannot be used to omit a bone marrow biopsy (BMB) among initial staging procedures in follicular lymphoma (FL). The additional diagnostic value of skeletal textural features on baseline 18FDG-PET/CT in diffuse large B-cell lymphoma (DLBCL) patients has given promising results. The aim of this study is to evaluate the value of 18FDG-PET/CT radiomics for the diagnosis of bone marrow involvement (BMI) in FL patients. This retrospective bicentric study enrolled newly diagnosed FL patients addressed for baseline 18FDG PET/CT. For visual assessment, examinations were considered positive in cases of obvious bone focal uptakes. For textural analysis, the skeleton volumes of interest (VOIs) were automatically extracted from segmented CT images and analysed using LifeX software. BMB and visual assessment were taken as the gold standard: BMB −/PET − patients were considered as bone-NEGATIVE patients, whereas BMB +/PET −, BMB −/PET + and BMB +/PET + patients were considered bone-POSITIVE patients. A LASSO regression algorithm was used to select features of interest and to build a prediction model. Sixty-six consecutive patients were included: 36 bone-NEGATIVE (54.5%) and 30 bone-POSITIVE (45.5%). The LASSO regression found variance_GLCM, correlation_GLCM, joint entropy_GLCM and busyness_NGLDM to have nonzero regression coefficients. Based on ROC analysis, a cut-off equal to − 0.190 was found to be optimal for the diagnosis of BMI using PET pred.score. The corresponding sensitivity, specificity, PPV and NPV values were equal to 70.0%, 83.3%, 77.8% and 76.9%, respectively. When comparing the ROC AUCs with using BMB alone, visual PET assessment or PET pred.score, a significant difference was found between BMB versus visual PET assessments (p = 0.010) but not between BMB and PET pred.score assessments (p = 0.097). Skeleton texture analysis is worth exploring to improve the performance of 18FDG-PET/CT for the diagnosis of BMI at baseline in FL patients.

www.nature.com/scientificreports/ possible pain, bleeding or infection complications 19 . Some studies have shown that a combination of PET and BMB could be an alternative for a more accurate BMI assessment than either PET or BMB alone 20,21 . Today, there is a growing interest in haematology in using alternatives to visual or semiquantitative PET assessments that are based on textural features (TFs) 22,23 . Indeed, the basic visual interpretation of diffuse bone marrow involvement without focal bone lesions on PET can be difficult, leading to false-negative results. The diagnostic value of skeletal TFs compared to BMB and PET visual analysis on baseline 18 FDG PET/CT in DLBCL patients has been demonstrated 24 . By extrapolation, we assume that the quantification of the metabolic heterogeneity of the skeleton could also significantly improve the bone pretherapeutic evaluation in FL patients. Therefore, the aim of this study was to evaluate the value of textural features (TFs) for the diagnosis of BMI.

Results
Population characteristics. From the 113 FL patients identified from our database, 66 patients were ultimately included. Twenty-one patients were excluded because of missing BMB and 26 because of missing baseline 18 FDG PET/CT. Fifty-nine patients were scanned on the Biograph TrueV Pet system, and seven were scanned on the Vereos PET system. There were 36 bone-NEGATIVE patients (54.5%) and 30 bone-POSITIVE patients (45.5%). Among the bone-POSITIVE patients, there were four BMB −/PET VISU + patients (13.3%), 14 BMB +/PET-VISU − patients (46.7%) and 12 BMB +/PET VISU + patients (40.0%). Focusing on BMB −/PET VISU + patients, hypermetabolic lesions were located towards the axial skeleton and not in the appendicular skeleton, explaining the negativity of the BMB. Representative examples of each case are shown in Fig. 1.
The population characteristics are summarized in Table 1. Among the bone-NEGATIVE patients, there were 4 patients (11.1%) staged 1, 10 (27.8%) staged 2, 13 (36.1%) staged 3 and 9 (25.0%) staged 4. There was no difference in the technical PET parameters between the bone-NEGATIVE and bone-POSITIVE groups of patients. The mean injected dose (MBq/kg), uptake time (min) and glycaemia (g/l) were 289 ± 54.5 versus 287 ± 56.3 (p = 0.61), 58.32 ± 3.59 versus 59.15 ± 3.52 (p = 0.30) and 1.06 ± 0.29 versus 0.98 ± 0.11 (p = 0.21), respectively. Hip prostheses were encountered in only four patients, two with a unilateral hip prosthesis and two with a bilateral hip prosthesis. No other types of prosthesis were encountered.   2). Among the 10 false-negative results, 3 would have been reclassified as positive on a visual PET assessment basis because of lesions located out of the field of analysis, especially on the costal grill or vertebra atlas and axis. Seven false-positive results were also observed.
Multivariable diagnostic value of PET radiomics for bone involvement at baseline staging. Thirteen 18 FDG PET/CT variables out of the 26 analysed were significantly different between the bone-NEGATIVE and bone-POSITIVE patients ( Table 2). The LASSO regression including all analysed PET radiomics (n = 26) found variance _GLCM , correlation _GLCM , joint entropy _GLCM and busyness _NGLDM to have nonzero regression coefficients. Coefficient and cross-validation plots are provided in Fig. 3. Correlations between these four PET radiomics scans can be seen in Fig. 4. The corresponding linear equation for the computation of the prediction score was as follows: The mean pred.score of the entire series was equal to − 0.096 ± 1.383. Based on ROC analysis, a cut-off equal to − 0.190 was found to be optimal for the diagnosis of BMI: AUC = 0.822 (95% CI = 0.721-0.924, p < 0.0001). The corresponding sensitivity, specificity, PPV and NPV values were equal to 70.0%, 83.3%, 77.8% and 76.9%, respectively (Fig. 2). Twenty-seven patients had a pred.score > − 0.190 and were considered positive for BMI, among which six were false-positive results (BMB −/PET − patients). Additionally, nine false-negative results were observed, including seven BMB +/PET VISU − patients and two BMB +/PET VISU + patients whose lesions were out of the field of quantitative PET analysis. These patients could be easily recovered by visual analysis: one with a lesion on the costal grill and the other with a lesion on the upper jaw. In fine, only 7 bone-POSITIVE patients (23.3%) would have been missed using skeletal PET quantification analysis. When comparing the AUCs from  www.nature.com/scientificreports/  www.nature.com/scientificreports/ ROC analyses for BMI assessment with BMB alone, visual PET alone, PET skewness _HISTO alone and PET pred. score (Fig. 2, Table 3), significant differences were found between BMB and visual PET assessments (p = 0.010) and between BMB and PET skewness _HISTO assessments (p = 0.015). No difference was observed between BMB and PET pred.score assessments (p = 0.097). No difference was found among the PET pred.score, visual PET alone, or PET skewness_ HISTO regarding the assessment of BMI.   www.nature.com/scientificreports/ The correlation between selected PET radiomics and biological characteristics was explored and is summarized in Table 4. Significant negative correlations were found between haemoglobin blood level and variance _GLCM (ρ = -0.447, p = 0.0003) and joint entropy _GLCM (ρ = -0.498, p < 0.0001).

Discussion
The aim of the present study was to extrapolate previous results obtained for the diagnosis of BMI using PET radiomics in DLBCL patients to FL patients.
Skewness was previously found to be a promising parameter for the identification of patients with BMB involvement without visually assessable focal lesions, with a positive LR of 4.46. Interestingly, in our series of FL patients, the optimal cut-off value was consistent: equal to 1.20 versus 1.26 previously in DLBCL. However, skewness_ HISTO BMI diagnostic performances were not as impressive, with low additional value over visual PET assessment alone: the sensitivity and NPV were 66.7% versus 53.3% and 74.4% versus 72.0% for skewness_ HISTO and visual PET assessments, respectively. Well-known differences in metabolic characteristics between FL and DLBCL diseases could explain these results. In particular, FL uptake is usually less intense than that of DLBCL 25 . Another issue could be the important discrepancies in BMI at diagnosis between DLBCL and FL patients, with the rate of positive BMB estimated to be 15% in newly diagnosed DLBCL and 50% in FL 6 . Additionally, cases of BMB +/PET − patients were previously estimated to be only 3.1% in DLBCL 26 but were estimated to be 13% in FL patients 18 . It is worth noting that this rate was even slightly superior in our series, reaching 21% of patients. However, it should be emphasized that cases of pure diffuse FDG uptake were considered positive in the study performed by Nakajima et al., whereas they were considered negative in the present study, which could partly explain this difference. Moreover, BMB −/PET + patients were also estimated at 13% and 12% in the another publication 26 , meaning that BMB could be tricked.
All things considered, it seemed to us that a multivariable approach using radiomics could be more accurate.
As has been highlighted in the literature, radiomic index values are highly dependent on the segmentation method [27][28][29][30] . The CT bone segmentation methods used here to draw VOIs were semiautomatic, with very little manual intervention and had already been shown to have a great interobserver agreement, which guaranteed their robustness 24 . To continue methodological considerations, the robustness of the radiomic indices to the intensity discretization method has been widely evaluated in the literature. Indices can be compared only if the same calculation parameters are used, which is the case here due to absolute resampling 31 .
In doing so, variance _GLCM , correlation _GLCM , joint entropy _GLCM and busyness _NGLDM were identified by LASSO analysis as potential variables of interest to build a linear model of prediction. None of the histogram or size-zone matrices were retained. It seemed that parameters extracted from GLCM or NGLDM were ideal candidates for describing skeletal tumour heterogeneity. Unlike histogram-based indices, calculated from original images, they reflect the spatial arrangement of voxel intensities. Even though statistical significance was not reached, with an optimal threshold PET pred.score set to − 0.19, the sensitivity and NPV were improved compared to visual PET assessment alone: 70.0% versus 53.3% and 76.9% versus 72.0%, respectively (Fig. 2). Finally, even if the performance of the BMB appeared to be better than that of the PET pred.score, it was still notable that there was no statistically significant difference between the ROC curves AUCs of these two diagnostic tests (p = 0.097). This may suggest that PET pred.score BMI assessment could perform equally to BMB provided that the model is strengthened with a larger database.
Notably, some examinations were found to be negative in terms of the PET pred.score but positive on visual PET assessment because of lesions located outside the VOIs. This result means either that improvement in CT bone segmentation has to be made to encompass the whole skeleton or that visual and quantitative PET assessments have to be conjointly made. The current paradigm of radiomic analysis adds quantitative information to visual analysis or biology without totally replacing them 32 , and it seems that the best option would be to combine visual and quantitative PET assessments. Presently, using this combined strategy, 7 bone _POSITIVE patients would have been missed compared to 14 patients using visual PET alone.
A more complex strategy combining clinical, biological and PET features should also be explored. However, the number of patients included in the present study did not allow us to test such strategies. We still looked for correlations between biological PET variables and found significant negative correlations between haemoglobin www.nature.com/scientificreports/ level and variance _GLCM and joint entropy _GLCM . Some studies have demonstrated that marrow hypermetabolism correlates with leukocyte and neutrophil levels, both of which are associated with a poor response to treatment 33,34 , but this was not observed in our series. Furthermore, the limited number of included patients did not allow the performance of the internal test. Therefore, the reliability of such a model should be evaluated on an independent dataset, ideally acquired on a different PET system or from a different centre, for its performances to be definitely validated.
Applying a multivariable PET radiomics model to baseline 18 FDG PET/CT images could be a promising path to improve the diagnosis of BMI follicular lymphoma patients. Prospective and larger clinical studies are needed to strengthen the model and to definitively confirm this hypothesis.

Methods
Population. In this retrospective double-centre study, we enrolled 113 patients newly diagnosed with FL from November 2014 to May 2019 who were treated with a chemotherapy regimen. The inclusion criteria were as follows: patients over 18 years old, histopathologically proven FL, pretherapeutic bone marrow biopsy and 18 FDG PET/CT. Clinical variables, including age at diagnosis, sex, body mass index, Ann Arbor stage, bulky mass, FLIPI score, first-line treatment type, serum haemoglobin level, serum platelet level, serum white cell level, serum β2-microglobulin (β2M) level, serum lactate dehydrogenase (LDH) level, serum albumin level, serum calcium level and serum alkaline phosphatase level, were recorded. All procedures performed in studies involving human participants were approved by the local ethics committee and were in accordance with the 1964 Helsinki Declaration. In accordance with European regulations, observational studies without any additional therapy or monitoring procedures do not need the approval of an ethical committee. Additionally, the need for informed signed consent was waived. The procedure was declared to the National Institute for Health Data, with registration no. F20201023145322. Extraction of PET bone textural features. All images were analysed by the same reviewer with 5 years of experience in PET interpretation using MIM (MIM Software, Cleveland, OH, USA, version 5.6.5). For visual PET/CT assessment, examinations were considered to be positive in cases of one or several obvious bone focal uptakes on PET images with or without bone lesions on CT images. Doubtful diffuse and/or heterogeneous skeletal uptake was not considered a positive finding. In case of discrepancy, the examination was conjointly reviewed to reach a consensus with a second experienced nuclear medicine physician having more than 10 years of experience in PET. For textural analysis, the skeleton volumes of interest (VOIs) from the C3 vertebra to the upper third of femurs were automatically extracted from CT images for each examination (Supplemental Fig. 1).

PET acquisition and reconstruction parameters.
In the case of hip prostheses, the zone was excluded to avoid PET attenuation correction artefacts. The final CT VOIs were then transferred to PET images. All possible lymph node areas of increased FDG uptake in the vicinity of the skeleton (especially in the retroperitoneum) that could affect texture features because of a partial volume effect were checked 36 . Finally, the VOIs were saved in DICOM-RT structure format so that they could be loaded in LIFEx software version 5.1 37 . For the resampling step, 64 discrete values with a range of SUV units set to 0-30 and a spatial resampling set to 2.0 × 4.0 × 4.0 mm were used. The following PET variables were extracted: • five conventional PET parameters: SUVmax, SUVpeak, SUVskewness, SUVkurtosis and SUVexcessKurtosis • six grey-level co-occurrence matrix (GLCM) parameters: inverse difference, angular second moment, variance, correlation, joint entropy and dissimilarity • three neighbourhood grey-level different matrix (NGLDM) parameters: coarseness, contrast and busyness  www.nature.com/scientificreports/ Statistical analysis. Quantitative data are presented as the mean ± standard deviation (SD) or median (interquartile range) when appropriate. Characteristics of populations and PET radiomics were compared using Fischer's exact tests for discrete variables and Mann-Whitney tests for continuous variables with Bonferroni correction. Both BMB and visual PET assessment as described above were taken as the gold standard for the patient's classification. BMB −/PET − patients were considered disease-free patients (bone-NEGATIVE patients), whereas BMB +/PET −, BMB −/PET + and BMB +/PET + patients were considered as bone-POSITIVE patients. A least absolute shrinkage and selection operator (LASSO) regression algorithm with tenfold cross-validation was used to select features of interest, namely, those with nonzero coefficients. This regression method performs both variable selection and regularization to enhance the prediction accuracy and interpretability of the resulting statistical model 39 . A prediction score (pred.score) was computed for each patient by means of a linear regression combining all selected PET variables. Receiver operating characteristic curves (ROCs) were used to define the optimal pred.score cut-off value for the diagnosis of BMI by maximizing the sensitivity and specificity according to the Youden index and for diagnostic performance comparisons using the DeLong et al. methodology. Finally, Spearman correlation tests were used to determine the relationship between biological variables and PET radiomics of interest. Statistical analysis and figure conception were performed using XLSTAT software (XLSTAT 2019: Data Analysis and Statistical Solution for Microsoft Excel. Addinsoft).
Ethical approval. The authors are accountable for all aspects of the work and guarantee that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All procedures performed in the studies involving human participants were in accordance also the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration (as revised in 2013) and its later amendments or comparable ethical standards.
Consent to participate and for publication. In accordance with European regulations, French observational studies without any additional therapy or monitoring procedures do not need the approval of an ethics committee. Additionally, the need for informed signed consent was waived. Nevertheless, global information for people participating in research was provided, including a specific paragraph on the possibility of using health data for research purposes. The patient had the right to oppose the transmission of data covered by medical confidentiality that may be used and processed in the context of this research. The procedure was declared to the National Institute for Health Data with the registration no. F20201023145322.

Data availability
The data supporting the conclusions of this article will be made available by the authors, upon reasonable request.