Metabolic tumor volume on interim PET is a better predictor of outcome in diffuse large B-cell lymphoma than semiquantitative methods

Radiologic methods that accurately assess clinical response are essential for the evaluation of current and experimental regimens used to treat hematologic malignancies. Recent advances that incorporate combination chemotherapy and the anti-CD20-targeted agent rituximab (Rituxan) have improved the clinical outcome of patients diagnosed with diffuse large B-cell lymphoma (DLBCL), but only 60% of all DLBCL patients are potentially cured and achieve sustained progression-free survival (PFS). PFS after salvage therapies including autologous stem cell transplantation drops to 30% leading to a disease relapse and a poor prognosis. 1 A response-adaptive imaging strategy that accurately determines the initial response to therapy and then individualizes subsequent treatment could improve PFS, reduce relapse rates and improve clinical outcomes. Positron emission tomography (PET) integrated with computer-ized tomography (CT) combines anatomical delineation and metabolic activity of tumor tissue counts as the main tool to determine the therapeutic response of DLBCL patients. 18 F-labeled- ﬂ uorodeoxyglucose ( 18 F-FDG) PET can differentiate viable tumor from posttreatment necrotic tissue or ﬁ brosis making it the imaging modality-of-choice upon completion of chemotherapy. 2 Although there has been an increasing trend to perform interim PET/CT (interim PET) after 2 – 4 cycles of induction chemotherapy to monitor response and tailor consolidation therapy, the optimal interpretation method for interim PET analyses remains uncertain. 3 Importantly, there is an unmet for a quantitative, standardized and reproducible method for this purpose. 4 Although semiquantitative methods, such as determination of the maximum standard uptake value (SUV max ), partially meet these criteria, studies have not de ﬁ ned a uniformly applicable SUV max reduction cutoff that accurately predicts PFS or clinical outcome. 5,6 SUV max represents a single-pixel value, which re ﬂ ects

Radiologic methods that accurately assess clinical response are essential for the evaluation of current and experimental regimens used to treat hematologic malignancies. Recent advances that incorporate combination chemotherapy and the anti-CD20targeted agent rituximab (Rituxan) have improved the clinical outcome of patients diagnosed with diffuse large B-cell lymphoma (DLBCL), but only 60% of all DLBCL patients are potentially cured and achieve sustained progression-free survival (PFS). PFS after salvage therapies including autologous stem cell transplantation drops to 30% leading to a disease relapse and a poor prognosis. 1 A response-adaptive imaging strategy that accurately determines the initial response to therapy and then individualizes subsequent treatment could improve PFS, reduce relapse rates and improve clinical outcomes.
Positron emission tomography (PET) integrated with computerized tomography (CT) combines anatomical delineation and metabolic activity of tumor tissue counts as the main tool to determine the therapeutic response of DLBCL patients. 18 F-labeled-fluorodeoxyglucose ( 18 F-FDG) PET can differentiate viable tumor from posttreatment necrotic tissue or fibrosis making it the imaging modality-of-choice upon completion of chemotherapy. 2 Although there has been an increasing trend to perform interim PET/CT (interim PET) after 2-4 cycles of induction chemotherapy to monitor response and tailor consolidation therapy, the optimal interpretation method for interim PET analyses remains uncertain. 3 Importantly, there is an unmet for a quantitative, standardized and reproducible method for this purpose. 4 Although semiquantitative methods, such as determination of the maximum standard uptake value (SUV max ), partially meet these criteria, studies have not defined a uniformly applicable SUV max reduction cutoff that accurately predicts PFS or clinical outcome. 5,6 SUV max represents a single-pixel value, which reflects maximum intensity of 18 F-fluorodeoxyglucose (FDG) activity in the tumor and ignores the extent of metabolic abnormality and changes in the distribution of a tracer within a lesion. SUV max reflects increased anaerobic metabolism and higher glucose consumption. This region of tumors is located in the hypoxic tumor core with irregular angiogenesis, which result in more leaky and less effective vasculature that may cause less effective medication delivery. 7 Thus, using SUV max reduction to assess chemotherapy effectiveness may miss the more dynamic area of the tumor and those with improved drug delivery. Although complete disappearance of SUV max may indicate complete response, SUV max , in fact, may not be the best index to determine the early tumor response to a given treatment. Therefore, alternative metabolic parameters that integrate both tumor volume and intensity of uptake may provide more precise clinical information. We hypothesized that a method that maximized the detection of all metabolically active regions within the tumor mass, defined as the metabolic tumor volume (MTV), could serve as a better predictor of clinical outcome than semiquantitative methods, that is, SUV max measurement. Here, we compared the ability of MTV measurement by gradient-or threshold-based methods with semiquantitative SUV max measurement on interim PET analyses to predict the PFS of DLBCL patients after initial therapy.
A total of 197 patients with pathology confirmed diagnosis of DLBCL were treated from December 2006 to December 2014. Of the 197 patients, 140 underwent interim PET analysis. Patient characteristics are shown in Table 1. The primary end point of the study was PFS, as defined by the time from the beginning of treatment to first progression, relapse, death from any cause or last follow-up visit. Patients still alive were censored at the date of last contact. Interim PET analysis was performed after 2-4 cycles of chemotherapy, acquired from the orbits to the proximal third of the thighs. All patients fasted 46 h before intravenous injection of 18 F-glucose, had glucose levels 490 and o 160 mg/dl at the moment of injection, scans were performed within 90 min after injection and granulocyte-colony stimulating factor was stopped 448 h before imaging. Interim results were interpreted as either positive or negative by visual dichotomous response criteria according to the five-point score Deauville system.
To evaluate the contribution of metabolic activity within the tumor periphery in assessing clinical outcomes, two different methods-fixed threshold-and gradient-based-were used to measure MTV. Fixed threshold-based measures tumor volume using software that includes all detectable areas with 18 FDG uptake greater than a fixed percentage of SUV max (usually defined as 37%). 8 Gradient-based methods are designed to allow a better estimation of intensity by reconstructing images that are denoised and deblurred with an edge-preserving filter and iterative deconvolution algorithm. 9 Differences in uptake and metabolism at tumor periphery, where a sharp drop in FDG uptake is seen, are considered to be the edge of the metabolically active tumor volume. Gradient-based methods appear to be more accurate compared with source-to-background ratio methods for segmenting FDG-PET images. 10 SUV max and MTV were determined from the initial and interim PET images using PET Edge software (MIMSoftware Inc., Cleveland, OH, USA).
Median follow-up period for patients in the study was 37 months. R-CHOP (Rituximab, Cyclophosphamide, doxorubicin/ Hydroxydaunomycin, vincristine/Oncovin and Prednisone) and R-DA-EPOCH (Rituximab-Dose-Adjusted Etoposide, Prednisone, Oncovin, Cyclophosphamide, Hydroxydaunorubicin) were the first line of therapy in 74% and 26% of patients, respectively. On interim PET/CT, 69% of patients achieved complete response with the remaining patients showing partial response based on visual assessment. Dichotomous visual interpretation of interim PET did not correlate with PFS (log-rank P = 0.37). Compared with the threshold-based method, the gradient-based method resulted in a statistically significant greater MTV in pretreatment, as well as interim PET images. However, no significant difference was noted between the reduction in MTV determined by the threshold-based (ΔMTV T ) or gradient-based (ΔMTV G ) methods (median 34% vs 36%, P = 0.29). However, the reduction in SUV max (ΔSUV max ) was smaller when measured by ΔMTV T and ΔMTV G (median ΔSUV max , ΔMTV T and ΔMTV G is 65%, 34%, 36% respectively, P = 0.043).
As no difference was found between the two methods to determine ΔMTV and as the threshold-based method was more versatile, this method was used to correlate interim PET values with PFS. To identify an optimal threshold cutoff that could predict PFS more accurately, receiver operating characteristic (ROC) curve analysis was used. The area under the ROC curve (AUC) provides a measure of the accuracy of a diagnostic test and ranges from 0.5 (random guessing) to 1.0 (perfect test). 11 Thresholds of ΔSUV max and ΔMTV by this method were 72% and 52%, respectively. ΔMTV predicted PFS better than ΔSUV max as the AUC for ΔMTV was significantly larger compared with that for for ΔSUV max (AUC ΔMTV : 0.713 and AUC ΔSUVmax : 0.873; P: 0.0324) (Figure 1a). All patients who achieved an SUV max reduction greater than the cutoff value determined by the ROC analysis (ΔSUV max 472%) were then stratified into two groups based on an ΔMTV cutoff value 4 or o 52%. From a total of 115 patients who achieved a ΔSUV max 472% on interim PET/CT imaging, 77 (67%) had a ΔMTV 452%. Importantly, patients who achieved a ΔMTV 452% had a statistically significantly greater PFS compared with patients who achieved a ΔMTV o52% (hazard ratio: 1.37; confidence interval: 1.03-1.71, P = 0.02; Figure 1b). Among 115 patients who achieved a ΔSUV max 472% on interim PET and those who demonstrated a ΔMTV 452% exhibited greater PFS (hazard ratio = 1.37; confidence interval = 1.03-1.71; P = 0.02).
In this study, a retrospective study was performed to correlate the reduction in MTV and SUV max on interim PET with PFS. MTV measurement using a gradient-based method rendered assessment of a greater tumor volume compared with the thresholdbased method. The two methods reveal a similar percent reduction in MTV and appear equivalent with respect to interim PET results. However, MTV measurement by either method after initial treatment was a better predictor of PFS compared with SUV max reduction. Further analysis also revealed the underlying importance of MTV reduction on interim PET to predict PFS for patients who had also achieved a significant reduction in SUV max (Figure 1b). Although SUV max assessment represents a significant improvement over subjective visual assessment of interim PET scans, alone it does not adequately predict PFS. 12 In contrast, MTV assessment (by either gradient-or threshold-based methods) more accurately predicted PFS as it incorporates the metabolic contribution of the tumor periphery. Commonly, peripheral tumor is not adequately assessed, although it is metabolically active.
Although prior reports highlight the prognostic value of imaging PET based on visual assessment, other studies, including ours, have not demonstrated a statistically significant difference for positive or negative. 13 Such results may be because of the high degree of interobserver variability inherent in visual assessment methods. The ΔSUV max cutoff values estimated by ROC analysis used here to distinguish good and bad responders were similar to those values previously reported in independent cohorts after either two or four cycles of induction treatment. 4,14 Thus, these thresholds appear to be robust and reproducible regardless of patient age and International Prognostic Index in DLBCL patients. Our study highlights the importance of MTV assessment combined with semiquantitative measurements on interim PET to better predict the clinical outcome of DLBCL patients. Metabolic activity of peripheral tumor should be incorporated into responseadaptive strategies and prospective trials that evaluate the response to current and novel therapeutic regimens to treat DLBCL patients.  Figure 1. (a) ROC curves for the MTV and ΔSUV max for predicting PFS. MTV was measured by two different methods, threshold-based using 37% SUV max as the threshold and gradient-based using the PET Edge software. The software calculates spatial derivatives along the tumor radii and then defines the tumor edge on the basis of derivative levels and the continuity of the tumor edge. All the measurements were performed by a single operator. The thresholds of ΔSUV max and ΔMTV by ROC curve analysis were 72% and 52%, respectively. ΔMTV predicted PFS better than ΔSUV max as the AUC for ΔMTV was significantly larger compared with the AUC for ΔSUV max (AUCΔMTV: 0.713 and AUCΔSUV max 0.873; P = 0.0324). (b) Kaplan-Meier curve for patient who achieved adequate ΔSUV max reduction (ΔSUV max 472%) stratified to two groups based on ΔMTV. ΔMTV can predict PFS in a subset of patients who had significant SUV max reduction on interim PET.