Introduction

Gastric cancer (GC) is the fifth most common cancer and the fourth leading cause of cancer-related deaths worldwide1. Although the mortality rate of GC patients has continuously decreased, the survival rate markedly varies among countries, and the prognosis remains poor2,3. For potentially resectable GC patients, a combined-modality treatment that includes surgery, perioperative chemotherapy, or chemoradiation therapy is recommended4. In recent years, new treatment options such as targeted agents or immunotherapy have become available for high-risk disease patients5,6. Accordingly, improving risk stratification for GC patients after surgery has become more important for predicting long-term survival outcomes and further therapeutic planning.

Currently, the tumor-node-metastasis (TNM) staging system developed by the American Joint Committee on Cancer (AJCC) is the most commonly used method for classifying and predicting GC prognosis7. However, the anatomy-based TNM staging has limited use as a predictive tool for assessing individual patient survival because it does not include several significant factors that affect prognosis owing to the need for simplicity and uniform application of the staging system8. Better prognostic models for the accurate prediction of survival outcomes and identification of patients with poor prognoses are needed for tailored treatment.

Positron emission tomography/computed tomography (PET/CT) with [18F]Fluorodeoxyglucose ([18F]FDG) has become a widely used method for staging, response evaluation, recurrence detection, and restaging of GC9,10. Despite the controversy regarding its routine use in GC patients owing to its unsatisfactory sensitivity for a primary tumor or lymph node (LN) involvement11,12, recent studies have demonstrated a prognostic value of preoperative [18F]FDG PET/CT for GC. [18F]FDG uptake by the primary tumor reflects biological aggressiveness, and the high specificity of PET/CT for detecting LN and occult metastases could be valuable in predicting prognosis based on metabolic information, although it does not provide exquisite anatomic details9,13,14.

Several studies have revealed that positive [18F]FDG uptake by primary gastric tumor is associated with inferior overall survival15,16 and [18F]FDG uptake by metastatic LN could be a surrogate prognostic marker17; however, there is no cooperative analysis study that uses both the metabolic activities of the primary tumor and metastatic LN for risk stratification in GC patients. Furthermore, no studies have attempted to develop a staging system for disease-specific survival (DSS) in GC patients using metabolic information from pretreatment [18F]FDG PET/CT.

Thus, this retrospective study aimed to investigate the prognostic impact of a metabolic score comprising the maximum standardized uptake value (SUVmax) of the primary tumor (T_SUVmax) and the metastatic LN (N_SUVmax) obtained via preoperative [18F]FDG PET/CT to predict DSS in GC patients. Furthermore, we assessed the additional prognostic value of an [18F]FDG PET/CT parameter for improving risk stratification. Finally, we developed a novel staging system using the metabolic score and compared its efficiency with that of conventional TNM staging for predicting DSS after curative surgery for GC.

Results

Patient characteristics

Among 731 patients included in the analysis, 90 (12.3%) died within a median follow-up period of 87.2 (range 1.8–109.2) months. The overall 3- and 5-year DSS rates were 90.8% and 88.2%, respectively. The median DSS were 88.8 (range 1.8–109.2) and 21.3 (range 1.9–93.1) months among survivors and nonsurvivors, respectively. The mean T_SUVmax and N_SUVmax were 3.4 ± 4.6 and 0.4 ± 1.8 in survivors and 6.7 ± 4.8 and 1.8 ± 3.6 in nonsurvivors, respectively.

The characteristics of the enrolled patients in the training cohort (n = 543) and validation cohort (n = 188) are summarized in Table 1. Regarding the TNM stage, 314 (57.8%) patients had stage I GC, 94 (17.3%) stage II GC, 132 (24.3%) stage III GC, and 3 (0.6%) stage IV GC in the training cohort. Meanwhile, 123 (65.4%) patients were the stage I, 36 (19.1%) were stage II, 28 (14.9%) were stage III and 1 (0.5%) were stage IV in the validation cohort. Three of the four stage IV GC patients presented with hepatic metastasis, and the other presented with a seeding mass in the sigmoid colon mesentery. These patients underwent radical gastrectomy with a curative aim accompanied by metastatic mass excision.

Table 1 Patient characteristics in the training and validation cohorts.

Prognostic factors for DSS

We developed the metabolic score based on PET-derived variables, i.e. T_SUVmax and N_SUVmax, for predicting DSS using conditional inference trees (CTree) analysis in the training cohort. The optimal cut-off values of T_SUVmax and N_SUVmax for the metabolic score determined by CTree analysis were 4.5 and 1.9, respectively. The scores were as follows: metabolic score 1 (T_SUVmax ≤ 4.5 and N_SUVmax ≤ 1.9), metabolic score 2 (T_SUVmax > 4.5 and N_SUVmax ≤ 1.9), metabolic score 3 (T_SUVmax ≤ 4.5 and N_SUVmax > 1.9), and metabolic score 4 (T_SUVmax > 4.5 and N_SUVmax > 1.9). In total, 359 (66.1%), 125 (23.0%), 11 (2.0%), and 48 (8.9%) patients had metabolic scores of 1, 2, 3, and 4, respectively (Fig. 1). Kaplan–Meier analysis showed that a higher metabolic score was associated with a poorer DSS, and the log-rank test showed a significant difference in survival between the metabolic scores (P < 0.001) (Fig. 2A).

Figure 1
figure 1

Tree-structured survival analyses for the metabolic score by combining T_SUVmax and N_SUVmax in the training cohort. The optimal cut-off values of T_SUVmax and N_SUVmax for the metabolic score determined using conditional inference trees analysis were 4.5 and 1.9, respectively. T_SUVmax = maximum standardized uptake value of primary tumor; N_SUVmax = maximum standardized uptake value of metastatic lymph node.

Figure 2
figure 2

Cumulative DSS curves of the 543 gastric cancer patients according to the metabolic score (A), TNM stage (B), and new staging system (C) in the training cohort. DSS disease-specific survival, TNM tumor-node-metastasis.

Univariate Cox proportional hazards regression analysis revealed that pathologic T (pT) stage, pathologic N (pN) stage, lymphovascular invasion, neural invasion, and the metabolic score were significantly associated with DSS in the training cohort (Table 2). In the multivariate analysis, pT stage (hazard ratio (HR) 1.69; 95% confidence interval (CI) 1.25–2.30; P < 0.001), pN stage (HR 1.69; 95% CI 1.29–2.22; P < 0.001), and the metabolic score (HR 1.28; 95% CI 1.06–1.56; P = 0.012) remained independent prognostic factors of DSS.

Table 2 Univariate and multivariate analyses of prognostic factors of disease-specific survival.

Construction of a new staging system incorporating the TNM stage and metabolic score

We constructed a hierarchical prognostic model for predicting DSS. Because metabolic score was determined to be an independent prognostic factor, we combined the metabolic score with the TNM stage for the new staging system. Patients were categorized into five new risk groups based on the results of the CTree analysis in the training cohort (Fig. 3): category I (TNM stage I, II and metabolic score 1), category II (TNM stage I, II and metabolic score 2), category III (TNM stage I, II and metabolic score ≥ 3), category IV (TNM stage III, IV and metabolic score ≤ 3), and category V (TNM stage III, IV and metabolic score 4). Regarding stage distribution according to the new prognostic system, 311 (57.3%) patients were classified into category I; 80 (14.7%) into category II; 17 (3.1%) into category III; 97 (17.9%) into category IV; and 38 (7.0%) into category V (Table 3).

Figure 3
figure 3

A new staging system using the metabolic score and TNM stage using tree-structured survival analyses. Five terminal risk groups (new staging system) were established in the training cohort. TNM tumor-node-metastasis.

Table 3 Patient distribution according to the 8th AJCC TNM stage and new staging system.

Comparison of prognostic performance between the TNM stage and new staging system

According to the TNM staging system, the DSS rates were 97.5% for stage I GC patients, 91.5% for stage II, 56.1% for stage III, and 0% for stage IV (P < 0.001) (Fig. 2B). In the new staging system, the DSS rates were 98.1% for category I, 92.5% for category II, 76.5% for category III, 60.8% for category IV, and 39.5% for category V (P < 0.001) (Fig. 2C). The calculated HRs for DSS (reference group: category I in the new staging system) increased in a stepwise manner (3.95 for category II, 13.26 for category III, 25.06 for category IV, and 54.24 for category V) in the training cohort.

Comparison of prognostic performance of the two prognostic models according to the χ2 likelihood ratio, Harrell’s C index, and Akaike information criterion (AIC) values showed that compared with the TNM stage, the new staging system had better homogeneity (χ2 likelihood ratio: 147.6 vs 131.3) and discriminatory capability (Harrell’s C index: 0.849 vs 0.823; AIC value: 806.3 vs 822.7) in the training cohort. This finding indicates that combining PET metabolic variables with the pathologic TNM stage could provide better prognostic stratification of GC patients.

Validation of the metabolic score and the new staging system

We performed internal validation of the metabolic score and new staging system established from the training cohort. Figure 4 shows the cumulative DSS curves according to the metabolic score, TNM stage, and the new staging system of the validation cohort. Kaplan–Meier analysis showed that a higher metabolic score was associated with poorer DSS, and the log-rank test showed a significant difference in survival between metabolic scores (P < 0.001) (Fig. 4A). The DSS rates were 99.2% in stage I, 91.7% in stage II, 67.9% in stage III, and 100% in stage IV (P < 0.001) under the TNM staging system (Fig. 4B). A survivor with stage IV underwent radical gastrectomy with a curative excision of solitary hepatic metastasis. In the new staging system, the DSS rates were 99.2% in category I, 92.6% in category II, 85.7% in category III, 70.0% in category IV, and 66.7% in category V (P < 0.001) (Fig. 4C). The χ2 likelihood ratio, Harrell’s C index, and AIC value in the new staging system and TNM stage were 23.98 and 23.09, 0.857 and 0.849, and 112.9 and 113.7, respectively, in the validation cohort.

Figure 4
figure 4

Cumulative DSS curves of the 188 gastric cancer patients according to the metabolic score (A), TNM stage (B), and new staging system (C) in the validation cohort. DSS disease-specific survival, TNM tumor-node-metastasis.

Discussion

In this study, we developed and internally validated a new risk prediction metabolic score and a new staging system using metabolic parameters of [18F]FDG PET/CT for predicting DSS in GC patients who undergo curative surgical resection. First, we developed a metabolic score that combines T_SUVmax and N_SUVmax. Second, because the pT/pN stages and metabolic score were independent prognostic factors for DSS in the multivariate analysis, we developed a new prognostic model by incorporating the metabolic score into the conventional TNM stage for improved DSS prediction in the training cohort. Our new staging system showed better performance for predicting DSS than the conventional TNM stage in the validation and training cohorts.

Several studies have revealed that the degree of [18F]FDG uptake by the primary tumor on [18F]FDG PET/CT could help predict survival in GC patients15,16,18,19. Mochiki et al. demonstrated that those with FDG PET-positive GCs showed significantly lower survival rates than those with FDG PET-negative tumors16. Furthermore, among patients who underwent curative surgical resection, those with a higher T_SUVmax had poorer overall survival than those with a lower T_SUVmax16. In the metastatic setting, Chung et al. showed that high T_SUVmax was associated with inferior overall survival in patients with metastatic gastric adenocarcinoma18. Similarly, Park et al. verified that T_SUVmax was the most robust independent factor for predicting prognosis in stage IV GC patients receiving palliative chemotherapy19. Despite the relatively small patient populations of these studies and the different thresholds they proposed, most studies reported that T_SUVmax was a significant prognostic factor for predicting the survival of GC patients possibly because an increased [18F]FDG uptake by primary tumors indicates the metabolic status and tumor aggressiveness20,21. However, T_SUVmax alone may be inadequate for precise survival prediction in GC patients when N_SUVmax is also considered a prognostic factor. We previously found that the metabolic information of metastatic LNs has a greater prognostic value than that of the primary tumor for predicting the survival of GC patients17.

There have been studies on the prognostic value of the metabolic activity of metastatic LNs15,17,22. Coupe et al. demonstrated that [18F]FDG positivity of LNs and primary tumors was associated with worse overall survival of GC patients15. We also found that N_SUVmax was an independent prognostic factor for overall and recurrence-free survival after curative resection in GC patients with LN involvement17. More recently, Wang et al. showed that the number of [18F]FDG PET-positive LNs could be a useful predictive marker for prognosis in locally advanced GC patients22. However, no study has evaluated the prognostic value of the combination of T_SUVmax and N_SUVmax in predicting the survival of GC patients. Notably, the present study showed that a higher metabolic score correlated with poor DSS and was confirmed to be an independent prognostic factor in GC patients.

Two recent studies developed a prognostic model using [18F]FDG PET/CT parameters for hepatocellular carcinoma and breast cancer23,24; however, a prognostic model using a combination of T_SUVmax and N_SUVmax has not been reported for GC. Accordingly, we developed and internally validated a new staging system that comprises a combination of the metabolic score and the conventional TNM stage, which are independent prognostic factors for DSS in GC patients. Five risk groups were derived from the model, and we found significant differences in DSS among the risk groups. Although direct comparison between the stages of the two models is difficult because of the different subgroups in each stage, the new staging system, which was developed using statistical methods for scientific rationality, showed better discriminatory capability than the conventional TNM staging. In the decision tree, the conventional TNM stage was selected as the first-order risk factor, and patients were divided into two groups (the TNM stage I–II group and the TNM stage III–IV group). The metabolic score was then added as the risk factor for subgroup classification of these two groups. In patients with TNM stage I–II, T_SUVmax could potentially aid in stratifying categories I and II in the new staging system. Furthermore, patients with TNM stage I–II GC and high N_SUVmax were classified into the new staging category III, while all patients with TNM stage III–IV GC were grouped into the new staging category IV–V. All three patients with distant metastasis (TNM stage IV) were also classified into the new staging category V.

Many studies have shown that despite its high prognostic value in many cancers, [18F]FDG PET/CT has low sensitivity for detecting LN metastasis15,17,25. In the training cohort of the present study, only 59 (10.9%) of 543 patients had a metabolic score of 3 or 4. As such, if only N_SUVmax was used as a prognostic factor, most patients would not be further classified into subgroups. However, considering T_SUVmax as a cooperative prognostic factor, 484 patients with a low N_SUVmax were further divided to have a metabolic score of 1 (n = 359) or 2 (n = 125). This classification using the combination of T_SUVmax and N_SUVmax showed a significant additional effect for the prognostic model as well as independent prognostic value. Although N_SUVmax has a high prognostic value, many patients show negative nodal FDG uptake. Thus, T_SUVmax helps to further group these patients.

The present study had a few limitations. First, this was a single-institution retrospective study that might have been subject to selection bias. For example, all patients who underwent preoperative treatment were excluded because any treatment before surgical resection could affect histopathologic results. Moreover, patients who received delayed surgical treatment (i.e. more than 1 month after [18F]FDG PET/CT) were excluded. Second, T_SUVmax in early GC and N_SUVmax in patients with small-sized metastatic LN could have been underestimated due to partial volume effects. Third, we could not completely rule out the possibility of the impact of physiological FDG uptake by the normal stomach wall or increased FDG uptake by inflammatory LNs. Finally, although we internally validated our risk prediction model, further prospective studies and external validation should be conducted to generalize the prognostic impact of the metabolic score and new prognostic model in patients with GC.

In conclusion, this study identified that the metabolic score comprising T_SUVmax and N_SUVmax was an independent predictor of DSS after curative surgical resection in GC patients. Furthermore, the new staging system comprising the metabolic score and TNM stage has superior prognostic performance for risk stratification for DSS than the TNM staging system alone. Therefore, [18F]FDG PET/CT could be used not only for individualized preoperative therapeutic planning but also for stratifying patients into different survival groups after surgical treatment and determining appropriate additional treatments.

Methods

This study followed the medical research protocols and ethical guidelines laid down by the World Medical Association’s Declaration of Helsinki. The Institutional Review Board of Keimyung University Dongsan Hospital approved this retrospective study and waived the requirement to obtain informed consent (2018-06-028).

Patients

We retrospectively reviewed medical records of 1141 patients who underwent surgery for primary GC at our institution between January 2008 and December 2011. Of these, 731 patients who underwent preoperative [18F]FDG PET/CT for a staging workup and subsequent curative surgical resection were enrolled in this study. The entire cohort was divided into a training cohort (n = 543) that underwent surgery between January 2008 and December 2010, and a validation cohort (n = 188) that underwent surgery between January 2011 and December 2011 (Fig. 5). The exclusion criteria were as follows: multiple primary malignancies, microscopic or macroscopic residual disease after surgical resection, any other treatment before surgery, surgery for GC that recurred, death within 30 days post operation, an unavailable pathologic report, or an interval of more than 1 month between [18F]FDG PET/CT and surgery.

Figure 5
figure 5

Flow diagram of patient selection.

All patients underwent subtotal or total gastrectomy along with D2 lymphadenectomy (advanced GC) and D1 + β or D2 lymphadenectomy (early GC). Clinicopathologic data, including sex, age at surgery, tumor location, World Health Organization and Lauren histopathological subtypes, lymphovascular invasion, neural invasion, pT and pN stages were retrieved from patients’ medical records. The survival data was retrieved from the National Health Insurance Service. The pT and pN stages were classified according to the 8th edition of the AJCC TNM staging system.

[18F]FDG PET/CT and image analysis

Before injecting [18F]FDG, all patients fasted for at least 6 h, and the blood glucose level was managed to < 150 mg/dL. Patients were instructed to rest during [18F]FDG uptake period. Images were acquired 60 min after intravenously administering 5.5 MBq/kg of [18F]FDG. [18F]FDG PET/CT was performed using 2 integrated PET/CT systems (Discovery STe; GE Healthcare or Biograph mCT; Siemens Healthineers). First, a low-dose CT image (Discovery STe; peak voltage, 120 kV; automated tube current, 60–150 mA; and slice thickness, 3.75 mm, Biograph mCT; peak voltage of 120 kV, automated exposure control using CARE Dose4D, and slice thickness of 3 mm) was acquired for attenuation correction. No oral or intravenous contrast was used. Immediately following CT, PET was performed with an acquisition time of 3 min per bed position with the Discovery STe and 1.5 min per bed position with the Biograph mCT in three-dimensional mode. PET images were reconstructed using an ordered-subset expectation maximum iterative reconstruction algorithm. All 543 training cohort patients underwent PET/CT scan on the Discovery STe. Meanwhile, of 188 validation cohort patients, 82 patients underwent PET/CT scan on the Discovery STe and 106 patients underwent PET/CT scan on the Biograph mCT.

The images were retrospectively interpreted on an AW server 3.2 (GE Healthcare) by two experienced nuclear medicine physicians who were blinded to patient survival outcomes, and a consensus was achieved. First, all [18F]FDG PET/CT images were visually assessed and classified as positive or negative with respect to [18F]FDG uptake by the primary tumor. Positive uptake was defined as abnormally increased [18F]FDG uptake that exceeded the physiologic uptake by the surrounding stomach wall and corresponding cancer lesions on esophagogastroduodenoscopy. Meanwhile, negative uptake was defined as no significantly visible [18F]FDG uptake or diffusely increased uptake indistinguishable from physiologic gastric wall uptake. Focally increased [18F]FDG uptake lesions that did not correspond to cancer lesions on esophagogastroduodenoscopy and histopathological findings were also judged to be negative [18F]FDG uptake. Consequently, T_SUVmax was obtained only in positive [18F]FDG uptake lesions. For metastatic LNs, N_SUVmax was acquired in the highest focal [18F]FDG-avid LN on the PET image regardless of the size on CT for semiquantitative analysis. Circular regions of interest were manually drawn over the maximum [18F]FDG uptake lesions on the attenuation-corrected transaxial [18F]FDG PET images. We assigned the SUVmax as 0 to patients with negative [18F]FDG uptake of the primary tumor or LNs. The SUVmax was calculated using the following formula: SUVmax = maximum activity in the region of interest (MBq/g)/(injected dose [MBq]/body weight [g]).

Statistical analysis

Continuous variables are expressed as means ± standard deviations, and categorical variables as numbers and percentages. The metabolic score, which is a combined index of T_SUVmax and N_SUVmax, was developed for predicting DSS using CTree analysis using the R package “party”26. DSS was defined as the interval between surgery and date of cancer-specific death (deaths from other causes were censored) and was calculated using the Kaplan–Meier method. Multivariate Cox proportional hazards regression analyses were performed to identify independent variables affecting DSS, and the HRs and 95% CIs were estimated for each parameter. The new staging system, which was a tree-structured survival model created using CTree analysis, was established by combining the metabolic score and TNM stage in the training cohort.

The prognostic performance of the new staging system and TNM stage were statistically assessed. To compare the homogeneity of the TNM stage with that of the new staging system, the χ2 likelihood ratio test related to the Cox regression model was used. The discriminatory capability of gradient assessments was evaluated using the AIC and Harrell’s C index. Models with higher χ2 likelihood ratios and Harrell’s C indices were deemed accurate. In addition, a lower AIC value indicates that the model attains a better balance between the overall fit to the data and the model’s simplicity27,28. To determine generalizability of the established metabolic score and new staging system derived from the training cohort for a risk prediction model, internal validation was performed in the validation cohort. All statistical analyses were performed using MedCalc for Windows, version 18.6 (MedCalc Software), and R version 3.4.3 software (http://www.r-project.org, R Foundation for Statistical Computing). P values < 0.05 were considered statistically significant.