Introduction

In China, esophageal cancer was associated with a diagnosis rate of 477.9 cases/100,000 population and a mortality rate of 375/100,000 population during 20151,2. The standard treatment for locally advanced esophageal squamous cell carcinoma (ESCC) is multimodal treatment including definitive chemotherapy and radiotherapy (CRT) or preoperative CRT3,4,5,6,7. However, it is important to predict the patient’s prognosis and response to multimodal treatment to enhance their management. The current esophageal treatment guidelines are mainly based on clinical or pathological staging, although the clinical stage cannot predict patients’ response to multimodal treatment4,8,9,10. This is because clinical staging is based on various examinations, including computed tomography (CT), esophageal ultrasonography, positron emission tomography–computed tomography (PET-CT)11, and magnetic resonance imaging (MRI)12,13. It is possible that various imaging-related parameters have prognostic value in this setting, such as CT-based or MR-based volume, however the volume can be affected by many reasons such as different oncologists or check equipment13. Therefore, tumor volume affected by multiple factors is hardly an effective prognostic predictor. Interestingly, CT-based compactness, which is calculated based on the primary tumor’s volume and surface area, is a prognostic factor in head, neck, and lung cancers14,15,16. Compactness is defined as a numerical quantity that can be calculated for three-dimensional objects as a function of the volume and surface area.However, to the best of our knowledge, no studies have examined the prognostic value of ESCC compactness or its correlations with treatment response, TNM stage, and radio-sensitivity. Therefore, the present study aimed to develop and validate a CT-based compactness risk model for predicting prognosis and the response of ESCC to multimodal treatment.

Materials and Methods

Study design

The present study evaluated data from three separate cancer centers, and the institutional review boards of each center(National Cancer Center, Sichuan Cancer Center and Fujian Cancer Hospital) approved the study’s retrospective protocol. The risk model was created based on separate datasets from the participating centers, which included pre-treatment, imaging, treatment, and outcome data. Disease staging was performed according to the sixth edition (2002) of the AJCC staging manual17. The characteristics of the patients in the training and validation datasets are listed in Table 1. The treatment details are provided in the Supplementary Information. The study’s design is shown in Fig. 1.

Table 1 Characteristics of the patients in each dataset.
Figure 1
figure 1

The definition of compactness was applied to the four datasets. One dataset used for training to determine the model’s value for predicting prognosis and treatment response. The other three datasets were used to validate the model and clarify the relationships between compactness, clinical TNM staging, and radio-sensitivity. CCRT: concurrent chemoradiotherapy, RT: radiotherapy, Preo: preoperative, pCR: pathological complete response.

Datasets

The training dataset (ESCC1) included data from 83 patients who participated in a prospective randomized study (NCT01551589) that examined involved field irradiation and elective nodal irradiation for esophageal cancer18. The patients had undergone concurrent chemoradiotherapy (CCRT) for locally advanced ESCC at the Sichuan Cancer Center, and had available data regarding the CT simulation, gross tumor volume (GTV) delineations, clinical TNM stage (IIB–III), and survival outcomes. This dataset was used to assess the ability of tumor compactness to predict prognosis and treatment response, and to identify the optimal cut-off value for the risk model. In this dataset, 20 patients were randomly selected for multiple contour analysis by different oncologists.

The first validation dataset (ESCC2) consisted of 98 patients who underwent CCRT for ESCC at the Fujian Cancer Center. These patients also had available data regarding the CT simulation, GTV delineations, PET-CT findings, clinical TNM stage (IIB–IVB), and survival outcomes. This dataset was used to evaluate the ability of tumor compactness to predict prognosis and treatment response, as well as the relationships with TNM stage and lymph node metastasis.

The second validation dataset (ESCC3) consisted of 283 patients who were treated for ESCC (56 patients received CCRT and 227 patients received RT alone) at the National Cancer Center (Beijing). These patients also had available data regarding CT simulation, GTV delineations, ultrasonography findings, clinical TNM stage (I–IVB), and survival outcomes. This dataset was used to evaluate the ability of tumor compactness to predict prognosis and treatment response, as well as the relationships with clinical T stage and radio-sensitivity.

The third validation dataset (ESCC4) consisted of 48 patients who underwent preoperative CCRT at National Cancer Center. This dataset was used to evaluate the relationship between tumor compactness and radio-sensitivity.

CT data acquisition and compactness calculation

All patients underwent radiotherapy (3D-CRT or IMRT) with or without chemotherapy. The CT data were downloaded from each center’s treatment planning system (step 1, Figure S1). Using Imaging Biomarker Explorer software (IBEX version 1.0β)19, the CT data were imported (step 2) and the GTV delineations were performed by three radiation oncologists in each center (steps 3 and 4). Esophageal stents, nasogastric tubes, intraluminal air, and oral contrast material were excluded from the GTV, and then descriptors were created for the GTV’s three-dimensional size and shape (step 5). Compactness is defined as a numerical quantity that can be calculated for three-dimensional objects as a function of the volume and surface area (step 6):

$${\rm{Compactness}}=({\rm{volume}})/[(\surd \pi )\times {({\rm{surface}}{\rm{area}})}^{2/3}].$$

Data analysis

The analysis was divided into training and validation phases. For the training phase, we evaluated the prognostic values of compactness, volume, and surface area using Cox proportional hazards regression models, although compactness was selected because it had the best prognostic performance. In the training cohort, a dataset for 20 patients were separately delineated by four radiation oncologists(one vice chief doctor and three associated doctors), with delineation stability being evaluated using the Friedman test. The optimal compactness cut-off value for predicting survival was identified using x-tile software (version 3.6.1), and the patients were stratified according to their compactness values. For the validation phase, the compactness-based risk model was applied to three separate cohorts (ESCC2–4), and the model’s prognostic value was assessed using Kaplan-Meier curves and the log-rank test. Univariate Cox analyses were performed with several clinical variables, and multivariate Cox analysis was subsequently performed to determine whether the risk model was an independent prognostic factor. The associations between CT-based compactness and tumor length or TNM staging were also evaluated using Spearman’s rank correlation coefficient or the Mann-Whitney U test. The relative predictive values for tumor compactness, tumor length, and TNM staging were evaluated using each factor’s Harrell concordance index (C-index) value, with higher values indicating better ability to predict prognosis.

The ability of the model to predict treatment response was evaluated using the Beijing dataset (ESCC3) that included patients who underwent RT or CCRT. To further adjust for unbalanced factors, propensity score matching was performed to create comparable groups that underwent RT or CCRT. The propensity score for each patient was estimated using a logit model that included age, sex, tumor location, and clinical stage. Nearest neighbor matching (1:1) was then performed within a prespecified caliper width but without replacement. The survival benefits of CCRT and RT were compared using Kaplan-Meier curves and the log-rank test for the various compactness-based subgroups.

All statistical analyses were performed using IBM SPSS software (version 22.0; IBM Corp., Armonk, NT) and R software (version 3.2.0 for Microsoft Windows). Differences were considered statistically significant at two-tailed P-values of <0.05.

Ethical approval

All procedures performed in studies involving human participants were in accordance with National Cancer Center, Sichuan Cancer Center and Fujian Cancer Hospital the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Results

Compactness is an independent prognostic factor for ESCC

In the training set, 103 patients with advanced ESCC (cT2–4N1M0) received CCRT during 2012–2016 as part of the NCT01551589 trial, although 20 patients were excluded from the present study based on M1 status (7 patients), age of >75 years (5 patients), and abnormal liver function (8 patients). Thus, 83 patients from that dataset were included in the present study and their characteristics are shown in Table 1. The median OS and PFS values for that group were 36.7 months and 24.0 months, respectively. Univariate analyses revealed that compactness (as continuous variable) predicted OS (HR = 2.74, 95% CI = 1.17–6.44, P < 0.02) and PFS (HR = 3.74, 95% CI = 1.57–8.88, P = 0.003). The x-tile software revealed two cut-off values for predicting compactness-based risk: low risk (<0.56), moderate risk (0.56–0.85), and high risk (>0.85) (Figure S2). The risk model significantly predicted PFS (Fig. 2A, P < 0.001) and OS (Fig. 2B, P = 0.012). The median OS times were 52.6 months in the low-risk group, 32.2 months in the moderate-risk group, and 20.8 months in the high-risk group. The median PFS times were 29.0 months in the low-risk group, 23.2 months in the moderate-risk group, and 9.0 months in the high-risk group. Multivariate Cox models indicated that the risk model based on compactness was able to independently predict OS and PFS in the training dataset (Table 2).

Figure 2
figure 2

Prognostic value of the compactness-based risk model. Kaplan-Meier curves for overall survival (OS) and progression-free survival (PFS) are shown for the various datasets and compactness-based risk groupings (low, moderate, and high). P-values were calculated using the log-rank test. The OS and PFS results are presented for the training dataset (A,B) and the validation datasets from Fujian (C,D) and Beijing CCRT + RT alone (E,F) or Beijing RT alone (G,H).

Table 2 Cox regression analyses of overall and progression-free survivals in the training data set.

The risk model was then validated using the Fujian dataset (98 patients in the ESCC2 dataset) and the Beijing dataset (283 patients in the ESCC3 dataset) (Fig. 2C–F), which revealed significant abilities to predict OS in the Fujian dataset (P = 0.022) and in the Beijing dataset (P = 0.003). The median OS times in the Fujian dataset were 46.7 months in the low-risk group, 23.6 months in the moderate-risk group, and 19.6 months in the high-risk group. The median OS times in the Beijing dataset were 31.9 months in the low-risk group, 20.8 months in the moderate-risk group, and 15.3 months in the high-risk group. The median PFS times in the Fujian data were 17.8 months in the low-risk group, 12.8 months in the moderate-risk group, and 8.4 months in the high-risk group (P = 0.003). The median PFS times in the Beijing dataset were 21.5 months in the low-risk group, 11.2 months in the moderate-risk group, and 8.6 months in the high-risk group (P = 0.005). Among 227 patients who only received RT in the Beijing dataset, the risk model was still significantly able to predict PFS (Fig. 2G, P = 0.006) and OS (Fig. 2H, P = 0.002).

Multivariate Cox models indicated that compactness was an independent prognostic factor for PFS in the two validations datasets (Table S1). Similarly, compactness was still able to significantly predict OS in Beijing (ESCC3) datasets, although it did not reach statistical significance in the Fujian dataset (ESCC2) (Table S1).

Compactness is correlated with clinical T stage

Compactness was significantly correlated with clinical T stage in the training dataset (Fig. 3A, P < 0.001), the Fujian dataset (Fig. 3B, P = 0.03), and the Beijing dataset (Fig. 3C, P < 0.001). Node metastasis is an important prognostic factor that can guide treatment for ESCC, although we detected a significant difference in the compactness values according to nodal status in the Beijing dataset (Figure S3, P = 0.015). In the Beijing dataset, the compactness-based risk model predicted prognosis among the 233 patients with N1 disease, which highlights the complementary nature of CT-based imaging and nodal status. There was no significant correlation of compactness with N stage in the Fujian dataset (P = 0.468). Furthermore, compactness was not significantly correlated with clinical M stage in the Fujian dataset (P = 0.152) or in the Beijing dataset (P = 0.598) (Figure S2).

Figure 3
figure 3

Compactness was significantly correlated with clinical T stage in the training dataset (P < 0.001), the Fujian dataset (P = 0.03), and the Beijing dataset (P < 0.001).

Compactness is better than clinical T stage for predicting ESCC prognosis

To compare the compactness-based model and clinical T stage, C-index values were calculated for each grouping’s ability to predict OS and PFS. In the training dataset (Table 3), the C-index values of compactness were 0.64 (95% CI: 0.55–0.73) for predicting OS and 0.66 (95% CI: 0.58–0.74) for predicting PFS. However, the C-index values of T stage were 0.58 for both OS and PFS. In the validation datasets (Fujian and Beijing), compactness also had higher C-index values that clinical T stage for predicting OS and PFS (Table 3). Compactness was also superior to the other TNM stages for both OS and PFS.

Table 3 Compactness and staging factors for predicting overall and progression-free survival.

Ability of the compactness-based risk model to guide treatment option

The Beijing datasets included 283 patients who were treated for ESCC (56 patients received CCRT and 227 patients received RT alone), with CCRT being associated with prolonged PFS (P = 0.091) and OS (P = 0.003) (Fig. 4A,E). In the high-risk compactness group, CCRT also provided prolonged PFS (P = 0.09) and OS (P = 0.01) (Fig. 4D,H). However, in the low-to-moderate risk compactness group, no significant differences in PFS (Fig. 4B,F) or OS (Fig. 4C,G) were observed between CCRT and RT alone. We also performed propensity score matching according to age, sex, KPS, and clinical TNM stage, which produced 56 matched pairs of patients who underwent CCRT or RT alone (Table S2). Among these patients, CCRT was associated with significantly prolonged PFS (P = 0.006) and OS (P < 0.001) (Fig. 4I,M). Similarly, in the high-risk compactness group, CCRT was associated with significantly prolonged PFS (P = 0.015) and OS (P = 0.001) (Fig. 4L,P). Among 10 randomly selected pairing results, 6 pairs revealed that the high-risk patients experienced a PFS benefit from CCRT, although no benefits were observed in the low-to-moderate compactness groups (Table S3). Moreover, all 10 pairing results indicated that the high-risk patients experienced an OS benefit from CCRT, although no benefits were observed in the low- or moderate-risk groups (Table S3).

Figure 4
figure 4

Survival benefit from concurrent chemoradiotherapy for esophageal squamous cell carcinoma according to the compactness-based risk model. The Kaplan-Meier curves for overall survival (OS) and progression-free survival (PFS) were compared for concurrent chemoradiotherapy (CCRT) and radiotherapy alone (RT) in the Beijing dataset (AH). P-values were calculated using the log-rank test. The survival analyses were performed for all patients and patients in the low-risk, moderate-risk, and high-risk compactness subgroups. A subcohort from the Beijing dataset was generated using propensity score matching and subjected to the same analyses (IP).

Compactness as a validated biomarker for treatment response

In the Sichuan dataset, the patients underwent RT (40 Gy in 20 fractions of 2 Gy), which provided a post-RT decrease in compactness for 65 of the 83 patients, with post-RT compactness of 0.56 significantly predicting both OS and PFS (both P < 0.001). In the Fujian dataset, post-treatment compactness (cut-off value: 0.56) also significantly predicted OS (P = 0.028) and PFS (P = 0.01). Moreover, among patients with moderate-to-high pre-treatment compactness, the post-treatment compactness also predicted OS (P = 0.052) and PFS (P = 0.002).

Among 48 patients in the Beijing dataset, pathological complete response (pCR) significantly predicted response to preoperative CCRT (OS: P = 0.02, PFS: P = 0.02). However, among patients without pCR, low-to-moderate compactness was associated with longer OS (P = 0.009) and PFS (P = 0.03) than in the high-risk group (Figure S4). Moreover, low- and moderate-risk patients with and without pCR had similar OS (P = 0.127) and PFS (P = 0.176) (Figure S3).

Discussion

The present study revealed that CT-based compactness could predict OS and PFS among patients who received primary radiation-based treatment for ESCC. Furthermore, compactness was correlated with clinical T stage but was better for predicting prognosis in this setting. Therefore, tumor compactness reflects both the tumor burden and likelihood of treatment response. The clinical TNM stage is currently used to guide treatment, with staging performed based on findings from various imaging modalities (e.g., CT, MRI, PET-CT, and esophageal ultrasonography)12, although many hospitals lack sufficient equipment to accurately determine the clinical TNM stage. Thus, it would be useful to have a clear CT-based parameter for predicting TNM stage, the patient’s prognosis, and the likelihood of treatment response. Therefore, we evaluated tumor compactness in this setting, as a previous study has indicated that compactness was a prognostic factor for head, neck, and lung cancers16.

The burden of esophageal tumors may help predict treatment response based on the tumor’s length, thickness, volume, and surface area13,20,21,22. Furthermore, the present study revealed that compactness was a better prognostic marker than the tumor’s volume or surface area (Fig. S4), and compactness had better stability than volume and surface area in the multiple delineation analysis that involved different oncologists (Supplementary Data). Moreover, compactness was closely related to clinical T stage (a measure of tumor burden), although we detected center-specific differences in the correlations between compactness and T stage, which highlights the difficulty of standardizing clinical staging and subsequent treatment between centers. Otherwise, Clinical T staging only represents one dimension of tumor burden, and the increase of infiltration depth corresponds to the increase of T staging. However, tumor compactness represents tumor burden in multidimensional direction, including tumor volume and surface area. Therefore, tumor compactness is correlated with T stage, which can better represent tumor burden than T stage.

The RTOG8501 trial showed that CCRT significantly increased OS relative to RT alone3. However, 8% of the CCRT group experienced acute life-threatening toxic effects based on the RTOG acute morbidity scale and an additional 2% died as a direct consequence of treatment. In contrast, only 2% of patients who received RT alone experienced acute grade 4 toxic effects and there were no related deaths3. Thus, CCRT has significantly increased toxicity and cost, relative to RT alone, and remains a controversial treatment regimen especially for elderly patients with ESCC23,24,25, which highlights the importance of identifying patients who are not expected to benefit from CCRT. Interestingly, the present study revealed that patients with low-risk compactness did not benefit from CCRT, although further studies are needed to validate this relationship. Based on the results shown in Fig. 4B,F, it appears that patients who received CCRT had longer survival after 24 months, relative to patients who only received RT, and 14 patients who received CCRT had KPS scores of ≥80, while some patients who received RT alone had lower KPS scores (KPS 90: 29.8%, KPS 70: 14.0%; P = 0.05). Therefore, among patients in the low-risk group, KPS may help predict prolonged survival.

Among patients undergoing CCRT or preoperative CCRT, clinical TNM stage cannot accurately predict treatment response and prognosis4,8,9,26. In the Sichuan dataset (training) and the Fujian dataset (validation), the patients underwent CT examinations before and after treatment, with a post-treatment decrease in compactness (from ≥0.56 to <0.56) being associated with prolonged OS and PFS. Furthermore, after preoperative CCRT, the OS outcomes were similar among patients in the low-risk group who did and did not achieve a pCR (P = 0.127). Therefore, compactness may be a useful biomarker for predicting treatment response.

The present study used datasets from different centers and different radiation oncologists performed the GTV contouring, which is a potential limitation because clinical staging can vary between cancer centers. In this context, the Sichuan dataset involved prospectively enrolled patients with clinical stage IIB–III disease and a median OS of 36.3 months. In contrast, the Fujian and Beijing datasets were retrospectively obtained from patients with clinical stage I–IVb disease who underwent CCRT, RT alone, or preoperative CCRT. Thus, it is possible that differences in the prognostic value of compactness were related to the GTV contouring being performed by various oncologists at difference centers. Nevertheless, we did not detect any significant differences when we compared the contoured values from 4 oncologists for 20 patients. Another potential limitation is that the Sichuan dataset was smaller than the Fujian and Beijing datasets, with significantly lower proportions of patients with T4 and stage IV disease in the Sichuan dataset.

In conclusion, our findings indicate that tumor compactness can supplement the traditional evaluation of clinical T stage for ESCC. Furthermore, compactness based on volume and surface area independently predicted prognosis and treatment response in this setting.