Main

Oesophageal carcinoma (EC) is one of the most malignant gastrointestinal cancers worldwide (Sterne et al, 2009b; Peters et al, 2010a). China is one of the countries with high EC risk. Oesophageal carcinoma has a unique geographic distribution in China, including Chaoshan in Guangdong province, the southern region of Fujian province, Linzhou in Henan province, Cixian in Hebei province, and Yangcheng in Shanxi province (Su et al, 2001). The annual average crude incidence of EC in Chaoshan was >100/100 000 people from 1995 to 2004 (Umar and Fleischer, 2008). Oesophageal squamous cell carcinoma (ESCC) is the predominant subtype of EC in China (Pisani et al, 1999; Lehrbach et al, 2003), with an overall 5-year survival rate of <10% (Takeno et al, 2007).

One of the most commonly used staging systems for ESCC is the tumour-node-metastasis (TNM) system, which was proposed by the American Joint Committee in Cancer (AJCC) in 1988, then revised in 2009 (the 7th edition) (ohlschlager et al, 2010). The TNM staging system does not reflect the whole cancer progression status because the clinical and pathologic findings are gathered for staging at the time of surgery (Kunisaki et al, 2005; Goldstraw et al, 2007), and the system has become complex (Varotti et al, 2005).

As with the development of cancer biology, clinical oncology has greatly changed. Cancer biology will hasten the development of new prognostic and predictive models that may be more relevant in some cases (Zisman et al, 2001).

Recent advances in molecular biology have demonstrated the benefit of molecular profiling in clarifying subtypes of cancer pertinent to cancer progression (Wang et al, 1999; Takeno et al, 2007). Peters et al (2010b) proposed five gene signatures closely associated with an overall survival rate in oesophageal adenocarcinoma, with good external validation. Chen et al (2007) considered four genes as the prognostic genetic signature in non-small-cell lung cancer. However, these authors did not use critical clinical and pathological information in their studies. With univariate survival analysis, survival with ESCC was suggested to be associated with the proteins Ezrin, Fascin, pFascin, connective-tissue growth factor (CTGF), cysteine-rich angiogenic protein 61 (CYR61), neutrophil gelatinase-associated lipocalin (NGAL), NGAL receptor (NGALR), desmocollin 2 (DSC2), and activating transcription factor 3 (ATF3) (Wang et al, 1999; Zhang et al, 2006; Chen et al, 2007; Gao et al, 2009; Xie et al, 2009, 2010, 2011; Fang et al, 2010; Zhao et al, 2010; Du et al, 2011).

To identify important factors, we performed a retrospective review of 461 patients with ESCC to evaluate independent prognostic factors. A novel model was built by using pathology information with other important clinical indicators and it validated the simple-to-use FENSAM model in a contemporary independent cohort.

Patients and methods

Patients

We included two independent cohorts of patients with ESCC in this study. A total of 596 patients were enrolled to construct the classifier. We excluded 135 patients who had >8 missing biomarkers each, resulting in a final learning cohort of 461 patients. All patients had undergone surgical resection at the Central Hospital of Shantou City, Guangdong province, China, from 1987 to 1997. The validation cohort included 290 patients from the same hospital during 2007–2010. The clinicopathological characteristics of patients are in Table 3 in the Supplementary Appendix. The clinicopathological characteristics were comparable in the two cohorts (Table 3 in the Supplementary Appendix).

Computerised Tomography (CT) was used to confirm ESCC in patients. Patients were excluded if they declined to undergo surgical resection or sign the informed consent. We excluded patients who died from diseases other than ESCC. In addition, surgery was not possible for patients who were confirmed by CT as at TNM stage IV from 2007 to 2010, so these patients were not included in the test cohort.

Because of inconvenient communication at Shantou in 1990s, follow-up data were mainly collected from medical records and mail after surgical resection. Nonetheless, all patients in the two cohorts were followed up for 3–10 years or until death.

Tissue microarray and immunohistochemistry

The study was approved by the ethics committee of the Central Hospital of Shantou City. Pathology was used to help clinicians with proper assessment. Tissue microarray (TMA) and immunohistochemistry were performed as described (Xie et al, 2009; Du et al, 2011). Briefly, we obtained two or more tissue cores from each specimen that was 1.8 mm in diameter and 1.0–3.0 mm in length depending on the tissue depth in the donor block. Each core was set into a new paraffin block. The antibodies used for incubation are in Table 4 in the Supplementary Appendix. The Polymer Detection Systems for Immuno-Histological Staining Kit and the Liquid Substrate kit (ZSGB-BIO, Beijing, China) were used for immunohistochemistry. Each section was independently assessed by two pathologists who were blinded to patient data.

The immunohistochemical staining results were assigned a maximum score considering both the intensity of staining and the proportion of tumour cells showing unequivocal positive reactions (see Figures 1–3 in the Supplementary Appendix). Positive reactions were defined as brown signals in cell cytoplasm, nucleus, and membrane. The intensity of staining was 0, no staining; 1, weak staining; 2, moderate staining; and 3, strong staining. Tumour cell area was defined as positive staining, for scores of 0, <5% of tumour cells; 1, 5–25% of tumour cells; 2, 26–50% of tumour cells; 3, 51–75% of tumour cells; and 4, 75–100% of tumour cells. A final score was calculated by multiplying positive reactions and tumour cell area, for a total score ranging from 0 to 12.

Statistical analysis

Multivariate analysis of survival and model selection

The association between survival and each of 22 potential prognostic factors was evaluated by the univariate Cox proportional hazards regression analysis. The hazard ratio determined which markers were associated with survival among patients with high vs low frequency of ESCC (see Table 1 in the Supplementary Appendix). Combined the results with previous studies, 12 associated prognostic significant factors were applied as initiate covariates for multivariate Cox proportional hazards model (tumour size, surgery extent, differentiation, T-stage, N-stage, M-stage, Ezrin, CYR61, DSC2, Fascin, pFascin, and ATF3). To select significant prognostic factors associated with a survival rate, we used a standard variable-selection procedure, LASSO (Datta et al, 2007), in a multivariate Cox model and Bayesian Information Criteria to select a proper tuning parameter in LASSO (O’Hara and Sillanpaa, 2009; Alhamzawi and Yu, 2012). Six variables were found to be independent and significant predictors of risk of death. We used a linear combination of the protein-expression values weighted by the regression coefficient to determine a hazard risk score (RS) for each patient (for details, see Methods section in the Supplementary Appendix). We applied K-means clustering method according to the RSs for the six selected variables to set the cutoff values (Ferreira et al, 2007). These cutoff values were used to classify patients into four ordered subgroups.

To compare with the traditional TNM stage system, we repeated the following procedure 200 times. At each time, 75% of the data (n=346) was chosen as training data to build the FENSAM, with the remaining data (n=115) as the validation data. The 200 P-values for the log-rank test were recorded, with box plots used to depict the difference between these 2 stages. To further evaluate the FENSAM and the TNM stage system, we used Receiver Operating Characteristic (ROC) curve and Kaplan–Meier survival analyses of the four subgroups along with the log-rank statistics for the 200 P-values. For more details, see Methods section in the Supplementary Appendix.

Validation

According to these results, we validated the model with an independent cohort of 290 patients by ROC curve analysis and the log-rank test.

Results

Combining the selected variables to build a model

Previous univariate survival analyses suggested that 12 markers could be a potential prognostic factor in ESCC (Zhang et al, 2006; Gao et al, 2009; Xie et al, 2009, 2010, 2011; Fang et al, 2010; Zhao et al, 2010; Du et al, 2011) (see Tables 1 and 2 in the Supplementary Appendix).

Table 1 lists the estimated coefficients for the characteristics in the Cox proportional-hazards model. We selected six variables, and all except ATF3 were positively associated with a survival rate. N-stage had the largest absolute effect size, 0.763, so the hazard ratio for N-stage was 2.145 (=exp(0.763)); in other words, patients with N-stage=1 may die at about twice the rate per unit time as patients with N-stage=0. In addition, ATF3 was negatively associated with survival.

Table 1 Estimated coefficients for variables used in the FENSAM (Fascin Ezrin N M Surgery extent ATF3) modela

The RSs for the selected characteristics were calculated from the estimated coefficients in Table 1. The RS-based FENSAM system is presented in Table 2. The optimal cutoffs calculated by the clustering method were 0.359, 0.797, and 1.304, respectively. Among the 461 patients, 75 were classified as having ESCC as stage I, 132 stage II, 119 stage III, and 138 stage IV.

Table 2 Thresholds of the FENSAM (Fascin Ezrin N M Surgery extent ATF3) stages for ESCCa

The median area under the ROC curve (AUC) values for the FENSAM and TNM systems are in Figure 1A. The AUC was larger for the FENSAM than the TNM system (0.6829 vs 0.6196). Comparison of other quantiles is depicted in Figure 1B. The median AUC values of the FENSAM were larger than the TNM stage.

Figure 1
figure 1

Receiver operating characteristic (ROC) curve analysis of the FENSAM and the TNM stage with the test data. (A) ROC curve for the FENSAM and the TNM stage. (B) Box plots for median AUC for the log-rank test of the FENSAM and TNM stage systems.

Survival was lower by the FENSAM than by the TNM system (P=7.11e−8 and P=1.07e−3) (Figure 2A and B). For survival time<30 months, survival was indistinguishable between TNM stage III and IV (Figure 2B) but was distinct between FENSAM III and IV (Figure 2A). Furthermore, the 5-year survival rate for TNM stages I–IV was 0.9%, 53.91%, 35.7%, and 1.0%, respectively, and for FENSAM stages I–IV was 20.9%, 37.4%, 19.1%, and 22.6%, respectively. In addition, the negative log-transformed P-values were larger with the FENSAM than the TNM system (Figure 2C). Detailed comparisons based on Kaplan–Meier survival analysis are described (see Figures 4–7 in the Supplementary Appendix).

Figure 2
figure 2

Comparison of the FENSAM and the TNM stage systems with the test data by Kaplan–Meier analysis. (A) FENSAM model. (B) TNM stage model. (C) Box plots for log-transformed P-values (−log 10 (P-value)) from the log-rank test with the 200 test data sets for the FENSAM and TNM models.

Validation

In the validation data set, no patient was at TNM stage IV, whereas 14 patients were classified as FENSAM stage IV, and all survived<3 years (Figure 3A and B). Therefore, the FENSAM stage coincided with the practical experience of clinicians and could increase the accuracy of ESCC prognosis. In addition, AUC scores were greater for FENSAM than for TNM stages (0.678 vs 0.626) (Figure 3C).

Figure 3
figure 3

The predictive ability of the FENSAM as compared with the TNM stage model by Kaplan–Meier analysis in the validation cohort (A and B) and ROC curves (C).

Discussion

Oesophageal squamous cell carcinoma is the predominant subtype of EC in China, with an overall 5-year survival rate of <10%. The staging system is based on the anatomic extent of the cancer. We aim to combine clinical variables and biomarkers to develop and validate a single, reliable model, named FENSAM, for ESCC prognosis. To build the FENSAM, we analysed 22 potential prognostic factors from 461 patients. We used multivariate analysis and variable selection to select significant factors associated with survival of ESCC patients. These selected factors were used to build our FENSAM model. We then obtained the hazard RS of the model to classify ESCC patients. In addition, we validated the model in an independent cohort of 290 patients from the same hospital. The predictive performance of the model was assessed by AUC and Kaplan–Meier survival analysis.

In short, FENSAM performed better than the TNM staging system based on our validation data. Although both the learning and validation data were collected from the same hospital and with some missing data, the FENSAM’s superior performance compared with TNM is noteworthy and further validation using multi-centre data will enhance the generalisation of our result.

Missing data present a ubiquitous problem in everyday clinical practice, and incomplete data are removed in the data analysis. We should pay more attention to the issue of missing data, which can introduce bias and lead to inaccurate conclusions. Although a variety of ad-hoc approaches are commonly used to handle missing data, some of these approaches are usually statistically invalid and also lead to serious bias (Sterne et al, 2009a). Rubin’s multiple imputation (MI) method (Sterne et al, 2009a) (see Methods section in the Supplementary Appendix), a general approach, can improve the validity of medical research. This study featured a lot of missing data, especially for the nine biochemical indicators in the test data set. In particular, data for Fascin, pFascin, and ATF3 featured more than 300 missing values, each with a missing rate of >50% (Table 3). Since we focussed on how the biomarkers influenced ESCC prognosis in this study, we removed data for patients with missing data for more than eight missing biomarkers. As a result, we included only 461 patients in our prediction model. Admittedly, all these missing data undermine the validity of our results. To reduce bias and increase efficiency as much as possible, the EMB algorithm combining the classic EM algorithm and a bootstrap approach was applied to impute data.

Table 3 The number of missing values for oesophageal squamous cell carcinoma (ESCC) cohort at the central hospital of Shantou from 1987 to 1999

Previously, six biomarkers (Ezrin, CYR61, DSC2, Fascin, pFascin, and ATF3) and six other clinicopathological characteristics (tumour size, surgery extent, differentiation, T-stage, N-stage, and M-stage) were found associated with the overall survival of ESCC patients by univariate COX proportional hazards model and in the previous studies (Wang et al, 1999; Zhang et al, 2006; Chen et al, 2007; Gao et al, 2009; Xie et al, 2009, 2010, 2011; Fang et al, 2010; Zhao et al, 2010; Du et al, 2011). However, our multivariate analysis revealed only Ezrin, Fascin, ATF3, N-stage and M-stage and surgery extent associated with survival. Both Ezrin and Fascin are actin-binding proteins that are overexpressed in ESCC and have key roles in the progression of ESCC (Xie et al, 2005, 2009). ATF3 is a transcription factor involved in cell-cycle progression, invasiveness, and metastasis of cancer cells. We recently revealed that ATF3 was involved in Ezrin or Fascin-mediated proliferation and invasiveness of ESCC cells (Xie et al, 2009, 2010). Therefore, molecules involved in a certain biological process or signal cascade may help in developing a cancer staging system. As well, in our model, coefficients were higher for clinicopathological characteristics than for biomarkers, so regional lymph-node involvement and distant metastasis were still predominant prognostic factors in ESCC.

Previous studies considered TNM clinical stage as the best prognostic indicator and T-, N-, and M-stage factors were associated with ESCC prognosis on univariate analysis in most previous studies (Goldstraw et al, 2007). However, our study showed that only N- and M-stages were independent prognostic factors on multivariate analysis. This result was consistent with several previous studies (Ikeda et al, 1999; Kuo et al, 2003).

By an extensive literature review, we selected the six biomarkers in the present study—three biomarkers (Ezrin, Fascin, and ATF3) and three clinical indicators (surgery extent, N- and M-stages)—as prognostic factors in ESCC. In addition, with the development of molecular techniques, more new biomarkers might emerge, which might be used as novel clinical indicators and provide higher specificity for the diagnosis and prognosis of ESCC.

The prognostic significance of ESCC classification is important. We can use prognosis to evaluate clinical outcome, guide treatment decisions, and account for the results of clinical trials (Zisman et al, 2001). The classification method we used could better separate patients into prognostic groups and improve the accuracy of the survival prediction on the basis of easily accessible clinical indicators and biomarkers. The six variables are readily available, are statistically significant prognostic factors, and were easily validated by confirmatory studies. FENSAM is based on pathology response rate, clinical indicators, and pathologic markers (Mittendorf et al, 2011; Yi et al, 2011) and is reliable and easy to use. On the basis of a simple model, FENSAM provides an alternative classifier for ESCC patients with a high precision. The method could be used with other malignant tumours.

The FENSAM is available at our website: http://escc.bio.med.stu.edu.cn/member/login.php. Our data sets are available upon request.