Main

Squamous cell carcinoma (SCC) is the most common pathological type of oesophageal cancer in the East (Law and Wong, 2002). According to the practice guidelines of the National Comprehensive Cancer Network, adjuvant chemotherapy was not recommended for completely resected oesophageal SCC (OSCC; National Comprehensive Cancer Network Esophageal Cancer Panel, 2012). However, many patients develop local recurrence or distant organ metastasis after surgery, and the 5-year survival rate is only about 42% (Rice et al, 2009). For patients with local recurrence, use of chemoradiotherapy can achieve good disease control. However, for patients with distant organ metastasis, the prognosis is much worse, with a 5-year survival rate of less than 7% (Wijnhoven et al, 2007). A review summarised that metastasis caused 90% of deaths from solid tumours (Nguyen and Massague, 2007). Therefore, it is very important to accurately select the patients with a high risk of postoperative distant organ metastasis and give them tailored adjuvant therapy to improve overall survival.

The American Joint Committee on Cancer (AJCC) staging system (Rice et al, 2009) is commonly used in the assessment of clinical outcomes in patients with OSCC. However, some patients may survive for a long time without recurrence, whereas others with disease in the same tumour node metastasis (TNM) stage may have a more unfavourable prognosis. Molecular biology advances have clarified biological behaviour in cancer metastasis. Nevertheless, in view of the complexity of cancer progression (Chiang and Massague, 2007; Gupta et al, 2007; Nguyen and Massague, 2007), it is fairly unlikely that one single molecule will perfectly predict a patient’s outcome. Thus, the combination of TNM staging with multiple molecular markers and other clinicopathological parameters may be a promising method of selecting the patients with a high risk of postoperative distant organ metastasis for the purpose of guiding tailored therapy. As the relationship between TNM stages, molecular markers, and other clinicopathological parameters is very complex, traditional statistical methods for linear combinations may not offer a reliable outcome. A support vector machine (SVM), as a state-of-the-art statistical method for classification, has recently been used to build extremely reliable cancer classifiers in terms of overall survival (Vapnik, 1998 and Zhu et al, 2009).

In this study we used the SVM method to develop effective models for predicting postoperative distant organ metastasis of OSCC. We used data from two institutes, one for data training and the other for data validation.

Materials and methods

Patient selection

This study was approved by the Ethics Committee of the Sun Yat-sen University Cancer Center and Linzhou Oesophageal Cancer Hospital. We used data from Sun Yat-sen University (in South China) to establish the training cohort and data from Linzhou Oesophageal Cancer Hospital (in North China) to establish the validation cohort. Both centres have broad experience in oesophageal surgery. The training cohort data came from an oesophageal cancer database that comprised information from 1071 consecutive patients with OSCC, who received surgical treatment for curative purposes, between January 1997 and January 2004. We used data from the training cohort to establish SVM-based models to predict the risk of postoperative distant organ metastasis that occurred within 5 years after surgery.

The validation cohort data came from an oesophageal cancer database that comprised information from 612 consecutive patients with OSCC, who received surgical treatment for curative purposes, between August 2000 and June 2007. All of the patients included in the study were restaged according to the seventh edition of the AJCC Cancer Staging Manual (Rice et al, 2009).

All patients included in the analysis fit the following criteria: (1) their disease was histologically defined as OSCC; (2) they underwent complete resection; (3) they had complete information for stage grouping; (4) they fit into pathological AJCC stages I–III; (5) their resections were neither preceded nor followed by adjuvant chemotherapy or radiotherapy (oesophagectomy alone); (6) for patients who were recorded with distant organ metastasis during follow-up, the metastatic organs were clearly recorded; and (7) they had adequate paraffin-embedded cancer tissue samples for use in constructing the tissue microarray. We defined the complete resection as resection with negative margins.

We excluded the patients with a history of concurrent malignant disease or other previous primary cancers and the operative deaths. Operative death is defined as death within 30 days of the operation or anytime after the operation if the patient did not leave the hospital alive.

As a result, 319 cases and 164 cases fit the inclusion criteria and were included in the training cohort and validation cohort, respectively (Table 1).

Table 1 Clinicopatholigical characteristics of the two cohorts of patients

Follow-up of patients

July 2012 was the last time of contact with both cohorts of patients. The median time from surgery to the last time of contact for the training cohort was 128.9 months, ranging from 93.6 to 188.1 months. The median time from surgery to the last time of contact with the validation cohort was 118.5 months, ranging from 66.2 to 152.6 months. The detailed information for patient follow-up was described in the Appendix (online only).

Tissue microarray construction

The method for tissue microarray construction was described in the Appendix (online only).

Molecular marker selection and immunohistochemical staining and scoring

Twenty-three molecular markers from different signal pathways were chosen for investigation (Table 2). The detailed method for molecular marker selection and tissue microarray construction was described in the Appendix (online only).

Table 2 The cutoff points and outcome for 23 immunomakers and 7 clinicopathological variables in predicting postoperative distant organ metastasis in the training cohort (n=319)

Statistical analysis

The SPSS statistical software package (Standard version 16.0; Chicago, IL, USA) was used for data analysis. The mean values are presented as the mean±s.d. Independent Student’s t-tests were used to compare groups of continuous, normally distributed variables. The Pearson χ2-test was used to determine the significance of differences between groups for dichotomous variables. All statistical tests were two-tailed, and P<0.05 was considered statistically significant.

To avoid a predetermined cutoff value, receiver-operating characteristic (ROC) analysis was used to define the cutoff value of immunoreactivity score for each immunomarker. An ROC curve for the cutoff value was generated using the MedCalc statistical software package 11.0.1 (MedCalc Software bvba, Mariakerke, Belgium). The score closest to the point of both maximum sensitivity and specificity was selected as the cutoff point, leading to the greatest number of cases correctly classified as having or not having the clinical outcome. Logistic regression analysis was used to determine the independent factors impacting postoperative metastasis in 5 years. Binary logistic regression analysis was used to determine the independent factors impacting the postoperative metastasis.

Survival time was measured from the date of surgery to the date of death or last follow-up. We used Matlab software (version 7.7.0.471, MathWorks, Inc, Natick, MA, USA) to establish SVM-based models for predicting the risk of postoperative distant organ metastasis and death in 5 years, respectively.

Strategies to establish the SVM-based models

First, we established four SVM models (SVM1–SVM4) using data from the Sun Yat-sen University Cancer Center as the training cohort and data from the Linzhou Oesophageal Cancer Hospital as the validation cohort, to predict distant organ metastasis in 5 years, respectively, after oesophagectomy for OSCC patients. The SVM1 model included two clinicopathological features: a pathological T category and a pathological N category. We selected these two variables to construct SVM1, because they were the most commonly used characteristics to predict the prognosis of operable OSCC. The SVM2 model included four clinicopathological features: a pathological T category, a pathological N category, cell differentiation, and tumour length. We included cell differentiation and tumour length in the SVM2 model, because these two features were not only the commonly used characteristics in predicting prognosis but also showed statistically significant P-values in predicting postoperative distant organ metastasis in univariate analysis in the training cohort (Table 2). The SVM3 model integrated the 4 clinicopathological features in SVM2 and 12 immunomarkers, including Cox-2, Cyclin B1, Cyclin D1, EGFR, HER2/neu, NF-κB, Integrin β1, NM23-H1, Ki67, p21Waf1/Cip1, uPAR, and VEGF. We included these 12 immunomarkers, because they were determined to be potential valuable variables by univariate analysis for prediction of distant organ metastasis in the training cohort (Table 2). The SVM4 model integrated the four clinicopathologic features in SVM2 and nine immunomarkers, including Cyclin D1, EGFR, HER2/neu, NF-κB, Integrin β1, Ki67, p21Waf1/Cip1, uPAR, and VEGF. We only included nine immunomarkers in SVM4, because they were determined to be potential valuable variables by univariate analysis for prediction of distant organ metastasis in both the training cohort (Table 2) and the validation cohort (Table 3), and were therefore considered more stable and representative in predicting metastases than those immunomarkers that showed significant P-value only in one cohort.

Table 3 The cutoff points and outcome for 12 immunomakers and 7 clinicopathological variables in predicting postoperative distant organ metastasis in the validation cohort (n=164)

Second, as the training cohort (from the Sun Yat-sen University Cancer Center) and the validation cohort (from the Linzhou Oesophageal Cancer Hospital) are different in some characteristics, such as gender, AJCC stage, and risk of distant organ metastasis (Table 1), we mixed the two cohorts and then randomly split the sample patients into the mixed training cohort (2/3) and the mixed validation cohort (1/3), and redid the prediction modelling (SVM1’–SVM4’) using the same variables in SVM1–SVM4, respectively.

Results

General patient characteristics

There are 319 cases that fit the inclusion criteria and these were included in the training cohort (Table 1). The other patients were excluded because of the following: incomplete resection (99 cases); preceded or followed by adjuvant chemotherapy or radiotherapy (263 cases); combined with secondary primary tumours (nine cases, including three combined with small cell lung cancer, two with colon cancer, two with laryngeal cancer, one with breast cancer, and one with tongue cancer); incomplete information for accurate staging (83 cases); and incomplete follow-up information for the accurate time and site of distant organ metastasis (298 cases). Seven operative deaths occurred and were excluded from this study, in which five were with neoadjuvant chemotherapy and/or radiotherapy, one was with incomplete resection, and one was with incomplete information for accurate staging.

There are 164 cases that fit the inclusion criteria and these established the validation cohort (Table 1). The other patients were excluded because of the following: incomplete resection (21 cases); preceded or followed by adjuvant chemotherapy or radiotherapy (197 cases); combined with secondary primary tumours (three cases, including one combined with colon cancer, two with breast cancer); incomplete information for accurate staging (73 cases); incomplete follow-up information for the accurate time and site of distant organ metastasis (97 cases); and inadequate paraffin-embedded cancer tissues (57 cases).

Supplementary Table S2 (online only) gives detailed information of metastatic sites for both cohorts of patients with high risk of postoperative distant organ metastasis. There were 28 patients in the training cohort, whose metastases were diagnosed by pathology (8 were with liver metastases, 11 were with lung metastases, 7 were with soft tissue metastases, and 2 were with multi-organ metastases), and 66 patients were diagnosed by cross-sectional imaging and clinical presentation. There were 20 patients in the validation cohort, whose metastases were diagnosed by pathology (5 were with liver metastases, 7 were with lung metastases, 5 were with soft tissue metastases, and 3 were with multi-organ metastases), and 56 patients were diagnosed by cross-sectional imaging and clinical presentation.

Variables and distant organ metastasis in the two cohorts

Table 2 gives the detailed cutoff points and outcomes for 23 immunomarkers and 7 clinicopathological variables in predicting postoperative distant organ metastasis in univariate analysis in the training cohort. The immunomarkers with a P-value of less than 0.05 in univariate analysis were selected to be further tested in the validation cohort. TIMP-2 and Rb were also excluded from further testing in the validation cohort, because the number of high expression cases (25 for TIMP-2, 20 for Rb) was too small in the training cohort, and the small number may cause selection bias. Representative figures on immunohistochemical (IHC) staining for 23 molecular markers are shown in Figure 1.

Figure 1
figure 1

Representative figures on IHC staining for 23 molecular markers ( × 200, for IHC staining).

Twelve immunomarkers and seven clinicopathological variables were selected for further analysis in the validation cohort (Table 3). We used the same cutoff point for each variable in both the validation and the training cohorts.

Outcome of SVM1–SVM4 in predicting distant organ metastasis and long-term survival

Supplementary Table S3 (online only) lists the outcome of prediction of postoperative distant organ metastasis for the validation cohort. Table 4 gives the detailed values of SVM1–SVM4 in predicting distant organ metastasis for the validation cohort. SVM4 showed higher predicting accuracy than other models. Although it is not the main endpoint of this study, the outcome of the four models in predicting postoperative deaths in 5 years were also listed (Supplementary Table S4 and Supplementary Table S5, online only). SVM4 showed higher accuracy in predicting long-term survival than other models (Supplementary Tables S4 and S5, online only).

Table 4 Outcomes of SVM-based models in predicting distant metastasis for the two validation cohorts

Multivariate analysis indicated that SVM3 and SVM4 were significant factors associated with postoperative distant organ metastasis (Supplementary Tables S6-S9, online only).

The ROC curves for each feature clearly illustrate the point on the curve closest to (0.0, 1.0), which maximises both sensitivity and specificity for the outcome in the training cohort (Figure 2A) and the validation cohort (Figure 2B), respectively.

Figure 2
figure 2

ROC curves for receptors for 12 immunomarkers, pathological T category, pathological N category, cell differentiation, tumour length, and the SVM-based models using (A) the training cohort (n=319), (B) the validation cohort (n=164), (C) the mixed training cohort (n=322), and (D) the mixed validation cohort (n=161).

Outcome of SVM1’–SVM4’ in predicting distant organ metastasis and long-term survival

Table 4 gives the detailed values of SVM1’–SVM4’ in predicting distant organ metastasis for the mixed validation cohort. The detailed outcomes of SVM1’–SVM4’ were listed in the Appendix (Supplementary Tables S10–S16, online only). Multivariate analysis indicated that SVM3’ and SVM4’ were significant factors associated with postoperative distant organ metastasis (Supplementary Tables S13–S16, online only).

The ROC curves for each feature clearly illustrate the point on the curve closest to (0.0, 1.0), which maximises both sensitivity and specificity for the outcome in the mixed training cohort (Figure 2C) and the mixed validation cohort (Figure 2D), respectively.

Discussion

Our data demonstrated that SVM-based models could help us predict the risk of postoperative distant organ metastasis of OSCC. When only a pathological T category and a pathological N category were included, the prediction sensitivity was much lower. This result suggested that it was not enough to include only the AJCC T and N categories to achieve an effective prediction of postoperative risk of distant organ metastasis of OSCC, and that more features were essential to construct a more efficacious prediction model. However, when cell differentiation and tumour length was added as variables to SVM1(SVM1’) to construct a new model (SVM2 or SVM2’), the outcome was not much improved as we expected. A possible reason for this result may be that tumour length and cell differentiation are highly related to tumour staging.

A number of previous gene profile-based prognosis techniques, combined with the use of microarrays or PCR, have been applied in survival prediction for patients with breast cancer (Huang et al, 2003; Wang et al, 2005; Liu et al, 2007), lung cancer (Hayes et al, 2006; Potti et al, 2006; Yanaihara et al, 2006; Chen et al, 2007), and OSCC (Kan et al, 2004; Tamoto et al, 2004; Guo et al, 2008; Mathé et al, 2009). However, these studies focused on the gene expression of cancer cells, whereas the clinicopathological features that were considered important in survival prediction were neglected. Moreover, there were other limitations in clinical practice for these studies: the requirements for fresh tissue, expensive examination costs, and more importantly, uncertainties about the reproducibility of complicated molecular biology methods. In this study, we used tissue microarray and IHC techniques, which are already widely used in laboratories conveniently and affordably. This makes our result more adaptable to clinical practice. We chose multiple molecular markers with potential roles in distant organ metastasis, which may represent different mechanisms in the process of metastasis. Therefore, these molecular markers may be the best candidates for distant organ metastasis prediction models.

Lagarde et al (2007, 2008) developed prognostic models for patients who underwent potentially curative surgery for adenocarcinoma of the oesophagus and gastroesophageal junction, but they did not consider the impact of molecular markers on prognosis. Takeno et al (2007) assessed the clinical outcome in 70 patients with OSCC by using four molecular markers based on IHC analysis in addition to TNM classification, but the features’ relationship was considered linear in their study. The small number of cases and weaker validation limits the clinical application of their result. Sato et al (2005) created a comprehensive prognostic model for oesophageal carcinoma using an artificial neural networks (ANNs) technique. The major shortcoming of their model is the large number of variables (135 variables) and the involvement of categories of postoperative treatment as variables. These shortcomings limit the clinical utility, because in clinical practice simplicity is essential, and the most important decisions regarding postoperative management should be made just after surgery and before postoperative therapy. Mofidi et al (2006) also used ANNs technique to construct a prognosis prediction model for carcinoma of the oesophagus and gastroesophageal junction, in which 15 clinicopathological variables were included. Unfortunately, they did not incorporate molecular markers into the design of the ANNs to improve its accuracy.

Our distant metastasis prediction models provide a new strategy and approach for making the optimal clinical decision. SVM3 (SVM3’) and SVM4 (SVM4’) have the highest sensitivity, specificity, and accuracy of prediction among the four models. For application purpose, we recommend SVM4 (SVM4’) in clinical practice, because fewer markers are needed than SVM3 (SVM3’). If a patient with OSCC is predicted to have a high risk of postoperative distant organ metastasis, adjuvant chemotherapy might be recommended; on the contrary, for patients with a low risk of postoperative distant metastasis, observation is preferred. Although further evaluation of this strategy for clinical use will be necessary, it may help clinicians to select the most appropriate therapies for individual OSCC patients in advance.

One meta-analysis showed a significant survival benefit for preoperative chemoradiotherapy in patients with resectable oesophageal carcinoma (Gebski et al, 2007). However, another recent meta-analysis indicated that OSCC did not benefit from neoadjuvant chemoradiotherapy (Jin et al, 2009). The exclusion of patients with preoperative therapy in this study was because of the controversy about neoadjuvant therapy in operable OSCC and the fact that neoadjuvant therapy for operable OSCC was not widely accepted by patients and doctors in China.

Our study has its limitations. Because of the retrospective nature of this study, there were some patients who were reported to die of postoperative tumour metastasis, but the accurate metastatic sites were not recorded. These patients were excluded, because without the record of accurate metastatic sites we could not make sure that they really died of distant organ metastases but not local recurrence. As many patients were excluded in this study, a selection bias may exist, which contributed to potential lack of generalisability of our results.Tissue microarrays are useful for initial screening of large numbers of patients in clinical research; however, because of the heterogeneity of tumours (Gerlinger et al, 2012), the results may need validation by analysing larger tissue specimens before clinical application. Immunohistochemical analysis is the most widely used method to detect gene expression, because it is easy to apply and inexpensive. However, some variation in methodological factors, such as different primary antibodies, wide range of dilutions, different cut-off points used by investigators, and storage time and fixation method of paraffin-embedded tissues, may contribute to different result of protein expression. We believe more prospectively collected data from multiple centres were essential to test our results.

In conclusion, we have designed effective SVM-based models by combining clinicopathological features and molecular markers as variables for helping select the patients of OSCC with high risk of postoperative distant organ metastasis. With the help of these models, a tailored adjuvant therapy strategy for operable OSCC might become possible. An evaluation of this strategy for clinical use by prospective randomised clinical trial will be necessary.