A systematic review and meta-analysis of prognostic biomarkers in resectable esophageal adenocarcinomas

Targeted therapy is lagging behind in esophageal adenocarcinoma (EAC). To guide the development of new treatment strategies, we provide an overview of the prognostic biomarkers in resectable EAC treated with curative intent. The Medline, Cochrane and EMBASE databases were systematically searched, focusing on overall survival (OS). The quality of the studies was assessed using a scoring system ranging from 0–7 points based on modified REMARK criteria. To evaluate all identified prognostic biomarkers, the hallmarks of cancer were adapted to fit all biomarkers based on their biological function in EAC, resulting in the features angiogenesis, cell adhesion and extra-cellular matrix remodeling, cell cycle, immune, invasion and metastasis, proliferation, and self-renewal. Pooled hazard ratios (HR) and 95% confidence intervals (CI) were derived by random effects meta-analyses performed on each hallmarks of cancer feature. Of the 3298 unique articles identified, 84 were included, with a mean quality of 5.9 points (range 3.5–7). The hallmarks of cancer feature ‘immune’ was most significantly associated with worse OS (HR 1.88, (95%CI 1.20–2.93)). Of the 82 unique prognostic biomarkers identified, meta-analyses showed prominent biomarkers, including COX-2, PAK-1, p14ARF, PD-L1, MET, LC3B, IGFBP7 and LGR5, associated to each hallmark of cancer.


Results
Study characteristics. All 3,298 identified articles were screened on title and abstract (Fig. 1). After assessing 466 articles on full text, 84 articles were included  . Six articles were grouped in the adapted hallmark of cancer 'multiple' , resulting in 78 articles that could be included in the meta-analysis, investigating a total population of 12,876 EAC patients. The main characteristics of the studies are shown in supplementary Table S1. A total of 82 unique biomarkers were identified. The majority of the biomarkers were detected by immunohistochemistry (IHC) or a combination of IHC and an in situ hybridization method (ISH). Less frequently applied detection methods were PCR, RNA sequencing, DNA sequencing and one article used a combination of reverse phase protein array (RPPA) analysis, reverse transcriptase-PCR and IHC 95 . Most (N = 61) articles included a study population consisting of EAC only, 12 articles included an EAC population that consisted of ≥70% adenocarcinomas, 11 articles performed separate OS analyses on EAC and other histological subtypes. Of the assessed patients, 1822 (14.2%) received prior chemo(radiation)therapy. The mean study sample size and IF of the articles was 152 patients (standard deviation = 112.16) and 4.54, respectively. Quality assessment. Assessment of the study quality using the adapted REMARK criteria, resulted in a mean quality of 5.9 points (range 3.5-7) (Supplementary Table S2). Three studies had a low quality score, and were included in the sensitivity analyses 31 . In general, points were lacking in quality criteria C5; reporting if patients received therapy and if so, specifying the chemo(radio)therapy regimen. In addition, C1; a representative cohort with clear baseline characteristic and C2; reasons of patient drop-out, were often absent. A positive correlation (R = 0.480) was observed comparing study size and the impact factor of the journal in which the study was published (p = 0.0005) ( Supplementary Fig. S3). There was no correlation (R = 0.058) between the study quality assessed by the adapted REMARK criteria and impact factor (p = 0.601).
Proliferation. The majority of the biomarkers studied are involved in tumor cell proliferation, of which HER2, EGFR, cyclin D, KI67 and MTOR were the most frequently reported (Fig. 2). Subgroup analysis on EGFR demonstrated an association with worse OS, HR 1.43 (95% CI 1.04-1.95). Analyses of the HER2 subgroup, however, showed no significant association with OS, HR 1.28 (95% CI 0.96-1.70). HER2 remained not significantly associated with worse OS when evaluating the HER2 subgroup by including only data on HER2 expression assessed by means of the gold standard (IHC and in case of equivocal HER2 expression (Hoffman scoring system 2+) an additional in situ hybridization method 96 ), or if data on EAC with Barrett's esophagus (BE) segment was replaced by data on EAC without BE ((HR 1.09 (95%CI 0.46-2.60)) and (HR 1.33 (95%CI 0.78-2.28)), respectively) ( Table 1). The overall pooled effect of the proliferation feature was significantly associated with worse OS (HR 1.41 (95%CI 1.22-1.63)), however, significant test heterogeneity was found. IGFBP7, a member of the insulin like growth factor receptor family, was identified as most promising prognostic biomarker in this hallmarks of cancer feature. Funnel plot analyses showed no indication for publication bias (Supplementary Material S4).
Hallmark specific markers. All identified biomarkers and hallmarks of cancer features are summarized in Fig. 3. The potential of all identified prognostic biomarkers was evaluated by assembling the biomarkers according to their main function in tumor biology in their corresponding hallmarks of cancer feature. Performing meta-analysis on all features, most were significantly associated with worse OS, except metabolism (HR 1.56 (95%CI 0.98-2.47)), and self-renewal (HR 1.08, (95%CI 0.81-1.43)). The hallmark of cancer feature 'immune' was most significantly associated with worse OS (HR 1.88, (95%CI 1.20-2.93)). Of the 82 unique prognostic biomarkers identified, meta-analyses showed several promising biomarkers, including COX-2, PAK-1, p14ARF, PD-L1, MET, LC3B and LGR5, associated to each hallmark of cancer feature. After excluding low study quality articles, there was no significant association with OS in the group cell adhesion (N = 1, n = 52, SPARC and SPP1; HR 1.49 (95% CI 1.07-2.07) to HR 1.24 (95% CI 0.83-1.86), respectively) ( Table 2) 31,45,58 . Additional sensitivity analyses on EAC treated with surgery as single treatment modality vs. EAC treated with neoadjuvant treatment and surgery, the hallmarks of cancers feature 'cell cycle' was not significantly associated with OS (HR 1.43 (95%CI 1.08-1.89) to HR 1.09 (95%CI 0.75-1.57), respectively) although the same biomarkers were tested. The feature 'metabolism' remained not significantly associated with OS. After sensitivity analyses, the prognostic biomarkers identified as most promising remained unchanged for each hallmark of cancer feature. Funnel plot analyses showed no indication for publication bias.

Discussion
This review summarizes the great diversity of prognostic biomarkers studied in EAC thus far. Evaluating the biomarkers by grouping them based on their role in tumor biology to the most fitted hallmark of cancer feature, 82 unique biomarkers could be identified.
Interestingly, the hallmark of cancer feature 'immune' presented itself as most significant associated with worse OS, and therefore may harbor potential to apply targeted therapies. Due to increased understanding of the tumor immunomicro-environment, and promising trial results, new immune based therapies are recently emerging, such as the PD-L1/PD1 targeting agents nivolumab and pembrolizumab 97 . Targeting PD-L1/PD-1, a critical immune checkpoint, releases the inhibitory effect on both the humoral and cellular immune response, activating T-cells to enhance the antitumor response. These PD-1 pathway inhibitors have previously been FDA approved in several solid tumors, including melanoma and non-small lung cancer. Indeed here we identify PD-L1, a ligand of the co-inhibitory receptor PD-1, as the most promising prognostic biomarker included in this hallmarks of cancer feature. However, the clinical applicability of these drugs has not been proven in resectable EAC yet and whether PD-1 is a predictive biomarker, reflective of response to treatment, remains to be elucidated 4,97 .
For all other hallmarks of cancer features promising prognostic biomarkers were identified as well, including COX-2, PAK-1, p14ARF, MET, LC3B, IGFBP7 and LGR5. For the MET-, IGFBP7, and LGR5 pathways targeted therapies have already been studied in other cancer types with varying results, however, the potential to target these biomarkers in EAC is yet to be investigated 98,99 . Likewise, the inhibition of CDK4/6 in p14ARF mutant patients by small molecules or pan-CDK inhibitors is being invested as add-on to standard chemotherapy backbones, potentially enabling blockage of unrestricted cell division caused by p14ARF mutations 100 . Non-steroidal anti-inflammatory drugs (NSAID's), inhibiting COX-2, are commonly used and safe. Hence, inhibition of COX-2, an important regulator of cell growth, differentiation and apoptosis, may be a valuable contribution in the treatment of EAC. Thus far, COX-2 has been demonstrated to be involved in the neoplastic formation of esophageal cancer 101 . Moreover, the use of NSAID's, is associated with a reduced risk of EAC development and is proven to reduce cell growth in 8 esophageal cell lines. Contrary, yet little is known about the potential drugability of PAK-1 in cancer, even though the recently elucidated central role in oncogenic signaling has enhanced interest in small-molecule based PAK-1 targeting 102 . Similarly, merely in vitro the inhibition of autophagy by blocking LC3B has been explored in oncological diseases. Therefore, the therapeutic potential remains to be clarified.
Even though promising prognostic biomarkers were identified, limitations should be recognized. Firstly, after performing sensitivity analysis on the study quality, the feature cell adhesion was no longer significantly associated with OS when excluding articles scoring low on the adapted REMARK criteria 31,45,58 . In addition, as it is known that studies with low quality hamper extrapolation of the data to clinical practice, it is surprising to notice that study size and impact factor were correlated, while no correlation between the study quality and the impact factor was found. Although after sensitivity analyses on articles scoring low on the adapted REMARK criteria the same promising biomarkers were still identified, the varying study quality is worrying. Frequently, articles failed to report the received therapy, and if this information was supplied, often did not specify the treatment regimen. As nowadays neoadjuvant treatment has become standard of care for operable EAC, reporting these baseline characteristics has become increasingly important.
In this meta-analyses 1822 (14.2%) resection specimens were evaluated on prognostic biomarker status after patients received neoadjuvant chemo(radiation)therapy. It should be noted that in specimens of good-responders    no, or a few, remaining tumor cells may be found, biasing the prognostic potential of the assessed biomarker. Moreover, if post-neoadjuvant therapy samples are included in biomarker analyses, treatment regimens should be clearly described. It is known that a better response to therapy is attained with neoadjuvant chemoradiation therapy than if patients receive radiation therapy as single treatment modality. This could further bias the results found. In addition, when extrapolating these results to a predictive setting for the identification of new therapy options, these biomarkers might not have predictive potential in the neoadjuvant setting. Indeed, sensitivity analyses on articles reporting on patients who received neoadjuvant therapy demonstrated the influence of these treatment regimens on the association between biomarker status and survival. The feature 'cell cycle' was significantly associated with worse OS in all patients, and, when testing the same biomarkers, no longer harbored this association with survival if solely neoadjuvant treated EAC was included in the analysis. Since commonly used DNA-damaging chemotherapeutics as carboplatin and paclitaxel have influence on the cell cycle, this effect was expected, highlighting the importance of reporting the received treatment regimen. The importance of clear reporting standards for biomarker research and standardization of the detection method used is also demonstrated by subgroup analyses on HER2. In contrast to the current notion, no association with decreased survival was found when plotting the data of all articles reporting on the prognostic potential of HER2. When exclusively including data on HER2 positivity assessed by means of the gold standard, IHC and in case of equivocal HER2 expression (Hoffman scoring system 2+) an additional in situ hybridization method, the association with worse OS remained not significant 5,103 . The significant test heterogeneity found in the corresponding hallmark of cancer feature 'proliferation' could at least partly be attributed to the varying detection methods applied. As all used tests have a unique sensitivity and specificity, outcomes can be greatly influenced by the method of biomarker assessment. The applied detection method will not only reflect underlying tumor biology, but also affect the relation of the biomarker with prognostic outcomes and targetability. For example, it has been demonstrated that solely assessing HER2 positivity by amplification of the HER2-gene with an in situ hybridization method does not correlate to efficacy of HER2-targeted therapy 103 . Likewise, different IHC cutoff-points of biomarker positivity influence both prognostic and predictive outcomes. As has been demonstrated in this meta-analysis, even for well-known biomarkers such as HER2, used in clinical practice, articles use varying definitions of biomarker positivity, thereby limiting comparison of data. Several promising biomarkers in resectable EAC have been identified, however, in order to stratify patients in accordance to their tumor biology, and to develop new targeted anti-cancer treatments, future research is needed. First, standardization of reporting on biomarker research is needed to further identify prognostic biomarkers. Subsequently, large-scale multicenter randomized-controlled trials should be conducted to validate the clinical applicability of these biomarkers and to evaluate their potential targetability.
To conclude, a wide variety of prognostic proteins and their expression have been studied in EAC treated with curative intent. Despite varying study quality of the published data, promising biomarkers could be identified, including COX-2, PAK-1, p14ARF, PD-L1, MET, LC3B, IGFBP7 and LGR5. The clinical application and targetability of these biomarkers as anti-cancer therapy in operable EAC should be addressed in future research.

Methods
Search strategy. Literature was retrieved using the Medline, Cochrane and Embase databases on the 19 th of January 2017 to identify articles published in the last 10 years, with the publication date restricted to the first of January 2007 until the first of January 2017. In addition to MESH terms, free text words were added to the search, to include all relevant articles that might not have assigned MESH terms yet. The full search is available in the supplementary information S5.
Screening and selection of studies. All titles, abstracts and full text articles were screened independently by two researchers (AC and EAE), discrepancies were resolved by discussion. Articles were selected based on the following criteria; (i) the research population included adenocarcinomas of the esophagus or the gastro-esophageal junction, defined as Siewert class I and II, that could be treated with curative intent (ii) should report biomarker related overall survival (OS) data, described with hazard ratios (HR), 95% confidence intervals (CI), and p-value. If both EAC and ESC were studied, the research population should include at least 70% EAC or display separate survival analysis. Reviews, case reports, (meeting) abstracts, phase I studies and articles without full-text in English were excluded. When articles reported on the same biomarker(s) investigating the identical patient population, the publication examining the most biomarkers was included. Endnote X7 (Clarivate Analytics, Boston, USA) was used to select and screen the literature.
Data extraction and outcomes. Data extraction was done by AC and EAE following a predefined protocol and double checked until consensus was reached. The following data was extracted: first author, publication year, journal, patient population (EAC only, >70% EAC or EAC and ESC with separate survival analysis), tumor material studied (blood, biopsy, resection specimen or a combination), reported tissue handling, method of biomarker detection, used scoring methods and cut-off values for biomarker positivity, received therapy (yes (including a clear description of the treatment regimen), no, or not reported (NR)), the duration of follow-up, and reported confounders in multivariate analyses. Lastly, the primary outcome of this review, overall survival data of univariate and/or multivariate analyses presented as HR, 95% CI, and p-value. The impact factor (IF) of journals at the time of publication of the studies were extracted from bioxbio.com/if/.

Study quality assessment.
To assess the quality of the included studies the REporting recommendations for tumor MARKer prognostic studies (REMARK) criteria for biomarker studies were adapted into a scoring system (Table 3) 11 . The adapted scoring criteria were chosen by discussion between AC, EAE, MvO and HvL. The articles could be scored 1 point per item, with a maximum of 7 points. In case of ambiguity or incompleteness, half a point was allocated. A study was defined of low quality when ≤3.5 points were assigned. The study quality was assessed by AC and EAE, in case of disagreement consensus was reached by discussion.
Statistics. The potential of all identified prognostic biomarkers was evaluated by grouping the biomarkers according to their main function in tumor biology in the corresponding hallmark of cancer 104 . To fit all identified biomarkers, the hallmarks of cancer were adapted, resulting in the following features: angiogenesis, cell adhesion and extra-cellular matrix remodeling, cell cycle, immune, invasion and metastasis, metabolism, proliferation, and self-renewal. Some articles showed data on a cluster of genes, these were assembled in the hallmarks of cancer feature 'multiple' . Due to the heterogeneous scope of action of the biomarkers, we did not perform meta-analysis on papers included in the 'multiple' group. Pooled hazard ratios (HR) and 95% confidence intervals (CI) were derived by random effects meta-analyses performed on each hallmark of cancer feature. HR and 95%CI data of univariate and multivariate analysis were combined in the meta-analysis; data derived from multivariate analysis was used as default, but when absent, univariate values were used. If the data was related to absence rather than presence of the biomarker, the HR data were inversed. When identical biomarkers were reported in more than two studies, these duplicate biomarkers were included in subgroup analysis. In order to determine the influence of a low quality score, sensitivity analyses were performed on studies with a low study quality on the adapted REMARK criteria scale. Additional sensitivity analyses were conducted on studies showing data on both EAC treated with surgery as single treatment modality and neoadjuvant treated EAC. Finally, the most promising biomarker for each hallmark of cancer feature was selected based on the most optimal combination of a high HR and small 95% CI. Consensus was reached between AC, EAE, MvO and HvL on the selected biomarkers. Publication bias was evaluated by means of a Funnel plot on all hallmarks of cancer features. Random effects meta-analyses were performed in Review Manager V5 (The Cochrane Collaboration, Copenhagen, Denmark). Pearson's correlations with linear regression analysis between IF, adapted REMARK quality score, and patient cohort size were performed using GraphPad Prism 6 (GraphPad Software, La Jolla, CA, USA).
Ethics statement. This article does not contain any studies with human or animal subjects performed by any of the authors. Table 3. The adapted version of the REporting recommendations for tumor MARKer prognostic studies (REMARK) criteria for biomarker studies 11 . A study could be allocated one point for each of the seven criteria, in case of ambiguity, half a point was assigned. Sensitivity analyses were performed on studies assigned ≤3,5 points on the adapted REMARK criteria scale.