The Prognostic Implications of Tumor Infiltrating Lymphocytes in Colorectal Cancer: A Systematic Review and Meta-Analysis

Tumor-infiltrating lymphocytes (TILs) are an important histopathologic feature of colorectal cancer that confer prognostic information. Previous clinical and epidemiologic studies have found that the presence and quantification of tumor-infiltrating lymphocytes are significantly associated with disease-specific and overall survival in colorectal cancer. We performed a systematic review and meta-analysis, establishing pooled estimates for survival outcomes based on the presence of TILs in colon cancer. PubMed (Medline), Embase, Cochrane Library, Web of Science, and ClinicalTrials.gov were searched from inception to April 2017. Studies were included, in which the prognostic significance of intratumoral tumor infiltrating lymphocytes, as well as subsets of CD3, CD8, FOXP3, CD45R0 lymphocytes, were determined within the solid tumor center, the invasive margin, and tumor stroma. Random-effects models were calculated to estimated summary effects using hazard ratios. Forty-three relevant studies describing 21,015 patients were included in our meta-analysis. The results demonstrate that high levels of generalized TILS as compared to low levels had an improved overall survival (OS) with a HR of 0.65 (p = <0.01). In addition, histologically localized CD3+ T-cells at the tumor center were significantly associated with better disease-free survival (HR = 0.46, 95% CI 0.36–0.61, p = 0.05), and CD3 + cells at the invasive margin were associated with improved disease-free survival (HR = 0.57, 95% CI 0.38–0.86, p = 0.05). CD8+ T-cells at the tumor center had statistically significant prognostic value on cancer-specific survival and overall survival with HRs of 0.65 (p = 0.02) and 0.71 (p < 0.01), respectively. Lastly, FOXP3+ T-cells at the tumor center were associated with improved prognosis for cancer-specific survival (HR = 0.65, p < 0.01) and overall survival (HR = 0.70, p < 0.01). These findings suggest that TILs and specific TIL subsets serve as prognostic biomarkers for colorectal cancer.

Although advances in screening and treatment have substantially improved survival from colorectal cancer (CRC), clinical outcomes vary widely among patients with tumors diagnosed at the same TNM stage, and disease relapse occurs in 20-30% of patients with localized cancer 1 . The presence of microsatellite instability (MSI-H) in colorectal cancers have a better prognosis as compared to microsatellite stable (MSS) colorectal cancer [2][3][4][5] . The mechanisms that confer this benefit are not fully understood, but an association has been linked to the prominent infiltration of immune cells within the tumor 6 . Increased focus on the tumor microenvironment has identified inflammatory activity as a critical predictor of disease activity impacting patient prognosis.
The host immune response has been implicated in tumor behavior as it influences all phases of tumor development and growth [7][8][9] . Tumor-infiltrating lymphocytes (TILs) in histopathological analysis of CRC is often interpreted as the host protecting against tumor development 10,11 . TILs mediate recruitment, maturation, and activation of immune cells that suppress tumor growth. Tumor infiltration by T lymphocytes is a highly informative prognostic factor for CRC outcome, independent of traditional prognostic indicators [12][13][14] . Numerous studies have demonstrated that the type, density and site of tumor-infiltrating lymphocytes in primary tumors are prognostic for disease-free survival (DFS) and overall survival (OS) from CRC and hints at a fundamental function of the immune system in the tumor microenvironment 15-18 . However, variability in study design, outcomes, sample size, and methods of measuring the host immune response reflecting the heterogeneity of studies in the literature inspired this systematic analysis. Recently, large retrospective studies have reported their data on the prognostic performance of TIL in CRC survival. To obtain a more precise estimate of the effect in populations with CRC, we performed an updated systematic review and meta-analysis to measure the impact of TILs on CRC survival.

Methods
Protocol and registration. We developed a protocol based on standard guidelines for the systematic review of prognostic studies and followed suggestions on updating systematic reviews as outlined by Moher et al. 19 . We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statements for reporting our systematic review 20 . Methods of the analysis and inclusion criteria were specified a priori in a protocol.
Data sources and search strategy. A librarian (LK) developed searches using a combination of keywords and controlled vocabulary (when available) in the following databases: PubMed (Appendix 1), Embase (Appendix 2 & 3), initially through OvidSP and later via Elsevier, Cochrane Library (Appendix 4), Web of Science (Appendix 5), and ClinicalTrials.gov (1997 to April 2017). In addition, we search grey literature sources (https// www.usa.gov, https://scholar.google.com) to identify relevant publications. The English language filter was used when available. We also examined bibliographies of related papers and reviews, while also consulting with experts in the field. In addition, we evaluated reference lists of previously published systematic reviews and meta-analysis.
Eligibility criteria. All studies were reviewed initially based on title and abstract. If the data was insufficient based on title and abstract, the full text article would be reviewed. Two independent reviewers (GEI and NB) reviewed the first 100 results of the Ovid Medline search to assess for agreement of article selection with a kappa of 0.82. Then further search results were divided equally amongst GI and NB. Disagreement was resolved either by discussion, consensus or by a third party (SBG).
For study inclusion, the keywords included focused on generalized tumor inflammatory infiltrate and associated T lymphocytes' subsets (CD4, CD8) in colorectal cancer patients identified with hematoxylin and eosin stain (HE) or immunohistochemical staining (IHC) and reported prognostic information. IHC staining was evaluated in subgroup analysis for tumor center (CC) and tumor stroma (TS) and at the invasive tumor margin (IM). Prognostic information included overall survival (OS) and disease-free survival (DFS).
Exclusion criteria included those publications for which there was insufficient data to estimate a hazard ratio (HR) with a 95% confidence interval (CI). However, references from review articles, case reports, commentaries and letter were reviewed to identify any additional studies that met the inclusion criteria. An effort was made to contact the authors for any clarifications.
Data extraction and quality assessment. Two reviewers (GEI, JK) independently evaluated and extracted relevant information from each included study. We utilized a form originally developed from the work of McShane et al. 21 and Hayes et al. 22 adapted by Mei et al. 18 for quality assessment in their systematic review and meta-analysis as this adaptation was comprehensive (See Supplement 1). It resulted in a quality rating of 0-9 based on reporting of inclusion and exclusion criteria, study design (prospective or retrospective), patient and tumor characteristics, description of the method or assay, study endpoints, follow-up time with patients and number of patients that dropped out during the follow-up period 18 .

Data collection process.
A standardized data abstraction form was adapted from Mei et al. to include key elements pertaining to the study design, sample size, patient age, stage of disease, assay method, follow-up duration, and HR estimates (with the corresponding 95% CIs) for TILs at certain locations within tumors (CT, TS or IM) and the HR cutoff point, method of quantifying (immunohistochemistry, PCR, sequencing). For time-to-event outcomes, we retrieved and curated the HR estimates with 95% CI from the original articles 18 . Discrepancies in interpretation between reviewers (GEI, NB) were resolved by discussion with a third reviewer (SBG) to reach consensus.

Subgroup analyses.
We performed analyses to estimate the association between prognostic outcomes (OS, CS, DFS) and both T-lymphocyte subsets (CD3 + , CD8 + , FOXP3 + , CD45R0 + ) and T lymphocyte infiltrate location (CC, TS, or IM). Survival time was recorded from either the date of diagnosis or the initiation of treatment, as available from the published reports. Random effects models were used consistent with prior published meta-analyses that showed evidence of heterogeneity for similar subgroup analyses. Summary measures. Meta-analyses were performed using the R package 'meta' version 4.9-0, using statistical software R (version 3.4.3). Random effects models were calculated based on HR estimates and their standard errors; inverse weighting was used for pooled variance. We then plotted forest and funnel charts by T-cell type, T-cell source and outcome to evaluate for publication bias. Interstudy heterogeneity was quantified using the I 2 statistic, with an I 2 value>50% as our a priori threshold for substantial heterogeneity 23 .

Results
Literature search. Eligible studies were identified and selected as shown in Fig. 1. Among the 3,789 studies identified for initial evaluation, 1,963 studies were eligible for further assessment based on pre-specified criteria. Abstracts of these studies were reviewed and 1,804 studies were excluded for the reasons delineated in Fig. 1.
After abstract review, we identified 159 articles for full manuscript review and 106 of these studies were excluded. The most common reasons for exclusion were studies were the following: No relevant outcome (N = 63); Shared identical population (N = 23); and Editorial, letter, or commentary (N = 19). There were 53 studies eligible for inclusion, but 10 studies were found to have insufficient data. Therefore, 43 studies were included in the final meta-analysis (Table 1) 24-65 . Study characteristics. Characteristics for each study are summarized in Table 1. Forty-three studies had a median quality score of 6 out of 9 (range: 3-8) and consisted of a median of 243 patients (range: 42-2,369), with a median follow-up of 64 months (range: 18-240). All studies were published from 1997-2017. There was one study included from an abstract due to the large number of patients (N = 2,293) included in the retrospective study (Sinicrope et al. 64 ). Study sample sizes range from 42 to 2,396 patients representing an overall total of 21,015 patients. HRs and 95% CI for overall survival (OS), cancer-specific survival (CSS), or disease-free survival (DFS) were derived directly when available. A synopsis of study variables and results are summarized in Table 1 and  Table 2, respectively. Subgroup analysis. Prognostic effect estimates were pooled for generalized tumor inflammatory infiltrate counts and T-lymphocyte subsets stratified by tumor location (IM, TS, CC) in CRC. Due to limited numbers and low sample sizes of studies within each subset, estimates of between-study heterogeneity were imprecise. Therefore, we performed funnel plot analyses for both generalized tumor inflammatory infiltrates and T-cell subsets.

CD8+ T lymphocyte subset. CD8+ T cells are cytotoxic T cells that promote apoptosis of cancer cells 66 .
The association between the presence of CD8+ T lymphocytes and survival of CRC patients was extracted from thirteen studies (Fig. 4)

Discussion
We have performed a systematic review and quantitative meta-analysis of the prognostic impact of tumor infiltrating lymphocyte density and composition on CRC outcome. Through a computerized, systematic literature search of Medline, Embase, Web of Science, and Scopus databases using pre-determined inclusion criteria, we identified 43 studies published between August 1997 and April 2017 (representing a total of 21,015 CRC patients with available samples) that evaluate specific marker subset populations of tumor infiltrating lymphocytes in CRC and survival outcomes. We separately considered Generalized TIL density, CD3, CD8, FOXP3, CD45R0 as the focus of our meta-analysis, recognizing that there are other systems of scoring the host immune response that are beyond the scope of the current meta-analysis. Since the publication of an initial meta-analysis of TILs and CRC in 2014 by Mei et al. which included 7840 patients, there have been an additional 13,175 CRC patients with tissue samples that have undergone analysis for TIL density by T-cell subset and histopathologic location. Due to the increasing recognition of intratumoral adaptive immune reaction as a prognostic marker for survival and as a therapeutic target of immune checkpoint inhibitors, we performed an updated systematic review and meta-analysis of TIL. Pooled analysis from an extensive compilation of studies suggest that high generalized TIL counts and CD3+ T-cell density have the strongest association with survival benefit for patients as compared to low generalized TIL counts and CD3+ T-cell density in regards to disease-free survival (DFS), cancer-specific survival (CSS), and overall survival (OS). The pooled summary HRs for each T-cell subset were inconsistent across different studies. Some markers trended towards a stronger prognostic association with survival as compared to the earlier analysis performed by Mei et al. (CD3, CD8, FOXP3).
The effect of the immune system in colorectal cancer is still being elucidated as several prospective and retrospective studies demonstrate that robust antitumor immunity is associated with favorable prognosis in patients with CRC. Notably, we confirmed in our study a prognostic benefit of FoxP3+ T cell infiltrates which stands in contrast to previous meta-analyses suggesting that tumor-infiltrating FoxP3+ T-cells are associated with poor clinical outcomes in solid cancers 68,69 . Recent studies elucidating the interplay between the tumor microenvironment and colonic microbiome have identified two distinct subpopulations of immunosuppressive and proinflammatory FOXP3+ T-cells. Investigators found that proinflammatory FoxP3 lo T-cells were associated with an www.nature.com/scientificreports www.nature.com/scientificreports/ increased presence of Fusobacterium nucleatum and better CRC patient prognosis, while immunosuppressive FOXP3+ T-cellls were associated with worse outcomes 70 . Additional TIL research is ongoing in understanding the modulation of T-cell trafficking by the gut microbiome and the control of tumor growth through direct lysis of cancer cells through the production of cytokines that promote a cytotoxic response 71,72 . In addition, new immunotherapies are being developed that harness adoptive transfer of marker-specific TIL populations to elicit an immune response to tumors 73 . www.nature.com/scientificreports www.nature.com/scientificreports/ Our meta-analysis demonstrates that generalized TIL density is a strong prognostic marker for survival in patients with colorectal cancer. This result is concordant with previous studies that identified the association of TILs with increased survival 74 . The strengths of the study include the addition of large retrospective studies by Rozek et al. 63 , and Sinicrope et al. 64 , which included 2,369 patients and 2,293 patients respectively, adding further precision and generalizability to the recognition that TILs confer a prognostic advantage with a maximum likelihood HR = 0.65 for overall survival. www.nature.com/scientificreports www.nature.com/scientificreports/ These findings are consistent with previous meta-analyses 18 , yet our results have caveats that are relevant to this type of summary analysis. Heterogeneity existed in most analyses even though subgroup and overall summary estimates were similar. Also, studies that utilize different methods of TIL identification, small populations, and variations associated with archival specimens were pooled. Nonetheless, the more homogeneous TIL density summary estimates were similar to the overall summary estimates, suggesting that the overall summary measures are a reasonable estimation of prognosis associated with TILs. Second, the meta-analysis was subject to detection, verification and spectrum biases from the original studies. We may have overlooked relevant studies with results www.nature.com/scientificreports www.nature.com/scientificreports/ (negative or limited) that would modify the estimates. In addition, the different cutoff values for designation of high vs low TIL was a source of bias for this meta-analysis. Among the analyzed studies, the cutoff values included presence or absence (Nagtegaal et al. 28 ; Cianchi et al. 30 ; Gao et al. 32 ; Ogino et al. 37 ; Richards et al. 50,57 ), TIL count with a different threshold for high vs low (Lee et al. 44 ; Rozek et al. 63 ), and mean, media, and quartiles (Naito et al. 25 ; Guidoboni et al. 27 ; Chiba et al. 31 ; Menon et al. 33 ; Galon et al. 14 ; Salama et al. 39 ; Frey et al. 43 ; Lee et al. 44 ; Nosho et al. 45 ; Sinicrope et al. 40,64 ; Yoon et al. 51 ; Di Caro et al. 54 ). Some studies detected TILs by tissue microarray while others used full histologic sections. These differences could be responsible for the variability in reaching a