Introduction

The Tumor, Node, Metastasis (TNM) staging system, developed by the American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control, is the most widely used cancer staging system worldwide [1]. The tumor stage is the most important factor in guiding treatment decisions and prognosticating outcome in patients with colorectal adenocarcinoma. The TNM staging system is periodically updated as new prognostic and therapeutic information accrues. These cancer staging guidelines are also incorporated into the College of American Pathologists (CAP) cancer staging templates [2]. The latest AJCC 8th edition Cancer Staging Manual defines pT3 tumors as “tumor that invades through the muscularis propria into pericolorectal tissues” and pT4a tumors as “tumor invades through the visceral peritoneum (including gross perforation of the bowel through tumor and continuous invasion of tumor through area of inflammation to the surface of the visceral peritoneum)” [1, 2]. Tumor cells undoubtedly identified at the serosal surface are consistently classified as pT4a, and tumors invading through the muscle wall and identified more than 1 mm from the serosal surface are classified consistently as pT3 by pathologists. However, tumor cells present ≤1 mm from the serosal surface are inconsistently classified as either pT3 or pT4a by pathologists, more so when there is contiguous inflammation or when there is fibro-inflammatory reaction [3,4,5,6,7,8,9,10]. This has been confirmed by recent survey-based studies, which prove that despite the newer AJCC 8th edition staging manual guidelines, there is a lot of variability in staging pT3 versus pT4a cancer in above mentioned challenging case scenarios, even amongst subspecialized gastrointestinal pathologists [11,12,13,14]. Not surprisingly, this may lead to inconsistencies in cancer staging across institutions, potentially affecting postsurgical treatment decisions and causing difficulties in comparing outcome-based studies across various institutions.

To this end, the aim of our study was to compare the clinical and pathologic features and clinical outcomes in tumors that would be consistently classified as pT4a (defined as tumor cells at the serosal surface in this study) and tumors that would be consistently classified as pT3 (defined as tumors invading through the muscle wall but present >1 mm from the serosal surface in this study) versus the challenging inconsistent group (i.e., tumors ≤1 mm to the serosal surface either separated by reactive fibrosis or inflammation). Our goal was to discern whether the tumors in this latter challenging group behave more akin to bona fide pT4a tumors undoubtedly present at the serosal surface or similar to the bona fide pT3 tumors present >1 mm from the serosal surface. In addition, we also evaluated the outcomes of the lymph node-negative subset of these three groups, as well as the outcomes of these three groups when adjusted for adjuvant therapy and nodal metastases, in an attempt to reduce the selection bias.

Materials and methods

Study base and patient eligibility

Institutional review board approval was obtained from the two participating academic institutions (Penn State Health Milton S. Hershey Medical Center and Albany Medical Center). A retrospective search of final pathology reports was performed between years 2010 and 2015 for any colectomy specimen signed out as colonic adenocarcinoma and staged as either pT3 or pT4a (via the then applicable AJCC staging manual). Cases that had received neoadjuvant therapy were excluded from the study. Given that most rectal cancers receive neoadjuvant therapy, coupled with the fact that lower rectal cancers cannot be staged as pT4a, and owing to different therapeutic implications (i.e., total mesorectal excision for rectal cancers), all rectal cancers were excluded from this study. Cases that were not adequately staged at the time of excision with regards to lymph nodes (i.e., pN0 with <12 total lymph nodes retrieved) were also excluded from the study. Lastly, colon cancer patients who did not have available 5-year follow-up data were also excluded from the study, as were all patients with known distant metastatic disease (stage 4) at the time of surgery and/or initial presentation. Clinical data were extracted from the patient’s electronic medical record and/or tumor registry record. The data collected included age at the time of surgery, sex, date of surgery, site of the tumor, recurrence (as defined by loco-regional recurrence and/or distant metastases), date of local recurrence (if applicable), date and site of distant metastatic disease (if applicable), date of last follow-up/death (due to disease), and status of adjuvant therapy.

Tumor groups and histopathologic features

All the H&E slides from each case were re-reviewed by the gastrointestinal pathologist at that particular institution to re-assess for the greatest depth of invasion. Although the number of sections taken in each case was not uniform as these were older cases; the departmental grossing protocols were followed, multiple sections were taken from grossly evident deepest extent of invasion, and multiple recuts/step sections were performed in cases with tumor close to the serosal surface by the primary pathologists, as deemed appropriate. All the cases were classified into one of three groups for this study: Group 1: tumor cells present at the serosal surface (designated “bona fide” pT4a tumors in this study) (Fig. 1a); Group 2: tumor cells ≤1 mm from the serosal surface (not further subdivided into pT3 or pT4a); and Group 3: tumor invading through the muscularis propria with greatest depth of tumor cells present >1 mm from the serosal surface (designated “bona fide” pT3 tumors in this study) (Fig. 1b). Group 2 was further sub-grouped into Group 2a: tumor cells ≤1 mm from the serosal surface, separated by fibrosis (reactive fibroblastic reaction) (Fig. 2a), and Group 2b: tumor cells ≤1 mm from the serosal surface, separated by contiguous inflammation (Fig. 2b). Some of the Group 2 cases had both fibrosis and inflammation (fibro-inflammatory reaction) as the serosal reaction separating the tumor cells from the serosal surface. In these cases, the predominant pattern was evaluated to subgroup the tumor. If the serosal reaction showed more fibrosis than inflammation, it was put into Group 2a (Fig. 2c), while if the intervening tissue showed more inflammation than fibrosis, it was grouped into Group 2b (Fig. 2d).

Fig. 1: Histopathologic  appearance of colonic adenocarcinoma.
figure 1

(a) Unequivocal pT4a tumor identified at the inked serosal surface (designated as Group 1 tumor  in this study); (b) pT3 tumor identified more than 1 mm from the serosal surface (designated as Group 3 tumor in this study).

Fig. 2: Histopathologic appearance of deeply invasive colonic adenocarcinoma present <1 mm from the serosal surface.
figure 2

Tumor (designated as Group 2 tumors in this study) separated from the serosal surface by (a) reactive fibrosis; (b) contiguous inflammation; (c) fibro-inflammatory reaction with possibly more fibrosis than inflammation; (d) fibro-inflammatory reaction with possibly more inflammation than fibrosis.

The other histopathologic features reviewed included tumor differentiation (all well (>90% gland formation) and moderately differentiated (50–90% gland formation) tumors were classified as low-grade adenocarcinoma and all poorly differentiated (<50% gland formation) and undifferentiated adenocarcinoma were classified as high-grade adenocarcinoma), tumor size (determined by measuring the greatest dimension), small vessel (lymphovascular) and large vessel invasion, number of examined and positive lymph nodes, and the presence of tumor deposits.

Statistical analysis

Descriptive statistics were used to characterize the sample: frequencies and percentages for categorical measures; and means, standard deviations, medians, and quartiles for continuous measures. The association between the groups and categorical measures was evaluated using chi-square or Fisher’s exact tests, while medians and quartiles of continuous measures were compared among the groups using Kruskal–Wallis test. Survival was initially compared among the groups with Kaplan–Meier curves and the log-rank test. Results were reported in terms of Kaplan–Meier plots and 5-year survival rates compared via log-rank p values. Proportional hazards regression was then used to compare hazard rates among groups with adjustment for adjuvant therapy. Additional baseline factors were evaluated for inclusion in the multivariable model from among those characteristics that were significantly different among the groups. If these factors did not significantly contribute to the predictive power of the model they were subsequently removed. Results were reported in terms of adjusted hazard ratios, 95% confidence intervals and p values. Recurrence was compared among the groups using the same survival analysis methods described above. All analyses for survival and recurrence were then repeated within the subset of subjects with no positive lymph nodes. SAS version 9.4 was used for all analyses, and significance was defined as p < 0.05.

Results

Tumor re-categorization into Groups 1, 2, and 3

A total of 151 cases from two academic institutions were retrieved after the above exclusion criteria. Group 1 (tumor cells at the serosal surface; “bona fide” pT4a tumors) consisted of 38 (25%) cases (Fig. 1a). Group 2 (tumor ≤1 mm from the serosal surface, separated by either reactive fibrosis or contiguous inflammation) comprised 49 (32%) cases; and the remaining 64 (42%) cases were classified as Group 3 (pT3 tumor >1 mm from the serosal surface; “bona fide” pT3 tumors) (Fig. 1b). The 49 cases in Group 2 were further subcategorized into 40 cases, wherein the tumor was separated ≤1 mm from the serosal surface by reactive fibrosis (Group 2a) (Fig. 2a, c) and nine cases with tumor ≤1 mm from the serosal surface with contiguous inflammation to the serosal surface (Group 2b) (Fig. 2b, d). Of note, all of our deeply invasive tumors ≤1 mm exhibited either fibrosis or inflammation in the intervening subserosal tissue. None of the tumors ≤1 mm to the serosal surface showed unremarkable fibrofatty subserosal tissue. The clinico-pathologic features of the tumors in the three groups are listed in Table 1.

Table 1 Clinico-pathologic features of tumors in different groupsa.

Nodal status at the time of initial presentation

No nodal metastases was seen in 82 cases (54%), a single-node involvement was identified in 20 (13%) cases and two to three lymph nodes were positive in 17 cases (11%). The remaining 32 cases (21%) had at least four positive lymph nodes. Amongst the 82 lymph node-negative cases, 42 cases (51%) were Group 3 tumors, while the remaining were roughly equally divided between Group 1 (19 cases, 23%) and Group 2 (21 cases; 26%). The nodal metastases in three groups are listed in Table 1. All patients with distant metastases at the time of initial presentation and/ or surgery (synchronous metastases) were excluded from the study.

Clinical outcomes

About half of the patients (76/151, 50.3%) received adjuvant therapy after the surgical resection. Forty patients (26.5%) showed clinical evidence of recurrence (loco-regional and/or distant) within 5 years of resection, of these 37 patients showed evidence of distant metastatic disease (Table 1). Seventeen of these 37 patients had metachronous peritoneal metastases with or without other organ metastases (M1c).

Association between tumor groups and patient outcomes

There was a significant difference among the three groups with regards to recurrence-free rates and 5-year survival rates (overall log-rank p < 0.001 for both; Table 2). The 5-year recurrence-free rates for Group 1 (0.56) and Group 2 (0.57) were significantly lower when compared with Group 3 (0.87) for the entire patient cohort (Table 2, Fig. 3a). Similar significant recurrence-free rates were seen for the lymph node-negative subset for the three groups (Table 2, Fig. 3b). Survival rates for Group 1 (0.18) and Group 2 (0.29) were also significantly lower when compared with Group 3 (0.83) for the entire patient cohort (Table 2, Fig. 4a). Similar survival rates were seen for the lymph node-negative subset for the three groups (Table 2, Fig. 4b). We attempted to further study the 5-year recurrence-free rates and survival rates (Table 2, Fig. 4c) by stratifying Group 2 into Groups 2a (intervening fibrosis) and 2b (intervening inflammation). There was a significant difference among the four groups (Table 2; Fig. 4c), although the numbers in Group 2b were small and did not provide sufficient power for robust comparisons.

Table 2 Kaplan–Meier rates and log-rank comparisons in different tumor groupsa.
Fig. 3: Kaplan–Meier curves showing the recurrence-free probability for the different tumor groups.
figure 3

(a) Kaplan-Meier curve for the entire cohort of patients; (b) Kaplan-Meier curve for the lymph node-negative subset of the three groups. (Group 1—pT4a colonic adenocarcinoma at the serosal surface; Group 2—Deeply invasive colonic adenocarcinoma ≤1 mm from the serosal surface; Group 3—pT3 colonic adenocarcinoma >1 mm from the serosal surface).

Fig. 4: Kaplan–Meier curves showing the survival probability for the different tumor groups.
figure 4

(a) Kaplan-Meier curve for the entire cohort of patients; (b) Kaplan-Meier curve for the lymph node-negative subset of the three groups; (c) Kaplan-Meier curve for the entire cohort of patients with Group 2 further divided into Groups 2a and 2b. (Group 1—pT4a colonic adenocarcinoma at the serosal surface; Group 2—Deeply invasive colonic adenocarcinoma ≤1 mm from the serosal surface {2a—separated by fibrosis; 2b—separated by contiguous inflammation}; Group 3—pT3 colonic adenocarcinoma >1 mm from the serosal surface).

Based on the comparison of clinical and pathologic features among the three groups (Table 1), tumor size, large vessel invasion, tumor deposits, and lymph node metastases were evaluated as potential covariates to include in the proportional hazards regression models for recurrence and survival, along with adjuvant therapy. All potential factors were included in the multivariable model and considered for removal in a manual backward selection process if they did not significantly contribute to the predictive power of the model. The only factor that remained significant was number of positive lymph nodes, which was retained in each model along with adjuvant therapy and group indicator. Even after adjusting for adjuvant therapy and number of positive lymph nodes, the proportional hazards ratios for the risk of death (p < 0.001) and risk of recurrence (p = 0.005) showed significantly higher risk in Groups 1 and 2 compared with Group 3 (Table 3). When stratifying Group 2 into 2a and 2b, the risk of death was 6.5 times greater for those in Group 2a versus 3 (95% CI 3.20–13.22, p < 0.001), and 12.9 times greater for those in 2b versus 3 (95% CI 5.12–32.66, p < 0.001). The wide confidence interval for the comparison for Group 2b was a result of the small sample size. In addition, the risk of death was 2.3 times greater for those who did not receive adjuvant therapy (95% CI 1.45–3.79, p < 0.001). Even in lymph node-negative subset of the three groups, there was a significant difference in the risk of recurrence (p = 0.004) and risk of death (p < 0.001), when adjusted for adjuvant therapy (Table 3).

Table 3 Multivariable proportional hazards regression analysis comparison for the three tumor groupsa.

The median time (in months) to outcome was also calculated for each group to better depict the comparison. The median times to recurrence as well as death for the entire cohort of patients as well as the lymph node-negative cohort for Groups 1 and 2 were significantly lower when compared with Group 3 (log-rank p value < 0.001, for both times to recurrence and death) (Table 4). The median time to outcome is reported as the mean time is underestimated when there are patients at the end of the study who have not yet experienced recurrence or death, referred to as censored observations.

Table 4 Median time (in months) to outcome comparison for the three tumor groupsa.

Discussion

The AJCC 8th edition Cancer Staging Manual defines pT4a tumors as “tumor invades through the visceral peritoneum (including gross perforation of the bowel through tumor and continuous invasion of tumor through areas of inflammation to the surface of the visceral peritoneum)” [1]. The latter half of this updated guideline particularly with regards to the usage of “through areas of inflammation” is subject to different interpretations and leaves some ambiguity as to whether tumor cells communicating to the serosal surface through inflammation or serosal reaction without gross perforation should be staged as pT4a. The manual also states that “the significance of tumors that are <1 mm from the serosal surface and accompanied by serosal reaction is unclear, with some but not all studies indicating a higher risk of peritoneal recurrence” [1, 2, 15, 16]. Even in CAP synoptic reporting for colonic cancers, in the tumor extension category, tumor continuous with serosal surface though inflammation is included in the category of tumor invading peritoneum, causing some confusion and leading pathologists to subjective interpretation in challenging cases [2].

Not surprisingly, despite the newest AJCC 8th edition staging manual guidelines, challenges and significant interobserver variability still exist when assigning a pT stage to a deeply invasive colorectal tumor in which neoplastic cells are present close (≤1 mm) to the serosal surface as concluded by some recent survey-based studies [11,12,13,14]. Kirsch et al. queried 389 pathologists (132 gastrointestinal pathologists and 257 general pathologists) in North America, and found that 34% of pathologists classified cancers ≤1 mm from a serosal reaction as pT4a, and more so subspecialized gastrointestinal pathologists (42.4%) when compared with general surgical pathologists (29.6%; p = 0.02). Seventy-seven percent of the respondents who staged these tumors as pT3 commented that there could be a breach in the peritoneum and that this tumor may behave as a pT4a cancer. The authors commented that such variation in the assessment and reporting of colorectal cancer resection specimens may affect tumor staging, prognosis, and patient management [12]. Another recent international survey-based study with 118 respondents concluded that tumors that communicated with the serosa through inflammatory foci were staged as pT3 (49%) or pT4a (51%) by nearly equal numbers of pathologists. No significant differences were seen with respect to practice model, region, or years of experience [11]. In addition, over a half (53%) of pathologists also provided clinical colleagues with information regarding tumors <1 mm from the serosa, routinely noting a higher risk of peritoneal recurrence [11]. A more recent study published this year also found a substantial variability in diagnosing pT4a colon cancer, both at the pathologist and laboratory level [13].

All these recent studies reaffirm that many pathologists indeed reject the notion that a pT4a designation requires tumor cells at the serosal surface and support that updated staging guidelines are needed for consistent and reproducible staging of deeply invasive carcinoma that are ≤1 mm from the serosal surface, which are often associated with fibro-inflammatory reaction. These guidelines need to be supported by the clinical outcome data as well. In our study, we found that tumors ≤1 mm from the serosal surface (Group 2 tumors) fare significantly worse than the pT3 tumors >1 mm from serosal surface (Group 3 tumors) and behaved more like bona fide pT4a tumors present at the serosal surface (Group 1 tumors) with regards to clinical outcomes (overall log-rank p value < 0.001 for both recurrence-free rates and survival rates). This is also depicted by the Kaplan–Meier curves, wherein the Group 2 tumors are seen closer to the Group 1 tumors in both recurrence free as well as survival curves. Similarly, when looking at the lymph node-negative subset of these three groups, it appears that tumors ≤1 mm from the serosal surface fare significantly worse than pT3 tumors >1 mm with regards to both recurrence-free rates (p = 0.026) and overall survival rates (p < 0.001).

The staging of tumors ≤1 mm from a serosal reaction has been a subject of debate with various studies showing contradictory results [2, 4, 9, 15,16,17]. In a seminal study, Shepherd et al. assessed various levels of peritoneal involvement in 412 colonic cancers. They found that intraperitoneal recurrence was higher in those tumors, which unequivocally invaded serosal surface (defining pT4 tumors), in comparison to the other tumors, which were away from the serosa or close to the serosa. The authors acknowledged the subjectivity of the assessment, which could be the reason why “closeness” to the serosa was not defined [16]. Subsequent studies supported Shepherd’s conclusions [16,17,18,19,20,21]. In one such study comprising of 889 colon cancer patients, Snaebjornsson et al. [17] found that 5-year survival rates for patients with unequivocal pT4 tumors was 41% when compared with 58% for the tumors close to the serosal surface and 71% for bona fide pT3 tumors. Based on these results, the authors suggested that only tumors that unequivocally invade serosal surface should be classified as pT4. However, clearly the survival rate of the tumors close to the serosal surface was worse than bona fide pT3 tumors in their study, and again, the “closeness” of the tumor to serosa was not defined, probably because of the subjectivity involved in the assessment. A more recent study by Klaver et al. [19] used 1 mm criteria in assessing the serosal involvement and found that tumors that clearly involved the serosa had higher chance of peritoneal recurrence (33%) in comparison to the tumors that were close to the serosal surface (21%). However, in this study, 5-year risk of metachronous peritoneal metastases was significant only in the univariate analysis and tumors <1 mm to the serosal surface were not compared with pT3 tumors >1 mm from serosal surface. In our study, 5-year recurrence rate was significantly worse in tumors that were ≤1 mm to the serosal surface when compared with bona fide pT3 tumors (>1 mm) and was more in alignment with bona fide pT4 tumors.

Our results were very similar to Panarelli et al.’s wherein serosal cytologic smears from 120 colon cancer resection specimens were evaluated. The authors found that 46% of pT3 tumors that are close (≤1 mm) from a serosal tissue reaction were associated with positive serosal cytology preparations, almost similar to pT4a tumors (55%) in their study. Cytologic smears from pT3 tumors distant (>1 mm) from the serosal surface were negative for tumor cells. Twenty-two percent of pT3 tumors that were ≤1 mm from the serosal surface had disease recurrence including 11% with peritoneal carcinomatosis. The rates were comparable to their pT4a colonic adenocarcinomas (36% with recurrent disease and 18% with peritoneal recurrence). Based on their findings, the authors proposed that pT4a be expanded to include deeply invasive tumors ≤1 mm from serosal reaction [15]. Several other authors have also indicated that serosal involvement is often underestimated and that tumors ≤1 mm from the serosa that are accompanied by a serosal reaction are at higher risk of peritoneal recurrence [10, 14, 15, 22,23,24,25].

One of the weaknesses of our study is that number of cases is on a lower side, especially the cases with intervening contiguous inflammation. However, we believe that fibrosis or inflammation or fibro-inflammatory reaction all represent varied serosal reaction. Normally, subserosal tissue is composed of fibrofatty tissue with mature fat with delicate septa, blood vessels, lymphatics, and nerve fibers without inflammation [25]. Tumors elicit a reactive stroma that may be composed of reactive fibroblasts (leading to reactive fibrotic stroma) or inflammatory cells with or without granulation tissue. Often times the serosal reaction is a mixture of two; a fibro-inflammatory reaction composed of fibrosis admixed with inflammation. Therefore, trying to stratify these cases as either with intervening fibrosis or intervening inflammation, or deciding which component is predominant, again adds to the interobserver variability. Given that both intervening reactive fibrosis and inflammation represent serosal reaction to the tumor, it may be redundant to separate the two. Thus, we combined the entire group of tumors ≤1 mm from the serosal surface with any serosal reaction to achieve statistical power. In our study, all tumors ≤1 mm from the serosal surface elicited either reactive fibrosis, reactive inflammation, or mixed fibro-inflammatory response and the clinical outcome of the of the entire combined group, regardless of the subserosal tissue reaction type, was more akin to the bona fide pT4a tumors. In addition, since this was a retrospective study with older cases with at least 5-year survival data, the number of sections submitted per case was not uniform. However, the departmental grossing protocols were followed, and multiple recuts were performed in cases with tumor close to the serosal surface by the primary pathologists, as deemed appropriate. Also, we did not include N1c disease (tumor deposits without nodal metastases) under lymph node metastases, given the variability in diagnosis of tumor deposits [26]. However, tumor deposit was evaluated as a potential covariate in the proportional hazards regression model in a multivariable analysis and did not significantly contribute to the predictive power of the model. Lastly, we were not able to correlate the results with regards to mismatch repair protein status as many of the resections were performed before universal screening for mismatch repair protein deficiency came into effect at both institutions.

To conclude, our study supports that deeply invasive colonic adenocarcinoma ≤1 mm from the serosal surface and with any type of serosal reaction (inflammation, fibrosis, or mixed fibro-inflammatory reaction) clinically appears to be a different group from bona fide pT3 tumors >1 mm from the serosal surface, and fares significantly worse. Similar results were seen in the lymph node-negative subset of these groups as well as when adjusted for adjuvant therapy and nodal metastases. We suggest that these tumors ≤1 mm from the serosal surface be staged as pT4a (rather than pT3) as the clinical outcome data in our cohort of patients show that these tumors behave more similar to the bona fide T4a tumors identified at the serosal surface, and not pT3 tumors >1 mm from the serosa. In addition, we hope that clearly stated concrete guidelines would ensure consistency and reproducibility amongst pathologists when staging deeply invasive colonic adenocarcinomas. This would not only lead to optimal patient care but also to optimal comparisons of outcome data amongst different institutions.